Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,590
|
Comments: 51,223
Privacy Policy · Terms
filter by tags archive
time to read 2 min | 361 words


Yep, another forum question. Unfortunately, in this case all I have is the title. Even more unfortunately, I already used the stripper metaphor before.

There are some questions that I am really not sure how to answer, because there are several underlying premises that are flat out wrong in the mere asking of the question.

“Can you design a square wheel carriage?” is a good example of that, and using a service bus for queries is another.

The short answer is that you don’t do that.

The longer answer is that you still don’t do that, but also explains why the question itself is wrong. One of the things that goes along with a service bus is the concept of services.

image Services are autonomous.

Does this ring a bell?

You don’t query a service for its state, because that would violate the autonomy tenant.

But I need the users data from the Personalization service to show the home page, I can hear you say. Well, sure, but you don’t perform queries across a service boundary.

 

Notice the terminology here. You don’t perform queries across a service boundary.

But you can perform queries inside a service boundary. The image on the right shows one such example of that.

We have several services in a single application, they communicate between services using a service bus.

But a service isn’t just something that is running in a server somewhere. The personalization service also have user interface, business logic that needs to run on the UI, etc.

That isn’t just some other part of the application that is accessing the personalization service operations. It is a part of the personalization service.

And inside a service boundary, there are no limitation to how you get the data you need to perform some operation.

You can perform a query using whatever method you like (a synchronous web service call, hitting the service database, using local state).

Personally, I would either access the data store directly (which usually means querying the service database) or use local state. I spoke about building a system where all queries are handled using local state here.

time to read 4 min | 711 words

Yes, I wrote my own CI server. I even did it in Power Shell, because that looks cool. You can find the source here. It is currently running in production and is responsible for pushing NH Prof builds out.

Now, what was I thinking when I built my own CI server? Put simply, I had the following goals:

  • Test WPF apps – CC.Net doesn’t allow it, since it is running as a service and that affects the way WPF tests behave. Texo shell out a different process, so it doesn’t have this limitation. Most other CI servers do the same.
  • Don’t expose passwords – the thing that really killed me with CC.Net was looking at the build log and seeing my password right there in plain text (happens if there is connectivity error to the repository). Yes, I am also surprised this is a feature.
  • Handle Git pushes – Git allows you to push several changes to the repository in a single shot. When I tried to use CC.Net to build NH Prof from git, it only showed the last commit. Texo understand the notion of a push (it takes it from the git hub API) and can pass that information to the build script.
  • Reactive – Texo doesn’t check the repository, in fact, most of the time it is completely passive (and is likely to be shut down). Whenever a push is made to the repository, github will call the Texo’s url, providing the information about the current push. Texo will take that information and create the builder process, which will update / clone the new repository, and then execute the build command.
  • Small configuration footprint – there are only two types of configuration, the SMTP settings and the project information. Here is the full configuration file:
  • <settings>
      <email>
        <smtpServer>smtp.gmail.com</smtpServer>
        <username>*****@gmail.com</username>
        <password>****</password>
        <useSSL>true</useSSL>
        <port>587</port>
        <from>*****@gmail.com</from>
      </email>
      <project
        url="https://github.com/ayende/Texo"
        name="Texo"
        git="C:\Work\Texo"
        ref="refs/heads/master"
        cmd="powershell .\psake.ps1 default.ps1 upload"
        build="3"
        workingDir="C:\Builds\Texo"
        email="ayende@ayende.com" />
    </settings>

What it doesn’t do:

  • UI, reports, tracking, whatever. Texo has one purpose in life, listen to changes and build the software, nothing else. The UI is a very simple email notification process, nothing more.
  • Hung build recovery.
  • Anything but git + github.

Texo is compromised of two parts, a web endpoint, written in C#, that plugs into IIS. I assume that the IIS website user is going to be the same one that the tests will run under (makes things much simpler). Once a notification arrives, the endpoint will invoke the Builder Power Shell script to perform the actual CI process.

At some point I really have to make up a list of all the projects that I was involved at, I know how I am going to title it: “NIH R US”.

Oh, and can you figure out the naming?

time to read 3 min | 556 words

This is another forum question, this time from Brendan Rice:

A lot of developers are unsure of how best to go about making money from a product, how do you go about implementing licensing, what pay system do you use, how do you accept payment, are there any legal issues...

Well, talk about an open ended question. There are several aspects for the answer, legal, licensing and payment processing. There are somewhat related, though.

From the legal side, you need to understand basic concepts in the legal side of software engineering. You need to understand copyright, the idea of licensing software, what rights you care about and what you shouldn’t. I got a lot of my knowledge from simply researching the topic.

You might want to have a lawyer draft your EULA, but there are two major things that you want to remember about the EULA. Some people actually read the bloody things. If you put things there that are too nefarious people will get pissed at you. There is such a thing as bad publicity. You want to avoid that.

The second important thing about EULA is that if you take someone to court over it, you have already lost. I like to think about EULA as just setting the grounds for what is expected from either side. By all means, get that through your lawyer, but be sure that you know what is in there. And be sure that it is an agreement that you would be willing to sign.

From the licensing perspective, I had a disastrous experience using one licensing component, after which I decided that I might as well write my own. It is a pretty simple system, based on signed XML files, I have the secret private key and the application ship with the public key. It allows to pass data around in a very simple form while protecting the license files from tampering. The code is available, and it is pretty simple, so I won’t get deeper into it.

The last part, payment processing, is probably the most interesting bit. I use a payment provider, because trying to manage something like that yourself is a nightmare. My payment provider handles all sort of payment options, including things that require someone to answer the phone or manually clear mailed checks, etc.

They also provide nice admin site, where I can do things like generate coupons, like this one: NHP-45K2D46S27 (yes, it is a valid one, at least until someone will use it), refund people, taxation, view interesting reports and in general administer all aspects of the accepting payments.

They take a commission that isn’t significantly larger than most credit cards and in general they solve so much headache that I am happy to pay them.

The result of a successful order in the payment provider is an email generated that is sent to a mailbox monitored by a service. That email is read, parsed, and the corresponding license file is then sent to the user.

Nothing really earth shattering in all the process, yes, I know. But it is probably important to outline that clearly for people who haven’t done it yet. It isn’t complex or hard by any measure.

time to read 3 min | 478 words

Well, so far, so good.

I started by defining a simple Linq to SQL model, there are zero things that you need to do here to make things work:

image

And now to the actual code using this model:

static void Main()
{
    LinqToSqlProfiler.Initialize();
    using (var db = new BlogModelDataContext())
    {
        var q = from blog in db.Blogs
                where blog.Title == "The Lazy Blog"
                select blog;

        foreach (var blog in q)
        {
            Console.WriteLine(blog.Title);

            foreach (var post in blog.Posts)
            {
                Console.WriteLine(post.Title);
            }
        }
    }
}

I think that we can agree that this is pretty routine usage of Linq to SQL. The only thing extra that we need is to initialize the profiler endpoint.

The end result is:

image 

We can detect data contexts opening and closing, we can detect queries and their parameters, format them to display properly and show their results. We can even detect queries generated from lazy loading and the stack trace that caused each query (a hugely valuable feature).

Now, before you get too excited, this is a spike. A lot of the code is going to end up in the final bits, but there is a lot more to do yet.

Things that I am not going to be able to do:

  • Track local transactions (I’ll probably be able to track distributed transactions, however)
  • Show query row count
  • Track query duration
  • Show entities load by session

I am going to be able to show at least some statistics, however, which is pretty nice, all told.

Thoughts?

time to read 2 min | 376 words

Billy Newport is talking about Redis, showing some of the special APIs that Redis offers.

  • Redis gives us first class List/Set operation, simplify many tasks involving collections. It is easy to get into big problems afterward.
  • Can do 100,000 operations per second.
  • Redis encourage a column oriented view, you use things like:
R.set("user:123@firstname", "billy")
R.set("user:123@surname", "newport")
R.set("uid:bewport", 123)

Ayende’s comment: I really don’t like that. No transactions or consistency, and this requires lots of remote calls. 

  • Bugs in your code can corrupt the entire data store. Causing severe issues in development.
  • There is a sample Twitter like implementation, and the code is pretty interesting, it is a work-on-write implementation.
  • List/set operations are problems. What happen when you have a big set? Case in point, Ashton has 4 million followers, work-on-write doesn’t work in this case.
  • 100,000 operations per second doesn’t mean much when a routine scenario result in millions of operations.
  • This is basically the usual SELECT N+1 issue.
  • Async approach is required, processing large operations in chunks.
  • Changing the way we work, instead of getting the data and working on it, send the code to the data store and execute it there (execute near the data).
    • Ayende’s note: That is still dangerous, what happen if you send a piece of code  to the data store and it hungs?
  • Usual problems with column oriented issues, no reports, need export tools.
  • Maybe use closures as a way to send the code to the server?

Ayende’s thoughts:

I need to think about this a bit more, I have some ideas based on this presentation that I would really like to explore more.

time to read 3 min | 425 words

I have a tremendous amount of respect to Michael Feathers, so it is a no brainer to see his presentation.

Michael is talking about why Global Variables are not evil. We already have global state in the application, removing it is bad/impossible. Avoiding global variables leads to very deep argument passing chains, where something needs an object and it passed through dozens of objects that just pass it down. We already have the notions on how to test systems using globals (Singletons). He also talks about Repository Hubs & Factory Hubs – which provide the scope for the usage of a global variable.

  • Refactor toward explicit seams, do not rely on accidental seams, make them explicit.
  • Test Setup == Coupling, excessive setup == excessive coupling.
  • Slow tests indicate insufficient granularity of coupling <- I am not sure that I agree with, see my previous posts about testing for why.
  • It is often easier to mock outward interfaces than inward interfaces (try to avoid mocking stuff that return data)
  • One of the hardest things in legacy code is making a change and not knowing what it is affecting. Functional programming makes it easier, because of immutability.
  • Seams in a functional languages are harder. You parameterize functions in order to get those seams.
  • TUF – Test Unfriendly Feature – IO, database, long computation
  • TUC – Test Unfriendly Construct – static method, ctor, singleton
  • Never Hide a TUF within a TUC
  • No Lie principal – Code should never lie to you. Ways that code can lie:
    • Dynamically replacing code in the source
    • Addition isn’t a problem
    • System behavior should be “what I see in the code + something else”, never “what I see minus something else”
    • Weaving & aspects
    • Impact on inheritance
  • The Fallacy of Restricted Languages
  • You want to rewrite if the architecture itself is bad, if you have issues in making changes rapidly, it is time for refactor the rough edges out.
time to read 2 min | 378 words

With the release of NH Prof v1.0, I started to look if I can extend what I am doing for NHibernate for other OR/M in the .NET space. My initial spiking makes me optimistic, this is certainly possible. I’ll probably talk at length about the actual architectural implementation, but for now I want to concentrate on the actual high level requirements. I want to be able to support the following:

  • Linq to SQL
  • SubSonic
  • LLBLGen
  • Plug your own DAL

While none of them are going to provide me with the detailed information that I can get from NHibernate, it turns out that I can get a pretty good mileage from just pushing the basics along. The first spikes with Linq to SQL are promising (more about that will show up starting next week or the one after that), and I intend to allow you to:

  • Show DataContext
  • Show SQL Statements
    • Show you the actual formatted SQL, including parameters
    • Show you the stack trace of where that SQL was generated
  • Generate alerts for bad practices such as SELECT N+1 or issuing too many queries

There are things that I can do with NHibernate that are simply not possible with other OR/Ms (something like tracking loaded entities per session, for example, or showing cached queries), but since most of those are actually capabilities that NHibernate has and the others do not, I think it is still great.

Currently the plan is to have a separate product for each OR/M, that means that buying NH Prof will not get you L2S Prof, but we will most likely have some uber license that will cover all of them.

You’ll notice that the Entity Framework isn’t listed in my initial targets list. That is for a very simple reason, plugging into EF seems to be about nine times harder than doing it with anything else. I would need to get a strong feedback that this is something that enough people are willing to pay for.

time to read 2 min | 213 words

Mike Rettig has left a somewhat snarky comment on a post detailing a deadlock issue that I run into:

Locking on shared state? I thought you were a proponent of message based concurrency.  This post demonstrates exactly why concurrency combined with shared state is so hard.
Looking forward to your next thread race or deadlock,

The problem with message passing concurrency is that the underlying assumption here is that there isn’t any shared state. But in my situation, that is no a valid assumption.

Let us see if I can give a good example of what I mean. Let us assume that we have a message passing the exchange the following messages:

  • Session Created { Session Id }
  • Statement Executed { Session Id, Statement Text }
  • Query Sessions And Statements { }

Furthermore, you are not going to be make use of something like a DB to manage the state (which would handle the sharing issue for you), you have to manage everything in memory.

I would be very interested in hearing how you can design such a system without having shared state and locking.

time to read 2 min | 262 words

NHProf Logo

I was a bit quite on the NH Prof front lately, not because I didn’t work on it, but because I was fighting a really nasty bug. The way NH Prof is making use of WPF exposed a memory leak scenario inside WPF.

Luckily, once we were able to isolate the actual problem, it was relatively easy to find a workaround. If you care, the resolution was to keep a single instance bound to the view and replace its values, instead of providing a new instance when replacement was required. The problem is that this bug took forever to isolate.

Also included are a bunch of performance optimizations that I did along the way to resolve the OOM error. Those relate to better handling of batch statements, caching the result of SQL parsing and optimized NH Prof’s idle state (I’ll have a separate post about that).

In addition to that, I gave some additional love to DDL statements, making sure that NH Prof treated them specially and didn’t generate DML errors for DDL statements. Same goes for cached statements.

It is a bunch of small changes, the biggest of them was tracking down and viciously attacking the OOM error. For the next week, I am going to be busy at JAOO, but I also intend to spend some free time continuing the final polishing & touch ups.

The week after that, I intend to seriously start working on the 1.1 feature set.

Should be interesting.

time to read 3 min | 579 words

This is just something that came up recently in a mailing list, we were talking about copyright, ownership and such. The topic of who owns the code you write on your own time (and on your own machines) came up.

The opinion of some people was that the employer may own the code even under those circumstances. It seems that it isn’t usually part of the law (that depend on where you are at, of course), but it is part of standard employment contract templates.

When I started looking for a job, I insisted on taking the employment contract home and going over it with:

  • a calm mind
  • having another set of eyes go over it

I had one case of not properly reading what I was signing on with bad consequences, I learned since then.

There is no such thing as a standard contract, you can always negotiate.

For that matter, I rejected an offer from one place after verbal agreements that we reached didn’t get into the contract (twice!). I decided that if they were trying to effectively cheat me when I wasn’t even working for them, I had better things to do than to put my head into this sickbed.

Some of the things that I found in employment contracts are of the sort that would make your head curl. Non compete agreements that basically say that you are not allowed to do any work (for anyone) for 2 years after you stop working for the company. Ownership on anything you do (be in software artifacts, a book about flowers and quite possibly any children you have during your employment terms).

Some of them are unenforceable at court, but you would be at a much better position if you didn’t have to deal with annoying section in a contract that you are signed on in the first place.

My usual approach to reading contracts is to debug them, assuming that the other side is nefarious, evil, double dealing and likes kicking puppies before breakfast. Most places will go with the “Try and you shall succeed” method for contracts. If you signed on to them without complaints, they are good. If you object to something, they can amend the contract to be more reasonable. It isn’t that they are nefarious, or that they even plan to act according to the contract. But it is best if they don’t have any leverage on you.

An interesting point that I run into is that it is often useful to be bold when negotiating a contract. I deleted the non compete clause for my employment contract when I viewed it, and required a lot of clarifications about what of my work amounts to company’s property. I followed the same logic as they did, “Try and you shall succeed”, if they didn’t care about that, I was good.

We ended up with a 1 year limitation for clients that they sent me to, and agreeing that any software work that I am making on the company’s time or using their equipment belong to the company, which I considered reasonable.

Not reading the contract is a crime, once you did, be very careful in deciding what is acceptable and what isn’t. And if you are already signed on a contract, make sure that you know what is in it.

FUTURE POSTS

  1. RavenDB & Distributed Debugging - 11 hours from now
  2. RavenDB & Ansible - 3 days from now

There are posts all the way to Jul 21, 2025

RECENT SERIES

  1. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  2. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
  3. Webinar (7):
    05 Jun 2025 - Think inside the database
  4. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  5. RavenDB News (2):
    02 May 2025 - May 2025
View all series

Syndication

Main feed ... ...
Comments feed   ... ...
}