Bug storiesHow do I call myself?

time to read 3 min | 522 words

imageThis bug is actually one of the primary reasons we had a Beta 2 release for RavenDB 4.0 so quickly.

The problem is easy to state, we had a problem in any non trivial deployment setup where clients would be utterly unable to connect to us. Let us examine what I mean by non trivial setup, shall we?

A trivial setup is when you are running locally, binding to “http://localhost:8080”. In this case, everything is simple, and you can bind to the appropriate interface and when a client connects to you, you let it know that your URL is “http://localhost:8080”.

Hm… this doesn’t make sense. If a client just connected to us, why do we need to let it know what is the URL that it need to connect to us?

Well, if there is just a single node, we don’t. But RavenDB 4.0 allows you to connect to any node in the cluster and ask it where a particular database is located. So the first thing that happens when you connect to a RavenDB server is that you find out where you really need to go. In the case of a single node, the answer is “you are going to talk to me”, but in the case of a cluster, it might be some other node entirely. And this is where things begin to be a bit problematic. The problem is that we need to know what to call ourselves when a client connects to us.

That isn’t as easy as it might sound. Consider the case where the user configure the server url to be “http://0.0.0.0:8080”. We can’t give that to the client, so we default to sending back the host name in that case. And this is where things started to get tricky. In many cases, the host name is not something that make sense.

Oh, for internal deployments, you can usually rely on it, but if you are deploying to AWS, for example, the machine host name is of very little use in routing to that particular machine. Or, for that matter, a docker container host name isn’t particularly useful when you consider it from the outside.

The problem is that with RavenDB, we had a single configuration value that was used both for the binding to the network and for letting the user know how to connect to us. That didn’t work when you had routers in the middle. For example, if my public docker IP is 10.0.75.2, that doesn’t mean that this is the IP that I can bind to inside the container. And the same is true whenever you have any complex network topology (putting nginx in front of the server, for example).

The resolution for that was pretty simple, we added a new configuration value that will separate the host that we bind to from the host that we report to the outside world. In this manner, you can bind to one IP but let the world know that you should be reached via another. 

More posts in "Bug stories" series:

  1. (29 Jun 2017) The memory ownership in the timeout
  2. (28 Jun 2017) How do I call myself?
  3. (27 Jun 2017) The data corruption in the cluster