Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,583
|
Comments: 51,214
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 466 words

Recently I am finding myself writing more and more infrastructure level code. Now, there are several reasons for that, mostly because the architectural approaches that I advocate don’t have a good enough infrastructure in the environment that I usually work with.

Writing infrastructure is both fun & annoying. It is fun because usually you don’t have business rules to deal with, it is annoying because it take time to get it to do something that will give the business some real value.

That said, there are some significant differences between writing application level code and infrastructure level code. For that matter, I usually think about this as:

image

Infrastructure code is usually the base, it provides basic services such as communication, storage, thread management, etc. It should also provide strong guarantees regarding what it is doing, it should be simple, understandable and provide the hooks to understand what happens when things go wrong.

Framework code is sitting on top of the infrastructure, and provide easy to use semantics on top of that. They usually take away some of the options that the infrastructure give you in order to present a more focused solution for a particular scenario.

App code is even more specific than that, making use of the underlying framework to deal with much of the complexities that we have to deal with.

Writing application code is easy, it is a single purpose piece of code. Writing framework and infrastructure code is harder, they have much more applicability.

So far, I don’t believe that I said anything new.

What is important to understand is that practices that works for application level code does not necessarily work for infrastructure code. A good example would be this nasty bit of work. It doesn’t read very well, and it has some really long methods, and… it handle a lot of important infrastructure concerns that you have to deal with. For example, it is completely async, has good error handling and reporting and it has absolutely no knowledge about what exactly it is doing. That is left for higher level pieces of the code. Trying to apply application code level of practices to that will not really work, different constraints and different requirements.

By the same token, testing such code follow a different pattern than testing application level code. Tests are often more complex, requiring more behavior in the test to reproduce real world scenarios. And the tests can rarely be isolated bits, they usually have to include significant pieces of the infrastructure. And what they test can be complex enough as well.

Different constraints and different requirements.

Test refacotring

time to read 3 min | 404 words

I just posted about a horribly complicated test, I thought I might as well share the results of its refactoring:

[TestFixture]
public class IndexedEmbeddedAndCollections : SearchTestCase
{
	private Author a;
	private Author a2;
	private Author a3;
	private Author a4;
	private Order o;
	private Order o2;
	private Product p1;
	private Product p2;
	private ISession s;
	private ITransaction tx;

	protected override IList Mappings
	{
		get
		{
			return new string[]
			{
				"Embedded.Tower.hbm.xml",
				"Embedded.Address.hbm.xml",
				"Embedded.Product.hbm.xml",
				"Embedded.Order.hbm.xml",
				"Embedded.Author.hbm.xml",
				"Embedded.Country.hbm.xml"
			};
		}
	}

	protected override void OnSetUp()
	{
		base.OnSetUp();

		a = new Author();
		a.Name = "Voltaire";
		a2 = new Author();
		a2.Name = "Victor Hugo";
		a3 = new Author();
		a3.Name = "Moliere";
		a4 = new Author();
		a4.Name = "Proust";

		o = new Order();
		o.OrderNumber = "ACVBNM";

		o2 = new Order();
		o2.OrderNumber = "ZERTYD";

		p1 = new Product();
		p1.Name = "Candide";
		p1.Authors.Add(a);
		p1.Authors.Add(a2); //be creative

		p2 = new Product();
		p2.Name = "Le malade imaginaire";
		p2.Authors.Add(a3);
		p2.Orders.Add("Emmanuel", o);
		p2.Orders.Add("Gavin", o2);


		s = OpenSession();
		tx = s.BeginTransaction();
		s.Persist(a);
		s.Persist(a2);
		s.Persist(a3);
		s.Persist(a4);
		s.Persist(o);
		s.Persist(o2);
		s.Persist(p1);
		s.Persist(p2);
		tx.Commit();

		tx = s.BeginTransaction();

		s.Clear();
	}

	protected override void OnTearDown()
	{
		// Tidy up
		s.Delete("from System.Object");

		tx.Commit();

		s.Close();

		base.OnTearDown();
	}

	[Test]
	public void CanLookupEntityByValueOfEmbeddedSetValues()
	{
		IFullTextSession session = Search.CreateFullTextSession(s);

		QueryParser parser = new MultiFieldQueryParser(new string[] { "name", "authors.name" }, new StandardAnalyzer());

		Lucene.Net.Search.Query query = parser.Parse("Hugo");
		IList result = session.CreateFullTextQuery(query).List();
		Assert.AreEqual(1, result.Count, "collection of embedded (set) ignored");
	}

	[Test]
	public void CanLookupEntityByValueOfEmbeddedDictionaryValue()
	{
		IFullTextSession session = Search.CreateFullTextSession(s);
		
		//PhraseQuery
		TermQuery  query = new TermQuery(new Term("orders.orderNumber", "ZERTYD"));
		IList result = session.CreateFullTextQuery(query).List();
		Assert.AreEqual(1, result.Count, "collection of untokenized ignored");
		query = new TermQuery(new Term("orders.orderNumber", "ACVBNM"));
		result = session.CreateFullTextQuery(query).List();
		Assert.AreEqual(1, result.Count, "collection of untokenized ignored");
	}

	[Test]
	[Ignore]
	public void CanLookupEntityByUpdatedValueInSet()
	{
		Product p = s.Get<Product>(p1.Id);
		p.Authors.Add(s.Get<Author>(a4.Id));
		tx.Commit();

		QueryParser parser = new MultiFieldQueryParser(new string[] { "name", "authors.name" }, new StandardAnalyzer());
		IFullTextSession session = Search.CreateFullTextSession(s);
		Query query = parser.Parse("Proust");
		IList result = session.CreateFullTextQuery(query).List();
		//HSEARCH-56
		Assert.AreEqual(1, result.Count, "update of collection of embedded ignored");

	}
}

It is almost the same as before, the changed are mainly structural, but it is so much easier to read, understand and debug.

time to read 2 min | 327 words

I think, without question, that this is one of the most horrible nasty tests that I have seen. It has been ported as is from the java code, and you can literally feel the nastiness in reading it:

[Test]
public void IndexedEmbeddedAndCollections()
{
	Author a = new Author();
	a.Name = "Voltaire";
	Author a2 = new Author();
	a2.Name = "Victor Hugo";
	Author a3 = new Author();
	a3.Name = "Moliere";
	Author a4 = new Author();
	a4.Name = "Proust";

	Order o = new Order();
	o.OrderNumber = "ACVBNM";

	Order o2 = new Order();
	o2.OrderNumber = "ZERTYD";

	Product p1 = new Product();
	p1.Name = "Candide";
	p1.Authors.Add(a);
	p1.Authors.Add(a2); //be creative

	Product p2 = new Product();
	p2.Name = "Le malade imaginaire";
	p2.Authors.Add(a3);
	p2.Orders.Add("Emmanuel", o);
	p2.Orders.Add("Gavin", o2);


	ISession s = OpenSession();
	ITransaction tx = s.BeginTransaction();
	s.Persist(a);
	s.Persist(a2);
	s.Persist(a3);
	s.Persist(a4);
	s.Persist(o);
	s.Persist(o2);
	s.Persist(p1);
	s.Persist(p2);
	tx.Commit();

	s.Clear();

	IFullTextSession session = Search.CreateFullTextSession( s );
	tx = session.BeginTransaction();

	QueryParser parser = new MultiFieldQueryParser( new string[] { "name", "authors.name" }, new StandardAnalyzer() );

	Lucene.Net.Search.Query query = parser.Parse( "Hugo" );
	IList result = session.CreateFullTextQuery( query ).List();
	Assert.AreEqual( 1, result.Count, "collection of embedded ignored" );

	//update the collection
	Product p = (Product) result[0];
	p.Authors.Add( a4 );

	//PhraseQuery
	query = new TermQuery( new Term( "orders.orderNumber", "ZERTYD" ) );
	result = session.CreateFullTextQuery( query).List();
	Assert.AreEqual( 1, result.Count, "collection of untokenized ignored" );
	query = new TermQuery( new Term( "orders.orderNumber", "ACVBNM" ) );
	result = session.CreateFullTextQuery( query).List();
	Assert.AreEqual( 1, result.Count, "collection of untokenized ignored" );

	tx.Commit();

	s.Clear();

	tx = s.BeginTransaction();
	session = Search.CreateFullTextSession( s );
	query = parser.Parse( "Proust" );
	result = session.CreateFullTextQuery( query ).List();
	//HSEARCH-56
	Assert.AreEqual( 1, result.Count, "update of collection of embedded ignored" );

	// Tidy up
	s.Delete(a);
	s.Delete(a2);
	s.Delete(a3);
	s.Delete(a4);
	s.Delete(o);
	s.Delete(o2);
	s.Delete(p1);
	s.Delete(p2);
	tx.Commit();

	s.Close();
}

Point to anyone who want to start a list of how many good test metrics this test violates.

Oh, and right now, this test fails.

time to read 2 min | 208 words

My recent post caused quite a few comments, many of them about two major topics. The first is the mockability of my approach and the second is regarding the separation of layers.

This post is about the first concern. I don’t usually mock my database when using NHibernate. I use an in memory database and leave it at that. There are usually two common behaviors for loading data from the database. The first is when you need to load by primary key, usually using either Get or Load, the second is using a query.

If we are using Get or Load, this is extremely easy to mock, so I won’t touch that any further.

If we are using a query, I categorically don’t want to mock that. Querying is a business concern, and should be treated as such. Setting up an in memory database to be used with NHibernate is easy, you’ll have to wait until the 28th for the post about that to be published, but it is basically ten lines of code that you stick in a base class.

As such, I have no real motivation to try to abstract that away, I gain a lot, and lose nothing.

time to read 14 min | 2722 words

While testing Rhino Service Bus, I run into several pretty annoying issues. The most consistent one is that the actual work done by the bus is done on another thread, so we have to have some synchronization mechanisms build into the bus just so we would be able to get consistent tests.

In some tests, this is not really needed, because I can utilize the existing synchronization primitives in the platform. Here is a good example of that:

   1: [Fact]
   2: public void when_start_load_balancer_that_has_secondary_will_start_sending_heartbeats_to_secondary()
   3: {
   4:     using (var loadBalancer = container.Resolve<MsmqLoadBalancer>())
   5:     {
   6:         loadBalancer.Start();
   7:  
   8:         Message peek = testQueue2.Peek();
   9:         object[] msgs = container.Resolve<IMessageSerializer>().Deserialize(peek.BodyStream);
  10:  
  11:         Assert.IsType<HeartBeat>(msgs[0]);
  12:         var beat = (HeartBeat)msgs[0];
  13:         Assert.Equal(loadBalancer.Endpoint.Uri, beat.From);
  14:     }
  15: }

Here, the synchronization is happening in line 8, Peek() will wait until a message arrive in the queue, so we don’t need to manage that ourselves.

This is not always possible, however, and this actually breaks down for more complex cases. For example, let us inspect this test:

   1: [Fact]
   2: public void Can_ReRoute_messages()
   3: {
   4:     using (var bus = container.Resolve<IStartableServiceBus>())
   5:     {
   6:         bus.Start();
   7:         var endpointRouter = container.Resolve<IEndpointRouter>();
   8:         var original = new Uri("msmq://foo/original");
   9:  
  10:         var routedEndpoint = endpointRouter.GetRoutedEndpoint(original);
  11:         Assert.Equal(original, routedEndpoint.Uri);
  12:  
  13:         var wait = new ManualResetEvent(false);
  14:         bus.ReroutedEndpoint += x => wait.Set();
  15:  
  16:         var newEndPoint = new Uri("msmq://new/endpoint");
  17:         bus.Send(bus.Endpoint,
  18:                  new Reroute
  19:                  {
  20:                      OriginalEndPoint = original,
  21:                      NewEndPoint = newEndPoint
  22:                  });
  23:  
  24:         wait.WaitOne();
  25:         routedEndpoint = endpointRouter.GetRoutedEndpoint(original);
  26:         Assert.Equal(newEndPoint, routedEndpoint.Uri);
  27:     }
  28: }

Notice that we are making explicit synchronization in the tests, line 14 and line 24. ReroutedEndpoint is an event that we added for the express purpose of allowing us to write this test.

I remember several years ago the big debates on whatever it is okay to change your code to make it more testable. I haven’t heard this issue raised in a while, I guess that the argument was decided.

As a side note, in order to get rerouting to work, we had to change the way that Rhino Service Bus viewed endpoints. That was a very invasive change, and we did it in less than two hours, but simply making the change and fixing the tests where they broke.

time to read 3 min | 592 words

My post about tests got the expected response, and I think that Denis's comment is a good sample of the type of feedback that I got:

...you won't be able to ship periodically at sustainable costs and quality without relevant engineering practices like unit tests, refactoring and continuous integration...

Despite several attempts in repeating that, it seems like many people missed the part that I explicitly stated: I wasn't trying to say that tests are not valuable. I was trying to talk about the blind faith that people put in tests.

Now, to Denis's comment. Yes, I would be able to ship periodically without tests.

I am not talking in the air, I have the practical experience and two years of a running project to back me up on that. We had a CI process in place, which deploy the application to a staging server, and that was it. Refactoring... is not a subject that I can talk about in the context of this application without first explaining the architecture.

I came to that project shortly after finishing up a project that was... problematic. And I was determined to avoid the same mistakes. One of the first thing that we did was to define the core parts of the system, and build just enough of that to serve us. So far, this is normal, except that we added a tiny twist.

We applied the Open Close Principle as a holistic view for the entire project. If we needed to change a line of existing code, that was a bug, and we had to refactor that area of the system to make sure that the next time we need to change this, we wouldn't have to change existing code. Adding a new feature consisted entirely of adding new code.

We didn't touch existing code. Well, not very often, to be honest. This is the same project where I spent three months straight without even once going to the infrastructure directory, indeed, after three months we run into a problem, and it took me 10 minutes to remember that we are using a container in the app, and that the new class didn't match our convention, so Binsor did not register it.

A lot of my ideas about zero friction development came directly from this project.

Because we always wrote new code, and each feature was isolated from all other features, we were able to keep a rapid pace of development for the lifetime of the project. I mentioned that we didn't have automated tests, we didn't feel the lack. Off the top of my head, I can't recall any regression bugs that we had, except with infrastructure going bad (IT admins removing our shared storage, BizTalk changes we weren't told about, stuff like that).

I am pretty sure that there were some, but I can't think of any, so that put them at a very low number. And some stats about the project. It is running for 2 years (I left after the first year, and the team continued to provide new features and releases after I left), currently at over 200,000 lines of code, is considered a major success from the business point of view and its scope has tripled since the original inception.

And just to reiterate, I like tests. I think that they are important and that they have proven themselves in the field. BUT, They aren't a silver bullet, having them wouldn't make the project succeed. And not having tests wouldn't make the project fail.

time to read 3 min | 510 words

In DevTeach, we had a panel that Kathleen Dollard has covered in depth, in which we talked about what the bare minimum aspects of Agile project would be. The first thing that was thrown up was testing.

That is quite predictable, and I objected to that. One of the things that bother me about much of the discussions in the agile space is the hard focus of tests. Often to the exclusion of much else.

My most successful (commercial) project was done without tests, and it is a huge success (ongoing now, by the way). A previous project had tests, quite a few of them, and I consider it a big failure (over time, over budget, overly complex code base, a lot of the logic in the UI, etc).

Update: I wanted to make clear the distinction between Agile and TDD. I consider the project without tests to be a fully agile project. The project with tests was a heavy waterfall project.

I think that I can safely say that it would be hard to accuse me of not getting testing, or not getting TDD. Not that I don't expect the accusations anyway, but I am going to try to preempt them.

I want to make it explicit, and understood. What I am riling against isn't testing. I think that they are very valuable, but I think that some people are focusing on that too much. For myself, I have a single metric for creating successful software:

Ship it, often.

There are many reasons for that, from the political ones and monetary ones to feedback, scope and tracer bullets.

Tests are a great tool to aid you in shipping often, but they aren't the only one. Composite architectures and JFHCI are two other ways that allow us to create stable software platform that we can develop on.

Tests are a tool, and its usage should be evaluated against the usual metrics before applying it in a project. There are many reasons not to use tests, but most of them boil down to: "They add friction to the process".

Testing UI, for example, is a common place where it is just not worth the time and effort. Another scenario is a team that is not familiar with testing, introducing testing at this point would hinder my topmost priority, shipping.

Code quality, flexibility and the ability to change are other things that are often attributed to tests. They certainly help, but they are by no mean the only (or even the best) way to approach that.

And finally, just to give an example, Rhino Igloo was developed without tests, using F5 testing style. I applied the same metric, trying to test what it does would have been painful, therefor, I made the decision not to write tests for that. I don't really like that code base, but that is because it is strongly tied to the Web Forms platform, not because it is hard to change or extend, it isn't.

Okay, I am done, feel free to flame.

time to read 1 min | 154 words

I have a system in which one of the core parts is required to work in a non deterministic fashion. In particular, some part of the system is behaving randomly (by design). Here is a small description:

  • 90% of the time, we select only items that are above the score limit.
    • Selection between those is made randomly, with no bias
  • 10% of the time, we select only items that are _below_ the score limit.
    • Selection among those is done based on the oldest last shown date
    • To skip "rotten apples", we calculate the median of the lowest group and exclude all items that are below 20% of the median.

The idea behind this is to have a bare bone system that can adapt in response to input.

But I am not sure how I can test this in a way that will actually reveal something meaningful about the system.

Thoughts?

Meta tests

time to read 3 min | 402 words

I found this test fixture hilarious.

[TestFixture]
public class ValidateNamingConventions
{
    private System.Type[] types;

    [SetUp]
    public void Setup()
    {
        types = typeof(ISearchFactory).Assembly.GetExportedTypes();            
    }

    [Test]
    public void InterfacesStartsWithI()
    {
        foreach (System.Type type in types)
        {
            if(type.IsInterface==false)
                continue;
            Assert.IsTrue(type.Name.StartsWith("I"),type.Name);
        }
    }

    [Test]
    public void FirstLaterOfMethodIsCapitalized()
    {
        foreach (System.Type type in types)
        {
            foreach (MethodInfo method in type.GetMethods())
            {
                Assert.IsTrue(char.IsUpper(method.Name[0]), type.FullName + "." + method.Name);
            }
        } 
    }

    [Test]
    public void AttributesEndWithAttribute()
    {
        foreach (System.Type type in types)
        {
            if (type.IsAssignableFrom(typeof(Attribute)))
                continue;
            Assert.IsTrue(type.Name.EndsWith("Attribute"), type.Name);
     
        }
    }
}

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
  2. Webinar (7):
    05 Jun 2025 - Think inside the database
  3. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  4. RavenDB News (2):
    02 May 2025 - May 2025
  5. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}