Modeling reference data in RavenDB

time to read 2 min | 352 words

The question came up in the Mailing list, and I thought it would be a good thing to post about.

How do you handle reference data in RavenDB? The typical example in most applications would be something like the list of states, their names and abbreviations. You might want to refer to a state by its abbreviation, but still allow for something like this:

How do you handle this in RavenDB? Using a relational database, you would probably have a separate table just for states, and it is certainly possible to create something similar to that in RavenDB:

The problem with this approach is that is it a very wrong headed approach one for a document database. In a relational database, you have no choice but to threat anything that has many items as a table with many rows. In a document database, we have much better alternatives.

Instead of threating each state as an individual document, we can treat them as a whole value, like this:

In the sample, I included just a couple of states, but I think that you get the idea. Note the name of this document “Config/States”.

What are the benefits of this approach?

We only have to retrieve a single document.
We are almost always going to treat the states as a list, never as individual items.
There is never a transactional boundary for modifying just a single state.

In fact, I would usually say that this isn’t good enough. I would try to centralized any/all of the reference data that we use in the application into a single document. There is a high likelihood that it would be a fairly small document even after doing so, and it would be insanely easy to cache.

Tweet Share Share 34 comments

Tags:

Raven

Comments

31 Aug 2011
09:08 AM

Rafal

This is a nice approach but has a drawback: sometimes you need to modify your dictionaries or let your customer modify them. In case of sql database it's a simple insert/update, but in Raven you'll have to edit Json document that can get huge and unmaintainable if there are many dictionaries. Of course you could provide a CRUD gui for that purpose but it's an additional burden for develpoers.

31 Aug 2011
09:22 AM

Matt Warren

@Rafal

With the Raven Client API it's not that hard, you certainly don't have to edit the Json directly. You just do something like this:

var stateLookup = session.Load<List<State>("Config/states"); ... make changes.. session.Store(stateLookup)

31 Aug 2011
09:36 AM

bavo

How would you handle localization of those values?

31 Aug 2011
09:48 AM

Matt Warren

@braco

I guess one way is to have several docs, "Config/en-gb/States", "Config/fr-ca/States", "Config/en-ca/States" etc

31 Aug 2011
10:27 AM

Rafal

@Matt Great, you have shown that Raven has API. And? Am I supposed to tell my customer that they can write a simple program to update a list? Almost everyone in IT can use SQL to modify data and it's quite common practice, but with Raven it's not so easy. Especially if all static data is in a huge Json document.

31 Aug 2011
10:38 AM

Frank Quednau

Please don't threat all those poor many-item'ed tables and states, they are but artifacts of sick minds.

31 Aug 2011
10:50 AM

Daniel Lang

Rafal,

As a serious developers you wouldn't want your customers to insert and update rows in a SQL-table manually but instead have a simple UI for that. In a RDBMS situation it is even more critical since you have to maintain forey-key-relations and I would doubt that someone who gives his customers database-access can assure that he has all the key-constraints set-up correctly in SQL-server...

From the developer perspective, the only drawback I find here is the work needed to maintain the relation, but because the names of states aren't likely to change it doesn't matter here.

31 Aug 2011
10:57 AM

Giedrius

not familiar with ravendb, so my question may look dumb, but I'm wondering how mailing list entity looks like (reference to state) and how you load state for each of mailing list item, what displaying grid of mailing list items.

31 Aug 2011
12:17 PM

Nieve

@Matt What happens if you've got a few documents using the same States document and you need to update one or a few states? Will that still work: var stateLookup = session.Load("Config/states"); ... make changes.. session.Store(stateLookup)

31 Aug 2011
12:25 PM

Jason Meckley

@Giedrius this is document DB, not a relational DB. there isn't a relationship between the states config and the mailing list. Instead you would save the state information directly in the the mailing list document.

This is the most difficult concept for developers coming from a relational data background. they ask "how does data persistance work without relations?" or "how do I make raven work like a RDBMS?". for more information on this check out Ayende's nosql tag.

31 Aug 2011
12:47 PM

Matt Warren

@Rafal

I didn't realize that you were talking about letting the customer edit database tables by hand. In that case, is it really much easier for them to write SQL insert/update statements than using the RavenDB management studio and editing a Json doc?

See http://blogs.hibernatingrhinos.com/5/ravendb-in-practice-part-1-an-introduction-to-ravendb for screen-shots.

BTW I'm not pretending that RavenDB and doc-databases are the answer to everything, just that they make some things easy.

31 Aug 2011
12:51 PM

Matt Warren

@Nieve

RavenDB has transactions and writing a doc is fully ACID, so the next time you load the doc you'll get the new values. The line I missed out from the code sample was: session.SaveChanges()

This writes the doc to the store as an atomic transaction, see http://ravendb.net/documentation/client-api/docs-documentsession-characteristics for more info.

31 Aug 2011
13:29 PM

Ayende Rahien

Rafal, What is the problem with modifying the document? That is what RavenDB is _for_.

31 Aug 2011
13:30 PM

Ayende Rahien

Bavo,

{ "States": [ { "Name": [ { "Culture": "en", "Value": "Texas"}, { "Culture": "fr", "Value": "French Texas" } ], "Abbrevation": "TX" } ] }

31 Aug 2011
13:30 PM

Ayende Rahien

Rafal, I still fail to see the problem. RavenDB has a management UI that you can use. Beside, if you are advocating FOR having random people execute SQL against the production db, you lost me

31 Aug 2011
14:12 PM

Robert Pickering

What’s the cut off limit for this approach? I thinking what if you’re reference data was very city and town in the US rather than just a list of states? I guess this would still fit in memory without too much of a problem, but if you were going to provide some sort of auto-complete on this then you’d need to provide some sort of efficient way of search this, if each was a document in RavenDb then you could just use an index to ensure an efficient search time on the fields you’re interested in.

31 Aug 2011
14:18 PM

Ayende Rahien

Robert, You can still use an index to search inside a single document. Cut off point would be greater than a few megs, but I would probably do is something like: Config/States Config/Texas/Cities Config/Florida/Cities

Etc

31 Aug 2011
15:01 PM

Marco

And what is the best approach to store this inside a NoSql "db"?

Address Street Country (reference) State (reference) Country Code Name-En/Name-Nl/etc. States (one or more states)

And when showing an address to the user it should show: Street Country.Name-Nl (current culture) State.Name-Nl (current culture)

Thanks

31 Aug 2011
15:36 PM

Rafal

Ayende, The problem i see is general lack of tools for modifying database contents, especially for modifying large Json documents, and general unfamiliarity of db admins with this new database type. But maybe the problem can be alleviated by providing scripts that automate basic tasks and show how to perform typical operations. This is not advocating to allow random people do random things to database, I'm talking about adoption of Raven by IT - usually they are the only people allowed to touch production databases.

31 Aug 2011
17:50 PM

Winston Junior

@Rafal, The problem with the DB Admins is that relational database was the silver bullet for more than 20 years and now we have new solutions that can be mixed with relational databases or not, they need to get back to the book and learn like all the people who write systems do every single day. I don't know if this is possible with RavenDB but with Mongo theres a command line where you can modify things using "scripts". A kind of javascript if I'm not wrong. In RavenDB I saw that there are a silverlight app, but I don't use this a lot yet, I really don't know all the resources available, you can modify documents there, but I don't know if you can do this using "scripts".

31 Aug 2011
18:39 PM

Winston Junior

@Marco As far as I'm concerned, the best approach for domain data is replicate this information, since this information almost never change (how many times you see an state or city that changed name?), there will be no problem with this approach.

01 Sep 2011
01:00 AM

Justin A

@Matt or @Ayende ...

can you please elaborate how you would use these states, in a simple document? eg. a User poco/document.

public class { public string Id { get; set; } public string Name { get; set; } public ????? State { get; set; } // <-- state this person was born in. }

Also, why the namespace Config/States ??? is 'Config/' something special?

01 Sep 2011
05:14 AM

Ayende Rahien

Marco, The Country / State usually just contains the id of the relevant item, not the actual value. You load that item and then access its name. We have very efficient ways of doing that.

01 Sep 2011
05:16 AM

Ayende Rahien

Rafal, Scripts are really easy to write against RavenDB. Hell, you can provide pure JS solution in a few minutes that requires nothing else. As for "general lack of tools", that isn't an issue for RavenDB, it comes already with such a tool to do just that.

01 Sep 2011
05:17 AM

Ayende Rahien

Justin, "Config/" is just my conventions for reference data, nothing special about it.

public class Address { public string Id { get; set; } public string Name { get; set; } public string State { get; set; } // <-- state this person was born in. }

01 Sep 2011
05:28 AM

Justin A

@Ayende .. so the value of that Address class is a string .. meaning .. there's no reference to the Config/States document? Just happens to use a value, from there?

01 Sep 2011
05:33 AM

Ayende Rahien

Justin, That value can uniquely identify the state in question in that document, but yes. There are no references in RavenDB

01 Sep 2011
05:34 AM

Justin A

Cheers! As always, much appreciated!

19 Oct 2011
22:12 PM

Ronnie Overby

FRENCH TEXAS! HAHA

21 Nov 2011
12:29 PM

Rasmus

Would you model this as a Config class with a single list? What about nexttime you need som "basic data" for at list? Would you then just extend this existing class/document?

17 Jan 2012
09:26 AM

Germán

Hi!

I was loading yesterday existing reference/read-most data from sql server into a document. I ended up loading data from SQL Server with EF, shaping it with linq and persisting the shaped data to ravendb in one single document, following this post's ideas.

Is this the recommended way of loading existing data to RavenDB? Wouldn't it be nice to have this kind of one-time loading of data built-in into RavenDB Studio? A tool that would let you specify a sql data source, a linq query to shape/filter results, a destination document and done!

Thanks!

Germán

17 Jan 2012
10:16 AM

Ayende Rahien

German, That is how we do it, yes. Having such a tool isn't really useful from our perspective, you already have those tools outside of RavenDB, no need to replicate them inside RavenDB.

16 Feb 2012
03:09 AM

Germán

Hi!

Is this approach also advisable if you have a lot of information? For example, I have at least 12000 locations with a lot of textual data, country data and so on, but in some parts of this application only id and name are needed as I want to display the name to the user after a webservice gives me the id.

With this in mind I could come up with two solutions: -make one document for each location with all the data and a map/reduce index just to get the subset of information needed in order to populate a select list or decode the id

-make one document for each location and another big document that has just the id and name. I would need to get them in sync.

In general, as you said, with sql there are just columns, rows, and relations. With RavenDb I think that we have more choices regarding how to shape our data and for me is just a little bit more diffcullt.

Any chance we get a good blog posts series on data modelling in document databases?

Germán

16 Feb 2012
03:13 AM

Ayende Rahien

Germán, I would go with the map/reduce approach, because you are always going to want to search this, never actually show the entire data set.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB