“Key-Value”, “Document”, “Wide Column”, “Graph”, “Search” – these are the terms used to categorize a very long list of technology offerings under the heading of NoSQL. I think if I was shown a diagrammatic representation of data for each, and was offered a million dollars to label them correctly – I could do it. But it’s worth taking a look at the basic tenets before jumping into specific technologies.
All I’m looking for now is to give myself a basic mental model of data storage. A few simple images for a quick refresher when I return from wrestling with SQL Server replication conflicts.
Database.Guide has a nice explanatory article, with simple visual examples. Here’s my own example, using a list as the value. There are two disparate sets of data here – one describing NoSQL technologies, and the other describing blog posts.
So no foreign keys, pardon me while I find my smelling salts. No defined schema, I may not recover.
The Database.Guide article mentions Redis as top of the list of Key-Value technologies. A little way down that list…I promised myself not to fall down rabbit holes, but…Voldemort…I can’t resist following the link for a peek at He Who Shall Not be Named.
Okay, I’m back. Now to peruse Document Stores. This Database.Guide article is clear, and gave me a chuckle. The Wiggles! And here’s my own example:
Hmm. After reading first about Key-Value stores and then about Document stores, I’m having a little trouble seeing what’s the difference. I mean, a document store seems to be a key value store with tags? Reading around a bit more…well, I’m not altogether wrong.
Okay, so document data can be organized into collections e.g. Customers, Orders, Products. These collections can be partitioned and indexed and all kinds of other interesting activities to give performance benefits for querying.
So when to use which? This seems a good breakdown: “if you usually retrieve data by key or ID value and don’t need to support complex queries, a key-value database is a good option…If you have different types of entities and need complex querying, choose a document database.”
Or “Wide Column”? Or “Columnar”? C’mon, community, pick one, ‘cos it’s confusing.
For some reason, I don’t find the Database.Guide article as explanatory as the others. Here is an alternative, which is a quick read with a good graphic. The actual product described in that article offers both row-based and column-based features, so the side-by-side was useful.
This is another good article. So basically, we’ve got columns instead of rows. Got it.
Main points I’m taking are that each attribute is stored in it’s own file or memory region, which gives faster queries on specific files, and allows greater compression (less variety in the data).
Whew, so I finish up with a quick overview of Graph databases. Most articles on the Net seem to be from Neo4j, let’s see if I can find something from somewhere other than the leading vendor.
No pics in this old-skool white-on-black page, but it’s a good short write-up.
However, I feel that I will need some passing familiarity with graph theory to really get my head around this. I enjoyed these two short-ish videos from mycodeschool.com: