The purpose of this exercise is to get a little familiarity with a range of NoSQL technologies. See my learning plan for some context.
Redis seems to be the current prevailing key-value technology. I thought I would have to download and install redis to play with it, but there is a great interactive tutorial on the redis website which provides an introduction to how developers will be using the technology.
It naturally starts with the commands to set and get a key value, and in doing so, must introduce the concept of atomic versus non-atomic operations, and wrapping a transaction around a series of commands. I wanted to read more on transactions, and got the lowdown on transacations here.
This little nugget jumped out at me: “if the Redis server crashes or is killed by the system administrator in some hard way it is possible that only a partial number of operations are registered. Redis will detect this condition at restart, and will exit with an error.”
Oh. So you have to explicitly do something about this? Why doesn’t it roll back? Seems strange.
Then all became clear a few paragraphs later:
“If you have a relational databases background, the fact that Redis commands can fail during a transaction, but still Redis will execute the rest of the transaction instead of rolling back, may look odd to you.”
Well, you took the words right out of Keanu’s mouth.
The redis website offers two reasons for not supporting rollbacks. One is perfectly understandable: “Redis is internally simplified and faster because it does not need the ability to roll back.” I get it. You compromise on one area to improve another (seems a mighty big compromise, but that’s another story).
Their other reason is less understandable to me, basically errors won’t happen (much). “in practical terms a failing command is the result of a programming errors, and a kind of error that is very likely to be detected during development, and not in production.” Nope, not really buying that. But I’ll buy the fact that usage must be done with understanding and acceptance of the compromise being made.
Jumping back to the tutorial, it takes us through setting an expiration date on a key, and then moves on to commands for interacting with lists and sets. Then it gets into hashes, which basically can represent an object and its multiple attributes.
At which point it wraps up. It would have been useful it the tutorial had given an indication of progress i.e. I was wondering where I was with it and was hurrying a little. But it only takes about 30 minutes to complete, with a little bit of playing and experimenting.
Having played with it, I looked through a few articles for an overview and commentary, and took away some “key” points (heh, heh) which make it stand out from the crowd. This was a good overview, from a “web astronaut” no less.
The database is in-memory. A corollary of this is that it won’t support a data set that is larger than the memory space (i.e. by persisting overflow to disk). Basically, its not for large data, and is optimized for fast read/write of small textual information.
So how does Redis deal with the server shutting down? There are two forms of secondary storage, RDB and AOF. At specified intervals, Redis snapshots to its data store, the RDB file. Typically, this might be every five minutes, so still brings data loss into the equation in the event of shutdown. The SAVE command will also persist a snapshot to the RDB file. There is also an AOF file, which gets operates through all writes being persisted to this file. This is akin to the transaction log in SQL Server. The general recommendation is to use both methods if data loss is unacceptable.
There is no proprietary backup mechanism, so the administrator needs to work out a mechanism to do this e.g. file copy of that RDB file via a cron job.
However, fault-tolerance is provided through replication of data to slaves. Redis replication is full replication, all slaves contain the same data as the master. By adding a slave, it is automatically synced from the master. If a slave fails, when it comes back online the master will sync to it. If the master fails, then a slave can be converted to being the master. Replication is also to support performance, as the master will distribute large reads and sorts for parallel execution.
How is it configured? Once installed, there is a configuration file (redis.conf) at the root directory. This can be edited directly, or there is a CONFIG command that can set all configurations e.g. loglevel, maxmemory, maxclients.
This was an interesting read on configuring for production. My main take-away was on how this thing should be monitored.
How does monitoring work? There is a log file that may be the first point of call. Other than that, the administrator should set up a monitoring service of some kind. There are several open source utilities which are installed as a service and will poll the status of Redis and provide alerts for RAM or CPU thresholds. They can also be configured to restart the redis server it is stopped.