Redis 2.2 replication consistency with maxmemory and LRU

We’ve been running a version of Redis 2.2 RC at Craigslist for a few months now and it has been flawless. It has rapidly become the backbone of one of our internal systems.

When I upgraded from the 2.0 series and started testing 2.2, I made a few changes to our configuration. The meat of the config file looks like this:

appendonly no
appendfsync no
rdbcompression yes
maxmemory 6gb
maxmemory-policy volatile-lru
maxmemory-samples 3
glueoutputbuf yes
hash-max-zipmap-entries 512
hash-max-zipmap-value 512
activerehashing yes
vm-enabled no

We’re running 4 instances on each of our servers. Two are masters and two are slaves of instances on another server. So our 10 machine cluster has 40 redis-server instances, half of which are masters and half of which are slaves.

The “maxmemory” and “maxmemory-policy” directives are fairly new (part of 2.2) and allow us to make sure that a single redis-server instance doesn’t go over 6GB. Since there are 4 per box, that means redis in total should never use more than 24GB (even though the server has 32GB). We don’t use BGSAVE, so the only time a redis-server should fork and trigger any serious copy-on-write (COW) behavior is when a new slave is attaching to a master.

Now here’s the interesting bit. All the servers are configured identically. But it turns out that the master doesn’t synthesize a delete event into the replication stream when it evicts (deletes) a key due to reaching maxmemory and having to use the LRU policy we’ve chosen. This is different than the normal expire/ttl mechanism that applies to “volatile” keys in redis. In that case, the master controls which keys are removed and sends a delete to each slave when that happens.

The result is that the master and slave are no longer identical when you hit maxmemory and doing LRU-based deletes. And the longer this runs, the more they can drift apart. This is a little surprising.

The discussion going on now revolves around whether or not this is the right default behavior. I’m a bit on the fence but leaning toward “keep the master and slave consistent” because I’m a fan of the principle of least surprise.

If this matters to you, now is a good time to speak up.

About Jeremy Zawodny

I'm a software engineer and pilot. I work at craigslist by day, hacking on various bits of back-end software and data systems. As a pilot, I fly Glastar N97BM, Just AirCraft SuperSTOL N119AM, Bonanza N200TE, and high performance gliders in the northern California and Nevada area. I'm also the original author of "High Performance MySQL" published by O'Reilly Media. I still speak at conferences and user groups on occasion.

View all posts by Jeremy Zawodny →

4 Responses to Redis 2.2 replication consistency with maxmemory and LRU

dave elkins says:

January 27, 2011 at 11:20 am

Why do you run four instances on each server? I understand a master instance and then a slave instance for a master on another server, but why 2 masters and 2 slaves on each?

thanks

Scott says:

January 27, 2011 at 6:44 pm

Redis is single core. So to get more performance out of a machine you run multiple instances. I have questions of my own however. What do you do to monitor failover at the machine level? Is this something application specific or agnostic? I’m looking at creating a no single point of failure Redis deployment.

Konstantin Osipov says:

January 28, 2011 at 12:09 am

If you need auto-expire, what was the key reason you chose redis over memcached? Replication? Could you please tell a bit more how replication is used?
I of course agree with you wrt. replicating deletes.
In tarantool we’re aproaching auto-expiry completely differently. The system itself does not have any expiration instruments, it’s keeps all data consistently. But as long as you can code simplistic stored procedures, and, as long as tarantool is single-threaded, each procedure is executed on the server side atomically, you can code some expiration logic into your “insert” and “select” and “update” stored procedure.

Al says:

November 25, 2013 at 1:42 pm

Here is a condition where it may be good to not replicate the deletions: the “master” is on a high performance system with limited memory, but we want the slave to store all the data that the master has had. e.g. master is an embedded system but the slave is a server with much more memory. This way everything from the master appears on the slave, but as the master cleans the data off itself, it persists on the slave.