1,250,000,000 Key/Value Pairs in Redis 2.0.0-rc3 on a 32GB Machine

Following up on yesterday’s 200,000,000 Keys in Redis 2.0.0-rc3 post, which was a worst-case test scenario to see what the overhead for top-level keys in Redis is, I decided to push the boundaries in a different way. I wanted to use the new Hash data type to see if I could store over 1 billion values on a single 32GB box. To do that, I modified my previous script to create 25,000,000 top-level hashes, each of which had 50 key/value pairs in it.

The code for redisStressHash was this:

#!/usr/bin/perl -w
$|++;

use strict;
use lib 'perl-Redis/lib';
use Redis;

my $r = Redis->new(server => 'localhost:63790') or die "$!";

## 2.5B values

for my $key (1..25_000_000) {
	my @vals;

	for my $k (1..50) {
		my $v = int(rand($key));
		push @vals, $k, $v;
	}

	$r->hmset("$key", @vals) or die "$!";
}

exit;

__END__

Note that I added a use lib in there to use a modified Redis Perl library that speaks the multi-bulk protocol used all over in the Redis 2.0 series.

If you do the math, that yields 1.25 billion (1,250,000,000) key/value pairs stored. This time I remembered to time the execution as well:

real	160m17.479s
user	58m55.577s
sys	5m53.178s

So it took about 2 hours and 40 minutes to complete. The resulting dump file (.rdb file) was 13GB in size (compared to the previous 1.8GB) and the memory usage was roughly 17GB.

Here’s the INFO output again on the master:

redis_version:1.3.16
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
process_id:21426
uptime_in_seconds:12807
uptime_in_days:0
connected_clients:1
connected_slaves:1
blocked_clients:0
used_memory:18345759448
used_memory_human:17.09G
changes_since_last_save:774247
bgsave_in_progress:1
last_save_time:1280092860
bgrewriteaof_in_progress:0
total_connections_received:22
total_commands_processed:32937310
expired_keys:0
hash_max_zipmap_entries:64
hash_max_zipmap_value:512
pubsub_channels:0
pubsub_patterns:0
vm_enabled:0
role:master
db0:keys=25000000,expires=0

Not bad, really. This provides a slightly more reasonable usse case of storing many values in Redis. In most applications, I supsect people will have a number of “complex” values stored behind their top-level keys (unlike my previous simple test).

I’m kind of tempted to re-run this test using LISTS, then SETS, then SORTED SETS just to see how they all compare from a storage point of view.

In any case, a 10 machine cluster could handle 12 billion key/value pairs this way. Food for thought.

About Jeremy Zawodny

I'm a software engineer and pilot. I work at craigslist by day, hacking on various bits of back-end software and data systems. As a pilot, I fly Glastar N97BM, Just AirCraft SuperSTOL N119AM, Bonanza N200TE, and high performance gliders in the northern California and Nevada area. I'm also the original author of "High Performance MySQL" published by O'Reilly Media. I still speak at conferences and user groups on occasion.

View all posts by Jeremy Zawodny →

This entry was posted in nosql, programming, tech. Bookmark the permalink.

9 Responses to 1,250,000,000 Key/Value Pairs in Redis 2.0.0-rc3 on a 32GB Machine

Pingback: 24/7 Wrinkle Serum News
LL says:

July 25, 2010 at 8:56 pm

I wonder how much time it takes a slave to sync all that data to itself (assuming it starts when the master is already fully populated).

Jeremy Zawodny says:

July 25, 2010 at 9:45 pm

Oh, that wouldn’t be hard to find out at all. Lemme go try that. 🙂

Pingback: 1,250,000,000 Key/Value Pairs in Redis 2.0.0-rc3 on a 32GB Machine « Jeremy Zawodny’s blog « blackdog
Yogish Baliga says:

July 26, 2010 at 8:37 am

What about access time after storing billions of values? Most of the time access is concurrent.. I am not sure if Redis is using shared read lock or exclusive locks. In any case, there will be a locking overhead while accessing the keys.. was just wondering…

Jeremy Zawodny says:

July 26, 2010 at 8:42 am

There is no locking. Redis is a single process, single threaded, event-driven server.

Pingback: 用Redis存储大量数据 : NoSQLfan
nedm says:

August 8, 2010 at 2:50 pm

In light of the above, I’d be interested to hear your take on this question:

http://serverfault.com/questions/168247/mysql-working-with-192-trillion-records-yes-192-trillion

Google says:

July 9, 2014 at 12:12 am

You actually make it seem so easy with your presentation but I find this topic to be really
something which I think I would never understand. It seems too complicated and extremely broad
for me. I am looking forward for your next post, I will try to get the hang of it!