Redis
Redis is an open-source, networked, in-memory, key-value data store with optional durability, written in ANSI C.
Setup
[edit]Cache
[edit]If you have not already, you'll need to configure a Redis instance and install a Redis client library for PHP. Most environments require the phpredis PHP extension. On Debian / Ubuntu, you can install the requirements with the following command:
apt-get install redis-server php-redis
In your "LocalSettings.php" file, set:
$wgObjectCaches['redis'] = [
'class' => 'RedisBagOStuff',
'servers' => [ '127.0.0.1:6379' ],
// 'connectTimeout' => 1,
// 'persistent' => false,
// 'password' => 'secret',
// 'automaticFailOver' => true,
];
- Parameters explained
servers
: An array of server names. A server name may be a hostname, a hostname/port combination or the absolute path of a UNIX socket. If a hostname is specified but no port, the standard port number 6379 will be used. Arrays keys can be used to specify the tag to hash on in place of the host/port. Required.connectTimeout
: The timeout for new connections, in seconds. Optional, default is 1 second.persistent
: Set this to true to allow connections to persist across multiple web requests. False by default.password
: The authentication password, will be sent to Redis in clear text. Optional, if it is unspecified, no AUTH command will be sent.automaticFailover
: If this is false, then each key will be mapped to a single server, and if that server is down, any requests for that key will fail. If this is true, a connection failure will cause the client to immediately try the next server in the list (as determined by a consistent hashing algorithm). This has the potential to create consistency issues if a server is slow enough to flap, for example if it is in swap death. True by default.
You will now be able to acquire a Redis object cache object via ObjectCache::getInstance( 'redis' )
. If you'd like to use Redis as the default cache for various data, you may set any of the following configuration options:
$wgMainCacheType = 'redis';
Job queue
[edit]$wgJobTypeConf['default'] = [
'class' => 'JobQueueRedis',
'redisServer' => '127.0.0.1:6379',
'redisConfig' => [],
'daemonized' => true
];
- Parameters explained
redisConfig
: An array of parameters to RedisConnectionPool::__construct(). Note that the serializer option is ignored as "none" is always used. If the same Redis server is used as for$wgObjectCaches
, the Redis password needs to be set here as well (see$wgObjectCaches
config above).redisServer
: A hostname/port combination or the absolute path of a UNIX socket. If a hostname is specified but no port, the standard port number 6379 will be used. Required.compression
: The type of compression to use; one of (none,gzip).daemonized
: Currently it doesn't support setting it to false.
From that moment, jobs will be delivered to the Redis instance on the specified server.
Automatic handling of job recycling and abandons
[edit]Abandoned jobs aren't purged from redis, and failed and delayed jobs need to be rescheduled. This requires a special job runner service.
Clone the git repository https://github.com/wikimedia/mediawiki-services-jobrunner
Create a configuration file named config.json
:
{
"groups": {
"basic": {
"runners": 0
}
},
"limits": {
},
"redis": {
"aggregators": [
"127.0.0.1:6379"
],
"queues": [
"127.0.0.1:6379"
]
},
"dispatcher": "nothing"
}
Configure a daemon to run this at server start:
php redisJobChronService --config-file=config.json
The daemon itself supports running jobs from the queue, but that's not very well documented. See also Nad's docu on setting up the job queue (partially outdated).
MediaWiki & Wikimedia use cases for Redis
[edit]- History of job queue runners at WMF on Wikitech.
Further reading
[edit]General
[edit]- Official site (see esp. Introduction to Redis)
- The Redis article on the English Wikipedia.
- Redis Watch - an e-mail round-up of Redis news, articles, tools and libraries
- Redis/INCR
- Getting to Know Redis
- Redis, from the Ground Up
- Redis and Relational Data
- Redis Cookbook (book; not great, but see ch. "Analytics and Time-Based Data")
- Interview with Salvatore Sanfilippo (code-oriented but still useful)
- Redis DB (Google Group)
Analytics
[edit]- Redis at Disqus (their entire analytics platform runs on Redis)
- Effective Web App Analytics with Redis
- How YouPorn uses Redis (video)
- Realtime metrics using Redis bitmaps
Tooling
[edit]- Redsmin a real-time, atomic, performant administration and monitoring interface for Redis
- redis-py is the library of choice for Python
- Redis and Python (presentation slides)
- Resque for jobs
- Redisco, a Python ORM for Redis
- py-analytics (I haven't used this)
- redis-bitops Ruby gem for sparse bitmap operations
Informed Opinions
[edit]Miscellaneous
[edit]- Storing hundreds of millions of simple key-value pairs (how Instagram uses Redis)
- Key performance metrics to monitor for Redis