Something about

Tim Bond

April 7 2015

About Me

  • Working with PHP for 11+ years
  • I do full stack LAMP dev
  • I work for Lightspeed GMI, a market research company in Bellevue

What is Redis?

Datatypes

Installation

Demos

What is Redis?

  • REmote DIctionary Store
  • Started out as LLOOGG
    • Realtime log analyzer web app
    • "web 2.0 tail -f access.log"
    • Written in PHP and MySQL
    • MySQL couldn't keep up
  • Redis 3.0.0 just came out last week!
  • Sponsored by Pivotal

Photo: @ObjectRocket

What is Redis?

  • Fast
  • Key-value store
  • Queue
  • Pub/Sub Messaging
  • Fast
  • Data structure server

Who uses Redis?

  • Craigslist
  • Twitter
  • Github
  • StackOverflow
  • Flickr
  • Instagram
  • Hulu
  • Just to name a few

Redis in the real world

  • Hulu's distributed setup sustains 1 million ops/second
  • RedisLabs project sustained 2.65 million votes per second on a low spec AWS instance
  • Twitter stores user timelines in Redis.  One datacenter:
    • ~40TB allocated heap
    • ~30 million queries/sec
    • >6,000 instances
  • Network is usually the bottleneck--up to 97% of time for an operation

Redis vs Memcached

  • More than just strings
  • Durable
  • Transactions
  • Scripting
  • No hard limit (unless you want it)
  • Single threaded
  • Not all client libraries serialize

What is Redis?

Datatypes

Installation

Demos

Datatypes: The basics

  • Regardless of datatype, every object has a unique key
  • Convention is to use colons as separators
    • e.g. user:123 or page:456
  • Keys can be up to 512 MB
  • Optional TTL (expiration)

Strings

  • Any sort of text
  • Binary safe
  • Can also be integers
    • Use incr/incrby commands
  • Commands are O(1)

Hashes

  • Think PHP's associative arrays
  • Each field pretty much the same as a Redis string
  • Can retreive all fields O(N), one field O(1), or just some O(n)

Hashes vs Strings

Use case: Instagram needed a quick way to get the user ID for a given media ID

  • Strings: Key = media ID, value = user ID
    • Example: media:1155315 = 939
    • 70 MB to store 1 million keys
  • Hashes: Same, but buckets of 1,000
    • Example: mediabucket:1155 = 1155315 => 939
    • 16 MB to store 1 million keys
  • Memcached used 52 MB

Hashes vs Strings

Based on a plain multi-dimensional object.  Encode/decode times for 10,000 ops

json serialize igbinary
Bytes 43,534 63,141 28,120
Encode 4.6174 4.4535 4.2700
Decode 15.5725 5.4071 3.5658

Lists

  • Same functionality of a doubly linked list
  • Internal representation varies depending on size
  • Accessing the head or tail is O(1)
  • Accessing by index is O(n)
  • Can set TTL on the list but not the members

List operations

  • LPUSH/RPUSH
  • LPOP/RPOP
  • LRANGE
  • BLPOP/BRPOP
    • Blocking, good for job queue daemons
  • (B)RPOPLPUSH
    • (Blocking) right pop, left push on to another list

Lists: Examples

  • Job queues: Process uploaded files, build reports, etc.
    • Webapp will PUSH to list
    • Background daemon uses blocking POP
  • Recent interactions: user uploaded photo, new article published, etc
    • Webapp will PUSH to list (and might also insert into MySQL)
    • Homepage will run LRANGE 0 9 which will show the first 10 items in the list

Sets

  • A lot like a list, sort of like a hash
  • No field names, unlike hash
  • A value can only exist once
  • No order
  • Can set TTL on the set but not the members
  • Lots of comparison functions
    • Intersection (members that exist in both sets)
    • Difference (members that only exist in one set)

Sorted Sets

  • Just like sets, but each member has a score (float)
    • Two members can have the same score
      • String sort is used to break "ties"
  • Members are sorted when stored
    • Inserts are fast: O(log(set length))
    • Lookups are more or less the same
  • Commands similar to list
  • Can also retrieve by score, range of scores, or sort by values

Sorted sets: Examples

  • Leaderboards
    • Every time a player finishes a game, insert their ID in to a list with their points as the score
    • ZRANGE myset 0 4 will return the top five
    • ZREVRANGE myset 0 4 will return the bottom five
  • Most popular articles
  • Autocomplete
    • Store all members with the same score
    • Retrieve by value (ZRANGEBYLEX)

Rate Limiter using Sorted Sets

  • Just one of many ways to accomplish it
    • INCR, sets, lists, others also possible
  • Each IP/user ID has its own set
  • Score and value is the timestamp
  • Transactions (MULTI/EXEC) prevent race conditions

Rate Limiter using Sorted Sets

  1. Start transaction
  2. Drop all elements that occurred more than (interval) ago (ZREMRANGEBYSCORE)
  3. Fetch all elements of the set
  4. Add the current timestamp to the set
  5. Run those commands; Redis returns three values
  6. See if the length of the set is > allowed actions

Bonus: Can compare current timestamp to last timestamp

Caveat: Every action increments the counter

Pub/Sub

  • One or more clients subscribes to a channel
  • One or more clients publishes to a channel
  • If a message is published to a channel and no one is around to subscribe to it, it does not make a sound.

What is Redis?

Datatypes

Installation

Demos

Installation

  • Redis Server
  • PHP client
    • phpredis
    • predis

phpredis

  • Written in C
  • PECL extension

predis

  • Written in PHP
  • Composer project

Neither does serialization!

What is Redis?

Datatypes

Installation

Demos