You spent already 50GB on the memcache cluster, but you still see many evictions and the cache hit ratio doesn't look good since a few days. The developers swear that they didn't change the caching recently, they checked the code twice and have found no problem.
What now? How to get some insight into the black box of memcached? One way would be to add logging to the application to see and count what is being read and written and then to guess from this about the cache efficiency. For to debug what's happening we need to set how the cache keys are used by the application.
An Easier Way
Memcache itself provides a means to peek into its content. The memcache protocol provides commands to peek into the data that is organized by slabs (categories of data of a given size range). There are some significant limitations though:
- You can only dump keys per slab class (keys with roughly the same content size)
- You can only dump one page per slab class (1MB of data)
- This is an unofficial feature that might be removed anytime.
The second limitation is propably the hardest because 1MB of several gigabytes is almost nothing. Still it can be useful to watch how you use a subset of your keys. But this might depend on your use case.
If you don't care about the technical details just skip to the tools section to learn about what tools allow you to easily dump everything. Alternatively follow the following guide and try the commands using telnet against your memcached setup.
How it Works
First you need to know how memcache organizes its memory. If you start memcache with option "-vv" you see the slab classes it creates. For example
$ memcached -vv
slab class 1: chunk size 96 perslab 10922
slab class 2: chunk size 120 perslab 8738
slab class 3: chunk size 152 perslab 6898
slab class 4: chunk size 192 perslab 5461
In the configuration printed above memcache will keep fit 6898 pieces of data between 121 and 152 byte in a single slab of 1MB size (6898*152). All slabs are sized as 1MB per default. Use the following command to print all currently existing slabs:
If you've added a single key to an empty memcached 1.4.13 with
set mykey 0 60 1
you'll now see the following result for the "stats slabs" command:
STAT 1:chunk_size 96
STAT 1:chunks_per_page 10922
STAT 1:total_pages 1
STAT 1:total_chunks 10922
STAT 1:used_chunks 1
STAT 1:free_chunks 0
STAT 1:free_chunks_end 10921
STAT 1:mem_requested 71
STAT 1:get_hits 0
STAT 1:cmd_set 2
STAT 1:delete_hits 0
STAT 1:incr_hits 0
STAT 1:decr_hits 0
STAT 1:cas_hits 0
STAT 1:cas_badval 0
STAT 1:touch_hits 0
STAT active_slabs 1
STAT total_malloced 1048512
The example shows that we have only one active slab type #1. Our key being just one byte large fits into this as the smallest possible chunk size. The slab statistics show that currently on one page of the slab class exists and that only one chunk is used.
Most importantly it shows a counter for each write operation (set, incr, decr, cas, touch) and one for gets. Using those you can determine a hit ratio!
You can also fetch another set of infos using "stats items" with interesting counters concerning evictions and out of memory counters.
STAT items:1:number 1
STAT items:1:age 4
STAT items:1:evicted 0
STAT items:1:evicted_nonzero 0
STAT items:1:evicted_time 0
STAT items:1:outofmemory 0
STAT items:1:tailrepairs 0
STAT items:1:reclaimed 0
STAT items:1:expired_unfetched 0
STAT items:1:evicted_unfetched 0
What We Can Guess Already...
Given the statistics infos per slabs class we can already guess a lot of thing about the application behaviour:
- How is the cache ratio for different content sizes?
- How good is the caching of large HTML chunks?
- How much memory do we spend on different content sizes?
- How much do we spend on simple numeric counters?
- How much do we spend on our session data?
- How much do we spend on large HTML chunks?
- How many large objects can we cache at all?
Of course to answer the questions you need to know about the cache objects of your application.
Now: How to Dump Keys?
Keys can be dumped per slabs class using the "stats cachedump" command.
stats cachedump <slab class> <number of items to dump>
To dump our single key in class #1 run
stats cachedump 1 1000
ITEM mykey [1 b; 1350677968 s]
The "cachedump" returns one item per line. The first number in the braces gives the size in bytes, the second the timestamp of the creation. Given the key name you can now also dump its value using
VALUE mykey 0 1
This is it: iterate over all slabs classes you want, extract the key names and if need dump there contents.
There are different dumping tools sometimes just scripts out there that help you with printing memcache keys: