Caching Expensive Results with MemCached

2017-11-03 09:05:07

Credit: Michael Granger with Ben Bleything

Problem

You want to transparently cache the results of expensive operations, so that code that triggers the operations doesn need to know how to use the cache. The memcached program, described in Recipe 16.16, lets you use other machines RAM to store key-value pairs. The question is how to hide the use of this cache from the rest of your code.

Solution

If you have the luxury of designing your own implementation of the expensive operation, you can design in transparent caching from the beginning. The following code defines a get method that delegates to expensive_get if it can find an appropriate value in the cache. In this case, the expensive operation that gets cached is the (relatively inexpensive, actually) string reversal operation:

require ubygems require memcache class DataLayer def initialize(*cache_servers) @cache = MemCache.new(*cache_servers) end def get(key) @cache[key] ||= expensive_get(key) end alias_method :[], :get protected def expensive_get(key) # …do expensive fetch of data for key puts "Fetching expensive value for #{key}" key.to_s.reverse end end

Assuming youve got a memcached server running on your local machine, you can use this DataLayer as a way to cache the reversed versions of strings:

layer = DataLayer.new( localhost:11211 ) 3.times do puts "Data for foo: #{layer[foo]}" end # Fetching expensive value for foo # Data for foo: oof # Data for foo: oof

Discussion

Thats the easy case. But you don always get the opportunity to define a data layer from scratch. If you want to add memcaching to an existing data layer, you can create a caching strategy and add it to your existing classes as a mixin.

Heres a data layer, already written, that has no caching:

class MyDataLayer def get(key) puts "Getting value for #{key} from data layer" return key.to_s.reverse end end

The data layer doesn know about the cache, so all of its operations are expensive. In this instance, its reversing a string every time you ask for it:

layer = MyDataLayer.new "Value for foo: #{layer.get(foo)}" # Getting value for foo from data layer # => "Value for foo: oof" "Value for foo: #{layer.get(foo)}" # Getting value for foo from data layer # => "Value for foo: oof" "Value for foo: #{layer.get(foo)}" # Getting value for foo from data layer # => "Value for foo: oof"

Lets improve performance a little by defining a caching mixin. Itll wrap the get method so that it only runs the expensive code (the string reversal) if the answer isn already in the cache:

require memcache module GetSetMemcaching SERVER = localhost:11211 def self::extended(mod) mod.module_eval do alias_method :__uncached_get, :get remove_method :get def get(key) puts "Cached get of #{key.inspect}" get_cache()[key] ||= __uncached_get(key) end def get_cache puts "Fetching cache object for #{SERVER}" @cache ||= MemCache.new(SERVER) end end super end def self::included(mod) mod.extend(self) super end end

Once we mix GetSetMemcaching into our data layer, the same code we ran before will magically start to use use the cache:

# Mix in caching to the pre-existing class MyDataLayer.extend(GetSetMemcaching) "Value for foo: #{layer.get(foo)}" # Cached get of "foo" # Fetching cache object for localhost:11211 # Getting value for foo from data layer # => "Value for foo: oof" "Value for foo: #{layer.get(foo)}" # Cached get of "foo" # Fetching cache object for localhost:11211 # => "Value for foo: oof" "Value for foo: #{layer.get(foo)}" # Cached get of "foo" # Fetching cache object for localhost:11211 # => "Value for foo: oof"

The examples above are missing a couple features youd see in real life. Their API is very simple (just get methods), and they have no cache invalidationitems will stay in the cache forever, even if the underlying data changes.

The same basic principles apply to more complex caches, though. When you need a value thats expensive to find or calculate, you first ask the cache for the value, keyed by its identifying feature. The cache might map a SQL query to its result set, a primary key to the corresponding database object, an array of compound keys to the corresponding database object, and so on. If the object is missing from the cache, you fetch it the expensive way, and put it in the cache.

Caching Expensive Results with MemCached

Problem

Solution

Discussion

See Also

Категории