Sunday, April 06, 2008

Facebook using MySQL to replicate Memcached

Faced with the challenge "to figure out a way for memcached servers to replicate data concurrently with the MySQL databases," across the country, Facebook came up with a clever solution of "embedding extra information in to the MySQL replication stream that allows [Facebook] to properly update memcached [servers] in Virginia."

This is very smart! I am curious about how they implemented this. I wonder if by "replication stream" they are just referring to binary logs. The article didn't mention whether they hacked MySQL to do synchronous replication as well, like Google. That would be really neat: synchronous replication that updates memcached.

Synchronous or not, the idea is still uber cool and I would love to see more discussion from Planet MySQL community regarding this.

Making replication possible for Brian Aker's memcached storage engine for MySQL can be another way in the future to making MySQL replicate to memcached. Brian's blog post shows:
ENGINE=MEMCACHE DEFAULT CHARSET=latin1 CONNECTION='localhost,piggy,bitters'
The multiple host specification looks very interesting. I will definitely love to talk about this with the brains at the conference.

Also, something like this would make a nice candidate for programs like Google summer of code.

Thanks to my colleague and friend A. Lee for brining this to my attention.

2 comments:

Anonymous said...

I think I helped them think this through, along with about a dozen other memcached hackers.

If it's the solution we dreamed up at a hackathon at their HQ, it uses a blackhole table so that updates to memcached get in the binlogs and are replicatable, but there's no actual data sent to MySQL disks and little-to-no CPU time wasted on MySQL servers.

The binlog then is just parsed after replication and applied to any relevant memcached nodes.

I think when we were first talking about it, it was mostly being used just to pass deletes through, and then memcached would just lazily re-fill the new data on-demand, rather than actively filling it.

Things may have changed since then, but it sure seemed like a great start...

Anonymous said...

Here you can see a sample setup for that:
http://golanzakai.blogspot.com/2008/11/memcached-replication-and-namespaces.html