Sunday, April 06, 2008

Is Backup Really Irrelevant?

Brian Aker writes in his "PostgreSQL to scale to 1 billion users" post:
Backup is irrelevant for those of you who care about this discussion. LVM/ZFS snapshots are the rule of the land.


While I agree with most of Brian's statements in the article, I respectfully disagree with the statement above, especially the bolded part. Copy-on-write snapshots are EVIL for very large databases operating in a high I/O environment and backup, by no means, is entirely irrelevant. Please correct me if I am wrong but it is my understanding that both LVM and ZFS implement copy-on-write snapshots. Backup may be irrelevant for most sites but not for us.

If, however, by "irrelevant" Brian meant that not important in choosing one database over another, I can agree with that. Why? Because no one benchmarks backup methodologies until backup process starts becoming a major PITA.

Backup methods can be a performance killer when dealing with very large databases. If you're interested in finding out why, and more importantly how, ask me at the conference, come to my scaling MySQL and InnoDB on Solaris session, or check on this blog after the conference.

3 comments:

mike503 said...

I still do a nightly mysqldump for (almost) all my clients and sites. It's the most guaranteed method for me.

I setup my latest servers using LVM2 expecting to do snapshots for backup, but I believe that involves downtime (or at least "read only time") - however I formatted them using XFS, and I believe there is an issue (still) with LVM2/XFS for MySQL snapshots...

With mysqldump I have a totally portable file that can be compressed and re-used/re-played anywhere. I like that. With snapshots, afaik, you have to figure out how to export them, then re-apply them using the same partitioning scheme (again, afaik) ... might be easy if you have multiple servers hanging around to spare, but I'm on a budget. Right now I can pull down last night's mysqldump and be back up as soon as it imports (and this I suppose can be where other methods would help out, not having to re-play every INSERT/etc. again)

This is one of the things I liked about solidDB's feature set. It came with a "high-performance" on-line backup utility. Sounded like it had the ability to dedicate a thread or something to a backup task and not interrupt or slow down normal processing.

I am not sure, but I would hope MySQL itself could emulate something like this.

Sadly, solidDB seems to be dead now. At least for MySQL. They've pulled all the product downloads and references to it on their site (besides for archived press...)

*Sigh*

Brian Aker said...

Hi!

So why use snapshots? The entire state of the machine is what matters. Think of the state of the machine as a "toaster". You want to have all of what made up the state of the machine.

Sysadmins understand this, and it gives them the ability to backup databases in the same manner they backup most everything else.

ZFS is always copy on write (that is my understanding at least).

Backups are always onerous on IO. Either you are writing more data then what you want, or you are blowing IO by reading more data off the disks then what you want (this is why I have always been a fan of backing up slaves... just special purpose up a slave for backups).

OSTG has a couple of master machines that run multi instance just so they can store multiple machines on to a single machine for backups. Livejournal at one point just used their second multi master node for this.

Cheers,
-Brian

Peter Zaitsev said...

Frank,

What kind of issue you're speaking about. Issue about write overhead with snapshot COV or issues with concurrent backup process reading the date ?

Do you have any benchmark numbers.

In our experience COV overhead is not that large (modified pages written sequentially) while backround read process just need to be throttled so it limits the load.