Thursday, June 23, 2011

XtraBackup Manager - Backup Strategies and Materialized Snapshots

Hi Folks,

I have now committed the changes for the new Backup Strategies feature to trunk! In addition, I'm pretty much finished on implementing the majority of the Materialized Snapshot feature/option.

So let me talk a little bit about those features...

Enabling the "maintain_materialized_copy" feature for a backup will mean that while XBM takes FULL backups and INCREMENTAL backups and keeps them separately, it will maintain an additional directory that contains a complete backup with the latest deltas applied to it.

We only keep a materialized copy of the latest backup, not for each and every possible restore point as that would take up more space than most people can afford ( or at least more than we can afford ).

One benefit here is that if some problem should occur applying the latest set of deltas, you do not risk completely voiding your backup, you can always restore from the seed and deltas that are stored separately, up until the snapshot before the problem, and then perhaps use binary logs to roll forward from there.

Using materialized snapshots also means that you are constantly testing the process of actually applying your deltas, so if something was wrong with that step, you will learn about it quickly, not later on when you are desperately trying to restore from your backups.

Another great thing about materialized snapshots is that there is no waiting around for multiple sets of deltas to apply in order to restore your latest backup. Simply copy the materialized snapshot to the restore location and fire up MySQL -- InnoDB will of course take the usual time to do final crash recovery steps, but it should be much faster to get back up and running.

Now a little on Backup Strategies. There are three major strategies available and I'll talk a little on each below.

Full Backup Only

This is fairly cut and dry. XtraBackup Manager only takes full backups. You can configure how many is the maximum number of these snapshots to keep. After each backup is complete, the retention policy will be applied and any number of backups beyond the maximum will be deleted, counting from the latest to oldest. There is no need or option for materialized snapshots, since in this case all backups are always fully materialized.

Continuous Incremental

Take a full backup (aka seed) first and then after that only take incremental backups. The seed and deltas are all stored separately. Again you can configure the maximum number of snapshots to maintain (retention policy), however, in this case, we apply the oldest set of deltas onto the seed and repeat that process until we have no more than the maximum number of snapshots configured. The retention policy is always applied after a successful backup.

Rotating Groups


This is the most complex backup strategy, but it allows a great deal of flexibility. The concept here is that we consider a backup group as a full backup with corresponding sets of deltas. You may configure the number of groups you keep, as well as when new groups should be created in a variety of ways.

The benefit of keeping more than one group, is that should one seed or set of deltas be bad or broken in any way, you have another option -- in addition, you may more frequently take full backups, which means that the number of sets of deltas to be applied to reach a particular restore point will be less.

When using rotating groups, you must select a rotation method - there are two options: DAY_OF_WEEK and AFTER_SNAPSHOT_COUNT.

With the snapshot count rotation method, the first backup will be a FULL backup, after which incremental backups are taken until a total number of backups equals the number you configure. The next backup after that will be a full backup in a new group and the backups after that will be incrementals, based on the newly taken full backup. This cycle just repeats, however, retention is controlled based on the maximum number of groups to maintain. Once a new group is created beyond the maximum allowed, the oldest group will be removed until there are no more than the maximum.

With day of week, you simply select which day(s) of week you would like your FULL backups to be taken on -- XBM will "rotate" on the first snapshot taken for that day. "Rotate" essentially means it will create a new group with its own full backup and then proceed to take deltas until a "rotate_day_of_week" is encountered again. You can also configure a maximum number of deltas allowed, in case for some reason the backup is never running on the day of the week that it should. In that case it will not backup - You may configure if you consider that a fatal error that should be alerted, or if it should just silently do nothing/skip that backup.

The benefit of using day of week over snapshot count is that it allows you to firmly control which days your full backups should happen on.

Eg. If I deploy backups on a new host and I configure to take full backups on Sunday. I might kick off the first backup on a Wednesday -- in this case because it is the first backup ever for the host, it will take a full backup and then continue to take deltas until Sunday, when it will take a full backup again and then continue to rotate every Sunday from then on.

Again for day of week rotation, retention is controlled based on the maximum number of groups to maintain. Once a new group is created beyond the maximum allowed, the oldest group will be removed until there are no more than the maximum.

Now with all of these complex behaviours and options to configure and close to zero up-to-date documentation, I am about the only person who could make use of XBM, so the next steps are a better configuration tool/interface and documentation.

In addition, I'm also planning to add support for backing up the MySQL binary logs.

Once again, if you're interested in contributing in any way or just checking out the project, it is hosted on Google Code here:

  http://code.google.com/p/xtrabackup-manager/

Comments and feedback are welcome!

Cheers,
Lachlan

8 comments:

  1. ...and I'm the other person knowing how to use it :-)

    Lachlan, this is amazing stuff. I honestly didn't expect you to get here this fast.

    To be honest, I'm not even sure I desperately need binary log backups now, I can just take frequent xtrabackup increments. Of course, the binlog still makes for a good undo tool where I can roll back to a specific statement that went wrong.

    I suggest you start with documentation, DBA's can use SQL as a configuration interface :-)

    Henrik

    ReplyDelete
  2. Thanks for the words of encouragement, Henrik!

    ReplyDelete
  3. Very nice stuff indeed. Keep up the good work.

    ReplyDelete
  4. i second the more documentation request :)

    ReplyDelete
  5. This is great work. I am happy to see a community building up around XtraBackup.

    ReplyDelete
  6. Indeed. xTrabackup is a solid piece of work. kudos dude.

    ReplyDelete
  7. It's indeed a nice piece o'work! I'm planning to look into it this week :)

    ReplyDelete
  8. This comment has been removed by a blog administrator.

    ReplyDelete