An article I wrote was posted to the Facebook Engineering blog, about the automation system I worked on at Facebook for MySQL Database Provisioning.
It covers, in fairly intimate detail, a system called "Windex" that we use to provision and re-provision our MySQL databases at Facebook. This system basically provisioned the new Facebook Datacenter in LuleƄ, Sweden, with very little human effort, saving us loads of time.
So, if you're curious about some of what it is that has been taking up all my time for the last year and some, or if you're just always curious about how Facebook is doing things, go check it out.
MySQL Soapbox
Various musings on MySQL and other technical topics...
Friday, July 12, 2013
Tuesday, April 17, 2012
Support for XtraBackup 2.0 in XtraBackup Manager coming soon...
Hi Folks,
Just a quick note to let you know that I am planning to add support to XtraBackup Manager to work with XtraBackup 2.0 series releases fairly soon.
Using the XtraBackup 2.0 series will mean that XtraBackup Manager will no longer need to stage the incremental backups to a location on the remote host before copying them back to the XtraBackup Manager storage.
This can be a remarkable efficiency saving for systems with a lot of page changes between backups.
I will also be trying to address some of the feedback/requests that I have received in the Google Code Issues section.
Please check out the project in Google Code here, if you have not already. Feedback and contributions are welcomed!
http://code.google.com/p/xtrabackup-manager/
Cheers,
Lachlan
Just a quick note to let you know that I am planning to add support to XtraBackup Manager to work with XtraBackup 2.0 series releases fairly soon.
Using the XtraBackup 2.0 series will mean that XtraBackup Manager will no longer need to stage the incremental backups to a location on the remote host before copying them back to the XtraBackup Manager storage.
This can be a remarkable efficiency saving for systems with a lot of page changes between backups.
I will also be trying to address some of the feedback/requests that I have received in the Google Code Issues section.
Please check out the project in Google Code here, if you have not already. Feedback and contributions are welcomed!
http://code.google.com/p/xtrabackup-manager/
Cheers,
Lachlan
Monday, April 9, 2012
Talking At MySQL Conference
Hi Everyone!
Just a reminder to all of those who are attending the MySQL Conference in Santa Clara this week that I'll be presenting a session all about XtraBackup Manager.
My session will be entitled "Introducing XtraBackup Manager" and happens on 11 April 15:30-16:20 @ Ballroom D.
If you are interested in learning more about XtraBackup Manager, or would just like to come support me - I look forward to seeing you there!
Cheers,
Lachlan
Just a reminder to all of those who are attending the MySQL Conference in Santa Clara this week that I'll be presenting a session all about XtraBackup Manager.
My session will be entitled "Introducing XtraBackup Manager" and happens on 11 April 15:30-16:20 @ Ballroom D.
If you are interested in learning more about XtraBackup Manager, or would just like to come support me - I look forward to seeing you there!
Cheers,
Lachlan
Tuesday, February 7, 2012
XtraBackup Manager - Job Control, Better Debugging and some little fixes...
Hi Everyone,
Just a quick note to let you know that I've just finished up adding some new features to XtraBackup Manager.
You can now get better visibility into what is going on inside XtraBackup Manager with the "xbm status" command.
It will allow you to see which backup jobs are running and also those which may be waiting to start, due to the maximum number of concurrent backup tasks already running.
It looks/works as follows:
Note: I have to thank a tiny little BSD-licensed project I found on Google Code called PHP text table for saving me the need to reinvent the wheel in providing this very mysql command-line client-styled table output.
In addition to seeing which jobs are running/queued, if there is a backup job you would like to abort for some reason, then you can now simply use the "xbm kill" command with a Job ID taken from the "xbm status" output:
The backup job itself will log an event at the ERROR level, like:
I'm still not 100% on whether an aborted backup message should be considered an "Error" level event or an "Info" level event. My thinking is that I'd prefer to know if a job was aborted, so I figure putting it at the ERROR level will ensure it is always logged.
Now speaking quickly of the log levels -- it is now useful to increase your logging level in config.php from INFO to DEBUG.
You will see the exact commands used for running backups by XtraBackup Manager, which can be useful when troubleshooting XBM-related issues.
It will enable logging like the below -- Note: The password is _actually_ masked when writing the command to the log. You're welcome ;-)
Aside from the above, some other small fixes were made, including ensuring that all temporary files created on the database host that you're backing up are created in the defined "staging_tmpdir" -- This is a parameter that is set at the host level in XtraBackup Manager.
eg. shell> xbm host edit hostname staging_tmpdir /path/to/use
I encourage you to check out the XtraBackup Manager Project and open issues with any problems you encounter or other feedback.
Cheers,
Lachlan
Just a quick note to let you know that I've just finished up adding some new features to XtraBackup Manager.
You can now get better visibility into what is going on inside XtraBackup Manager with the "xbm status" command.
It will allow you to see which backup jobs are running and also those which may be waiting to start, due to the maximum number of concurrent backup tasks already running.
It looks/works as follows:
[xbm@localhost ~]$ xbm status XtraBackup Manager v0.8 - Copyright 2011-2012 Marin Software Currently Running Backups: +--------+-----------+-------------+---------------------+-------------------+------+ | Job ID | Host | Backup Name | Start Time | Status | PID | +--------+-----------+-------------+---------------------+-------------------+------+ | 14 | localhost | test-backup | 2012-02-07 14:19:19 | Performing Backup | 2525 | +--------+-----------+-------------+---------------------+-------------------+------+
Note: I have to thank a tiny little BSD-licensed project I found on Google Code called PHP text table for saving me the need to reinvent the wheel in providing this very mysql command-line client-styled table output.
In addition to seeing which jobs are running/queued, if there is a backup job you would like to abort for some reason, then you can now simply use the "xbm kill" command with a Job ID taken from the "xbm status" output:
[xbm@localhost ~]$ xbm kill 14 XtraBackup Manager v0.8 - Copyright 2011-2012 Marin Software Action: Backup Job ID 14 was killed.
The backup job itself will log an event at the ERROR level, like:
2012-02-07 14:19:30 -0800 [ERROR] : [ The backup job was killed by an administrator. Aborting... ] 2012-02-07 14:19:30 -0800 [INFO] : [ Cleaning up files... ] 2012-02-07 14:19:30 -0800 [INFO] : [ Released lock on port 10000. ] 2012-02-07 14:19:31 -0800 [ERROR] : [ Exiting after the backup job was killed... ]
I'm still not 100% on whether an aborted backup message should be considered an "Error" level event or an "Info" level event. My thinking is that I'd prefer to know if a job was aborted, so I figure putting it at the ERROR level will ensure it is always logged.
Now speaking quickly of the log levels -- it is now useful to increase your logging level in config.php from INFO to DEBUG.
You will see the exact commands used for running backups by XtraBackup Manager, which can be useful when troubleshooting XBM-related issues.
It will enable logging like the below -- Note: The password is _actually_ masked when writing the command to the log. You're welcome ;-)
2012-02-07 14:19:19 -0800 [INFO] : [ Staging an INCREMENTAL xtrabackup snapshot of /var/lib/mysql via ssh: mysql@localhost to /tmp/xbm-3592510/deltas... ] 2012-02-07 14:19:19 -0800 [DEBUG] : [ Attempting to run the incremental backup with command: ssh -o StrictHostKeyChecking=no -p 22 mysql@localhost 'cd /tmp/xbm-3592510 ; innobackupex --ibbackup=xtrabackup --slave-info --incremental-lsn=2325647 /tmp/xbm-3592510/deltas --user=root --safe-slave-backup --password=XXXXXXX --no-timestamp --incremental --throttle=0 1>&2 ' ]
Aside from the above, some other small fixes were made, including ensuring that all temporary files created on the database host that you're backing up are created in the defined "staging_tmpdir" -- This is a parameter that is set at the host level in XtraBackup Manager.
eg. shell> xbm host edit hostname staging_tmpdir /path/to/use
I encourage you to check out the XtraBackup Manager Project and open issues with any problems you encounter or other feedback.
Cheers,
Lachlan
Tuesday, January 24, 2012
I'm speaking at the MySQL Conference And Expo 2012!
Hello Everyone,
I'm very pleased to announce that my submission to talk at the Mysql Conference And Expo 2012 has been accepted! I'll be giving a talk entitled "Introducing XtraBackup Manager", which, as the title suggests, will serve as an introduction to XtraBackup Manager.
I'll be covering what it is, how it works and its features as well as reserving some time for Q+A.
If you are interested in learning more about this tool and plan to attend the conference, this will be a great way to get started!
I hope to see some of you there in April!
For more info on the conference, click here.
Cheers,
Lachlan
I'm very pleased to announce that my submission to talk at the Mysql Conference And Expo 2012 has been accepted! I'll be giving a talk entitled "Introducing XtraBackup Manager", which, as the title suggests, will serve as an introduction to XtraBackup Manager.
I'll be covering what it is, how it works and its features as well as reserving some time for Q+A.
If you are interested in learning more about this tool and plan to attend the conference, this will be a great way to get started!
I hope to see some of you there in April!
For more info on the conference, click here.
Cheers,
Lachlan
Thursday, January 5, 2012
XtraBackup Manager Pre-Release v0.8 - Try it out today!
Aloha Everybody!
I'm happy to announce XtraBackup Manager Pre-Release v0.8!
Now that XtraBackup 1.6.4 is released and I have completed some of my final show-stopper bug fixes, I feel that XtraBackup Manager is now in a state ready for more general consumption.
I have yet to package up tarballs, but the Quick Start Guide in the Project Wiki contains all the steps you should need to get up and running from the svn trunk.
There is also some great detailed documentation, including diagrams of all of the different Backup Strategies here.
So please, check out the Project and take it for a spin -- if you have problems or questions, join the discussion on the XtraBackup Manager Google Group!
Thanks and Happy 2012!!
Note: Release notes for XtraBackup v0.8 can be found here.
Lachlan
I'm happy to announce XtraBackup Manager Pre-Release v0.8!
Now that XtraBackup 1.6.4 is released and I have completed some of my final show-stopper bug fixes, I feel that XtraBackup Manager is now in a state ready for more general consumption.
I have yet to package up tarballs, but the Quick Start Guide in the Project Wiki contains all the steps you should need to get up and running from the svn trunk.
There is also some great detailed documentation, including diagrams of all of the different Backup Strategies here.
So please, check out the Project and take it for a spin -- if you have problems or questions, join the discussion on the XtraBackup Manager Google Group!
Thanks and Happy 2012!!
Note: Release notes for XtraBackup v0.8 can be found here.
Lachlan
Friday, December 2, 2011
XtraBackup Manager - XtraBackup Throttling
Hello again!
This week I have been spending some time adding support for throttling to XtraBackup Manager as it has been considered a pre-requisite for us using the tool against our production databases.
In order to add support for throttling, the first thing I did was to look into what kind of means are available to throttle.
It seems there are two methods, both of which are mentioned in Percona's docs or blogs.
#1. Use the --throttle=N parameter. You can give this to innobackupex or to xtrabackup directly. According to the documentation this will limit xtrabackup to use N IOPs/sec when running in --backup mode.
For local machine backups this means N total read/write IOPS/sec and for incrementals this simply means N read IOPS/sec. When using streaming mode --throttle does not take effect (see #2).
#2. Use a nifty tool called "pv" (Pipe Viewer). It has a few features, but most notably it can be use as a simple rate limiter in your pipeline. An example:
shell> cat myFile | pv -q -L10m > myFileCopy
The above will limit the speed at which the file is "cat" into myFileCopy to 10 megabytes a second. Assuming of course the IO subsystem can reach at least that speed.
The best application for pv is to place it somewhere in the pipeline of your streaming backups to limit the rate at which things can flow through the pipeline.
Eg.
shell> innobackupex --stream | pv -q -L10m | nc targetHost 10000
The above will stream through pv and limit the maximum throughput to 10 megabytes/second.
So now understanding what rate limiting methods are available, I needed to consider in what ways XtraBackup Manager uses XtraBackup and the best way to implement the throttling.
I know that:
a) XtraBackup Manager always uses streaming mode when it takes a full backup, so the only option to use there is #2, pv.
b) When performing an incremental backup, XtraBackup Manager will always have xtrabackup stage the deltas locally, before using netcat (nc) to shuttle the data back over the network to the backup host for storage. In this case, limiting using pv is not really useful, because xtrabackup is going to chew up as much IO as it can while calculating the deltas, so we need to opt for the --throttle option on xtrabackup.
So once I understood that I'll need to actually implement throttling in two ways in XtraBackup Manager, I thought about how I would present it to the user for configuration.
I personally find it a bit annoying and confusing that I have to think in two units of measurement for different situations, so I wanted to see if I could insulate the user from that.
My aim was to see if I could present the user with a single configurable for the throttle on a backup task. After all, you don't care what type of backup is going on, you just want to say "Don't use more IO than this much…".
So in order to achieve this, I needed to understand the relationship between the two options as well as the characteristics of IO in both cases.
From my understanding, if you are taking a full backup, you are simply streaming each file sequentially - so we are talking about sequential reads here.
If we are talking about incrementals, we basically give xtrabackup a log sequence number and say "check all the pages and copy ones with a log sequence number above the one we gave" -- so we're finding the pages that have been changed since the given log sequence number.
In this case, it should also be a sequential read, as we're scanning pages end to end, and just checking the log sequence number.
So in both cases it seems we're talking about sequential reads.
When using pv, we're already dealing in a term that is easy to understand and fairly non-subjective. A rate limit in megabytes/sec of sequential read is straight forward.
Now when we're dealing with the --throttle option and thinking in IOPS we have some more to think about. Firstly, how big is an IOP?
Since I'm no good at reading C source code, I opted for the black box method of investigation and simply took an idle database server and started running xtrabackup against it with various --throttle values, while watching iostat on the data mount.
Here are some results:
Throttle value vs Observed disk throughput MB/sec
1:3 MB/sec
2:4 MB/sec
3:5 MB/sec
4:6 MB/sec
5:7 MB/sec
Interestingly the pattern I observe is: throughput = N+2
My best interpretation after even attempting a little digging into xtrabackup.c is that on this idle system we are limiting xtrabackup to 1 x 1MB IOP per second to scan the InnoDB data files, plus we burn 2MB per second to scan/copy the InnoDB log so that it can be applied later.
Now the catch 22 in this whole thing is that I'm observing this on an idle system, so this 2MB per second of log IO would increase if there is more log activity -- surely on a busy system you would need to read more than 2MB of logs every second to keep up.
The catch part? If I actually make the system busy, I can no longer determine where all the different IO in iostat is coming from, so I can't determine how much IO xtrabackup is now using. I'm sure there is a better way to instrument that per process, but unfortunately it extends beyond my personal skill set right now.
In blogging this, I'm hoping someone reading this can help with ideas or clarification...
So coming back to how I should implement the throttling -- I'm fairly sure that IOPS are 1MB in xtrabackup and pv also allows me to throttle in MB/sec, so I should be able to give one simple "throttle" configurable to the XtraBackup Manager user and tell them it limits in MB/sec.
The question then becomes, should I adjust the value I pass to --throttle for XtraBackup to account for this "at least 2MB used for log scanning"?
I decided I wanted to try to be clever and go ahead and adjust it -- so the value passed to XtraBackup for --throttle is now adjusted -2. If the adjustment gives a throttle value less than 1, it is simply given as 1.
None of this is set in stone -- I'm still testing and experimenting, but I'm curious to know your thoughts.
Can anyone shed light on what xtrabackup is doing ?
Should I bother adjusting this value or not ?
Cheers,
Lachlan
This week I have been spending some time adding support for throttling to XtraBackup Manager as it has been considered a pre-requisite for us using the tool against our production databases.
In order to add support for throttling, the first thing I did was to look into what kind of means are available to throttle.
It seems there are two methods, both of which are mentioned in Percona's docs or blogs.
#1. Use the --throttle=N parameter. You can give this to innobackupex or to xtrabackup directly. According to the documentation this will limit xtrabackup to use N IOPs/sec when running in --backup mode.
For local machine backups this means N total read/write IOPS/sec and for incrementals this simply means N read IOPS/sec. When using streaming mode --throttle does not take effect (see #2).
#2. Use a nifty tool called "pv" (Pipe Viewer). It has a few features, but most notably it can be use as a simple rate limiter in your pipeline. An example:
shell> cat myFile | pv -q -L10m > myFileCopy
The above will limit the speed at which the file is "cat" into myFileCopy to 10 megabytes a second. Assuming of course the IO subsystem can reach at least that speed.
The best application for pv is to place it somewhere in the pipeline of your streaming backups to limit the rate at which things can flow through the pipeline.
Eg.
shell> innobackupex --stream
The above will stream through pv and limit the maximum throughput to 10 megabytes/second.
So now understanding what rate limiting methods are available, I needed to consider in what ways XtraBackup Manager uses XtraBackup and the best way to implement the throttling.
I know that:
a) XtraBackup Manager always uses streaming mode when it takes a full backup, so the only option to use there is #2, pv.
b) When performing an incremental backup, XtraBackup Manager will always have xtrabackup stage the deltas locally, before using netcat (nc) to shuttle the data back over the network to the backup host for storage. In this case, limiting using pv is not really useful, because xtrabackup is going to chew up as much IO as it can while calculating the deltas, so we need to opt for the --throttle option on xtrabackup.
So once I understood that I'll need to actually implement throttling in two ways in XtraBackup Manager, I thought about how I would present it to the user for configuration.
I personally find it a bit annoying and confusing that I have to think in two units of measurement for different situations, so I wanted to see if I could insulate the user from that.
My aim was to see if I could present the user with a single configurable for the throttle on a backup task. After all, you don't care what type of backup is going on, you just want to say "Don't use more IO than this much…".
So in order to achieve this, I needed to understand the relationship between the two options as well as the characteristics of IO in both cases.
From my understanding, if you are taking a full backup, you are simply streaming each file sequentially - so we are talking about sequential reads here.
If we are talking about incrementals, we basically give xtrabackup a log sequence number and say "check all the pages and copy ones with a log sequence number above the one we gave" -- so we're finding the pages that have been changed since the given log sequence number.
In this case, it should also be a sequential read, as we're scanning pages end to end, and just checking the log sequence number.
So in both cases it seems we're talking about sequential reads.
When using pv, we're already dealing in a term that is easy to understand and fairly non-subjective. A rate limit in megabytes/sec of sequential read is straight forward.
Now when we're dealing with the --throttle option and thinking in IOPS we have some more to think about. Firstly, how big is an IOP?
Since I'm no good at reading C source code, I opted for the black box method of investigation and simply took an idle database server and started running xtrabackup against it with various --throttle values, while watching iostat on the data mount.
Here are some results:
Throttle value vs Observed disk throughput MB/sec
1:3 MB/sec
2:4 MB/sec
3:5 MB/sec
4:6 MB/sec
5:7 MB/sec
Interestingly the pattern I observe is: throughput = N+2
My best interpretation after even attempting a little digging into xtrabackup.c is that on this idle system we are limiting xtrabackup to 1 x 1MB IOP per second to scan the InnoDB data files, plus we burn 2MB per second to scan/copy the InnoDB log so that it can be applied later.
Now the catch 22 in this whole thing is that I'm observing this on an idle system, so this 2MB per second of log IO would increase if there is more log activity -- surely on a busy system you would need to read more than 2MB of logs every second to keep up.
The catch part? If I actually make the system busy, I can no longer determine where all the different IO in iostat is coming from, so I can't determine how much IO xtrabackup is now using. I'm sure there is a better way to instrument that per process, but unfortunately it extends beyond my personal skill set right now.
In blogging this, I'm hoping someone reading this can help with ideas or clarification...
So coming back to how I should implement the throttling -- I'm fairly sure that IOPS are 1MB in xtrabackup and pv also allows me to throttle in MB/sec, so I should be able to give one simple "throttle" configurable to the XtraBackup Manager user and tell them it limits in MB/sec.
The question then becomes, should I adjust the value I pass to --throttle for XtraBackup to account for this "at least 2MB used for log scanning"?
I decided I wanted to try to be clever and go ahead and adjust it -- so the value passed to XtraBackup for --throttle is now adjusted -2. If the adjustment gives a throttle value less than 1, it is simply given as 1.
None of this is set in stone -- I'm still testing and experimenting, but I'm curious to know your thoughts.
Can anyone shed light on what xtrabackup is doing ?
Should I bother adjusting this value or not ?
Cheers,
Lachlan
Subscribe to:
Posts (Atom)