Tuesday, April 19, 2011

When is an in-memory operation a BAD idea?

Recently I've learned a little more about how MySQL uses implicit in-memory temp tables that I felt it would be worth sharing.

A little background that perhaps many of you may wish to skip...

MySQL when handling many kinds of queries will implicitly create a temp table. This table will start off in-memory and if it exceeds a certain size (controlled by tmp_table_size and max_heap_table_size) it will be converted to an on-disk table in the location(s) defined by the tmpdir server variable.

This was not new to me and it may not be new to you, but I urge you to read on for the interesting part...

When MySQL creates this temporary table in memory, it will use fixed width rows. I assume this is done because in many cases it is easier/faster to allocate and manage memory this way, rather than measuring the size needed for each row in the temp table and then allocating the memory needed, MySQL just allocates <max_row_size> for each row and it's done.

What this means is that the maximum possible space that could be consumed by any one row is the amount of space allocated and consumed for all rows.

Consider, if you will, a VARCHAR(1024) field using the UTF8 character set. Given that a UTF8 character can be represented by up to three bytes (in MySQL), it means that the maximum theoretical size for storage of 1024 such characters becomes 3072 bytes (3K).

Suddenly your generous and forward-thinking schema design becomes your enemy. If such a field only contains simple words like "cat" and "dog" you will need 3K of memory to be allocated in your in-memory temp table regardless.

As you can imagine, a few such fields existing in your implicit temp table, combined with a high number of rows can cause the space needed for this to spiral out of control very quickly!

Now, to add insult to injury, when MySQL decides that your enormous implicit temp table is too big for memory, based on tmp_table_size / max_heap_table_size, it maintains the very same fixed width row format as it copies the table to disk and continues appending rows to it there.

In practise, I have seen this cause 2.3G of data balloon out to 43G -- this is an increase by a factor of over 18!

So how to avoid it?

It really depends on the situation, but I would suggest that if you know a query is going to need such a temp table that you split the query into multiple steps and employ the use of a pivot table.

The pivot table would be an on-disk MyISAM table (TEMPORARY or not - your choice) that you use to explicitly perform the work done by MySQL when performing the implicit temp table step. The benefit here is that when you define this table, you can use variable-width fields and only consume the space needed.

Depending on your system and environment, you could be a little sneaky and even consider defining your MySQL tmpdir as tmpfs (memory) -- this way you get the benefit of the speed of memory as well as only allocating the space you need for each row, rather than the maximum theoretical size.

In the case that I found, it makes a lot more sense to just materialize the temp table efficiently on disk than to be exposed to the risk that a fixed-width table could run amok.

Hopefully this is useful to some of you out there!


Note: Edited per Don McArthur for correctness. utf8 in MySQL only supports the Basic Multilingual Plane subset of utf8, meaning that it may consume only up to 3 bytes per character, not 4 as in the full utf8 spec.

Xtrabackup Manager - Updates and MySQL Conference Observations..

After talking to a number of people at the MySQL Conf last week, it seems there is a pretty high level of interest in a tool like Xtrabackup Manager. This is great news!

I also got a chance to discuss with some folks about what their needs might be and how they would use such a tool. Hopefully I can make sure that those needs are met as I'm developing things.

The other day I finally committed the xbm-conftool contribution. You can now manage the configuration of your hosts in your favourite CSV editor and then import it into the DB.

I have also now started work on making sure that Xtrabackup Manager will run on Nexenta. If you're not aware, Nexenta is a Solaris kernel based system with a Debian userland -- basically OpenSolaris with apt-get.

The main reason for this is that I really like the idea of using a ZFS based system to run as my backup host. It means I can have the filesystem do compression behind the scenes, which saves on disk usage, but I don't have to worry about it in the user space -- This makes it easier to manage backups because I don't have to worry about compressing and uncompressing stuff all the time. This simplifies operations like applying incremental deltas into full backups.

So far the main aspects of the Xtrabackup Manager code seem to "just work" on Nexenta which is promising, but more testing is needed. I've had to make a small change in the way flushing to the crontab is done, since it seems the crontab command in Nexenta does not support installing a file in the crontab of another user.

I've been side-lined with some other work tasks this week, but I'm hoping to get back to Xtrabackup Manager soon.


Monday, April 4, 2011

Xtrabackup Manager - Local Restores, ConfTools and Re-factoring!

Things have been moving along well in the world of Xtrabackup Manager.

In the last week I managed to fix a some bugs and overcome a number of implementation issues. Most notably the internals have been re-factored significantly and now make use of PHP Exceptions.

You probably don't care about the re-factoring all that much if all you want to do is use the tool, but rest assured that it makes development easier, which in turn is going to be better for users!

Aside from the refactoring and probably more interesting -- I added the functionality to be able to perform a local restore any backup snapshot.

If you are using the standard "rolling incremental" backup method, then this means Xtrabackup Manager would first take a FULL backup of your target MySQL host and following that it would take incrementals.

With Xtrabackup Manager you have the ability to set a snapshot retention policy and it is based on the count of snapshots to retain. For example, if you have scheduled backups to be at 11PM daily with a snapshot retention of 7, then you will, at most, keep snapshots for 7 days.

Once Xtrabackup Manager successfully takes the 8th backup, it will collapse the oldest set of incremental deltas by applying/merging them into the full backup snapshot.

Using the new local restore tool you can restore any snapshot with a simple command like:

shell> xbm-restore -s 17 -l /path/to/restore/to

This will restore backup snapshot ID 17 to local path /path/to/restore/to

It works by first copying the full snapshot for the relevant host -- we call this the SEED -- and then periodically applying each set of incremental deltas needed to effectively "roll forward" to the snapshot that you specified in the command.

So far this seems to work fairly well.

In addition to the local restore tool, I have received a patch to aid in managing your host configurations - it allows you to export everything to CSVs that can be more easily edited in something like Excel or OpenOffice, make a bunch of changes and then reimport over the top.

It is important to note that this is just one of _many_ ways that one will be able to manage their Xtrabackup Manager configuration.

I am still looking to add a nifty web interface in the future.... which leads me into reminding everyone and anyone that I am looking for contributors for the project!

MySQL, Linux, PHP and Web/UI experienced folks would be greatly appreciated!

Check out the project on Google Code for more info: