Thursday, June 3, 2010

ON DUPLICATE KEY UPDATE Gotcha!

I know it has been a long time between drinks/posts, but I've been busy -- I promise! :)

Today I spent a considerable amount of time trying to figure out why an INSERT SELECT ON DUPLICATE KEY UPDATE was not behaving as I would expect.

Here is an example to illustrate:

CREATE TABLE t1 (
  id INT AUTO_INCREMENT,
  num INT NOT NULL DEFAULT 0,
  PRIMARY KEY (id)
);

CREATE TABLE t2 (
  id INT NOT NULL,
  num INT
);

INSERT INTO t1 VALUES (1, 10);
INSERT INTO t2 VALUES (1, NULL);


INSERT INTO t1
  SELECT id, num
  FROM t2
ON DUPLICATE KEY UPDATE
  num=IFNULL(VALUES(num), t1.num);

To convert the above query into plain English -- I'm saying, INSERT into table t1 the id and num fields from the t2 table. If there are already row(s) for any UNIQUE key in the target table, t1, then we should instead UPDATE the existing row. Additionally, we should set the num field to the result of whatever this evaluates to:

IFNULL( VALUES(num), t1.num)

To explain: VALUES(num) means "The value that is to be placed into the "num" field when it is updated."

So we are saying, if a NULL is going to be put into the field "num" then we want to leave the value alone -- set it to "t1.num" -- eg. the value that is already there.

One might expect the result of my query to be as follows:


testDB:test> SELECT * FROM t1;
+----+-----+
| id | num |
+----+-----+
|  1 |  10 |
+----+-----+
1 row in set (0.00 sec)


However, that is not the case -- the actual result is:



testDB:test> SELECT * FROM t1;
+----+-----+
| id | num |
+----+-----+
|  1 |   0 |
+----+-----+
1 row in set (0.00 sec)


Why is this the case?

The hint lies in the result of the INSERT itself:


dbp16-int:test> INSERT INTO t1   SELECT id, num   FROM t2 ON DUPLICATE KEY UPDATE   num=IFNULL(VALUES(num), t1.num);
Query OK, 0 rows affected, 1 warning (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 1

Note: 1 warning...

testDB:test> SHOW WARNINGS;
+---------+------+-----------------------------+
| Level   | Code | Message                     |
+---------+------+-----------------------------+
| Warning | 1048 | Column 'num' cannot be null |
+---------+------+-----------------------------+
1 row in set (0.00 sec)

What is actually happening here is that because the column 'num' in the table t1 is defined as NOT NULL, MySQL is silently converting it to a valid value of 0.

So VALUES(num) actually equals 0, thus, it will not evaluate as NULL and the 0 will be INSERTed into the table.

This was not my intention and the solution in this case was to allow NULLs on the "num" field of the table t1. It may not always be possible to remove such restrictions if you rely on these automatic conversions by MySQL to "valid values".

Something to keep in mind - VALUES() will always evaluate as what would have ended up in the table, which may not necessarily be the same thing as the value that was attempted to be INSERTed.