如果表中的行不存在,如何更新它或插入它?

我有下表的计数器:

CREATE TABLE cache (
key text PRIMARY KEY,
generation int
);

我希望增加一个计数器,或者如果对应的行还不存在,则将其设置为零。在标准 SQL 中有没有不存在并发问题的方法?操作有时是事务的一部分,有时是独立的。

SQL 必须在 SQLite、 PostgreSQL 和 MySQL 上运行,如果可能的话。

通过搜索,我们得到了一些想法,这些想法要么存在并发性问题,要么是特定于数据库的:

  • 尝试将 INSERT设置为一个新行,如果出现错误,则设置为 UPDATE。遗憾的是,INSERT上的错误将中止当前事务。

  • UPDATE行,如果没有行被修改,INSERT一个新的行。

  • MySQL 有一个 ON DUPLICATE KEY UPDATE子句。

编辑: 感谢所有伟大的回复。看起来 Paul 是对的,并且没有一种单一的,可移植的方法来做这件事。这让我很惊讶,因为这听起来像是一个非常基本的手术。

154904 次浏览

I don't know that you are going to find a platform-neutral solution.

This is commonly called an "UPSERT".

See some related discussions:

If you don't have a common way to atomically update or insert (e.g., via a transaction) then you can fallback to another locking scheme. A 0-byte file, system mutex, named pipe, etc...

Could you use an insert trigger? If it fails, do an update.

MySQL (and subsequently SQLite) also support the REPLACE INTO syntax:

REPLACE INTO my_table (pk_id, col1) VALUES (5, '123');

This automatically identifies the primary key and finds a matching row to update, inserting a new one if none is found.

Documentation: https://dev.mysql.com/doc/refman/8.0/en/replace.html

SQLite supports replacing a row if it already exists:

INSERT OR REPLACE INTO [...blah...]

You can shorten this to

REPLACE INTO [...blah...]

This shortcut was added to be compatible with the MySQL REPLACE INTO expression.

Standard SQL provides the MERGE statement for this task. Not all DBMS support the MERGE statement.

In PostgreSQL there is no merge command, and actually writing it is not trivial - there are actually strange edge cases that make the task "interesting".

The best (as in: working in the most possible conditions) approach, is to use function - such as one shown in manual (merge_db).

If you don't want to use function, you can usually get away with:

updated = db.execute(UPDATE ... RETURNING 1)
if (!updated)
db.execute(INSERT...)

Just remember that it is not fault proof and it will fail eventually.

I would do something like the following:

INSERT INTO cache VALUES (key, generation)
ON DUPLICATE KEY UPDATE (key = key, generation = generation + 1);

Setting the generation value to 0 in code or in the sql but the using the ON DUP... to increment the value. I think that's the syntax anyway.

the ON DUPLICATE KEY UPDATE clause is the best solution because: REPLACE does a DELETE followed by an INSERT so for an ever so slight period the record is removed creating the ever so slight possibility that a query could come back having skipped that if the page was viewed during the REPLACE query.

I prefer INSERT ... ON DUPLICATE UPDATE ... for that reason.

jmoz's solution is the best: though I prefer the SET syntax to the parentheses

INSERT INTO cache
SET key = 'key', generation = 'generation'
ON DUPLICATE KEY
UPDATE key = 'key', generation = (generation + 1)
;

If you're OK with using a library that writes the SQL for you, then you can use Upsert (currently Ruby and Python only):

Pet.upsert({:name => 'Jerry'}, :breed => 'beagle')
Pet.upsert({:name => 'Jerry'}, :color => 'brown')

That works across MySQL, Postgres, and SQLite3.

It writes a stored procedure or user-defined function (UDF) in MySQL and Postgres. It uses INSERT OR REPLACE in SQLite3.