当触发器在表上时,不能将 UPDATE 与 OUTPUT 子句一起使用

我使用 OUTPUT查询执行 UPDATE:

UPDATE BatchReports
SET IsProcessed = 1
OUTPUT inserted.BatchFileXml, inserted.ResponseFileXml, deleted.ProcessedDate
WHERE BatchReports.BatchReportGUID = @someGuid

这个语句很好; 直到在表上定义了触发器。然后我的 UPDATE语句将得到错误 334:

如果语句包含没有 INTO 子句的 OUTPUT 子句,则 DML 语句的目标表“ BatchReports”不能具有任何启用的触发器

现在,SQLServer 团队在一篇博客文章中解释了这个问题—— 带 OUTPUT 子句的 UPDATE-Triggers-和 SQLMoreResults:

错误消息是不言而喻的

他们还提供了解决方案:

应用程序已更改为使用 INTO 子句

除了我不能理解整篇博客文章的含义。

那么让我来问问我的问题: 我应该把我的 UPDATE改成什么样才能正常工作?

参见

68485 次浏览

To work around this restriction you need to OUTPUT INTO ... something. e.g. declare an intermediary table variable to be the target then SELECT from that.

DECLARE @T TABLE (
BatchFileXml    XML,
ResponseFileXml XML,
ProcessedDate   DATE,
RowVersion      BINARY(8) )


UPDATE BatchReports
SET    IsProcessed = 1
OUTPUT inserted.BatchFileXml,
inserted.ResponseFileXml,
deleted.ProcessedDate,
inserted.Timestamp
INTO @T
WHERE  BatchReports.BatchReportGUID = @someGuid


SELECT *
FROM   @T

As cautioned in the other answer if your trigger writes back to the rows modified by the UPDATE statement itself in such a way that it affects the columns that you are OUTPUT-ing then you may not find the results useful but this is only a subset of triggers. The above technique works fine in other cases, such as triggers recording to other tables for audit purposes, or returning inserted identity values even if the original row is written back to in the trigger.

Visibility Warning: Don't the other answer. It will give incorrect values. Read on for why it's wrong.


Given the kludge needed to make UPDATE with OUTPUT work in SQL Server 2008 R2, I changed my query from:

UPDATE BatchReports
SET IsProcessed = 1
OUTPUT inserted.BatchFileXml, inserted.ResponseFileXml, deleted.ProcessedDate
WHERE BatchReports.BatchReportGUID = @someGuid

to:

SELECT BatchFileXml, ResponseFileXml, ProcessedDate FROM BatchReports
WHERE BatchReports.BatchReportGUID = @someGuid


UPDATE BatchReports
SET IsProcessed = 1
WHERE BatchReports.BatchReportGUID = @someGuid

Basically I stopped using OUTPUT. This isn't so bad as Entity Framework itself uses this very same hack!

Hopefully 2012 2014 2016 2018 2019 2020 will have a better implementation.


Update: using OUTPUT is harmful

The problem we started with was trying to use the OUTPUT clause to retrieve the "after" values in a table:

UPDATE BatchReports
SET IsProcessed = 1
OUTPUT inserted.LastModifiedDate, inserted.RowVersion, inserted.BatchReportID
WHERE BatchReports.BatchReportGUID = @someGuid

That then hits the well-know limitation ("won't-fix" bug) in SQL Server:

The target table 'BatchReports' of the DML statement cannot have any enabled triggers if the statement contains an OUTPUT clause without INTO clause

Workaround Attempt #1

So we try something where we will use an intermediate TABLE variable to hold the OUTPUT results:

DECLARE @t TABLE (
LastModifiedDate datetime,
RowVersion timestamp,
BatchReportID int
)
  

UPDATE BatchReports
SET IsProcessed = 1
OUTPUT inserted.LastModifiedDate, inserted.RowVersion, inserted.BatchReportID
INTO @t
WHERE BatchReports.BatchReportGUID = @someGuid


SELECT * FROM @t

Except that fails because you're not allowed to insert a timestamp into the table (even a temporary table variable).

Workaround Attempt #2

We secretly know that a timestamp is actually a 64-bit (aka 8 byte) unsigned integer. We can change our temporary table definition to use binary(8) rather than timestamp:

DECLARE @t TABLE (
LastModifiedDate datetime,
RowVersion binary(8),
BatchReportID int
)
  

UPDATE BatchReports
SET IsProcessed = 1
OUTPUT inserted.LastModifiedDate, inserted.RowVersion, inserted.BatchReportID
INTO @t
WHERE BatchReports.BatchReportGUID = @someGuid


SELECT * FROM @t

And that works, except that the value are wrong.

The timestamp RowVersion we return is not the value of the timestamp as it existed after the UPDATE completed:

  • returned timestamp: 0x0000000001B71692
  • actual timestamp: 0x0000000001B71693

That is because the values OUTPUT into our table are not the values as they were at the end of the UPDATE statement:

  • UPDATE statement starting
    • modifies row
      • timestamp is updated (e.g. 2 → 3)
    • OUTPUT retrieves new timestamp (i.e. 3)
    • trigger runs
      • modifies row again
        • timestamp is updated (e.g. 3 → 4)
  • UPDATE statement complete
  • OUTPUT returns 3 (the wrong value)

This means:

  • We do not get the timestamp as it exists at the end of the UPDATE statement (4)
  • Instead we get the timestamp as it was in the indeterminate middle of the UPDATE statement (3)
  • We do not get the correct timestamp

The same is true of any trigger that modifies any value in the row. The OUTPUT will not OUTPUT the value as of the end of the UPDATE.

This means you cannot trust OUTPUT to return any correct values ever.

This painful reality is documented in the BOL:

Columns returned from OUTPUT reflect the data as it is after the INSERT, UPDATE, or DELETE statement has completed but before triggers are executed.

How did Entity Framework solve it?

The .NET Entity Framework uses rowversion for Optimistic Concurrency. The EF depends on knowing the value of the timestamp as it exists after they issue an UPDATE.

Since you cannot use OUTPUT for any important data, Microsoft's Entity Framework uses the same workaround that I do:

Workaround #3 - Final - Do not use OUTPUT clause

In order to retrieve the after values, Entity Framework issues:

UPDATE [dbo].[BatchReports]
SET [IsProcessed] = @0
WHERE (([BatchReportGUID] = @1) AND ([RowVersion] = @2))


SELECT [RowVersion], [LastModifiedDate]
FROM [dbo].[BatchReports]
WHERE @@ROWCOUNT > 0 AND [BatchReportGUID] = @1

Don't use OUTPUT.

Yes it suffers from a race condition, but that's the best SQL Server can do.

What about INSERTs

Do what Entity Framework does:

SET NOCOUNT ON;


DECLARE @generated_keys table([CustomerID] int)


INSERT Customers (FirstName, LastName)
OUTPUT inserted.[CustomerID] INTO @generated_keys
VALUES ('Steve', 'Brown')


SELECT t.[CustomerID], t.[CustomerGuid], t.[RowVersion], t.[CreatedDate]
FROM @generated_keys AS g
INNER JOIN Customers AS t
ON g.[CustomerGUID] = t.[CustomerGUID]
WHERE @@ROWCOUNT > 0

Again, they use a SELECT statement to read the row, rather than placing any trust in the OUTPUT clause.

Why put all needed columns into table variable? We just need primary key and we can read all data after the UPDATE. There is no race when you use transaction:

DECLARE @t TABLE (ID INT PRIMARY KEY);


BEGIN TRAN;


UPDATE BatchReports SET
IsProcessed = 1
OUTPUT inserted.ID INTO @t(ID)
WHERE BatchReports.BatchReportGUID = @someGuid;


SELECT b.*
FROM @t t JOIN BatchReports b ON t.ID = b.ID;


COMMIT;

Triggers are a code smell that can silently assassinate other parts of code. Use stored procedures instead. Instead Of triggers are even worse. So, if you run into an Instead Of trigger, Output won't work even with an INTO, and neither will Scope_Identity.
For example:

CREATE   Trigger [DI].[TRG_AJC_BeforeInsert]
On [DI].[AJC]
INSTEAD OF INSERT
As
Set NoCount on
Insert Into [DI].[AJC] (val1, date2)
From Select [val1], getDate()

will return no results for Output or ScopeIdentity. No results for Scope Identity because it hasn't been created yet, and no results for Output because it wasn't created at the time the OUTPUT was read.

Your only hope is that somehow the trigger didn't do another insert in another table. Then you can use @@identity. But if another table was updated/inserted during that session, you will now have @@identity returning that other table's identity, not the ScopedIdentity you were looking for.