SQLServer 查询? IN 子句的最大大小是否有更好的方法

小开

Per batch, 65536 * Network Packet Size which is 4k so 256 MB

However, IN will stop way before that but it's not precise.

You end up with memory errors but I can't recall the exact error. A huge IN will be inefficient anyway.

Edit: Remus reminded me: the error is about "stack size"

小开

The SQL Server Maximums are disclosed http://msdn.microsoft.com/en-us/library/ms143432.aspx (this is the 2008 version)

A SQL Query can be a varchar(max) but is shown as limited to 65,536 * Network Packet size, but even then what is most likely to trip you up is the 2100 parameters per query. If SQL chooses to parameterize the literal values in the in clause, I would think you would hit that limit first, but I havn't tested it.

Edit : Test it, even under forced parameteriztion it survived - I knocked up a quick test and had it executing with 30k items within the In clause. (SQL Server 2005)

At 100k items, it took some time then dropped with:

Msg 8623, Level 16, State 1, Line 1 The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.

So 30k is possible, but just because you can do it - does not mean you should :)

Edit : Continued due to additional question.

50k worked, but 60k dropped out, so somewhere in there on my test rig btw.

In terms of how to do that join of the values without using a large in clause, personally I would create a temp table, insert the values into that temp table, index it and then use it in a join, giving it the best opportunities to optimse the joins. (Generating the index on the temp table will create stats for it, which will help the optimiser as a general rule, although 1000 GUIDs will not exactly find stats too useful.)

小开

最佳答案

Every SQL batch has to fit in the Batch Size Limit: 65,536 * Network Packet Size.

Other than that, your query is limited by runtime conditions. It will usually run out of stack size because x IN (a,b,c) is nothing but x=a OR x=b OR x=c which creates an expression tree similar to x=a OR (x=b OR (x=c)), so it gets very deep with a large number of OR. SQL 7 would hit a SO at about 10k values in the IN, but nowdays stacks are much deeper (because of x64), so it can go pretty deep.

Update

You already found Erland's article on the topic of passing lists/arrays to SQL Server. With SQL 2008 you also have Table Valued Parameters which allow you to pass an entire DataTable as a single table type parameter and join on it.

XML and XPath is another viable solution:

SELECT ...
FROM Table
JOIN (
SELECT x.value(N'.',N'uniqueidentifier') as guid
FROM @values.nodes(N'/guids/guid') t(x)) as guids
ON Table.guid = guids.guid;

小开

Can you load the GUIDs into a scratch table then do a

... WHERE var IN SELECT guid FROM #scratchtable