ROW_NUMBER Without ORDER BY

I've to add row number in my existing query so that I can track how much data has been added into Redis. If my query failed so I can start from that row no which is updated in other table.

Query to get data start after 1000 row from table

SELECT * FROM (SELECT *, ROW_NUMBER() OVER (Order by (select 1)) as rn ) as X where rn > 1000

Query is working fine. If any way that I can get the row no without using order by.

What is select 1 here?

Is the query optimized or I can do it by other ways. Please provide the better solution.

111301 次浏览

Try just order by 1. Read the error message. Then reinstate the order by (select 1). Realise that whoever wrote this has, at some point, read the error message and then decided that the right thing to do is to trick the system into not raising an error rather than realising the fundamental truth that the error was trying to alert them to.

Tables have no inherent order. If you want some form of ordering that you can rely upon, it's up to you to provide enough deterministic expression(s) to any ORDER BY clause such that each row is uniquely identified and ordered.

Anything else, including tricking the system into not emitting errors, is hoping that the system will do something sensible without using the tools provided to you to ensure that it does something sensible - a well specified ORDER BY clause.

You can use any literal value

ex

order by (select 0)


order by (select null)


order by (select 'test')

etc

Refer this for more information https://exploresql.com/2017/03/31/row_number-function-with-no-specific-order/

There is no need to worry about specifying constant in the ORDER BY expression. The following is quoted from the Microsoft SQL Server 2012 High-Performance T-SQL Using Window Functions written by Itzik Ben-Gan (it was available for free download from Microsoft free e-books site):

As mentioned, a window order clause is mandatory, and SQL Server doesn’t allow the ordering to be based on a constant—for example, ORDER BY NULL. But surprisingly, when passing an expression based on a subquery that returns a constant—for example, ORDER BY (SELECT NULL)—SQL Server will accept it. At the same time, the optimizer un-nests, or expands, the expression and realizes that the ordering is the same for all rows. Therefore, it removes the ordering requirement from the input data. Here’s a complete query demonstrating this technique:

SELECT actid, tranid, val,
ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
FROM dbo.Transactions;

enter image description here

Observe in the properties of the Index Scan iterator that the Ordered property is False, meaning that the iterator is not required to return the data in index key order


The above means that when you are using constant ordering is not performed. I will strongly recommend to read the book as Itzik Ben-Gan describes in depth how the window functions are working and how to optimize various of cases when they are used.

What is select 1 here?

In this scenario, the author of query does not really have any particular sorting in mind. ROW_NUMBER requires ORDER BY clause so providing it is a way to satisfy the parser.

Sorting by "constant" will create "undeterministic" order(query optimizer is able to choose whatever order it found suitable).

Easiest way to think about it is as:

ROW_NUMBER() OVER(ORDER BY 1)    -- error
ROW_NUMBER() OVER(ORDER BY NULL) -- error

There are few possible scenarios to provide constant expression to "trick" query optimizer:

ROW_NUMBER() OVER(ORDER BY (SELECT 1)) -- already presented

Other options:

ROW_NUMBER() OVER(ORDER BY 1/0)       -- should not be used
ROW_NUMBER() OVER(ORDER BY @@SPID)
ROW_NUMBER() OVER(ORDER BY DB_ID())
ROW_NUMBER() OVER(ORDER BY USER_ID())

db<>fiddle demo