如何加入第一排

我将使用一个具体但假设的例子。

每个订单通常只有一个行项目

订单:

OrderGUID   OrderNumber=========   ============{FFB2...}   STL-7442-1{3EC6...}   MPT-9931-8A

产品线:

LineItemGUID   Order ID Quantity   Description============   ======== ========   ================================={098FBE3...}   1        7          prefabulated amulite{1609B09...}   2        32         spurving bearing

但偶尔会有两个行项目的订单:

LineItemID   Order ID    Quantity   Description==========   ========    ========   ================================={A58A1...}   6,784,329   5          pentametric fan{0E9BC...}   6,784,329   5          differential girdlespring

通常,当向用户显示订单时:

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.DescriptionFROM OrdersINNER JOIN LineItemsON Orders.OrderID = LineItems.OrderID

我想显示订单上的单个项目。但是,如果这个偶尔的订单包含两个(或更多)项目,订单出现将是重复

OrderNumber   Quantity   Description===========   ========   ====================STL-7442-1    7          prefabulated amuliteMPT-9931-8A   32         spurving bearingKSG-0619-81   5          panametric fanKSG-0619-81   5          differential girdlespring

我真正想要的是SQL服务器随便挑一个,因为它将是足够好

OrderNumber   Quantity   Description===========   ========   ====================STL-7442-1    7          prefabulated amuliteMPT-9931-8A   32         differential girdlespringKSG-0619-81   5          panametric fan

如果我喜欢冒险,我可能会向用户显示一个省略号,以指示存在多个:

OrderNumber   Quantity   Description===========   ========   ====================STL-7442-1    7          prefabulated amuliteMPT-9931-8A   32         differential girdlespringKSG-0619-81   5          panametric fan, ...

所以问题是如何要么

  • 消除“重复”行
  • 只连接到其中一行,以避免重复

第一次尝试

我第一次天真的尝试是只加入“TOP1”行项目:

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.DescriptionFROM OrdersINNER JOIN (SELECT TOP 1 LineItems.Quantity, LineItems.DescriptionFROM LineItemsWHERE LineItems.OrderID = Orders.OrderID) LineItems2ON 1=1

但这给出了错误:

列或前缀“订单”不
匹配表名或别名
在查询中使用。

大概是因为内部选择看不到外部表。

764543 次浏览
SELECT   Orders.OrderNumber, LineItems.Quantity, LineItems.DescriptionFROM     OrdersJOIN     LineItemsON       LineItems.LineItemGUID =(SELECT  TOP 1 LineItemGUIDFROM    LineItemsWHERE   OrderID = Orders.OrderID)

在SQLServer 2005及更高版本中,您可以将INNER JOIN替换为CROSS APPLY

SELECT  Orders.OrderNumber, LineItems2.Quantity, LineItems2.DescriptionFROM    OrdersCROSS APPLY(SELECT  TOP 1 LineItems.Quantity, LineItems.DescriptionFROM    LineItemsWHERE   LineItems.OrderID = Orders.OrderID) LineItems2

请注意,没有ORDER BYTOP 1不是确定性的:这个查询您将为每个订单获得一个行项目,但它没有定义是哪一个。

查询的多次调用可以为相同的顺序提供不同的行项,即使底层没有更改。

如果您想要确定的顺序,您应该在最里面的查询中添加一个ORDER BY子句。

你可以这样做:

SELECTOrders.OrderNumber,LineItems.Quantity,LineItems.DescriptionFROMOrders INNER JOIN LineItemsON Orders.OrderID = LineItems.OrderIDWHERELineItems.LineItemID = (SELECT MIN(LineItemID)FROM   LineItemsWHERE  OrderID = Orders.OrderID)

这需要LineItems.LineItemID上的索引(或主键)和LineItems.OrderID上的索引,否则会很慢。

编辑:没关系,Quassnoi有更好的答案。

对于SQL2K,类似这样:

SELECTOrders.OrderNumber, LineItems.Quantity, LineItems.DescriptionFROM (SELECTOrders.OrderID, Orders.OrderNumber, FirstLineItemID = (SELECT TOP 1 LineItemIDFROM LineItemsWHERE LineItems.OrderID = Orders.OrderIDORDER BY LineItemID -- or whatever else)FROM Orders) OrdersJOIN LineItemsON LineItems.OrderID = Orders.OrderIDAND LineItems.LineItemID = Orders.FirstLineItemID

我知道这个问题之前已经回答过了,但是在处理大型数据集时,嵌套查询的成本可能很高。这是一个不同的解决方案,其中嵌套查询只运行一次,而不是针对返回的每一行。

SELECTOrders.OrderNumber,LineItems.Quantity,LineItems.DescriptionFROMOrdersINNER JOIN (SELECTOrders.OrderNumber,Max(LineItem.LineItemID) AS LineItemIDFROMOrders INNER JOIN LineItemsON Orders.OrderNumber = LineItems.OrderNumberGROUP BY Orders.OrderNumber) AS Items ON Orders.OrderNumber = Items.OrderNumberINNER JOIN LineItemsON Items.LineItemID = LineItems.LineItemID

尝试了十字架,效果很好,但时间稍长。调整了行列以具有最大值并添加了保持速度并丢弃额外记录的组。

以下是调整后的查询:

SELECT Orders.OrderNumber, max(LineItems.Quantity), max(LineItems.Description)FROM OrdersINNER JOIN LineItemsON Orders.OrderID = LineItems.OrderIDGroup by Orders.OrderNumber

@Quassnoi答案很好,在某些情况下(特别是如果外部表很大),更有效的查询可能是使用窗口函数,如下所示:

SELECT  Orders.OrderNumber, LineItems2.Quantity, LineItems2.DescriptionFROM    OrdersLEFT JOIN(SELECT  LineItems.Quantity, LineItems.Description, OrderId, ROW_NUMBER()OVER (PARTITION BY OrderId ORDER BY (SELECT NULL)) AS RowNumFROM    LineItems
) LineItems2 ON LineItems2.OrderId = Orders.OrderID And RowNum = 1

有时您只需需要测试哪个查询提供更好的性能。

相关子查询是依赖于外部查询的子查询。就像SQL中的for循环。子查询将对外部查询中的每一行运行一次:

select * from users join widgets on widgets.id = (select id from widgetswhere widgets.user_id = users.idorder by created_at desclimit 1)

我最喜欢的运行此查询的方法是使用不存在子句。我相信这是运行此类查询的最有效方法:

select o.OrderNumber,li.Quantity,li.Descriptionfrom Orders as oinner join LineItems as lion li.OrderID = o.OrderIDwhere not exists (select 1from LineItems as li_laterwhere li_later.OrderID = o.OrderIDand li_later.LineItemGUID > li.LineItemGUID)

但我还没有将这种方法与这里建议的其他方法进行测试。

,另一个使用公共表表达式的方法:

with firstOnly as (select Orders.OrderNumber, LineItems.Quantity, LineItems.Description, ROW_NUMBER() over (partiton by Orders.OrderID order by Orders.OrderID) lpFROM Ordersjoin LineItems on Orders.OrderID = LineItems.OrderID) select *from firstOnlywhere lp = 1

或者,最后,也许您想显示已加入的所有行?

逗号分隔版本:

  select *from Orders ocross apply (select CAST((select l.Description + ','from LineItems lwhere l.OrderID = s.OrderIDfor xml path('')) as nvarchar(max)) l) lines

从SQLServer 2012及以后,我认为这将做到这一点:

SELECT DISTINCTo.OrderNumber ,FIRST_VALUE(li.Quantity) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Quantity ,FIRST_VALUE(li.Description) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS DescriptionFROM    Orders AS oINNER JOIN LineItems AS li ON o.OrderID = li.OrderID

试试这个

SELECTOrders.OrderNumber,LineItems.Quantity,LineItems.DescriptionFROM OrdersINNER JOIN (SELECTOrders.OrderNumber,Max(LineItem.LineItemID) AS LineItemIDFROM OrdersINNER JOIN LineItemsON Orders.OrderNumber = LineItems.OrderNumberGROUP BY Orders.OrderNumber) AS Items ON Orders.OrderNumber = Items.OrderNumberINNER JOIN LineItemsON Items.LineItemID = LineItems.LineItemID