SQL Aggregation用於較小的結果集

[英]SQL Aggregation for a smaller result set


I have a database for which I need to aggregate records into another smaller set. This result set should contain the difference between maximum and minumum of specific columns of the original records where they add up to certain SUM, a closed interval constant C.

我有一個數據庫,我需要將記錄聚合到另一個較小的集合中。此結果集應包含原始記錄的特定列的最大值和最小值之間的差值,它們加起來為某個SUM,一個閉合的間隔常數C.

The constant C determines how the original records are aggregated and no entry in the resulting set ever exceeds it. Naturally I am supposed to run this in natural primary key order..

常量C確定如何聚合原始記錄,並且結果集中的任何條目都不超過它。當然我應該以自然主鍵順序運行它。

To illustrate: table has:

為了說明:表格有:

  • [key]
  • [a]
  • [b]
  • [minColumn]
  • [maxColumn]
  • [N]

...all are int datatype.

...都是int數據類型。

I am after a result set that has entries where the MAX(maxColumn) - MIN(minColumn) for that group such that when their difference is summed up it is less or equal to constant C.

我在一個結果集之后有一個條目,其中MAX(maxColumn) - MIN(minColumn)為該組,當它們的差值總和時,它小於或等於常數C.

Apart from the MAX(maxColumn) and MIN(minColumn) value I also need the FIRST record column [a] and LAST record column [b] values before creating a new entry in this result set. Finally, the N column should be SUMmed for all original records in a group.

除了MAX(maxColumn)和MIN(minColumn)值之外,在此結果集中創建新條目之前,還需要FIRST記錄列[a]和LAST記錄列[b]值。最后,對於組中的所有原始記錄,N列應該是SUMmed。

Is there an efficient way to do this without cursors?

有沒有游標的有效方法嗎?

-----[Trivial Sample]------------------------------------------------------------

I am attempting to group-by a slightly complicated form of a running sum, constant C.

我試圖通過一個稍微復雜的運行總和,常數C.

There is only one table, columns are all of int type and sample data

只有一個表,列都是int類型和樣本數據

declare @t table (
  PK int primary key
    , int a, int b, int minColumn, int maxColumn, int N 
)

insert @t values (1,5,6,100,200,1000)
insert @t values (2,7,8,210,300,2000)
insert @t values (3,9,10,420,600,3000)
insert @t values (4,11,12,640,800,4000)

Thus for:

key, a,   b, minColumn, maxColumn,    N
---------------------------------------
1,   5,   6,       100,       200, 1000 
2,   7,   8,       210,       300, 2000 
3,   9,  10,       420,       600, 3000 
4,   11, 12,       640,       800, 4000 

I need the result set to look like, for a constant C of 210 :

對於常數C為210,我需要結果集看起來像:

firstA | lastB | MIN_minColumn | MAX_maxColumn | SUM_N
5       8                  100             300    3000 
9       10                 420             600    3000 
11      12                 640             800    4000 

[ Adding the bounty and sample as discussed below]

[添加以下討論的賞金和樣本]

For C = 381, It should contain 2 rows:

對於C = 381,它應該包含2行:

firstA | lastB | MIN_minColumn | MAX_maxColumn | SUM_N
5            8             100             300    3000 
9           12             420             800    7000

Hope this demonstrates the problem better.. and for a constant C say 1000 you would get 1 record in the result:

希望這能更好地證明問題..對於常數C表示1000,你會在結果中獲得1條記錄:

firstA | lastB | MIN_minColumn | MAX_maxColumn | SUM_N
5           12             100             800   10000

3 个解决方案

#1


2  

DECLARE @c int
SELECT @c = 210

SELECT MIN(a) firstA,
       MAX(b) lastB, 
       MIN(minColumn) MIN_minColumn, 
       MAX(maxColumn) MAX_maxColumn, 
       SUM(N) SUM_N
FROM @t t 
JOIN (SELECT key, floor(sum/@c) as rank
        FROM (SELECT key, 
                     (SELECT SUM(t2.maxColumn - t2.minColumn) 
                        FROM @t t2 
                       WHERE t2.key <= t1.key 
                    GROUP BY t1.key) as sum
               FROM @t t1) A
     ) B on B.key = t.key
GROUP BY B.rank

/*

Table A: for each key, calculating SUM[maxColumn-minColumn] of all keys below it.
Table B: for each key, using the sum in A, calculating a rank so that:
  sum = (rank + y)*@c where 0 <= y < 1. 
  ex: @c=210, rank(100) = 0, rank(200) = 0, rank(220) = 1, ...
finally grouping by rank, you'll have what you want.

*/

#2


1  

declare @c int

聲明@c int

select @c = 210

選擇@c = 210

select firstA = min(a), lastB = max(b), MIN_minColumn = min(minColumn), MAX_maxColumn = max(maxColumn), SUM_N = sum(N) from @t where minColumn <= @c

選擇firstA = min(a),lastB = max(b),MIN_minColumn = min(minColumn),MAX_maxColumn = max(maxColumn),SUM_N = sum(N)來自@t,其中minColumn <= @c

union all

select a, b, minColumn, maxColumn, N from @t where minColumn > @c

從@t中選擇a,b,minColumn,maxColumn,N,其中minColumn> @c

#3


1  

I am a little confused on the grouping logic for result you are trying to produce, but from the description of what you are looking for, I think you need a HAVING clause. You should be able to do something like:

我對你想要產生的結果的分組邏輯感到有點困惑,但是根據你所尋找的描述,我認為你需要一個HAVING子句。你應該能夠做到這樣的事情:

SELECT groupingA, groupingB, MAX(a) - MIN(b)
FROM ...
GROUP BY groupingA, groupingB
HAVING (MAX(a) - MIN(b)) < C

...in order to filter out the difference between your max and min values, once you've determined your grouping. Hope this is helpful

...一旦確定了分組,就可以過濾掉最大值和最小值之間的差異。希望這有用


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2009/10/06/72f0e90e6c44fe94bedc64daf4cd11c.html



 
粤ICP备14056181号  © 2014-2021 ITdaan.com