各位先進好,問題有點複雜,希望用圖表有說明的更清楚~
需求說明:
有一堆資料集合(資料結構範例如圖一,有Type(A,B,C,D,E)),想要依條件分群去除不合條件的資料後,再篩選出不重複的資料show在DataTable上。
依範例資料結構,最終需篩選出Type(A,B,C)於DataTable上
原始資料結構如下圖所示
篩選條件情境如下圖所示,這邊將每組資料集的min及max條件都假定為一樣的示意圖來說明
篩選條件說明:
每組資料都有value, min, max的值,想要將所有資料都小於min,或所有資料都大於max,或所有資料都介於中間的資料篩選出來,若資料有橫跨兩個條件的資料集合則要排除掉,資料需全部都座落在同一個條件區域。
篩選出來的資料需要有type資訊及分群編號的資訊,這樣才能知道每一組資料集合是屬於那一群。
如果不分開處理,只能用一個SQL來篩選這些資料,不知道是否有可能做到。
個人的淺見好像是個不可能的任務。。。><
還請各位先進指點,謝謝
CREATE TABLE TypeGroupTest (
XTypeNo int NULL ,
Xtype nvarchar(40) NULL ,
Xvalue int NULL ,
Xmin int NULL ,
Xmax int NULL);
INSERT INTO TypeGroupTest (XTypeNo,Xtype,Xvalue,Xmin,Xmax)
VALUES
(1,'TypeA',10,100,200),
(2,'TypeA',20,100,200),
(3,'TypeA',30,100,200),
(4,'TypeB',410,200,300),
(5,'TypeB',420,200,300),
(6,'TypeB',430,200,300),
(7,'TypeC',310,300,400),
(8,'TypeC',320,300,400),
(9,'TypeC',330,300,400),
(12,'TypeD',410,400,500),
(13,'TypeD',490,400,500),
(14,'TypeD',550,400,500),
(15,'TypeE',510,500,600),
(16,'TypeE',590,500,600),
(17,'TypeE',450,500,600);
SELECT [Xtype],[XGROUP]
FROM (
SELECT [Xtype],[XGROUP],
COUNT(1) OVER(PARTITION BY [Xtype]) AS ROWNUM
FROM (
SELECT DISTINCT [Xtype],
CASE
WHEN [Xvalue] <= [Xmin]
THEN 'X1'
WHEN [Xvalue] >= [Xmax]
THEN 'X3'
WHEN [Xvalue] >= [Xmin] AND [Xvalue] <= [Xmax]
THEN 'X2'
ELSE 'X4'
END AS [XGROUP]
FROM [TypeGroupTest]) AS TempA
) AS TempB
WHERE ROWNUM=1
這麼無厘頭的需求都寫得出來
我先按三個
太利害了,感謝指教,太強了~第一次學到這語法~
想要進階請教問題,若不將資料排除,而想要將TypeD和E計算資料分佈比例來分群的話,有方法可以處理嗎?
例如:TypeD的大多數資料都座落在>max的範圍,所以將TypeD歸納為X3群組
TypeE的大多數資料都座落在> max AND < min 的範圍,所以將TypeE歸納為X2群組
進階的問題不易明瞭,請另外提問並詳細說明.
建議:
1.先在 https://dbfiddle.uk/?rdbms=sqlserver_2017 建好模擬的資料
2.貼上預期的結果
rogeryao測試有成功了,但不知怎麼的,在實際線上實作,我的ROWNUM都是從2開始計數,肉眼看明明PARTITION 完後只有一筆資料,為什麼都會計算出為2,不知道有藏什麼鬼。
接下來要處理進一步的問題,我再整理資料說明,謝謝~
原問題的邏輯有個疑點(我無法確定你的定義,所以沒改)
WHEN 內的 =
WHEN [Xvalue] >= [Xmin] AND [Xvalue] <= [Xmax]
可能要改成
WHEN [Xvalue] > [Xmin] AND [Xvalue] < [Xmax]
否則
[Xvalue] = [Xmin] 時是 X1 還是 X2
[Xvalue] = [Xmax] 時是 X3 還是 X2
嗯,這我有發現,原則上應該是去掉等號沒錯。
但我PARTITION出來真的都是從2開始計算,真的很神奇,一開始用WHERE ROWNUM=1一直找不到資料還以為自己那裡寫錯了,除錯了好久,攤開row data的ROWNUM才發現竟然都是2
進階的問題 :
1.依佔比方式分類
2.假設相同 Type 的 Group 佔比大於等於 67% 才視為有效
CREATE TABLE TypeGroupTest (
XTypeNo int NULL ,
Xtype nvarchar(40) NULL ,
Xvalue int NULL ,
Xmin int NULL ,
Xmax int NULL);
INSERT INTO TypeGroupTest (XTypeNo,Xtype,Xvalue,Xmin,Xmax)
VALUES
(1,'TypeA',10,100,200),
(2,'TypeA',20,100,200),
(3,'TypeA',30,100,200),
(4,'TypeB',410,200,300),
(5,'TypeB',420,200,300),
(6,'TypeB',430,200,300),
(7,'TypeC',310,300,400),
(8,'TypeC',320,300,400),
(9,'TypeC',330,300,400),
(12,'TypeD',410,400,500),
(13,'TypeD',490,400,500),
(14,'TypeD',550,400,500),
(15,'TypeE',510,500,600),
(16,'TypeE',590,500,600),
(17,'TypeE',450,500,600),
-- 額外加入
(18,'TypeD',505,400,500),
(19,'TypeD',506,400,500),
(20,'TypeF',590,500,600),
(21,'TypeF',450,500,600);
SELECT Xtype,XGROUP,PERCENTAGE
FROM (
SELECT Xtype,XGROUP,
ROUND(1.00*ROWNUM2/ROWNUM3,2)*100 AS PERCENTAGE,
DENSE_RANK() OVER(PARTITION BY Xtype ORDER BY Xtype,ROWNUM2 DESC) AS ROWID
FROM (
--# SELECT Xtype,XGROUP,ROWNUM2
--# FROM (
SELECT Xtype,XGROUP,
COUNT(1) OVER(PARTITION BY Xtype) AS ROWNUM1,
COUNT(1) AS ROWNUM2,
SUM(COUNT(1)) OVER(PARTITION BY Xtype) AS ROWNUM3
FROM (
SELECT Xtype,
CASE
WHEN Xvalue < Xmin
THEN 'X1'
WHEN Xvalue > Xmax
THEN 'X3'
WHEN Xvalue >= Xmin AND Xvalue <= Xmax
THEN 'X2'
ELSE 'X4'
END AS XGROUP
FROM TypeGroupTest) AS TempA
GROUP BY Xtype,XGROUP
--# ) AS TempB
--# WHERE ROWNUM1>1
) AS TempC
) AS TempD
WHERE ROWID=1
AND PERCENTAGE>=67
先不要急著做DISTINCT,先取得每筆資料的GROUP
然後用type
+GROUP
做條件GROUP BY,再用type
去GROUP BY取得COUNT()的數量
限制數量為一的資料
SELECT b.type
,b.GROUP
,COUNT(*) AS Num
FROM (SELECT a.type
,a.GROUP
FROM(SELECT type
,CASE .... END AS GROUP
FROM TypeGroupTest
) a GROUP BY a.type
,a.GROUP
) b GROUP BY b.type
HAVING Num
=1
謝謝指導~
最後一段的HAVING套用上去,顯示
Invalid column name 'Nums'.
不知道是什麼原因無法使用HAVING
SELECT b.Xtype,b.XGROUP,COUNT(*) AS Nums FROM (
SELECT a.Xtype,a.XGROUP FROM(
SELECT Xtype,
CASE
WHEN [Xvalue] <= [Xmin]
THEN 'X1'
WHEN [Xvalue] >= [Xmax]
THEN 'X3'
WHEN [Xvalue] >= [Xmin] AND [Xvalue] <= [Xmax]
THEN 'X2'
ELSE 'X4'
END AS [XGROUP] FROM TypeGroupTest) a
GROUP BY a.Xtype,a.XGROUP) b
GROUP BY b.Xtype,b.XGROUP
HAVING b.Nums = 1
因為Nums
這個名稱來源自
COUNT(*) AS Nums
所以要把b.拿掉
create table t0225 (
id int not null
, ytype text not null
, yvalue int not null
, ymin int not null
, ymax int not null);
insert into t0225
values
(1,'TypeA',10,100,200),
(2,'TypeA',20,100,200),
(3,'TypeA',30,100,200),
(4,'TypeB',410,200,300),
(5,'TypeB',420,200,300),
(6,'TypeB',430,200,300),
(7,'TypeC',310,300,400),
(8,'TypeC',320,300,400),
(9,'TypeC',330,300,400),
(12,'TypeD',410,400,500),
(13,'TypeD',490,400,500),
(14,'TypeD',550,400,500),
(15,'TypeE',510,500,600),
(16,'TypeE',590,500,600),
(17,'TypeE',450,500,600);
with t1 as (
select id
, ytype
, case
when yvalue < ymin
then 'G1'
when yvalue > ymax
then 'G3'
when yvalue >= ymin and yvalue <= ymax
then 'G2'
else 'G4'
end grp
from t0225
), t2 as (
select ytype
,count(*) as typecnt
from t0225
group by ytype
), t3 as (
select t2.ytype
, round(count(*) filter(where grp = 'G1')::numeric / t2.typecnt,2) as percg1
, round(count(*) filter(where grp = 'G2')::numeric / t2.typecnt,2) as percg2
, round(count(*) filter(where grp = 'G3')::numeric / t2.typecnt,2) as percg3
from t2
join t1
using (ytype)
group by t2.ytype, t2.typecnt
), t4 as (
select ytype
, greatest(percg1, percg2, percg3) gst
from t3
)
select t3.ytype
, concat(case when percg1 = gst then 'G1' end
, case when percg2 = gst then 'G2' end
, case when percg3 = gst then 'G3' end)
from t3
join t4
using(ytype)
order by 1
;
按照佔比方式分類.