iT邦幫忙

0

MySQL GROUP 問題

結構
+-------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(20) | YES | | NULL | |
| fk | varchar(20) | YES | | NULL | |
+-------+------------------+------+-----+---------+----------------+

資料
SELECT * FROM gg;
+----+------+------+
| id | name | fk |
+----+------+------+
| 1 | a | aaa |
| 2 | b | aaa |
| 3 | c | bbb |
+----+------+------+

我想要用 fk 做 GROUP 撈出最新的資料
我的語法為

SELECT * FROM gg GROUP BY fk HAVING MAX(id);
結果卻是
+----+------+------+
| id | name | fk |
+----+------+------+
| 1 | a | aaa |
| 3 | c | bbb |
+----+------+------+

我想要得是 id = 2 跟 id = 3 的結果
請問該怎麼下才對呢?

4
fillano
iT邦超人 1 級 ‧ 2011-05-11 17:59:00
最佳解答
<pre class="c" name="code">
SELECT * FROM (SELECT * FROM gg ORDER BY id DESC) AS A GROUP BY fk;

不用subquery好像有點困難...而且又有table scan...

這樣可能比較好,不過還是subquery:

<pre class="c" name="code">
SELECT * FROM gg WHERE id IN (SELECT MAX(id) FROM gg GROUP BY fk);
看更多先前的回應...收起先前的回應...
fillano iT邦超人 1 級 ‧ 2011-05-11 18:07:51 檢舉

或是用join:

<pre class="c" name="code">
SELECT a.id, b.name, b.fk FROM (SELECT MAX(id) FROM gg GROUP BY fk) AS a JOIN gg AS b on a.id=b.id;
chan15 iT邦新手 5 級 ‧ 2011-05-11 18:10:54 檢舉

fillano 大這樣是可以
不過那有辦法同時用這個結構取得 GROUP COUNT 的結果嗎?

fillano iT邦超人 1 級 ‧ 2011-05-11 18:13:56 檢舉

SELECT a.id, b.name, b.fk FROM (SELECT MAX(id) FROM gg GROUP BY fk) AS a JOIN gg AS b on a.id=b.id;

寫錯...應該是:

<pre class="c" name="code">
SELECT a.id, b.name, b.fk FROM (SELECT MAX(id) as id FROM gg GROUP BY fk) AS a JOIN gg AS b on a.id=b.id;

忘記指定alias

fillano iT邦超人 1 級 ‧ 2011-05-11 18:18:31 檢舉

要取group count的話,那你用join好了,不過join的效率會比where in 差一些...例如:

<pre class="c" name="code">
SELECT a.id, a.count, b.name, b.fk 
FROM (SELECT MAX(id) as id, count(id) as count FROM gg GROUP BY fk) AS a 
JOIN gg AS b on a.id=b.id;
a802216 iT邦新手 4 級 ‧ 2011-05-12 13:45:52 檢舉

如果Join的欄位是主鍵(叢集索引)其速度甚至比SubQuery更快,更別提Where in... Where in在大部份狀況來說,是效能最差又是最佔記憶體運算空間的

fillano iT邦超人 1 級 ‧ 2011-05-12 17:33:50 檢舉

瞭解,是我的認知有誤,不好意思。

不過從explain看起來,這個join需要做filesort跟temptable,所以懷疑他會比較慢,不過只用三筆看不出效率。

<pre class="c" name="code">
mysql> explain SELECT * FROM (SELECT * FROM gg ORDER BY id DESC) AS A GROUP BY fk;
+----+-------------+------------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table      | type | possible_keys | key  | key_len | ref  | rows | Extra                           |
+----+-------------+------------+------+---------------+------+---------+------+------+---------------------------------+
|  1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL | NULL    | NULL |    3 | Using temporary; Using filesort | 
|  2 | DERIVED     | gg         | ALL  | NULL          | NULL | NULL    | NULL |    3 | Using filesort                  | 
+----+-------------+------------+------+---------------+------+---------+------+------+---------------------------------+
2 rows in set (0.00 sec)
fillano iT邦超人 1 級 ‧ 2011-05-12 17:35:03 檢舉
<pre class="c" name="code">
mysql> explain SELECT * FROM gg WHERE id IN (SELECT MAX(id) FROM gg GROUP BY fk);  
+----+--------------------+-------+-------+---------------+--------+---------+------+------+----------------+
| id | select_type        | table | type  | possible_keys | key    | key_len | ref  | rows | Extra          |
+----+--------------------+-------+-------+---------------+--------+---------+------+------+----------------+
|  1 | PRIMARY            | gg    | ALL   | NULL          | NULL   | NULL    | NULL |    3 | Using where    | 
|  2 | DEPENDENT SUBQUERY | gg    | index | NULL          | idx_fk | 22      | NULL |    3 | Using filesort | 
+----+--------------------+-------+-------+---------------+--------+---------+------+------+----------------+
2 rows in set (0.00 sec)
fillano iT邦超人 1 級 ‧ 2011-05-12 17:36:52 檢舉
<pre class="c" name="code">
mysql> explain SELECT a.id, b.name, b.fk FROM (SELECT MAX(id) as id FROM gg GROUP BY fk) AS a JOIN gg AS b on a.id=b.id;
+----+-------------+------------+--------+---------------+---------+---------+------+------+---------------------------------+
| id | select_type | table      | type   | possible_keys | key     | key_len | ref  | rows | Extra                           |
+----+-------------+------------+--------+---------------+---------+---------+------+------+---------------------------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL          | NULL    | NULL    | NULL |    2 |                                 | 
|  1 | PRIMARY     | b          | eq_ref | PRIMARY       | PRIMARY | 4       | a.id |    1 |                                 | 
|  2 | DERIVED     | gg         | ALL    | NULL          | NULL    | NULL    | NULL |    3 | Using temporary; Using filesort | 
fillano iT邦超人 1 級 ‧ 2011-05-12 21:29:23 檢舉

從explain的結果看起來,使用join會多一個步驟,所以應該是會比較慢。要避免使用temporary table以及filesort,必須加上一個索引,例如:

<pre class="c" name="code">
ALTER TABLE `gg` ADD INDEX `idx_fkid` (fk, id);

這樣查詢方式就會變成Using index。但是用join還是會比較慢。(多了一個步驟來產生結果)不過這是在mysql,如果是在sql server,應該就會有所不同。

fillano iT邦超人 1 級 ‧ 2011-05-12 23:45:12 檢舉

不過實際塞多一點資料做profiling,看起來還是join比較快...
如果是join,只要:

<pre class="c" name="code">
starting	0.000116
Opening tables	0.000017
System lock	0.000007
Table lock	0.000061
optimizing	0.000008
statistics	0.000029
preparing	0.000015
executing	0.000011
Sorting result	0.000004
Sending data	0.000120
init	0.000019
optimizing	0.000015
statistics	0.000029
preparing	0.000013
executing	0.000003
Sending data	0.000113
end	0.000006
end	0.000003
query end	0.000004
freeing items	0.000010
closing tables	0.000004
removing tmp table	0.000011
closing tables	0.000006
logging slow query	0.000004
cleaning up	0.000004

查詢花了0.0007秒

fillano iT邦超人 1 級 ‧ 2011-05-13 00:07:56 檢舉

如果用where in,會使用65個步驟,多出許多executing->sorting result->sending data的過程。查詢會花0.0009秒。如果資料更多,步驟及花費時間更多。

看起來join花的是常數時間,where in是O(n)(n=where in list的筆數)...不過這個從explain看不太出來

2
liurambo0911
iT邦高手 1 級 ‧ 2011-05-11 16:35:09

chan15提到:
SELECT * FROM gg GROUP BY fk HAVING MAX(id)

SELECT MAX(id),* FROM gg GROUP BY fk

chan15 iT邦新手 5 級 ‧ 2011-05-11 16:40:07 檢舉

這樣怎麼會對…

啊 不然哩 請指教 我只是用看的猜的

2
seasuwang
iT邦新手 5 級 ‧ 2011-05-14 22:55:57

參考一下這個吧 mysql pivot query
http://www.artfulsoftware.com/infotree/queries.php#78

文章裡面找一下關鍵字..Automate pivot table queries

fillano iT邦超人 1 級 ‧ 2011-05-16 10:09:02 檢舉

這本書不錯耶開心

我要發表回答

立即登入回答