結構
+-------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(20) | YES | | NULL | |
| fk | varchar(20) | YES | | NULL | |
+-------+------------------+------+-----+---------+----------------+
資料
SELECT * FROM gg;
+----+------+------+
| id | name | fk |
+----+------+------+
| 1 | a | aaa |
| 2 | b | aaa |
| 3 | c | bbb |
+----+------+------+
我想要用 fk 做 GROUP 撈出最新的資料
我的語法為
SELECT * FROM gg GROUP BY fk HAVING MAX(id);
結果卻是
+----+------+------+
| id | name | fk |
+----+------+------+
| 1 | a | aaa |
| 3 | c | bbb |
+----+------+------+
我想要得是 id = 2 跟 id = 3 的結果
請問該怎麼下才對呢?
<pre class="c" name="code">
SELECT * FROM (SELECT * FROM gg ORDER BY id DESC) AS A GROUP BY fk;
不用subquery好像有點困難...而且又有table scan...
這樣可能比較好,不過還是subquery:
<pre class="c" name="code">
SELECT * FROM gg WHERE id IN (SELECT MAX(id) FROM gg GROUP BY fk);
或是用join:
<pre class="c" name="code">
SELECT a.id, b.name, b.fk FROM (SELECT MAX(id) FROM gg GROUP BY fk) AS a JOIN gg AS b on a.id=b.id;
fillano 大這樣是可以
不過那有辦法同時用這個結構取得 GROUP COUNT 的結果嗎?
SELECT a.id, b.name, b.fk FROM (SELECT MAX(id) FROM gg GROUP BY fk) AS a JOIN gg AS b on a.id=b.id;
寫錯...應該是:
<pre class="c" name="code">
SELECT a.id, b.name, b.fk FROM (SELECT MAX(id) as id FROM gg GROUP BY fk) AS a JOIN gg AS b on a.id=b.id;
忘記指定alias
要取group count的話,那你用join好了,不過join的效率會比where in 差一些...例如:
<pre class="c" name="code">
SELECT a.id, a.count, b.name, b.fk
FROM (SELECT MAX(id) as id, count(id) as count FROM gg GROUP BY fk) AS a
JOIN gg AS b on a.id=b.id;
如果Join的欄位是主鍵(叢集索引)其速度甚至比SubQuery更快,更別提Where in... Where in在大部份狀況來說,是效能最差又是最佔記憶體運算空間的
瞭解,是我的認知有誤,不好意思。
不過從explain看起來,這個join需要做filesort跟temptable,所以懷疑他會比較慢,不過只用三筆看不出效率。
<pre class="c" name="code">
mysql> explain SELECT * FROM (SELECT * FROM gg ORDER BY id DESC) AS A GROUP BY fk;
+----+-------------+------------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+---------------+------+---------+------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | Using temporary; Using filesort |
| 2 | DERIVED | gg | ALL | NULL | NULL | NULL | NULL | 3 | Using filesort |
+----+-------------+------------+------+---------------+------+---------+------+------+---------------------------------+
2 rows in set (0.00 sec)
<pre class="c" name="code">
mysql> explain SELECT * FROM gg WHERE id IN (SELECT MAX(id) FROM gg GROUP BY fk);
+----+--------------------+-------+-------+---------------+--------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+--------+---------+------+------+----------------+
| 1 | PRIMARY | gg | ALL | NULL | NULL | NULL | NULL | 3 | Using where |
| 2 | DEPENDENT SUBQUERY | gg | index | NULL | idx_fk | 22 | NULL | 3 | Using filesort |
+----+--------------------+-------+-------+---------------+--------+---------+------+------+----------------+
2 rows in set (0.00 sec)
<pre class="c" name="code">
mysql> explain SELECT a.id, b.name, b.fk FROM (SELECT MAX(id) as id FROM gg GROUP BY fk) AS a JOIN gg AS b on a.id=b.id;
+----+-------------+------------+--------+---------------+---------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+---------+---------+------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2 | |
| 1 | PRIMARY | b | eq_ref | PRIMARY | PRIMARY | 4 | a.id | 1 | |
| 2 | DERIVED | gg | ALL | NULL | NULL | NULL | NULL | 3 | Using temporary; Using filesort |
從explain的結果看起來,使用join會多一個步驟,所以應該是會比較慢。要避免使用temporary table以及filesort,必須加上一個索引,例如:
<pre class="c" name="code">
ALTER TABLE `gg` ADD INDEX `idx_fkid` (fk, id);
這樣查詢方式就會變成Using index。但是用join還是會比較慢。(多了一個步驟來產生結果)不過這是在mysql,如果是在sql server,應該就會有所不同。
不過實際塞多一點資料做profiling,看起來還是join比較快...
如果是join,只要:
<pre class="c" name="code">
starting 0.000116
Opening tables 0.000017
System lock 0.000007
Table lock 0.000061
optimizing 0.000008
statistics 0.000029
preparing 0.000015
executing 0.000011
Sorting result 0.000004
Sending data 0.000120
init 0.000019
optimizing 0.000015
statistics 0.000029
preparing 0.000013
executing 0.000003
Sending data 0.000113
end 0.000006
end 0.000003
query end 0.000004
freeing items 0.000010
closing tables 0.000004
removing tmp table 0.000011
closing tables 0.000006
logging slow query 0.000004
cleaning up 0.000004
查詢花了0.0007秒
如果用where in,會使用65個步驟,多出許多executing->sorting result->sending data的過程。查詢會花0.0009秒。如果資料更多,步驟及花費時間更多。
看起來join花的是常數時間,where in是O(n)(n=where in list的筆數)...不過這個從explain看不太出來
chan15提到:
SELECT * FROM gg GROUP BY fk HAVING MAX(id)
SELECT MAX(id),* FROM gg GROUP BY fk
參考一下這個吧 mysql pivot query
http://www.artfulsoftware.com/infotree/queries.php#78
文章裡面找一下關鍵字..Automate pivot table queries