[MySQL] 分组排序取前N条记录以及生成自动数字序列,类似group by后 limit,mysqlgroup
前言:
同事的业务场景是,按照cid、author分组,再按照id倒叙,取出前2条记录出来。
oracle里面可以通过row_number() OVER (PARTITION BY cid,author ORDER BY id DESC) 表示根据cid,author分组,在分组内部根据id排序,而此函数计算的值就表示每组内部排序后的顺序编号(组内连续的唯一的),而mysql数据库就没有这样的统计函数,需要自己写复杂的sql来实现。
1,录入测试数据
- USE csdn;
- DROP TABLE IF EXISTS test;
- CREATE TABLE test (
- id INT PRIMARY KEY,
- cid INT,
- author VARCHAR(30)
- ) ENGINE=INNODB;
- INSERT INTO test VALUES
- (1,1,\'test1\'),
- (2,1,\'test1\'),
- (3,1,\'test2\'),
- (4,1,\'test2\'),
- (5,1,\'test2\'),
- (6,1,\'test3\'),
- (7,1,\'test3\'),
- (8,1,\'test3\'),
- (9,1,\'test3\'),
- (10,2,\'test11\'),
- (11,2,\'test11\'),
- (12,2,\'test22\'),
- (13,2,\'test22\'),
- (14,2,\'test22\'),
- (15,2,\'test33\'),
- (16,2,\'test33\'),
- (17,2,\'test33\'),
- (18,2,\'test33\');
- INSERT INTO test VALUES (200,200,\'200test_nagios\');
2,原始的效率比较低下的子查询实现方式
SQL代码如下:
- SELECT * FROM test a
- WHERE
- N>(
- SELECT COUNT(*)
- FROM test b
- WHERE a.cid=b.cid AND a.`author`=b.`author` AND a.id<b.id
- )ORDER BY cid,author,id DESC;
只要将N换成你要的数字比如2,就表示查询出每个分组的前2条记录,如下所示:
- mysql> SELECT * FROM test a
- -> WHERE
- -> 2>(
- -> SELECT COUNT(*)
- -> FROM test b
- -> WHERE a.cid=b.cid AND a.`author`=b.`author` AND a.id<b.id
- -> )ORDER BY cid,author,id DESC;
- +-----+------+----------------+
- | id | cid | author |
- +-----+------+----------------+
- | 2 | 1 | test1 |
- | 1 | 1 | test1 |
- | 5 | 1 | test2 |
- | 4 | 1 | test2 |
- | 9 | 1 | test3 |
- | 8 | 1 | test3 |
- | 11 | 2 | test11 |
- | 10 | 2 | test11 |
- | 14 | 2 | test22 |
- | 13 | 2 | test22 |
- | 18 | 2 | test33 |
- | 17 | 2 | test33 |
- | 200 | 200 | 200test_nagios |
- +-----+------+----------------+
- 13 ROWS IN SET (0.00 sec)
- mysql>
3,使用动态sql来实现
先构造序列号码,引入一个@row来做rownumber
SET @row=0;SET @mid='';SELECT cid, author, @row:=@row+1 rownum FROM test ORDER BY cid, author LIMIT 10;
序列号码已经出来了,再加一个@mid来进行分组,重点在于CASE WHEN @mid = author THEN @row:=@row+1 ELSE @row:=1 END rownum,表示分组的时候会自动从1计数指导这个分组数据遍历结束。
SET @row=0;SET @mid='';SELECT cid, author,CASE WHEN @mid = author THEN @row:=@row+1 ELSE @row:=1 END rownum, @mid:=author FROM test ORDER BY cid,author DESC LIMIT 20;
好了,再外面加一层inner JOIN 再对 rownumber 做限制 就可以拿到目标数据了。
SET @row=0;
SET @mid='';
SELECT a.*,b.rownum FROM test a
INNER JOIN (
SELECT cid, author, id, CASE WHEN @mid = author THEN @row:=@row+1 ELSE @row:=1 END rownum, @mid:=author MID
FROM test
ORDER BY cid,author,id DESC
) b ON b.author=a.author AND b.cid=a.cid AND b.id=a.id WHERE b.rownum<3;
执行结果如下所示:
- mysql> SET @row=0;
- QUERY OK, 0 ROWS affected (0.00 sec)
- mysql> SET @mid=\'\';
- QUERY OK, 0 ROWS affected (0.00 sec)
- mysql> SELECT a.*,b.rownum FROM test a
- -> INNER JOIN (
- -> SELECT cid, author, id, CASE WHEN @mid = author THEN @row:=@row+1 ELSE @row:=1 END rownum, @mid:=author MID
- -> FROM test
- -> ORDER BY cid,author,id DESC
- -> ) b ON b.author=a.author AND b.cid=a.cid AND b.id=a.id WHERE b.rownum<3;
- +-----+------+----------------+--------+
- | id | cid | author | rownum |
- +-----+------+----------------+--------+
- | 2 | 1 | test1 | 1 |
- | 1 | 1 | test1 | 2 |
- | 5 | 1 | test2 | 1 |
- | 4 | 1 | test2 | 2 |
- | 9 | 1 | test3 | 1 |
- | 8 | 1 | test3 | 2 |
- | 11 | 2 | test11 | 1 |
- | 10 | 2 | test11 | 2 |
- | 14 | 2 | test22 | 1 |
- | 13 | 2 | test22 | 2 |
- | 18 | 2 | test33 | 1 |
- | 17 | 2 | test33 | 2 |
- | 200 | 200 | 200test_nagios | 1 |
- +-----+------+----------------+--------+
- 13 ROWS IN SET (0.01 sec)
- mysql>
参考文章地址:
http://blog.csdn.net/mchdba/article/details/22163223
http://blog.csdn.net/ylqmf/article/details/39005949
可以写为 select id,channel_id,time from table where group by channel_id order by time desc limit 2
# mysql不支持其它复杂数据库的类似 rank() over 的排名和统计查询
# 只能通过变通的子查询和逻辑计算方式来实现,对于中小数据量可以考虑
-- rank 排名实现
select inline_rownum, aa, cc, amt, orderid FROM
(
select
# logic_cal 只是实现计数器计算的,每次逐条查询时会对比当前 cc 与 @last_cc 是否相同,如果不同则把当前该列值赋于 @last_cc 并重设计数器 @num := 1,否则计数器自加 @num := @num + 1
(case when cc <> @last_cc then concat(@last_cc := cc, @num := 1 ) else concat(@last_cc, @num := @num + 1) end ) logic_cal
, @num as inline_rownum
, aa, cc, amt, orderid
from tb_rank,
( select @last_cc := '') t, # 初始化 @last_cc 为 '', 如要检查的列(基于计数器统计的列)是int型,则初始化为0; varchar型初始化为''
( select @num := 0 ) t2 # 初始化@num为0
order by cc, orderid asc # 排序的方式会影响@num的生成,因为logic_cal是逐行计算的
) t
where inline_rownum <= floor(amt*0.8) #限制条数,取常量值或其他
order by cc,orderid asc
;