Hive面试题:hive如何实现in和not in,
In的实现:
Hive中的in的实现方式很多,简单说几种:
Not in的实现:
Left outer join+is null
举例说明:
有两个表如下:
skim表
| userID | itemID | time |
| 001 | 342 | 2015-05-08 |
| 002 | 382 | 2015-05-09 |
| 002 | 458 | 2015-05-09 |
| 004 | 325 | 2015-05-09 |
| userID | itemID | time |
| 001 | 342 | 2015-05-07 |
| 002 | 382 | 2015-05-08 |
| 003 | 458 | 2015-05-09 |
| 004 | 325 | 2015-05-09 |
IN实现:
如果要查询在skim表中并且也在buy表中的信息,需要用in查询,hive sql如下:
select skim.userId , skim.itemId from skim left outer join buy
on skim.userId = buy .userId and skim.itemId = buy .itemId where buy .userId is not null;
或
select skim.userId , skim.itemId from skim left semi join buy
on skim.userId = buy .userId and skim.itemId = buy .itemId;
或
select skim.userId , skim.itemId from skim join buy
on skim.userId = buy .userId and skim.itemId = buy .itemId;
结果如下:
| userID | itemID |
| 001 | 342 |
| 002 | 382 |
| 003 | 458 |
NOT IN实现:
如果要查询在skim表中并且不在buy表中的信息,需要用not in查询,hive sql如下:
select skim.userId, skim.itemId from skim left outer join buy
on skim.userId=buy .userId and skim.itemId=buy .itemId where buy .userId is null;
结果如下:
| userID | itemID |
| 004 | 468 |
本站文章为和通数据库网友分享或者投稿,欢迎任何形式的转载,但请务必注明出处.
同时文章内容如有侵犯了您的权益,请联系QQ:970679559,我们会在尽快处理。