欢迎投稿

今日深度:

Hive之复合类型,

Hive之复合类型,


hive arraymapstruct使用

hive提供了复合数据类型:
Structs
structs内部的数据可以通过DOT.)来存取,例如,表中一列c的类型为STRUCT{a INT; b INT},我们可以通过c.a来访问域a
Maps
K-V对):访问指定域可以通过["指定域名称"]进行,例如,一个Map M包含了一个group-gidkv对,gid的值可以通过M['group']来获取
Arrays
array中的数据为相同类型,例如,假如array A中元素['a','b','c'],则A[1]的值为'b'

Struct
使用

建表:

·        hive>create table student_test(id INT, infostruct<name:STRING, age:INT>

·            > ROW FORMAT DELIMITED FIELDS TERMINATEDBY ','                        

·            > COLLECTION ITEMS TERMINATED BY ':';                                  

·        OK 

·        Timetaken: 0.446 seconds 

'FIELDS TERMINATED BY' :字段与字段之间的分隔符
''COLLECTION ITEMS TERMINATED BY'
:一个字段各个item的分隔符
导入数据:

·        $ cattest5.txt  

·        1,zhou:30 

·        2,yan:30 

·        3,chen:20 

·        4,li:80 

·        hive>LOAD DATA LOCAL INPATH '/home/work/data/test5.txt' INTO TABLEstudent_test; 

·        Copyingdata from file:/home/work/data/test5.txt 

·        Copyingfile: file:/home/work/data/test5.txt 

·        Loadingdata to table default.student_test 

·        OK 

·        Timetaken: 0.35 seconds 

查询:

·        hive>select info.age from student_test; 

·        TotalMapReduce jobs = 1 

·        ...... 

·        TotalMapReduce CPU Time Spent: 490 msec 

·        OK 

·        30 

·        30 

·        20 

·        80 

·        Timetaken: 21.677 seconds 


Array
使用
建表:

·        hive>create table class_test(name string, student_id_list array<INT>

·            > ROW FORMAT DELIMITED                                             

·            > FIELDS TERMINATED BY ','                                         

·            > COLLECTION ITEMS TERMINATED BY':';                              

·        OK 

·        Timetaken: 0.099 seconds 

导入数据:

·        $ cattest6.txt  

·        034,1:2:3:4 

·        035,5:6 

·        036,7:8:9:10 

·        hive>  LOAD DATA LOCAL INPATH'/home/work/data/test6.txt' INTO TABLE class_test ; 

·        Copyingdata from file:/home/work/data/test6.txt 

·        Copyingfile: file:/home/work/data/test6.txt 

·        Loadingdata to table default.class_test 

·        OK 

·        Timetaken: 0.198 seconds 

查询:

·        hive>select student_id_list[3] from class_test; 

·        TotalMapReduce jobs = 1 

·        ...... 

·        TotalMapReduce CPU Time Spent: 480 msec 

·        OK 

·        

·        NULL 

·        10 

·        Timetaken: 21.574 seconds 


Map
使用
建表:

·        hive>create table employee(id string, perfmap<string, int>)      

·            > ROW FORMAT DELIMITED                                         

·            > FIELDS TERMINATED BY '\t'                               

·            > COLLECTION ITEMS TERMINATED BY','                      

·            > MAP KEYS TERMINATED BY ':';                                   

·        OK 

·        Timetaken: 0.144 seconds 

MAPKEYS TERMINATED BYkeyvalue分隔符

导入数据:

·        $ cattest7.txt  

·        1       job:80,team:60,person:70 

·        2       job:60,team:80 

·        3       job:90,team:70,person:100 

·        hive>  LOAD DATA LOCAL INPATH'/home/work/data/test7.txt' INTO TABLE employee; 

查询:

·        hive>select perf['person'] from employee; 

·        TotalMapReduce jobs = 1 

·        ...... 

·        TotalMapReduce CPU Time Spent: 460 msec 

·        OK 

·        70 

·        NULL 

·        100 

·        Timetaken: 20.902 seconds 

·        hive>select perf['person'] from employee where perf['person'] is not null;    

·        TotalMapReduce jobs = 1 

·        ....... 

·        TotalMapReduce CPU Time Spent: 610 msec 

·        OK 

·        70 

·        100 

·        Timetaken: 21.989 seconds 

·        hive>  

·        <span><span> 

·        </span></span>  

www.htsjk.Com true http://www.htsjk.com/hive/40350.html NewsArticle Hive之复合类型, hive array 、 map 、 struct 使用 hive 提供了复合数据类型: Structs : structs 内部的数据可以通过 DOT ( . )来存取,例如,表中一列 c 的类型为 STRUCT{a INT; b INT} ,我们可以...
相关文章
    暂无相关文章
评论暂时关闭