欢迎投稿

今日深度:

Hive学习笔记 --,

Hive学习笔记 --,


  • Text File
  • SequenceFile
  • RCFile
    • CREATE TABLE ... STORED AS RCFile
  • Avro Files
    • CREATE TABLE kst
    •   PARTITIONED BY (ds string)
    •   ROW FORMAT SERDE
    •   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
    •   STORED AS INPUTFORMAT
    •   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
    •   OUTPUTFORMAT
    •   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
    •   TBLPROPERTIES (
    •     'avro.schema.url'='http://schema_provider/kst.avsc');
  • ORC Files
    • CREATE TABLE ... STORED AS ORC
  • Parquet
    • CREATE TABLE parquet_test (
    • id int,
    • str string,
    • mp MAP<STRING,STRING>,
    • lst ARRAY<STRING>,
    • strct STRUCT<A:STRING,B:STRING>)
    • PARTITIONED BY (part string)
    • STORED AS PARQUET;
  • Custom INPUTFORMAT and OUTPUTFORMAT


STORED AS TEXTFILE Stored as plain text files. TEXTFILE is the default file format, unless the configuration parameter hive.default.fileformat has a different setting.

Use the DELIMITED clause to read delimited files.

Enable escaping for the delimiter characters by using the 'ESCAPED BY' clause (such as ESCAPED BY '\') 
Escaping is needed if you want to work with data that can contain these delimiter characters. 

A custom NULL format can also be specified using the 'NULL DEFINED AS' clause (default is '\N').

STORED AS SEQUENCEFILE Stored as compressed Sequence File.
STORED AS ORC Stored as ORC file format. Supports ACID Transactions & Cost-based Optimizer (CBO). Stores column-level metadata.
STORED AS PARQUET Stored as Parquet format for the Parquet columnar storage format in Hive 0.13.0 and later; 
Use ROW FORMAT SERDE ... STORED AS INPUTFORMAT ... OUTPUTFORMAT syntax ... in Hive 0.10, 0.11, or 0.12.
STORED AS AVRO Stored as Avro format in Hive 0.14.0 and later (see Avro SerDe).
STORED AS RCFILE Stored as Record Columnar File format.
STORED BY Stored by a non-native table format. To create or link to a non-native table, for example a table backed by HBase or Druid or Accumulo. 
See StorageHandlers for more information on this option.
INPUTFORMAT and OUTPUTFORMAT in the file_format to specify the name of a corresponding InputFormat and OutputFormat class as a string literal.

For example, 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'. 

For LZO compression, the values to use are 
'INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat" 
OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"' 

(see LZO Compression).




www.htsjk.Com true http://www.htsjk.com/hive/32980.html NewsArticle Hive学习笔记 --, Text File SequenceFile RCFile CREATE TABLE ... STORED AS RCFile Avro Files CREATE TABLE kst    PARTITIONED BY (ds string)    ROW FORMAT SERDE    'org.apache.hadoop.hive.serde2.avro.AvroSerDe'    STORED AS INPU...
相关文章
    暂无相关文章
评论暂时关闭