欢迎投稿

今日深度:

HBase数据压缩方式的选择,hbase数据压缩方式

HBase数据压缩方式的选择,hbase数据压缩方式



2015年07月23日 17:53:59阅读数:2593

官方文档:http://hbase.apache.org/book.html#_which_compressor_or_data_block_encoder_to_use


The compression or codec type to use depends on the characteristics of your data. Choosing the wrong type could cause your data to take more space rather than less, and can have performance implications. 
In general, you need to weigh your options between smaller size and faster compression/decompression. Following are some general guidelines, expanded from a discussion at Documenting Guidance on compression and codecs.

  • If you have long keys (compared to the values) or many columns, use a prefix encoder. FAST_DIFF is recommended, as more testing is needed for Prefix Tree encoding.

  • If the values are large (and not precompressed, such as images), use a data block compressor.

  • Use GZIP for cold data, which is accessed infrequently. GZIP compression uses more CPU resources than Snappy or LZO, but provides a higher compression ratio. 
    GZIP压缩适合冷数据场景,相比较Snappy和LZO压缩,压缩率更高,但是CPU消耗的也更多。

  • Use Snappy or LZO for hot data, which is accessed frequently. Snappy and LZO use fewer CPU resources than GZIP, but do not provide as high of a compression ratio.

  • In most cases, enabling Snappy or LZO by default is a good choice, because they have a low performance overhead and provide space savings.

  • Before Snappy became available by Google in 2011, LZO was the default. Snappy has similar qualities as LZO but has been shown to perform better. 
    Snappy压缩出现之前谷歌默认使用的是LZO,但是Snappy出现之后在性能上更加出色,因此Snappy成了默认压缩方式。


HBase配置Snappy压缩:http://blog.csdn.net/maomaosi2009/article/details/47019913

www.htsjk.Com true http://www.htsjk.com/hbase/31581.html NewsArticle HBase数据压缩方式的选择,hbase数据压缩方式 2015年07月23日 17:53:59 阅读数:2593 官方文档:http://hbase.apache.org/book.html#_which_compressor_or_data_block_encoder_to_use The compression or codec type to use depe...
相关文章
    暂无相关文章
评论暂时关闭