欢迎投稿

今日深度:

[HBase] Hbase Filter,

[HBase] Hbase Filter,



本文是笔者学习过程中的简单笔记,日后会逐渐增加内容,主要参考资料是《Hbase The Definitive Guide》。


Comparison Filters  比较过滤器

这一类Filter派生于CompareFilter ,其构造函数如下:

CompareFilter(CompareOp,WritableByteArrayComparable valueComparator)

RowFilter

基于row key 过滤数据,例子:

		HTable table=new HTable(cfg,"t1");
		
		table.setAutoFlush(false);
		
		Scan scan =new Scan();
		
		//RowFilter  行键 小于或等于"row09"
		Filter filter1=new RowFilter(
				CompareFilter.CompareOp.LESS_OR_EQUAL,
				new BinaryComparator(Bytes.toBytes("row09")));
		//正则表达式 , 行键以 9结尾
		Filter filter2=new RowFilter(
				CompareFilter.CompareOp.EQUAL,
				new RegexStringComparator(".*9$"));
				
		//使用Substring 匹配行键以"45"结尾的行
		Filter filter3=new RowFilter(
				CompareFilter.CompareOp.EQUAL,
				new SubstringComparator("45"));
		
		Filter filter =filter3;
 
		scan.setFilter(filter);
		

		//执行Scan操作,打印结果
		ResultScanner scanner=table.getScanner(scan);

		for(Result res: scanner){
			System.out.println(res.toString());
		}

		table.close();

FamilyFilter

基于Family过滤数据,以下是例子:

		//FamilyFilter  Family小于"f3"
		Filter filter1=new FamilyFilter(CompareFilter.CompareOp.LESS,
				new BinaryComparator(Bytes.toBytes("f3")));
		
		//FamilyFilter  Family等于"f2"
		Filter filter2=new FamilyFilter(CompareFilter.CompareOp.EQUAL,
				new BinaryComparator(Bytes.toBytes("f2")));
 
		//应用于scan
		Scan scan =new Scan();
		scan.setFilter(filter1);
		ResultScanner scanner=table.getScanner(scan);
		for(Result res: scanner){
			System.out.println(res.toString());
		}
		//应用于get
		Get get =new Get(Bytes.toBytes("row098"));
		get.setFilter(filter2);
		Result result=table.get(get);
		
		System.out.println(result.toString());

QualifierFilter

基于限定符过滤数据,例子:

		//FamilyFilter  Qualifier小于"c2"
		Filter filter=new QualifierFilter(CompareFilter.CompareOp.LESS,
				new BinaryComparator(Bytes.toBytes("c2")));

ValueFilter

基于值过滤数据,例子:

//  Value小于"value01-9-3"
		Filter filter=new ValueFilter(CompareFilter.CompareOp.LESS,
				new BinaryComparator(Bytes.toBytes("value01-9-3")));

DependentColumnFilter

其构造函数有以下几种:

DependentColumnFilter(byte[] family,byte[] qualifilter)
DependentColumnFilter(byte[] family,byte[] qualifier,boolean dropDependentColumn)
DependentColumnFilter(byte[] family,byte[] qualifier,boolean dropDependentColumn,
	CompareOp valueCompareOp,WritableByteArrayCompareable valueComparator)

以下是例子:

public class HBaseTest2 {

	public static void filter(boolean drop,
			CompareFilter.CompareOp operator,
			WritableByteArrayComparable comparator) throws IOException{

		//create configuration,table
		Configuration cfg=HBaseConfiguration.create();	
		HTable table=new HTable(cfg,"t2");
		//
		Filter filter;
		
		if(comparator !=null){
			//drop为true时,filter表示对"c1"列以外的所有"f1"列族数据做filter操作
			//drop为false时,表示对所有"f1"列族的数据做filter操作
			filter=new DependentColumnFilter(Bytes.toBytes("f1"),
					Bytes.toBytes("c1"),drop,operator,comparator);
		}else{
			filter=new DependentColumnFilter(Bytes.toBytes("f1"),
					Bytes.toBytes("c1"),drop);
		}
		//filter应用于scan
		Scan scan =new Scan();
		scan.setFilter(filter);
		ResultScanner scanner=table.getScanner(scan);
		
		for (Result result:scanner){
			for(KeyValue kv:result.list()){
				System.out.println("kv="+kv.toString()+
						",value="+Bytes.toString(kv.getValue()));
			}
		}
		scanner.close();
		table.close();
	}
	

	public static void main(String[] args) throws IOException {
		// TODO Auto-generated method stub
		//1.获取整个"f1"列族当前Version中的所有timestamp等于参照列"f1:c1"的数据
		filter(false,CompareFilter.CompareOp.NO_OP,null);
		//2.获取除了"c1"列以外的"f1"列族中的所有timestamp等于参照列"f1:c1"的数据
		filter(true,CompareFilter.CompareOp.NO_OP,null);
		//3.获取除了"c1"列以外的"f1"列族当前Version中的所有timestamp等于参照列"f1:c1"的,value以"value02-9"开头的所有数据
		filter(true,CompareFilter.CompareOp.EQUAL,
				new BinaryPrefixComparator(Bytes.toBytes("value02-9")));

		System.out.println("end!");	
	}
}

Dedicated Filters 专用过滤器

这一类过滤器主要用于Scan操作,用于get操作意义不大

SingleColumnValueFilter

对于每行数据,基于指定的列的值做过滤操作,符合条件则返回整行数据,构造函数如下:

SingleColumnValueFilter(byte[] family,byte[]qualifier,CompareOp compareOp,byte[] value)
SingleColumnValueFilter(byte[] family,byte[]qualifier,CompareOp compareOp,
	WritableByteArrayComparable comparator)

例子:

//针对每行数据,如果"f1:c1"列的值以"value00"开头,则返回该行所有数据
		filter=new SingleColumnValueFilter(Bytes.toBytes("f1"),
				Bytes.toBytes("c1"),CompareFilter.CompareOp.EQUAL,
				new BinaryPrefixComparator(Bytes.toBytes("value00")));

SingleColumnValueExcludeFilter 用法同SingleColumnValueFilter ,其返回结果与后者的区别是,前者返回结果集不包括指定列


PrefixFilter

其构造函数如下:返回所有匹配prefix的行数据

public PrefixFilter(byte[] prefix)
例子:

//匹配行键以"row"开头的行数据
		Filter filter;
		filter =new PrefixFilter(Bytes.toBytes("row"));

PageFilter

分页过滤器,例子:

//本例结合PageFilter ,Scan 以5行为一页扫描整个table
		
		//实例化PageFilter,5行(row)为一页
		Filter filter= new PageFilter(5);
		//每页的最后一行
		byte[]lastrow=null;
		//扫描的总行数
		int totalRows=0;
		//
		while(true){
			Scan scan =new Scan();
			scan.setFilter(filter);
			
			//当前行在当前Page中的位置
			int localRows=0;
			//将上一页的最后一行的rowkey加 0后作为下一次scan的起始行
			if(lastrow!=null){
				lastrow=Bytes.add(lastrow, Bytes.toBytes(0));
				scan.setStartRow(lastrow);
			}
			//扫描
			ResultScanner scanner=table.getScanner(scan);
			Result result=null;
			//遍历结果集
			while((result=scanner.next()) !=null){
				localRows++;
				totalRows++;
				lastrow=result.getRow();
				System.out.println("localRows="+localRows +", row="+Bytes.toString(lastrow));
			}
			//扫描结束
			if(localRows==0){
				break;
			}
		}
		//打印出总行数
		System.out.println("totalRows="+totalRows);

KeyOnlyFilter ,待补充

FirstKeyOnlyFilter,待补充

InclusiveStopFilter

Scan的setStartRow(startrow)默认是闭区间,setStopRow(stoprow)是开区间 ; 通过InclusiveStopFilter可将stoprow也改为闭区间,即返回结果[startrow,stoprow]

//将Scan的扫描范围设为["row2","row5"]
		Filter filter =new InclusiveStopFilter(Bytes.toBytes("row5"));
		Scan scan =new Scan();
		scan.setStartRow(Bytes.toBytes("row2"));
		scan.setFilter(filter);

TimestampsFilter

其构造函数如下,返回timestamp值在timestamps中的cell

public TimestampsFilter(List<long> timestamps)
例子:
List<Long> list=new ArrayList<Long>();
		list.add(Long.parseLong("1358516951448"));
		list.add(Long.parseLong("1358496418447"));
		list.add(Long.parseLong("1358516951449"));
		
		//过滤器,过滤timestamp和上述三个ts相同的cell
		filter=new TimestampsFilter(list);
	
		Scan scan =new Scan();
		scan.setFilter(filter);
		//version设为2 ,这里针对同一行的同一列,如果有多个cell符合filter,则最终返回用户最新的2个cell
		scan.setMaxVersions(2);

ColumnCountGetFilter

获取某行数据的前n列,例子:

//取一行的前5列
		filter=new ColumnCountGetFilter(5);
		
		Get get =new Get(Bytes.toBytes("row1"));
		get.setFilter(filter);

ColumnPaginationFilter

列分页过滤器,在一行的所有列中分页,其构造函数如下,返回每一行的offset开头的(包括offset)后limit列数据

public ColumnPaginationFilter(int limit,intoffset)
例子:

//取每行的第3列开始的四列(包括第三列)
		filter=new ColumnPaginationFilter(4,3);

ColumnPrefixFilter

直接看例子:

//取每行的所有列中限定符以"c0"开头的列
		filter=new ColumnPrefixFilter(Bytes.toBytes("c0"));
RandomRowFilter

随机列过滤器,指定一个0.0到1.0间的参数chance,表示每行被选中的几率,构造函数如下:

public RandomRowFilter(float chance)


Decorating Filters

SkipFilter

其构造函数如下,功能: 如果该行数据中有一个cell不满足filter1,则跳过该行

public SkipFilter(Filter filter1)

WhileMatchFilter

其构造函数如下,功能:检索到第一个不满足filter1的cell就终止,返回检索的结果

public WhileMatchFilter(Filter filter1)

FilterList

对scan或get应用多个Filter,例子:

//扫描table,获取row3到row5区间中value已3、4、5结尾的cell

		Filter filter=null;
		
		Filter filter1=new RowFilter(CompareFilter.CompareOp.GREATER_OR_EQUAL,
				new BinaryComparator(Bytes.toBytes("row3")) );
		
		Filter filter2=new RowFilter(CompareFilter.CompareOp.LESS_OR_EQUAL,
				new BinaryComparator(Bytes.toBytes("row5")));
		Filter filter3=new ValueFilter(CompareFilter.CompareOp.EQUAL,
				new RegexStringComparator("[3-5]$"));
		
		List<Filter> list=new ArrayList<Filter>();
		list.add(filter1);
		list.add(filter2);
		list.add(filter3);
		
		filter=new FilterList(list);
		Scan scan =new Scan();
	   
		scan.setFilter(filter);
//通过将operator参数设置为Operator.MUST_PASS_ONE,达到list中各filter为"或"的关系
	    //默认operator参数的值为Operator.MUST_PASS_ALL,即list中各filter为"并"的关系
	    Filter filter=new FilterList(FilterList.Operator.MUST_PASS_ALL,list);






www.htsjk.Com true http://www.htsjk.com/hbase/36567.html NewsArticle [HBase] Hbase Filter, 本文是笔者学习过程中的简单笔记,日后会逐渐增加内容,主要参考资料是《Hbase The Definitive Guide》。 Comparison Filters  比较过滤器 这一类Filter派生于CompareFilter ,其构...
相关文章
    暂无相关文章
评论暂时关闭