欢迎投稿

今日深度:

Hive学习笔记(12),

Hive学习笔记(12),


1 User Defined Functions

https://cwiki.apache.org/confluence/display/Hive/HivePlugins

  • UDF: 一进一出
  • UDAF(Aggregation) : 聚集函数,多进一出,类似于 count / max /min
  • UDTF(Table-Generating) : 一进多出,例如 lateral view explore()

2 Hive UDF 编程步骤


2.1 注意事项

  • UDF 必须要有返回类型,可以返回 null , 但是返回类型不能为 void;
  • UDF 中常用的 Text, LongWritable 等类型,不推荐用 java 的类型;

3 UDF 测试

  • 添加依赖
 <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-exec</artifactId>
            <version>1.1.0-cdh5.7.0</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-jdbc</artifactId>
            <version>1.1.0-cdh5.7.0</version>
        </dependency>
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

public class LowerUDF extends UDF {
    public Text evaluate(Text str) {

        if (null == str.toString()) {
            return null;
        }

        return new Text(str.toString().toLowerCase());
    }

    public static void main(String[] args) {
        System.out.println(new LowerUDF().evaluate(new Text("HIVE")));
    }
}

  • maven 打包
  • hive 添加 jar 包
hive (default)> add jar /home/hadoop/testhadoop-1.0.jar;
Added [/home/hadoop/testhadoop-1.0.jar] to class path
Added resources: [/home/hadoop/testhadoop-1.0.jar]

  • create temporary FUNCTION my_lower as "hive.LowerUDF";
  • SELECT ename,my_lower(ename) lowername from emp limit 5;

3.2 从 hdfs 添加 UDF jar

  • 从 hive-0.13 开始
    CREATE FUNCTION myfunc AS 'myclass' USING JAR 'hdfs:///path/to/jar';

www.htsjk.Com true http://www.htsjk.com/hive/31649.html NewsArticle Hive学习笔记(12), 1 User Defined Functions https://cwiki.apache.org/confluence/display/Hive/HivePlugins UDF : 一进一出 UDAF(Aggregation) : 聚集函数,多进一出,类似于 count / max /min UDTF(Table-Generating) : 一进多...
相关文章
    暂无相关文章
评论暂时关闭