hive的jdbc，

和通数据库htsjk.Com2020-01-23 22:54 来源:未知阅读:9684 评论 454 热度2

标签：

hive的jdbc，

原文地址： http://blog.csdn.net/nsrainbow/article/details/43002387 最新课程请关注原作者博客，获得更好的显示体验

声明

本文基于Centos 6.x + CDH 5.x

说到Hive就一定要说到写程序的时候怎么调用Hive。以下我通过一个例子说明如果通过java来调用hive查询数据

服务准备

使用Jdbc方式链接hive,首先需要启动hive的Thrift Server，否则会导致错误“Could not establish connection to localhost:10000/default: java.net.ConnectException: Connection refused”

hive --service hiveserver 是两”-“，

数据准备

建立一个文本文件叫 a.txt，内容是 [plain] view plain copy

1,terry
2,alex
3,jimmy
4,mike
5,kate

并上传到hive服务器的 /data/ 目录下

JDBC调用方法

加载Driver

加载driver (只说hive2的jdbc) [java] view plain copy

Class.forName("org.apache.hive.jdbc.HiveDriver");

连接数据库

[java] view plain copy

Connection con = DriverManager.getConnection("jdbc:hive2://host1:10000/default", "hive", "");

这里的 host1 是主机名
10000是hive默认的端口名
default是默认的database
hive是默认的用户名，默认密码是空

数据库操作语句

删除表 [java] view plain copy

stmt.execute("drop table if exists " + tableName);

创建表 [java] view plain copy

stmt.execute("create table " + tableName + " (key int, value string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'");

查询数据 [java] view plain copy

ResultSet res = stmt.executeQuery("select * from " + tableName);

导入数据 [java] view plain copy

stmt.execute("load data local inpath '" + filepath + "' into table " + tableName);

例子

建立项目

先打开eclipse建立一个maven项目

pom.xml [html] view plain copy

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.crazycake</groupId>
<artifactId>play-hive</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>play-hive</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>0.14.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.2.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.0.2</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
<encoding>UTF-8</encoding>
<optimise>true</optimise>
<compilerArgument>-nowarn</compilerArgument>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<configuration>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer">
</transformer>
</transformers>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>

其中最重要的就是这两段 [html] view plain copy

<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>0.14.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.2.0</version>
</dependency>

其他的都无所谓

建表、导入以及查询数据

建立一个类 HiveJdbcClient [java] view plain copy

package org.crazycake.play_hive;
import java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DriverManager;
/**
* 测试hive 的客户端连接
* @author alexxiyang (https://github.com/alexxiyang)
*
*/
public class HiveJdbcClient {
/**
* 注意：hive-server2 引用的driver是 org.apache.hive.* 而 hive-server 是 org.apache.hadoop.hive.*
*/
private static String driverName = "org.apache.hive.jdbc.HiveDriver";
/**
* @param args
* @throws SQLException
*/
public static void main(String[] args) throws SQLException {
try {
Class.forName(driverName);
} catch (ClassNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
System.exit(1);
}
//hive的默认端口是 10000，如果要修改就修改 hive-site.xml 文件的hive.server2.thrift.port 属性值
//默认用户名hive，默认密码为空
Connection con = DriverManager.getConnection("jdbc:hive2://host1:10000/default", "hive", "");
Statement stmt = con.createStatement();
//测试的表名 testhivedrivertable
String tableName = "testhivedrivertable";
//如果已经存在就删除
stmt.execute("drop table if exists " + tableName);
//创建这张表
stmt.execute("create table " + tableName + " (key int, value string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'");
//看下创建是否成功
String sql = "show tables '" + tableName + "'";
System.out.println("Running: " + sql);
ResultSet res = stmt.executeQuery(sql);
if (res.next()) {
System.out.println(res.getString(1));
}
//看下表结构
sql = "describe " + tableName;
System.out.println("Running: " + sql);
res = stmt.executeQuery(sql);
while (res.next()) {
System.out.println(res.getString(1) + "\t" + res.getString(2));
}
// 加载数据到表里面
// NOTE: filepath 是本地文件所在的位置，注意这个本地不是你的电脑！
// 你得先把这个文件上传到服务器，然后这里的路径是服务器上这个文件的路径
// NOTE: /data/a.txt
String filepath = "/data/a.txt";
sql = "load data local inpath '" + filepath + "' into table " + tableName;
System.out.println("Running: " + sql);
stmt.execute(sql);
// select * query
sql = "select * from " + tableName;
System.out.println("Running: " + sql);
res = stmt.executeQuery(sql);
while (res.next()) {
System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2));
}
}
}

输出是 [plain] view plain copy

Running: show tables 'testhivedrivertable'
testhivedrivertable
Running: describe testhivedrivertable
key int
value string
Running: load data local inpath '/data/a.txt' into table testhivedrivertable
Running: select * from testhivedrivertable
1 terry
2 alex
3 jimmy
4 mike
5 kate
Running: select count(1) from testhivedrivertable

其实java调用很简单的，就是把你在hive shell 里面执行的语句用jdbc执行一遍而已，所以你传输过去的语句的环境是hive server机器，里面写的路径也是从hive server主机的根目录路径出发去寻找数据，所以我们的 a.txt 得上传到服务器上，这段代码才会运行正常。