欢迎投稿

今日深度:

Hadoop伪分布式环境搭建,hadoop伪搭建

Hadoop伪分布式环境搭建,hadoop伪搭建


Hadoop伪分布式环境搭建
简单记录hadoop伪分布式环境搭建的步骤
软件环境配置
VM:VMware-workstation REDHAT linux, ip 192.168.2.6
Hadoop:hadoop-0.20.2

1, 创建一个hadoop用户
groupadd hadoop
useradd hadoop -g hadoop
passwd hadoop
以后的步骤都以创建好的hadoop用户进行操作
2, setup ssh
基于空口令创建一个新的SSH密钥,以启动无密码登录。
ssh-keygen -t rsa
会在~/.ssh目录下生成一个私钥文件和一个公钥文件
将公钥文件的内容添加到authorized_keys,这样ssh的时候就不需要密码了。
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
需要chmod 700 .ssh
chmod 600 authorized_keys
ssh localhost 或者 ssh ip 不需要密码则ok

3,配置hadoop
tar -xzvf hadoop-0.20.2.tar.gz
修改配置,在目录/home/hadoop/hadoop-0.20.2/conf
修改hadoop-env.sh
export JAVA_HOME=/usr/local/jdk1.6.0_33

修改core-site.xml 如下

<configuration>
	<property>
		<name>fs.default.name</name>
		<value>hdfs://localhost:9000</value>
	</property>

	<property>
		<name>hadoop.tmp.dir</name>
		<value>/home/hadoop/tmp</value>
		<description>A base for other temporary directories.</description>
	</property>
</configuration>
修改mapred-site.xml:

<configuration>
<property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
</property>
</configuration>
修改hdfs-site.xml:
<configuration>
	<property>
		<name>dfs.name.dir</name>
		<value>/home/hadoop/dfs/name</value>
		<description>Determines where on the local filesystem the DFS name node should store </description>
	</property>
	<property>
		<name>dfs.data.dir</name>
		<value>/home/hadoop/dfs/data</value>
		<description>Determin. If this is a comma-delimited </description>
	</property>
	<property>
		<name>dfs.replication</name>
		<value>1</value>
		<description>Default block replicied when the file is created. The default </description>
	</property>
	<property>
		<name>dfs.permissions</name>
		<value>false</value>
	</property>
</configuration>
4,运行
cd /home/hadoop/hadoop-0.20.2/bin
格式化HDFS:
./hadoop namenode -format
启动hadoop守护进程:
./start-all.sh
通过浏览器查看hadoop运行状态:
NameNode - http://localhost:50070/
JobTracker - http://localhost:50030/

5,测试
在/home/hadoop/hadoop-0.20.2 目录中进行如下的操作

复制本地文件到HDFS的input目录:
[hadoop@srv6 hadoop-0.20.2]$ bin/hadoop fs -put conf input

运行hadoop提供的例子:
[hadoop@srv6 hadoop-0.20.2]$ bin/hadoop jar hadoop-0.20.2-examples.jar grep input output 'dfs[a-z.]+'

查看DFS文件:
[hadoop@srv6 hadoop-0.20.2]$ bin/hadoop fs -ls output

复制DFS文件到本地,并在本地查看:
[hadoop@srv6 hadoop-0.20.2]$ bin/hadoop fs -get output output
[hadoop@srv6 hadoop-0.20.2]$ cat output/*

或者直接查看DFS文件:
[hadoop@srv6 hadoop-0.20.2]$ bin/hadoop fs -cat output/*

关闭hadoop守护进程:
[hadoop@srv6 hadoop-0.20.2]$ bin/stop-all.sh

在windows下的eclipse利用插件连接linux的hdfs的时候遇到
java.net.ConnectException: Call to /192.168.2.6:9001 failed on connection exception:
原因是linux的hadoop配置文件中写的是localhost,改为ip 192.168.2.6,连接成功。

www.htsjk.Com true http://www.htsjk.com/Hadoop/25535.html NewsArticle Hadoop伪分布式环境搭建,hadoop伪搭建 Hadoop伪分布式环境搭建 简单记录hadoop伪分布式环境搭建的步骤 软件环境配置 VM:VMware-workstation REDHAT linux, ip 192.168.2.6 Hadoop:hadoop-0.20.2 1, 创建一个...
相关文章
    暂无相关文章
评论暂时关闭