Hadoop集群中增加新节点

和通数据库htsjk.Com2019-03-07 17:22 来源:未知阅读:6846 评论 258 热度4

标签：ETL_TOOL

Hadoop集群中增加新节点

/opt/sohuhadoop/hadoop/bin/hadoop-daemon.sh start tasktracker

    <name>dfs.balance.bandwidthPerSec</name>
    <value>10485760</value>
    <description>
        Specifies the maximum bandwidth that each datanode can utilize for the balancing purpose in term of the number of bytes per second.
    </description>
</property>

Balancing took 2.9950980555555557 hours

十二月 13th, 2010利用Decommission从Hadoop集群中Remove节点

10.15.10.42
10.15.10.43

    <name>dfs.hosts.exclude</name>
    <value>/opt/sohuhadoop/conf/excludes</value>
    <final>true</final>
</property>

十一月 19th, 2010基于Hadoop的Hbase环境搭建

源码，并解压

cd /opt/hadoop/
tar zxvf hbase-0.20.6.tar.gz
ln -s hbase-0.20.6 hbase

export HBASE_LOG_DIR=/opt/log/hbase
export HBASE_MANAGES_ZK=true

    <name>hbase.rootdir</name>
    <value>hdfs://zw-hadoop-master:9000/hbase</value>
    <description>The directory shared by region servers.</description>
</property>
<property>
            <name>hbase.cluster.distributed</name>
            <value>true</value>
            <description>The mode the cluster will be in. Possible values are
            false: standalone and pseudo-distributed setups with managed Zookeeper
            true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
            </description>
    </property>
<property>
    <name>hbase.master</name>
    <value>hdfs://zw-hadoop-master:60000</value>
</property>
<property>
    <name>hbase.zookeeper.quorum</name>
    <value>zw-hadoop-slave225,zw-hadoop-slave226,zw-hadoop-slave227</value>
    <description>Comma separated list of servers in the ZooKeeper Quorum.      For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".      By default this is set to localhost for local and pseudo-distributed modes      of operation. For a fully-distributed setup, this should be set to a full      list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh      this is the list of servers which we will start/stop ZooKeeper on.
    </description>
</property>
<property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/opt/log/zookeeper</value>
    <description>Property from ZooKeeper's config zoo.cfg.
        The directory where the snapshot is stored.
    </description>
</property>

hbase.rootdir设置hbase在hdfs上的目录，主机名为hdfs的namenode节点所在的主机hbase.cluster.distributed设置为true，表明是完全分布式的hbase集群hbase.master设置hbase的master主机名和端口hbase.zookeeper.quorum设置zookeeper的主机，官方推荐设置为3，5，7比较好

/opt/sohuhadoop/hbase/bin/stop-hbase.sh

http://10.10.71.1:60030/regionserver.jsp

十一月 18th, 2010Hadoop集群的NameNode的备份

<name>dfs.name.dir</name>
<value>/pvdata/hadoopdata/name/,/opt/hadoopdata/name/</value>
</property>

fs.checkpoint.size定义了edits日志文件的最大值，一旦超过这个值会导致强制执行检查点（即使没到检查点的最大时间间隔）。默认值是64MB。

<name>fs.checkpoint.dir</name>
<value>/opt/hadoopdata/secondname,/pvdata/hadoopdata/secondname</value>
<description>Determines where on the local filesystem the DFS secondary
      name node should store the temporary images to merge.
      If this is a comma-delimited list of directories then the image is
      replicated in all of the directories for redundancy.
</description>
</property>

十一月 17th, 2010惊天大悲剧-Hadoop的rmr和trash

首先一定要控制好hadoop上各用户的权限，使各user只能操作自己的目录尽量少用hadoop的超级用户进行操作，可以减少误操作hadoop的rm和rmr命令，设计的太BT了，连一个确认提示都没有，直接就删除了。看到有人给官方提了这个建议，但人家回复说：已经有了trash机制了，所以不需要提示，真是无语….hadoop的trash功能：很遗憾，之前没有配置trash，所以就直接给删除了，经过这次误操作，赶紧配置上trash，并设置保留时间为7天。

<name>fs.trash.interval</name>
<value>10080</value>
<description>
Number of minutes between trash checkpoints. If zero, the trash feature is disabled
</description>
</property>

hadoop fs -put *.txt /user/oplog/test
hadoop fs -rmr /user/oplog/test
hadoop fs -ls /user/oplog/.Trash/Current/user/oplog
    drwxr-xr-x   – oplog oplog          0 2010-11-16 10:44 /user/oplog/.Trash/Current/user/oplog/test
hadoop fs -mv /user/oplog/.Trash/Current/user/oplog/test /user/oplog/
hadoop fs -ls /user/oplog/.Trash/Current/user/oplog
    drwxr-xr-x   – oplog oplog          0 2010-11-16 10:44 /user/oplog/.Trash/Current/user/oplog/test
    drwxr-xr-x   – oplog oplog          0 2010-11-16 10:47 /user/oplog/.Trash/Current/user/oplog/test.1

九月 19th, 2010hadoop中mapred.tasktracker.map.tasks.maximum的设置

    <name>mapred.tasktracker.map.tasks.maximum</name>
    <value>8</value>
    <description>The maximum number of map tasks that will be run
    simultaneously by a task tracker.
    </description>
</property>

九月 14th, 2010从集群外操作Hadoop和Hive

export JAVA_HOME=/opt/java/jdk export HADOOP_CONF_DIR=/opt/sohuhadoop/conf export HADOOP_HOME=/opt/sohuhadoop/hadoop export HIVE_HOME=/opt/sohuhadoop/hive vi /etc/hosts 10.10.1.1 hadoop-master. hadoop-master

九月 13th, 2010Hadoop和Hive的权限问题

/ hadoop fs -chmod -R // hadoop fs -chown -R : //

为不同日志的MetaStore在Mysql建立不同的数据库为不同的用户建立单独的conf目录，如用户test的hive conf目录位于/opt/sohuhadoop/hive/conf/test下在单独的test目录下，修改hive-default.xml文件，配置相应的db启动单独的hiveserver实例，并监听不同的端口：HIVE_PORT=10020 nohup hive –config $HIVE_HOME/conf/test –service hiveserver &在JDBC中连接自己对应的端口，如10020

九月 2nd, 2010搭建基于Eclipse的Hadoop测试环境

下载

package importimportimportimport publicclass MapperTest extends MapperObject, Text, Text, IntWritableprivatefinalstatic IntWritable one new IntWritable publicvoid map key, Text value, contextthrows, userid value.. context.new Textuserid, new IntWritable package importimportimportimport publicclass ReducerTest extends ReducerText, IntWritable, Text, IntWritable private IntWritable result new IntWritable publicvoid reduceText key, IterableIntWritable values, contextthrows, int sum forIntWritabl val values sum val. result.sum context.key, result package importimportimportimportimportimportimportimportimportimport publicclass DriverTest publicstaticvoid main argsthrows Configuration conf new Configuration otherArgs new GenericOptionsParserconf, args .ifotherArgs.... Job job new Jobconf, job.DriverTest.class job.MapperTest.class job.ReducerTest.class job.ReducerTest.class job.Text.class job.IntWritable.class conf., true conf., GzipCodec.class,CompressionCodec.class FileInputFormat.job, new PathotherArgs FileOutputFormat.job, new PathotherArgs .job.true

本站文章为和通数据库网友分享或者投稿，欢迎任何形式的转载，但请务必注明出处.
同时文章内容如有侵犯了您的权益，请联系QQ：970679559，我们会在尽快处理。

返回首页

暂无相关文章

评论暂时关闭