write down,forget
  • adidaseqt
  • eqtturbored
  • eqtsupport9317
  • eqtsupport
  • 9317adidas
  • adidaseqtboost9317
  • eqtsupport93
  • 9317eqt
  • eqt support 9317 adv
  • support 9317 adv
  • eqtadv
  • eqt9317
  • eqtadv9317
  • support93
  • originalseqt
  • adidas eqt
  • eqt support 9317
  • eqt support
  • eqt adv
  • eqt 9317
  • Hadoop 集群配置(centos\CDH3)

    <Category: Hadoop> 查看评论

    ref:http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
    https://docs.cloudera.com/display/DOC/CDH3+Installation

    记录下,许久不弄hadoop,都生疏了。

    ENV

    jdk1.6
    wget http://cds.sun.com/is-bin/INTERSHOP.enfinity/WFS/CDS-CDS_Developer-Site/en_US/-/USD/VerifyItem-Start/jdk-6u24-linux-x64-rpm.bin?BundledLineItemUUID=eE6J_hCxGTcAAAEuiVIWty2U&OrderID=TPKJ_hCxXJYAAAEudFIWty2U&ProductID=oSKJ_hCwOlYAAAEtBcoADqmS&FileName=/jdk-6u24-linux-x64-rpm.bin
    101  sudo rpm -Uvh ./jdk-6u24-linux-amd64.rpm

    复制java安装包到各slave
    1002  scp jdk-6u24-linux-amd64.rpm  platformA:/home/dev  安装java,注意版本

    配置主机别名,两台机器:platformA(slave)PlatformB(master)
    1003  sudo vi /etc/hosts
    10.129.8.58     platformA
    10.129.8.74     platformB
    master
    cd /etc/yum.repos.d/
    sudo wget http://archive.cloudera.com/redhat/cdh/cloudera-cdh3.repo

    106  sudo yum install -0.20
    108  sudo yum  install hadoop-0.20-namenode
    109  sudo yum  install hadoop-0.20-datenode
    110  sudo yum  install hadoop-0.20-datanode
    111  sudo yum  install hadoop-0.20-jobtracker
    112  sudo yum  install hadoop-0.20-tasktracker

    保证master和slave都使用相同的用户名
    sudo /usr/sbin/adduser dev
    passwd dev

    1014  ssh-copy-id -i $HOME/.ssh/id_rsa.pub dev@platformA
    1015  ssh dev@platformA

    1018  ssh-copy-id -i $HOME/.ssh/id_rsa.pub dev@platformB
    1019  ssh dev@platformB

    slave

    sudo rpm -Uvh ./jdk-6u24-linux-amd64.rpm

    cd /etc/yum.repos.d/
    sudo wget http://archive.cloudera.com/redhat/cdh/cloudera-cdh3.repo
    sudo yum install hadoop-0.20

    #install daemon
    sudo yum install hadoop-0.20-datanode
    sudo yum install hadoop-0.20-tasktracker

    sudo chmod -R 777 /usr/lib/hadoop/logs
    sudo chmod -R 777 /usr/lib/hadoop/pids

    cd /var/lib
    sudo mkdir hadoop-0.20
    sudo chmod 777 -R  hadoop-0.20

    修改配置 sudo vi conf/masters,添加master服务器
    platformB

    test
    149  hadoop-0.20 jar /usr/lib/hadoop-0.20/hadoop-*-examples.jar grep input output20 ‘con[a-z.]+’

    config

    A.platformB(master)

    1.conf/masters 此处设置master节点的ip,如果有secondmaster,也要添加,启动的时候,secondmaster由primarymaster来启动(执行bin/start-dfs.sh的节点自动成为primarymaster)
    platformB

    2.conf/slaves 设置slave的节点地址,当前我们是2个
    platformB
    platformA

    B.platformA/platformB(all machines)
    Important: You have to change the configuration files conf/core-site.xml, conf/mapred-site.xml and conf/hdfs-site.xml on ALL machines as follows.

    on platformB:
    拷贝配置样例
    cd conf.pseudo/
    sudo cp *.* ../../conf/

    稍作修改
    <!– In: conf/core-site.xml –>
    <?xml version=”1.0″?>
    <?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

    <configuration>
    <property>
    <name>fs.default.name</name>
    <value>hdfs://platformB:8020</value><!–此处用别名,需要和slave配置一样,并且注意hosts文件的配置,并且一个host只能指向一个ip,否则与slave通信失败–>
    </property>

    <property>
    <name>hadoop.tmp.dir</name>
    <value>/var/lib/hadoop-0.20/cache/${user.name}</value>
    </property>
    </configuration>

    <!– In: conf/mapred-site.xml –>

    <?xml version=”1.0″?>
    <?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

    <configuration>
    <property>
    <name>mapred.job.tracker</name>
    <value>platformB:8021</value>
    </property>

    <!– Enable Hue plugins –>
    <property>
    <name>mapred.jobtracker.plugins</name>
    <value>org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin</value>
    <description>Comma-separated list of jobtracker plug-ins to be activated.
    </description>
    </property>
    <property>
    <name>jobtracker.thrift.address</name>
    <value>0.0.0.0:9290</value>
    </property>
    </configuration>

    <!– In: conf/hdfs-site.xml –>
    根据需要修改dfs.replication 的数目

    集群
    copy以上几个配置文件到platformA
    1.格式化namenode
    创建cachedir(platformA、platformB)
    cd /var/lib
    sudo mkdir hadoop-0.20
    sudo chmod 777 -R  hadoop-0.20
    /usr/lib/hadoop/bin/hadoop namenode -format

    如果遇到类似如下错误:
    Re-format filesystem in /var/lib/hadoop-0.20/cache/hadoop/dfs/name ? (Y or N) y
    Format aborted in /var/lib/hadoop-0.20/cache/hadoop/dfs/name
    删除cache下的历史目录重新格式化即可

    2.启动集群(platformB)
    sudo chmod -R 777 /usr/lib/hadoop/logs
    sudo chmod -R 777 /usr/lib/hadoop/pids

    如果遇到类似错误
    touch: cannot touch `/usr/lib/hadoop/bin/../logs/hadoop-dev-datanode-platformA.out’: Permission denied
    在platformA上执行
    sudo chmod -R 777 /usr/lib/hadoop/logs
    sudo chmod -R 777 /usr/lib/hadoop/pids

    检查服务端口是否正常监听
    netstat -ano|grep 8020
    jps

    查看master的日志,确保master的dfs启动成功(platformB)
    cat logs/hadoop-dev-datanode-platformB.log

    查看slave的日志,确保slave的dfs启动成功(platformA)
    cat logs/hadoop-dev-datanode-platformA.log

    3.启动mapreduce服务
    bin/start-mapred.sh (on platformB , master)

    各种服务器,查看日志
    cat  logs/hadoop-dev-tasktracker-platformB.log
    cat  logs/hadoop-dev-tasktracker-platformA.log

    错误:
    2011-02-18 18:03:48,924 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: platformB/10.129.8.74:8021. Already tried 6 time(s).
    ip绑定的错误,同上处理
    至此,安装完毕,运行jps
    jps

    master:
    [dev@platformB hadoop]$ jps
    11805 JobTracker
    11095 DataNode
    11902 TaskTracker
    10992 NameNode
    11233 SecondaryNameNode

    slave:
    7938 DataNode
    8255 TaskTracker

    停止集群,与安装过程相反
    bin/stop-mapred.sh    on master
    bin/stop-dfs.sh            on master

    添加更多节点
    先按上面的安装配置slave,然后停掉集群,master服务器上的slave配置添加其他slave的ip地址,启动集群

    http://localhost:50030/ – web UI for MapReduce job tracker(s)
    http://localhost:50060/ – web UI for task tracker(s)
    http://localhost:50070/ – web UI for HDFS name node(s)

    几个webconsole地址,查看livenode和日志等信息:
    http://localhost:50030/ – web UI for MapReduce job tracker(s)
    http://localhost:50060/ – web UI for task tracker(s)
    http://localhost:50070/ – web UI for HDFS name node(s)

    在集群上执行MR任务

    创建input目录

    [dev@platformB hadoop]$ hadoop fs -mkdir input

    拷贝测试文件到dfs的input目录,此处为几个xml配置文件

    [dev@platformB hadoop]$ hadoop fs -copyFromLocal conf/*.xml input

    执行Mapreduce

    [dev@platformB hadoop]$ bin/hadoop jar hadoop-0.20.2+737-examples.jar wordcount   input/* output1

    如果遇到类似错误:

    16:17:48,327 INFO org.apache.hadoop.mapred.JobTracker: Removing task ‘attempt_201102201606_0001_m_000008_0’ 2011-02-20 16:17:51,203 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201102201606_0001_r_000002_0: Error initializing attempt_201102201606_0001_r_000002_0: java.lang.IllegalArgumentException: Wrong FS: hdfs://10.129.8.74/var/lib/hadoop-0.20/cache/dev/mapred/system/job_201102201606_0001/jobToken, expected: hdfs://platformB at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:385) at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:106) at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:162) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:515) at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:3968) at org.apache.hadoop.mapred.TaskTracker.localizeJobFiles(TaskTracker.java:1020) at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:967) at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2209) at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2174) 2011-02-20 16:17:51,204 ERROR org.apache.hadoop.mapred.TaskStatus: Trying to set finish time for task attempt_201102201606_0001_r_000002_0 when no start time is set, stackTrace is : java.lang.Exception at org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145) at org.apache.hadoop.mapred.ReduceTaskStatus.setFinishTime(ReduceTaskStatus.java:64) at org.apache.hadoop.mapred.TaskInProgress.incompleteSubTask(TaskInProgress.java:665) at org.apache.hadoop.mapred.JobInProgress.failedTask(JobInProgress.java:2729) at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:1069) at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:4481) at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:3455) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3154) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:528) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1319) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1315) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1313)

    请检查你的/etc/hosts文件的配置,如果一个hostname对应了多个ip,则会报错,此处的platformB上的配置同时指向了127.0.0.1和10.129.8.74,去掉127.0.0.1,重启集群

    本文来自: Hadoop 集群配置(centos\CDH3)

    
    eqt support adidas eqt support 93 primeknit og colorway ba7506 adidas eqt running 93 updated with primeknit construction adidas eqt boost 93 17 white turbo red adidas eqt support 9317 white turbo red adidas eqt support 93 17 adidas eqt support 9317 adidas eqt support 9317 turbo red releases tomorrow adidas originals adidas eqt tactile green pack adidas eqt tactile green pack adidas eqt light green pack womens adidas eqt light green pack coming soon adidas eqt milled leather pack release date adidas originals eqt milled leather pack adidas eqt support ultra boost turbo red white adidas adv support burnt orange grey where to buy the adidas eqt support 9317 turbo red adidas eqt boost 91 16 turbo red adidas eqt support 93 turbo red adidas eqt support 9317 white turbo red available now