Hadoop集群部署实战(cdh发行版)

栏目: 服务器 · 发布时间: 7年前

内容简介:Hadoop集群部署实战(cdh发行版)

一、概要

由于工作需要,最近一段时间开始接触学习hadoop相关的东西,目前公司的实时任务和离线任务都跑在一个hadoop集群,离线任务的特点就是每天定时跑,任务跑完了资源就空闲了,为了合理的利用资源,我们打算在搭一个集群用于跑离线任务,计算节点和储存节点分离,计算节点结合aws的Auto Scaling(自动扩容、缩容服务)以及竞价实例,动态调整,在跑任务的时候拉起一批实例,任务跑完就自动释放掉服务器,本文记录下hadoop集群的搭建过程,方便自己日后查看,也希望能帮到初学者,本文所有软件都是通过yum安装,大家也可以下载相应的二进制文件进行安装,使用哪种方式安装,从属个人习惯。

二、环境

1、角色介绍

10.10.103.246 NameNode zkfc journalNode QuorumaPeerMain DataNode ResourceManager NodeManager WebAppProxyServer JobHistoryServer
10.10.103.144 NameNode zkfc journalNode QuorumaPeerMain DataNode ResourceManager NodeManager WebAppProxyServer
10.10.103.62       zkfc journalNode QuorumaPeerMain DataNode             NodeManager

2、基础环境说明

a、系统版本

我们用的是aws的ec2,用的aws自己定制过的系统,不过和redhat基本相同,内核版本:4.9.20-10.30.amzn1.x86_64

b、 java 版本

java version "1.8.0_121"

c、hadoop版本

hadoop-2.6.0

d、cdh版本

cdh5.11.0

e、关于主机名,因为我这里用的aws的ec2,默认已有主机名,并且内网可以解析,故就不单独做主机名的配置了,如果你的主机名内网不能解析,请一定要配置主机名,集群内部通讯很多组件使用的是主机名

三、配置部署

1、设置yum源

vim /etc/yum.repos.d/cloudera.repo
   
[cloudera-cdh5-11-0]
# Packages for Cloudera's Distribution for Hadoop, Version 5.11.0, on RedHat or CentOS 6 x86_64
name=Cloudera's Distribution for Hadoop, Version 5.11.0
baseurl=http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.11.0/
gpgkey=http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera   
gpgcheck=1
[cloudera-gplextras5b2]
# Packages for Cloudera's GPLExtras, Version 5.11.0, on RedHat or CentOS 6 x86_64
name=Cloudera's GPLExtras, Version 5.11.0
baseurl=http://archive.cloudera.com/gplextras5/redhat/6/x86_64/gplextras/5.11.0/
gpgkey=http://archive.cloudera.com/gplextras5/redhat/6/x86_64/gplextras/RPM-GPG-KEY-cloudera   
gpgcheck=1

PS:我这里安装的5.11.0,如果想安装低版本或者高版本,根据自己的需求修改版本号即可

2、安装配置zookeeper集群

yum -y  install zookeeper zookeeper-server
vi /etc/zookeeper/conf/zoo.cfg

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/data/zookeeper
clientPort=2181
maxClientCnxns=0
server.1=10.10.103.144:2888:3888
server.2=10.10.103.226:2888:3888
server.3=10.10.103.62:2888:3888
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
mkdir /data/zookeeper           #创建datadir目录
/etc/init.d/zookeeper-server init    #所有节点先初始化
echo 1 > /data/zookeeper/myid      #10.10.103.144上操作
echo 2 > /data/zookeeper/myid      #10.10.103.226上操作
echo 3 > /data/zookeeper/myid      #10.10.103.62上操作
/etc/init.d/zookeeper-server       #启动服务
/usr/lib/zookeeper/bin/zkServer.sh status  #查看所有节点状态,其中只有一个节点是Mode: leader就正常 了

3、安装

a、10.10.103.246和10.10.103.144安装

yum -y install hadoop hadoop-client hadoop-hdfs hadoop-hdfs-namenode hadoop-hdfs-zkfc hadoop-hdfs-journalnode hadoop-hdfs-datanode hadoop-mapreduce-historyserver hadoop-yarn-nodemanager hadoop-yarn-proxyserver  hadoop-yarn hadoop-mapreduce hadoop-yarn-resourcemanager hadoop-lzo* impala-lzo

b、10.10.103.62上安装

yum -y install hadoop hadoop-client hadoop-hdfs hadoop-hdfs-journalnode hadoop-hdfs-datanode  hadoop-lzo* impala-lzo hadoop-yarn hadoop-mapreduce hadoop-yarn-nodemanager

PS:

1、一般小公司,计算节点(ResourceManager)和储存节点(NameNode)的主节点部署在两台服务器上做HA,计算节点(NodeManager)和储存节点(DataNode)部署在多台服务器上,每台服务器上都启动NodeManager和DataNode服务。

2、如果大集群,可能需要计算资源和储存资源分离,集群的各个角色都有服务器单独部署,个人建议划分如下:

a、储存节点

NameNode:

需要安装hadoop hadoop-client hadoop-hdfs hadoop-hdfs-namenode hadoop-hdfs-zkfc hadoop-lzo* impala-lzo

DataNode:

需要安装hadoop hadoop-client hadoop-hdfs hadoop-hdfs-datanode  hadoop-lzo* impala-lzo

QJM集群:

需要安装hadoop hadoop-hdfs  hadoop-hdfs-journalnode zookeeper zookeeper-server

b、计算节点

ResourceManager:

需要安装hadoop hadoop-client hadoop-yarn hadoop-mapreduce hadoop-yarn-resourcemanager

WebAppProxyServer:

需要安装 hadoop hadoop-yarn hadoop-mapreduce hadoop-yarn-proxyserver

JobHistoryServer:

需要安装 hadoop hadoop-yarn hadoop-mapreduce hadoop-mapreduce-historyserver

NodeManager:

需要安装hadoop hadoop-client hadoop-yarn hadoop-mapreduce hadoop-yarn-nodemanager

4、配置

a、创建目录并设置权限

mkdir -p /data/hadoop/dfs/nn             #datanode上操作
chown hdfs:hdfs /data/hadoop/dfs/nn/ -R  #datanode上操作
mkdir -p /data/hadoop/dfs/dn             #namenode上操作
chown hdfs:hdfs /data/hadoop/dfs/dn/ -R  #namenode上操作
mkdir -p /data/hadoop/dfs/jn             #journalnode上操作
chown hdfs:hdfs /data/hadoop/dfs/jn/ -R  #journalnode上操作
mkdir /data/hadoop/yarn -p               #nodemanager上操作
chown yarn:yarn  /data/hadoop/yarn  -R   #nodemanager上操作

b、撰写配置文件

5、服务启动

a、启动journalnode(三台服务器上都启动)

/etc/init.d/hadoop-hdfs-journalnode start

b、格式化namenode(在其中一台namenode10.10.103.246上操作)

sudo -u hdfs hadoop namenode -format

c、初始化zk中HA的状态(在其中一台namenode10.10.103.246上操作)

sudo -u hdfs hdfs zkfc -formatZK

d、初始化共享Edits文件(在其中一台namenode10.10.103.246上操作)

sudo -u hdfs hdfs namenode -initializeSharedEdits

e、启动10.10.103.246上namenode

/etc/init.d/hadoop-hdfs-namenode start

f、同步源数据并启动10.10.103.144上namenode

sudo -u hdfs hdfs namenode -bootstrapStandby
/etc/init.d/hadoop-hdfs-namenode start

g、在两台namenode上启动zkfc

/etc/init.d/hadoop-hdfs-zkfc start

h、启动datanode(所有机器上操作)

/etc/init.d/hadoop-hdfs-journalnode start

i、在10.10.103.246上启动WebAppProxyServer、JobHistoryServer、httpfs

/etc/init.d/hadoop-yarn-proxyserver start
/etc/init.d/hadoop-mapreduce-historyserver start
/etc/init.d/hadoop-httpfs start

j、在所有机器上启动nodemanager

/etc/init.d/hadoop-yarn-nodemanager restart

四、功能验证

1、hadoop功能

a、查看hdfs根目录

[root@ip-10-10-103-246 ~]# hadoop fs -ls /
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
Found 3 items
drwxr-xr-x   - hdfs   hdfs          0 2017-05-11 11:40 /tmp
drwxrwx---   - mapred hdfs          0 2017-05-11 11:28 /user
drwxr-xr-x   - yarn   hdfs          0 2017-05-11 11:28 /var

b、上传一个文件到根目录

[root@ip-10-10-103-246 ~]# hadoop fs -put /tmp/test.txt  /
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
[root@ip-10-10-103-246 ~]# hadoop fs -ls /               
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
Found 4 items
-rw-r--r--   2 root   hdfs         22 2017-05-11 15:47 /test.txt
drwxr-xr-x   - hdfs   hdfs          0 2017-05-11 11:40 /tmp
drwxrwx---   - mapred hdfs          0 2017-05-11 11:28 /user
drwxr-xr-x   - yarn   hdfs          0 2017-05-11 11:28 /var

c、直接删除文件不放回收站

[root@ip-10-10-103-246 ~]# hadoop fs -rm -skipTrash /test.txt
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
Deleted /test.txt

d、跑一个wordcount用例

[root@ip-10-10-103-246 ~]# hadoop fs -put /tmp/test.txt /user/hdfs/rand/
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
[root@ip-10-10-103-246 conf]# sudo -u hdfs  hadoop  jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.11.0.jar wordcount /user/hdfs/rand/ /tmp
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
17/05/11 11:40:08 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to 10.10.103.246
17/05/11 11:40:09 INFO input.FileInputFormat: Total input paths to process : 1
17/05/11 11:40:09 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
17/05/11 11:40:09 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev 674c65bbf0f779edc3e00a00c953b121f1988fe1]
17/05/11 11:40:09 INFO mapreduce.JobSubmitter: number of splits:1
17/05/11 11:40:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1494472050574_0003
17/05/11 11:40:09 INFO impl.YarnClientImpl: Submitted application application_1494472050574_0003
17/05/11 11:40:09 INFO mapreduce.Job: The url to track the job: http://10.10.103.246:8100/proxy/application_1494472050574_0003/
17/05/11 11:40:09 INFO mapreduce.Job: Running job: job_1494472050574_0003
17/05/11 11:40:15 INFO mapreduce.Job: Job job_1494472050574_0003 running in uber mode : false
17/05/11 11:40:15 INFO mapreduce.Job:  map 0% reduce 0%
17/05/11 11:40:20 INFO mapreduce.Job:  map 100% reduce 0%
17/05/11 11:40:25 INFO mapreduce.Job:  map 100% reduce 100%
17/05/11 11:40:25 INFO mapreduce.Job: Job job_1494472050574_0003 completed successfully
17/05/11 11:40:25 INFO mapreduce.Job: Counters: 53
        File System Counters
                FILE: Number of bytes read=1897
                FILE: Number of bytes written=262703
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=6431
                HDFS: Number of bytes written=6219
                HDFS: Number of read operations=6
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Launched map tasks=1
                Launched reduce tasks=1
                Data-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=2592
                Total time spent by all reduces in occupied slots (ms)=5360
                Total time spent by all map tasks (ms)=2592
                Total time spent by all reduce tasks (ms)=2680
                Total vcore-milliseconds taken by all map tasks=2592
                Total vcore-milliseconds taken by all reduce tasks=2680
                Total megabyte-milliseconds taken by all map tasks=3981312
                Total megabyte-milliseconds taken by all reduce tasks=8232960
        Map-Reduce Framework
                Map input records=102
                Map output records=96
                Map output bytes=6586
                Map output materialized bytes=1893
                Input split bytes=110
                Combine input records=96
                Combine output records=82
                Reduce input groups=82
                Reduce shuffle bytes=1893
                Reduce input records=82
                Reduce output records=82
                Spilled Records=164
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=120
                CPU time spent (ms)=1570
                Physical memory (bytes) snapshot=501379072
                Virtual memory (bytes) snapshot=7842639872
                Total committed heap usage (bytes)=525860864
                Peak Map Physical memory (bytes)=300183552
                Peak Map Virtual memory (bytes)=3244224512
                Peak Reduce Physical memory (bytes)=201195520
                Peak Reduce Virtual memory (bytes)=4598415360
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=6321
        File Output Format Counters
                Bytes Written=6219
[root@ip-10-10-103-246 conf]#

2、namenode高可用验证

查看http://10.10.103.246:50070

Hadoop集群部署实战(cdh发行版)

查看http://10.10.103.144:50070

Hadoop集群部署实战(cdh发行版)

停掉10.10.103.246节点的namenode进程,查看10.10.103.144节点是否会提升为active节点

Hadoop集群部署实战(cdh发行版)

3、resourcemanager高可用验证

查看http://10.10.103.246:8088

Hadoop集群部署实战(cdh发行版)

查看http://10.10.103.144:8088

Hadoop集群部署实战(cdh发行版)

在浏览器输入http://10.10.103.144:8088,会跳转到 http://ip-10-10-103-246.ec2.internal:8088/ ,ip-10-10-103-246.ec2.internal是10.10.103.246的主机名,说明resourcemanager高可用配置ok,停掉10.10.103.144的

resourcemanager进程,在浏览器输入http://10.10.103.144:8088,就不会在跳转了,说明10.10.103.144已经切成了master。

五、总结

1、hadoop集群能成本部署完成,这才是开始,后期的维护,业务方问题的解决这些经验需要一点一点积累,多出差多折腾总是好的。

2、对应上面部署的集群后期需要扩容,直接把10.10.103.62这台机器做个镜像,用镜像启动服务器即可,服务会自动启动并且加入到集群

3、云上hadoop集群的成本优化,这里只针对aws而言

a、冷数据存在在s3上,hdfs可以直接支持s3,在hdfs-site.xml里面添加s3的key参数(fs.s3n.awsAccessKeyId和fs.s3n.awsSecretAccessKey)即可,需要注意的是程序上传、下载的逻辑需要多加几个重试机制,s3有时候不稳定会导致上传或者下载不成功

b、使用Auto Scaling服务结合竞价实例,配置扩展策略,比如当cpu大于50%的时候就扩容5台服务器,当cpu小于10%的时候就缩容5台服务器,当然你可以配置更多阶梯级的扩容、缩容策略,Auto Scaling还有一个计划任务的功能,你可以向设置crontab一样设置,让Auto Scaling帮你扩容、缩容服务器

本文出自 “�潘吭宋�男” 博客,谢绝转载!


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

算法的陷阱

算法的陷阱

阿里尔•扎拉奇 (Ariel Ezrachi)、莫里斯•E. 斯图克 (Maurice E. Stucke) / 余潇 / 中信出版社 / 2018-5-1 / CNY 69.00

互联网的存在令追求物美价廉的消费者与来自世界各地的商品只有轻点几下鼠标的距离。这诚然是一个伟大的科技进步,但却也是一个发人深思的商业现象。本书中,作者扎拉奇与斯图克将引领我们对由应用程序支持的互联网商务做出更深入的检视。虽然从表面上看来,消费者确是互联网商务兴盛繁荣过程中的获益者,可精妙的算法与数据运算同样也改变了市场竞争的本质,并且这种改变也非总能带来积极意义。 首当其冲地,危机潜伏于计算......一起来看看 《算法的陷阱》 这本书的介绍吧!

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具

随机密码生成器
随机密码生成器

多种字符组合密码

HEX HSV 转换工具
HEX HSV 转换工具

HEX HSV 互换工具