内容简介:本篇是在上一篇为 “为 “
一、前期准备
本篇是在上一篇 https://segmentfault.com/a/11... 的基础上进行的操作。
1.1 修改/添加部分
#修改主机名 hostnamectl set-hostname hadoop104
#ssh免密钥登陆配置 ##删除现有的ssh信息 [admin@hadoop104 ~]$ cd ~/.ssh [admin@hadoop104 .ssh]$ rm -rf * ##然后不输入密码(直接按三次回车)生成私钥和公钥 [admin@hadoop104 .ssh]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/admin/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/admin/.ssh/id_rsa. Your public key has been saved in /home/admin/.ssh/id_rsa.pub. The key fingerprint is: SHA256:kjL6k939tz4wgrdYIlA7/r5EgzGVJ12YlQB7BfkNMSU admin@hadoop104 The key's randomart image is: +---[RSA 2048]----+ | o+oOE+. | | ..o.*.oo | | .o..o.. o | | . o= . . . | | oo+.S. | | . oooo.+ o | | . o +.* o o | | .o ..+ o o | | .. .o. ..ooo | +----[SHA256]-----+ [admin@hadoop104 .ssh]$ ll 总用量 8 -rw------- 1 admin admin 1675 4月 2 21:26 id_rsa #id_rsa为私钥文件 -rw-r--r-- 1 admin admin 397 4月 2 21:26 id_rsa.pub #id_rsa.pub为公钥文件 ##将公钥发送给从节点hadoop104 [admin@hadoop104 .ssh]$ ssh-copy-id hadoop104 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/admin/.ssh/id_rsa.pub" The authenticity of host 'hadoop104 (192.168.119.104)' can't be established. ECDSA key fingerprint is SHA256:X25gXFFr2vsKVxn7LLOpQtYBb1OHOmRGj9XmJpQQ9Vs. ECDSA key fingerprint is MD5:d6:55:be:36:9b:b6:33:f7:4d:75:5a:c5:40:89:a1:7c. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys admin@hadoop104's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'hadoop104'" and check to make sure that only the key(s) you wanted were added. ##然后就可以了 [admin@hadoop104 .ssh]$ ssh hadoop104 Last login: Thu Apr 4 10:48:25 2019 from hadoop104
为 “ 二、实际操作-2.1 HDFS上运行MapReduce 程序 ” 进行的配置,配置完请先进行 2.1 HDFS上运行MapReduce 程序 操作。
# core-site.xml vi /opt/module/hadoop-3.1.1/etc/hadoop/core-site.xml <!-- 指定HDFS中NameNode的地址 --> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop104:9000</value> </property> <!-- 指定hadoop运行时产生文件的存储目录 --> <property> <name>hadoop.tmp.dir</name> <value>/opt/module/hadoop-3.1.1/data/tmp</value> </property> # hdfs-site.xml vi /opt/module/hadoop-3.1.1/etc/hadoop/hdfs-site.xml <!-- 指定HDFS副本的数量 --> <property> <name>dfs.replication</name> <value>1</value> </property>
为 “ 二、实际操作-2.2 YARN上运行MapReduce 程序 ” 进行的配置,配置完进行 2.2 YARN上运行MapReduce 程序 操作。
#配置yarn-site.xml <!-- reducer获取数据的方式 --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 指定YARN的ResourceManager的地址 --> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop104</value> </property> #配置mapred-site.xml <!-- 指定mr运行在yarn上 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
二、实际操作
2.1 HDFS上运行MapReduce 程序
#格式化namenode(第一次启动时格式化,以后不要总格式化) [admin@centos104 hadoop-3.1.1]$ bin/hdfs namenode -format #启动 [admin@hadoop104 hadoop-3.1.1]$ sbin/start-dfs.sh #查看 [admin@hadoop104 hadoop-3.1.1]$ jps 14448 NameNode 14769 SecondaryNameNode 14571 DataNode 14892 Jps #浏览器查看HDFS文件系统,Hadoop3.0中namenode的默认端口配置发生变化:从50070改为9870 http://192.168.119.104:9870/dfshealth.html#tab-overview 或者 http://hadoop104:9870/dfshealth.html#tab-overview (本地windows需要配hosts) #在hdfs文件系统上创建一个input文件夹 [admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -mkdir -p /user/qianxkun/mapreduce/wordcount/input #将测试文件内容上传到文件系统上 [admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -put wcinput/wc.input /user/qianxkun/mapreduce/wordcount/input/ #查看上传的文件 [admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -ls /user/qianxkun/mapreduce/wordcount/input/ Found 1 items -rw-r--r-- 1 admin supergroup 47 2019-04-04 16:07 /user/qianxkun/mapreduce/wordcount/input/wc.input [admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -cat /user/qianxkun/mapreduce/wordcount/input/wc.input hadoop yarn hadoop mapreduce qianxkun qianxkun #在HDFS上运行mapreduce程序 [admin@hadoop104 hadoop-3.1.1]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar wordcount /user/qianxkun/mapreduce/wordcount/input/ /user/qianxkun/mapreduce/wordcount/output
查看输出结果
- 命令行查看
[admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -cat /user/qianxkun/mapreduce/wordcount/output/* hadoop 2 mapreduce 1 qianxkun 2 yarn 1
- 浏览器查看
#将测试文件内容下载到本地 [admin@hadoop104 hadoop-3.1.1]$ hadoop fs -get /user/qianxkun/mapreduce/wordcount/output/part-r-00000 ./wcoutput/ #删除输出结果 [admin@hadoop104 hadoop-3.1.1]$ hdfs dfs -rm -r /user/qianxkun/mapreduce/wordcount/output
2.2 YARN上运行MapReduce 程序
#启动 [admin@hadoop104 hadoop-3.1.1]$ sbin/start-yarn.sh #查看 [admin@hadoop104 hadoop-3.1.1]$ jps 14448 NameNode 14769 SecondaryNameNode 15939 ResourceManager 16374 Jps 14571 DataNode 16063 NodeManager #yarn的浏览器页面查看 http://192.168.119.104:8088/cluster 或者 http://hadoop104:8088/cluster #删除文件系统上的output文件 [admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -rm -R /user/qianxkun/mapreduce/wordcount/output Deleted /user/qianxkun/mapreduce/wordcount/output #执行mapreduce程序 [admin@hadoop104 hadoop-3.1.1]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar wordcount /user/qianxkun/mapreduce/wordcount/input /user/qianxkun/mapreduce/wordcount/output ##报错"错误: 找不到或无法加载主类 org.apache.hadoop.mapreduce.v2.app.MRAppMaster" ##解决 ###停止yarn [admin@hadoop104 hadoop-3.1.1]$ sbin/stop-yarn.sh ###将hadoop classpath 下内容配置到yarn-site.xml文件中 [admin@hadoop104 hadoop-3.1.1]$ hadoop classpath /opt/module/hadoop-3.1.1/etc/hadoop:/opt/module/hadoop-3.1.1/share/hadoop/common/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/common/*:/opt/module/hadoop-3.1.1/share/hadoop/hdfs:/opt/module/hadoop-3.1.1/share/hadoop/hdfs/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/hdfs/*:/opt/module/hadoop-3.1.1/share/hadoop/mapreduce/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/mapreduce/*:/opt/module/hadoop-3.1.1/share/hadoop/yarn:/opt/module/hadoop-3.1.1/share/hadoop/yarn/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/yarn/* [admin@hadoop104 hadoop-3.1.1]$ vi etc/hadoop/yarn-site.xml <property> <name>yarn.application.classpath</name> <value> /opt/module/hadoop-3.1.1/etc/hadoop:/opt/module/hadoop-3.1.1/share/hadoop/common/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/common/*:/opt/module/hadoop-3.1.1/share/hadoop/hdfs:/opt/module/hadoop-3.1.1/share/hadoop/hdfs/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/hdfs/*:/opt/module/hadoop-3.1.1/share/hadoop/mapreduce/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/mapreduce/*:/opt/module/hadoop-3.1.1/share/hadoop/yarn:/opt/module/hadoop-3.1.1/share/hadoop/yarn/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/yarn/* </value> </property> ##重新启动后执行成功 [admin@hadoop104 hadoop-3.1.1]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar wordcount /user/qianxkun/mapreduce/wordcount/input /user/qianxkun/mapreduce/wordcount/output #查看运行结果 [admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -cat /user/qianxkun/mapreduce/wordcount/output/* hadoop 2 mapreduce 1 qianxkun 2 yarn 1
三、历史服务配置启动查看
# 配置mapred-site.xml <property> <name>mapreduce.jobhistory.address</name> <value>hadoop104:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop104:19888</value> </property> #查看启动历史服务器文件目录: [admin@hadoop104 hadoop-3.1.1]$ ls sbin/ |grep mr mr-jobhistory-daemon.sh #启动历史服务器 [admin@hadoop104 hadoop-3.1.1]$ sbin/mr-jobhistory-daemon.sh start historyserver #查看历史服务器是否启动 [admin@hadoop104 hadoop-3.1.1]$ jps 19442 SecondaryNameNode 19800 NodeManager 19257 DataNode 19146 NameNode 19692 ResourceManager 20142 Jps 18959 JobHistoryServer #查看jobhistory http://192.168.119.104:19888/jobhistory 或者 http://hadoop104:19888/jobhistory
四、日志的聚集
#配置yarn-site.xml <!-- 日志聚集功能使能 --> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <!-- 日志保留时间设置7天 --> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>604800</value> </property> #启动hdfs、yarn和historymanager [admin@hadoop104 hadoop-3.1.1]$ sbin/start-dfs.sh [admin@hadoop104 hadoop-3.1.1]$ sbin/start-yarn.sh [admin@hadoop104 hadoop-3.1.1]$ sbin/mr-jobhistory-daemon.sh start historyserver #删除hdfs上已经存在的hdfs文件 [admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -rm -R /user/qianxkun/mapreduce/wordcount/output #执行wordcount程序 [admin@hadoop104 hadoop-3.1.1]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar wordcount /user/qianxkun/mapreduce/wordcount/input /user/qianxkun/mapreduce/wordcount/output 2019-04-04 18:23:59,294 INFO client.RMProxy: Connecting to ResourceManager at hadoop104/192.168.119.104:8032 2019-04-04 18:24:00,525 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/admin/.staging/job_1554373388828_0001 #查看日志 http://192.168.119.104:19888/jobhistory 或者 http://hadoop104:19888/jobhistory
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:- 分布式应用运行时 Dapr
- Dapr 1.0 发布,分布式应用运行时
- Flink 分布式运行时环境(Distributed Runtime Environment)
- 一键在本地搭建运行Istio 1.0的分布式Kubernetes集群
- 英特尔开源分布式深度学习平台 Nauta,使用 Kubernetes 和 Docker 平台运行
- 宜信开源|分布式任务调度平台SIA-TASK的架构设计与运行流程
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Looking For a Challenge
the University of Warsaw / the University of Warsaw / 2012-11 / 0
LOOKING FOR PROGRAMMING CHALLENGES? Then this book is for you! Whether it's the feeling of great satisfaction from solving a complex problem set, or the frustration of being unable to solve a task,......一起来看看 《Looking For a Challenge》 这本书的介绍吧!