namenode HA配置:FsImage和edits log

栏目: 服务器 · 发布时间: 6年前

内容简介:The metadata in Hadoop NN consists of:If you list all files inside your NN workspace directory, you’ll see files include:When NN starts, Hadoop will load fsimage and apply all edit logs, and meanwhile do a lot of consistency checks, it’ll abort if the che

The metadata in Hadoop NN consists of:

  • fsimage: contains the complete state of the file system at a point in time
  • edit logs: contains each file system change (file creation/deletion/modification) that was made after the most recent fsimage.

If you list all files inside your NN workspace directory, you’ll see files include:

fsimage_0000000000000000000 (fsimage) fsimage_0000000000000000000.md5 edits_0000000000000003414-0000000000000003451 (edit logs, there’re many ones with different name) seen_txid (a separated file contains last seen transaction id)

When NN starts, Hadoop will load fsimage and apply all edit logs, and meanwhile do a lot of consistency checks, it’ll abort if the check failed. Let’s make it happen, I’ll rm edits_0000000000000000001-0000000000000000002 from many of my edit logs in my NN workspace, and then try to sbin/start-dfs.sh, I’ll get error message in log like:

java.io.IOException: There appears to be a gap in the edit log. We expected txid 1, but got txid 3.

So your error message indicates that your edit logs is inconsitent(may be corrupted or maybe some of them are missing). If you just want to play hadoop on your local and don’t care its data, you could simply hadoop namenode -format to re-format it and start from beginning, otherwise you need to recovery your edit logs, from SNN or somewhere you backed up before.

通常在HA配置时,需要先启动journalnode,之后运行:

hdfs namenode -initializeSharedEdits

来把namenode的edits log同步到各个journalnode节点上,journalnode只负责存储edits log,不存FsImage。 经过我的操作并观察发现,这个同步不一定能跟namenode完全一致,不明白这程序同步的依据是什么,同步之后journalnode和namenode的edits_inprogress会不一样,不过也不用担心,也不需要额外操作,过一会儿edits_inprogress就会一致了。 还有一点要说,经过hdfs namenode -initializeSharedEdits命令同步的edits log只是最近的一部分,如果集群之前是非HA的运行过一段时间的,edits log会有很多也许有几千个,那么它只会同步最近一段时间的一部分edits log。

暂时先这些。


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

OKR:源于英特尔和谷歌的目标管理利器

OKR:源于英特尔和谷歌的目标管理利器

(美) 保罗R.尼文(Paul R. Niven)、本•拉莫尔特(Ben Lamorte) / 况阳 / 机械工业出版社 / 2017-8-1 / 59.00元

内在动机驱动,而非绩效考核驱动 尤其适用快速扩张和转型期组织 谷歌、英特尔、领英、推特、星佳等硅谷知名企业成功的法宝 OKR(目标与关键结果法)是一套严密的思考框架和持续的纪律要求,旨在确保员工紧密协作,把精力聚焦在能促进组织成长的、可衡量的贡献上。 如何更好地将OKR集成到企业现有的绩效评估体系中? 如何确保OKR由高管团队来领导,而不仅仅是HR、IT或财务等职能部......一起来看看 《OKR:源于英特尔和谷歌的目标管理利器》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

URL 编码/解码
URL 编码/解码

URL 编码/解码