内容简介:greenplum集群启动失败问题分析
开发同事跟我说,测试环境的greenplun突然连接不上了,于是我登陆进去服务器,发现没有greenplun进程了,问开发同事是否有对greenplumn有过改动之类的,他们说没有动过,这就奇了怪了,咋回事呢?
自己手动尝试下gpstart启动报错
[gpadmin@00_mdw ~]$ gpstart 20170517:10:53:59:017586 gpstart:00_mdw:gpadmin-[INFO]:-Starting gpstart with args: 20170517:10:53:59:017586 gpstart:00_mdw:gpadmin-[INFO]:-Gathering information and validating the environment... 20170517:10:53:59:017586 gpstart:00_mdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.10.0 build commit: f413ff3b006655f14b6b9aa217495ec94da5c96c' 20170517:10:53:59:017586 gpstart:00_mdw:gpadmin-[INFO]:-Greenplum Catalog Version: '201310150' 20170517:10:53:59:017586 gpstart:00_mdw:gpadmin-[INFO]:-Starting Master instance in admin mode 20170517:10:54:01:017586 gpstart:00_mdw:gpadmin-[CRITICAL]:-Failed to start Master instance in admin mode 20170517:10:54:01:017586 gpstart:00_mdw:gpadmin-[CRITICAL]:-Error occurred: non-zero rc: 1 Command was: 'env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /home/gpadmin/gpdata/gpmaster/gpseg-1 -l /home/gpadmin/gpdata/gpmaster/gpseg-1/pg_log/startup.log -w -t 600 -o " -p 5432 -b 1 -z 0 --silent-mode=true -i -M master -C -1 -x 0 -c gp_role=utility " start' rc=1, stdout='waiting for server to start...... stopped waiting ', stderr='pg_ctl: PID file "/home/gpadmin/gpdata/gpmaster/gpseg-1/postmaster.pid" does not exist pg_ctl: could not start server Examine the log output. ' [gpadmin@00_mdw ~]$
日志信息比较简单,没有看出来啥有用的信息,砸破呢?
2017-05-16 11:18:20.666964 CST,,,p16542,th251283232,,,,0,,,seg-1,,,,,"LOG","00000","removing all temporary files",,,,,,,,"RemovePgTempFiles","fd.c",1873, 2017-05-16 11:18:20.692596 CST,,,p16542,th251283232,,,,0,,,seg-1,,,,,"LOG","00000","temporary files using default filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2569, 2017-05-16 11:18:20.693209 CST,,,p16542,th251283232,,,,0,,,seg-1,,,,,"LOG","00000","transaction files using default pg_system filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2629, 2017-05-16 13:27:17.059691 CST,,,p16630,th930637600,,,,0,,,seg-1,,,,,"LOG","00000","removing all temporary files",,,,,,,,"RemovePgTempFiles","fd.c",1873, 2017-05-16 13:27:17.062897 CST,,,p16630,th930637600,,,,0,,,seg-1,,,,,"LOG","00000","temporary files using default filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2569, 2017-05-16 13:27:17.063528 CST,,,p16630,th930637600,,,,0,,,seg-1,,,,,"LOG","00000","transaction files using default pg_system filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2629, 2017-05-17 10:53:59.610428 CST,,,p17597,th695740192,,,,0,,,seg-1,,,,,"LOG","00000","removing all temporary files",,,,,,,,"RemovePgTempFiles","fd.c",1873, 2017-05-17 10:53:59.643630 CST,,,p17597,th695740192,,,,0,,,seg-1,,,,,"LOG","00000","temporary files using default filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2569, 2017-05-17 10:53:59.644220 CST,,,p17597,th695740192,,,,0,,,seg-1,,,,,"LOG","00000","transaction files using default pg_system filespace",,,,,,,,"primaryMirrorPopulateFilespaceInfo","primary_mirror_mode.c",2629,
去日志目录下面去查看所有的日志记录,看到最新的有一个.csv文件,gpdb-2017-05-17_112454.csv
博客来源地址: http://blog.csdn.net/mchdba/article/details/72383684 ,作者为mchdba黄杉,谢绝转载。
[gpadmin@00_mdw pg_log]$ ll -t total 740 -rw-------. 1 gpadmin gpadmin 386 May 17 11:24 gpdb-2017-05-17_112454.csv -rw-------. 1 gpadmin gpadmin 3951 May 17 11:24 startup.log -rw-------. 1 gpadmin gpadmin 384 May 17 10:53 gpdb-2017-05-17_105359.csv -rw-------. 1 gpadmin gpadmin 384 May 16 13:27 gpdb-2017-05-16_132717.csv -rw-------. 1 gpadmin gpadmin 384 May 16 11:18 gpdb-2017-05-16_111820.csv -rw-------. 1 gpadmin gpadmin 30004 May 16 11:17 gpdb-2017-05-16_000000.csv -rw-------. 1 gpadmin gpadmin 0 May 15 00:00 gpdb-2017-05-15_000000.csv -rw-------. 1 gpadmin gpadmin 0 May 14 00:00 gpdb-2017-05-14_000000.csv -rw-------. 1 gpadmin gpadmin 0 May 13 00:00 gpdb-2017-05-13_000000.csv -rw-------. 1 gpadmin gpadmin 0 May 12 00:00 gpdb-2017-05-12_000000.csv -rw-------. 1 gpadmin gpadmin 0 May 11 00:00 gpdb-2017-05-11_000000.csv -rw-------. 1 gpadmin gpadmin 0 May 10 00:00 gpdb-2017-05-10_000000.csv -rw-------. 1 gpadmin gpadmin 13073 May 9 21:14 gpdb-2017-05-09_000000.csv -rw-------. 1 gpadmin gpadmin 18458 May 8 11:38 gpdb-2017-05-08_000000.csv -rw-------. 1 gpadmin gpadmin 0 May 7 00:00 gpdb-2017-05-07_000000.csv [gpadmin@00_mdw pg_log]$ more gpdb-2017-05-17_112454.csv 2017-05-17 11:24:54.936656 CST,,,p17681,th-400611552,,,,0,,,seg-1,,,,,"LOG","F0000","invalid authentication method ""127.0.0.1/28""",,,,,"line 87 of configuration file ""/home/gpadmin/gpdata/gpmaster/gpseg-1/pg_hba.conf""",,0,,"hba.c",1095, 2017-05-17 11:24:54.936871 CST,,,p17681,th-400611552,,,,0,,,seg-1,,,,,"FATAL","XX000","could not load pg_hba.conf",,,,,,,0,,"postmaster.c",1529, [gpadmin@00_mdw pg_log]$
看到gpdb-2017-05-17_112454.csv文件里面描述的很清晰,是pg_hba.conf配置文件有误,然后去找配置文件/home/gpadmin/gpdata/gpmaster/gpseg-1/pg_hba.conf,注释掉报错的那一行【line 87 of configuration file 】”127.0.0.1/28”“
#local all all 127.0.0.1/28 trust
然后再次启动greenplum集群,ok,可以启动起来了
[gpadmin@00_mdw pg_log]$ gpstart 20170517:11:28:20:017745 gpstart:00_mdw:gpadmin-[INFO]:-Starting gpstart with args: 20170517:11:28:20:017745 gpstart:00_mdw:gpadmin-[INFO]:-Gathering information and validating the environment... 20170517:11:28:20:017745 gpstart:00_mdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.10.0 build commit: f413ff3b006655f14b6b9aa217495ec94da5c96c' 20170517:11:28:20:017745 gpstart:00_mdw:gpadmin-[INFO]:-Greenplum Catalog Version: '201310150' 20170517:11:28:20:017745 gpstart:00_mdw:gpadmin-[INFO]:-Starting Master instance in admin mode 20170517:11:28:21:017745 gpstart:00_mdw:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information 20170517:11:28:21:017745 gpstart:00_mdw:gpadmin-[INFO]:-Obtaining Segment details from master... 20170517:11:28:21:017745 gpstart:00_mdw:gpadmin-[INFO]:-Setting new master era 20170517:11:28:21:017745 gpstart:00_mdw:gpadmin-[INFO]:-Master Started... 20170517:11:28:21:017745 gpstart:00_mdw:gpadmin-[INFO]:-Shutting down master 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Skipping startup of segment marked down in configuration: on 02_sdw directory /home/gpadmin/gpdata/gpdatam1/gpseg0 <<<<< 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Skipping startup of segment marked down in configuration: on 02_sdw directory /home/gpadmin/gpdata/gpdatam2/gpseg1 <<<<< 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Skipping startup of segment marked down in configuration: on 01_sdw directory /home/gpadmin/gpdata/gpdatam1/gpseg4 <<<<< 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Skipping startup of segment marked down in configuration: on 01_sdw directory /home/gpadmin/gpdata/gpdatam2/gpseg5 <<<<< 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:--------------------------- 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Master instance parameters 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:--------------------------- 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Database = template1 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Master Port = 5432 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Master directory = /home/gpadmin/gpdata/gpmaster/gpseg-1 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Timeout = 600 seconds 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Master standby = Off 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:--------------------------------------- 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:-Segment instances that will be started 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:--------------------------------------- 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- Host Datadir Port Role 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 01_sdw /home/gpadmin/gpdata/gpdatap1/gpseg0 40000 Primary 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 01_sdw /home/gpadmin/gpdata/gpdatap2/gpseg1 40001 Primary 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 02_sdw /home/gpadmin/gpdata/gpdatap1/gpseg2 40000 Primary 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 03_sdwm /home/gpadmin/gpdata/gpdatam1/gpseg2 50000 Mirror 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 02_sdw /home/gpadmin/gpdata/gpdatap2/gpseg3 40001 Primary 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 03_sdwm /home/gpadmin/gpdata/gpdatam2/gpseg3 50001 Mirror 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 03_sdwm /home/gpadmin/gpdata/gpdatap1/gpseg4 40000 Primary 20170517:11:28:23:017745 gpstart:00_mdw:gpadmin-[INFO]:- 03_sdwm /home/gpadmin/gpdata/gpdatap2/gpseg5 40001 Primary Continue with Greenplum instance startup Yy|Nn (default=N): > y 20170517:11:28:25:017745 gpstart:00_mdw:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait... ... 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:-Process results... 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:----------------------------------------------------- 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:- Successful segment starts = 8 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:- Failed segment starts = 0 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Skipped segment starts (segments are marked down in configuration) = 4 <<<<<<<< 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:----------------------------------------------------- 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:- 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:-Successfully started 8 of 8 segment instances, skipped 4 other segments 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:----------------------------------------------------- 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-**************************************************************************** 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-There are 4 segment(s) marked down in the database 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-To recover from this current state, review usage of the gprecoverseg 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-management utility which will recover failed segment instance databases. 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[WARNING]:-**************************************************************************** 20170517:11:28:28:017745 gpstart:00_mdw:gpadmin-[INFO]:-Starting Master instance 00_mdw directory /home/gpadmin/gpdata/gpmaster/gpseg-1 20170517:11:28:29:017745 gpstart:00_mdw:gpadmin-[INFO]:-Command pg_ctl reports Master 00_mdw instance active 20170517:11:28:30:017745 gpstart:00_mdw:gpadmin-[INFO]:-No standby master configured. skipping... 20170517:11:28:30:017745 gpstart:00_mdw:gpadmin-[WARNING]:-Number of segments not attempted to start: 4 20170517:11:28:30:017745 gpstart:00_mdw:gpadmin-[INFO]:-Check status of database with gpstate utility [gpadmin@00_mdw pg_log]$
bty有意思的是greenplum的关键报错信息竟然不在log日志里面,而是记录在了同目录的csv文件里面,这大大惊呆我,哈哈。
最后问题分析,为啥这条127的配置,greenplum就起不起来了呢,去查看pg_hba.conf文件,猜测原因有如下情况:
(1)因为已经有了一个127.0.0.1/28的配置了,导致相互冲突了
[gpadmin@00_mdw ~]$ more /home/gpadmin/gpdata/gpmaster/gpseg-1/pg_hba.conf |grep 127 host all gpadmin 127.0.0.1/28 trust #local all all 127.0.0.1/28 trust [gpadmin@00_mdw ~]$
(2)local后面只能跟ident之类的配置,不能跟127…..trust的配置
[gpadmin@00_mdw ~]$ more /home/gpadmin/gpdata/gpmaster/gpseg-1/pg_hba.conf |grep local |grep -v "#" local all gpadmin ident local replication gpadmin ident #local all all 127.0.0.1/28 trust [gpadmin@00_mdw ~]$
以上所述就是小编给大家介绍的《greenplum集群启动失败问题分析》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:- greenplum 集群启动失败
- 快速失败机制 & 失败安全机制
- 通过不断地失败来避免失败,携程混沌工程实践
- 快速失败(fail-fast)和安全失败(fail-safe)
- Nginx 失败重试机制
- 一次换工作的失败总结
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Thirty-three Miniatures
Jiří Matoušek / American Mathematical Socity / 2010-6-18 / USD 24.60
This volume contains a collection of clever mathematical applications of linear algebra, mainly in combinatorics, geometry, and algorithms. Each chapter covers a single main result with motivation and......一起来看看 《Thirty-three Miniatures》 这本书的介绍吧!