PostgreSQL pg_ctl start超时分析

栏目: 数据库 · 发布时间: 5年前

内容简介:一、问题pg_ctl start启动时报错退出:pg_ctl:server did not start in time。超时时间是多少?从什么时候到哪个阶段算超时?二、分析:该信息打印位置,从后面代码段do_start函数中可以看出

一、问题

pg_ctl start启动时报错退出:pg_ctl:server did not start in time。超时时间是多少?从什么时候到哪个阶段算超时?

二、分析:该信息打印位置,从后面代码段do_start函数中可以看出

1、pg_ctl start调用start_postmaster启动PG的主进程后,每隔0.1ms检查一次postmaster.pid文件,是否已写入ready/standby

2、总共会检查600次,即从启动主进程后,最多等待60s,如果没有写入ready/standby则打印上述日志并退出

3、默认等待时间是60s,如果pg_ctl start -t指定等待时间,则等待时间为该指定时间

三、什么时候postmaster.pid文件写入ready/standby

1、如果是主机不管有没有设置hot standby

1)当startup进程恢复完成退出时,调用proc_exit函数向主进程发送SIGCHLD信号并退出

2)主进程接收到信号后,signal处理函数reaper调用AddToDataDirLockFile向postmaster.pid文件写入ready

2、如果是备机即data目录下有recovery.cnf文件,且设置了hot standby,在实际恢复前没有到达一致性位置

1)startup进程向主进程发送PMSIGNAL_RECOVERY_STARTED信号,主进程调用信号处理函数sigusr1_handler,将pmState=PM_RECOVERY

2)每次读取下一个xlog前都会调用CheckRecoveryConsistency函数进行一致性检查:

2.1 进入一致性状态,starup进程向主进程发送PMSIGNAL_BEGIN_HOT_STANDBY信号,主进程接收到信号后调用sigusr1_handler->AddToDataDirLockFile向postmaster.pid文件写入ready

3、如果是备机即data目录下有recovery.cnf文件,且设置了hot standby,在实际恢复前没有到达一致性位置

1)startup进程向主进程发送PMSIGNAL_RECOVERY_STARTED信号,主进程调用信号处理函数sigusr1_handler,将pmState=PM_RECOVERY

2)每次读取下一个xlog前都会调用CheckRecoveryConsistency函数进行一致性检查。如果没有进入一致性状态

3)本地日志恢复完成,切换日志源时同样调用CheckRecoveryConsistency函数进行一致性检查

3.1 进入一致性状态,starup进程向主进程发送PMSIGNAL_BEGIN_HOT_STANDBY信号,主进程接收到信号后调用sigusr1_handler->AddToDataDirLockFile向postmaster.pid文件写入ready

4、如果是备机即data目录下有recovery.cnf文件,且设置了hot standby,在实际恢复前到达一致性位置

1)startup进程向主进程发送PMSIGNAL_RECOVERY_STARTED信号,主进程调用信号处理函数sigusr1_handler,将pmState=PM_RECOVERY

2)CheckRecoveryConsistency函数进行一致性检查,向主进程发送PMSIGNAL_BEGIN_HOT_STANDBY信号,主进程接收到信号后调用sigusr1_handler->AddToDataDirLockFile向postmaster.pid文件写入ready

5、如果是备机即data目录下有recovery.cnf文件,没有设置hot standby

1)startup进程向主进程发送PMSIGNAL_RECOVERY_STARTED信号

2)主进程接收到信号后,向postmaster.将pmState=PM_RECOVERY

四、代码分析

1、pg_ctl start流程

do_start->

pm_pid = start_postmaster();

if (do_wait){

print_msg(_("waiting for server to start..."));

switch (wait_for_postmaster(pm_pid, false)){

case POSTMASTER_READY:

print_msg(_(" done\n"));

print_msg(_("server started\n"));

break;

case POSTMASTER_STILL_STARTING:

print_msg(_(" stopped waiting\n"));

write_stderr(_("%s: server did not start in time\n"), progname);

exit(1);

break;

case POSTMASTER_FAILED:

print_msg(_(" stopped waiting\n"));

write_stderr(_("%s: could not start server\n" "Examine the log output.\n"), progname);

exit(1);

break;

}

}else

print_msg(_("server starting\n"));

wait_for_postmaster->

for (i = 0; i < wait_seconds * WAITS_PER_SEC; i++){

if ((optlines = readfile(pid_file, &numlines)) != NULL && numlines >= LOCK_FILE_LINE_PM_STATUS){

pmpid = atol(optlines[LOCK_FILE_LINE_PID - 1]);

pmstart = atol(optlines[LOCK_FILE_LINE_START_TIME - 1]);

if (pmstart >= start_time - 2 && pmpid == pm_pid){

char      *pmstatus = optlines[LOCK_FILE_LINE_PM_STATUS - 1];

if (strcmp(pmstatus, PM_STATUS_READY) == 0 || strcmp(pmstatus, PM_STATUS_STANDBY) == 0){

/* postmaster is done starting up */

free_readfile(optlines);

return POSTMASTER_READY;

}

}

}

free_readfile(optlines);

if (waitpid((pid_t) pm_pid, &exitstatus, WNOHANG) == (pid_t) pm_pid)

return POSTMASTER_FAILED;

pg_usleep(USEC_PER_SEC / WAITS_PER_SEC);

}

/* out of patience; report that postmaster is still starting up */

return POSTMASTER_STILL_STARTING;

2、server主进程及信号处理函数

PostmasterMain->

pqsignal_no_restart(SIGUSR1, sigusr1_handler);  /* message from child process */

pqsignal_no_restart(SIGCHLD, reaper);  /* handle child termination */

...

StartupXLOG();

...

proc_exit(0);//exit函数向主进程发送SIGCHLD信号

reaper->//进程终止或者停止的信号

AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_READY);

postmaster进程接收信号:

sigusr1_handler->

if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_STARTED) &&

pmState == PM_STARTUP && Shutdown == NoShutdown){

CheckpointerPID = StartCheckpointer();

BgWriterPID = StartBackgroundWriter();

if (XLogArchivingAlways())

PgArchPID = pgarch_start();

//hot_standby在postgresql.conf文件中配置TRUE

//表示在恢复的时候允许连接

if (!EnableHotStandby){

//将standby写入postmaster.pid文件,表示up但不允许连接

AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STANDBY);

}

pmState = PM_RECOVERY;

}

if (CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) &&

pmState == PM_RECOVERY && Shutdown == NoShutdown){

PgStatPID = pgstat_start();

//将ready写入postmaster.pid文件,允许连接

AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_READY);

pmState = PM_HOT_STANDBY;

}

...

3、Startup进程

StartupXLOG->

ReadCheckpointRecord

if (ArchiveRecoveryRequested && IsUnderPostmaster){//有recovery.conf文件则ArchiveRecoveryRequested为TRUE

//有recovery.conf文件则ArchiveRecoveryRequested为TRUE

PublishStartupProcessInformation();

SetForwardFsyncRequests();

//向master进程发送PMSIGNAL_RECOVERY_STARTED信号

SendPostmasterSignal(PMSIGNAL_RECOVERY_STARTED);

bgwriterLaunched = true;

}

CheckRecoveryConsistency();-->...

|-- if (standbyState == STANDBY_SNAPSHOT_READY && !LocalHotStandbyActive &&

|      reachedConsistency && IsUnderPostmaster){

|      SpinLockAcquire(&XLogCtl->info_lck);

|      XLogCtl->SharedHotStandbyActive = true;

|      SpinLockRelease(&XLogCtl->info_lck);

|      LocalHotStandbyActive = true;

|      SendPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY);

|-- }

...

回放一个record后,每次读取下一个record前都会调用CheckRecoveryConsistency

Linux公社的RSS地址https://www.linuxidc.com/rssFeed.aspx

本文永久更新链接地址: https://www.linuxidc.com/Linux/2019-02/157093.htm


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Haskell

Haskell

Simon Thompson / Addison-Wesley / 1999-3-16 / GBP 40.99

The second edition of Haskell: The Craft of Functional Programming is essential reading for beginners to functional programming and newcomers to the Haskell programming language. The emphasis is on th......一起来看看 《Haskell》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

MD5 加密
MD5 加密

MD5 加密工具

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具