内容简介:一、问题pg_ctl start启动时报错退出:pg_ctl:server did not start in time。超时时间是多少?从什么时候到哪个阶段算超时?二、分析:该信息打印位置,从后面代码段do_start函数中可以看出
一、问题
pg_ctl start启动时报错退出:pg_ctl:server did not start in time。超时时间是多少?从什么时候到哪个阶段算超时?
二、分析:该信息打印位置,从后面代码段do_start函数中可以看出
1、pg_ctl start调用start_postmaster启动PG的主进程后,每隔0.1ms检查一次postmaster.pid文件,是否已写入ready/standby
2、总共会检查600次,即从启动主进程后,最多等待60s,如果没有写入ready/standby则打印上述日志并退出
3、默认等待时间是60s,如果pg_ctl start -t指定等待时间,则等待时间为该指定时间
三、什么时候postmaster.pid文件写入ready/standby
1、如果是主机不管有没有设置hot standby
1)当startup进程恢复完成退出时,调用proc_exit函数向主进程发送SIGCHLD信号并退出
2)主进程接收到信号后,signal处理函数reaper调用AddToDataDirLockFile向postmaster.pid文件写入ready
2、如果是备机即data目录下有recovery.cnf文件,且设置了hot standby,在实际恢复前没有到达一致性位置
1)startup进程向主进程发送PMSIGNAL_RECOVERY_STARTED信号,主进程调用信号处理函数sigusr1_handler,将pmState=PM_RECOVERY
2)每次读取下一个xlog前都会调用CheckRecoveryConsistency函数进行一致性检查:
2.1 进入一致性状态,starup进程向主进程发送PMSIGNAL_BEGIN_HOT_STANDBY信号,主进程接收到信号后调用sigusr1_handler->AddToDataDirLockFile向postmaster.pid文件写入ready
3、如果是备机即data目录下有recovery.cnf文件,且设置了hot standby,在实际恢复前没有到达一致性位置
1)startup进程向主进程发送PMSIGNAL_RECOVERY_STARTED信号,主进程调用信号处理函数sigusr1_handler,将pmState=PM_RECOVERY
2)每次读取下一个xlog前都会调用CheckRecoveryConsistency函数进行一致性检查。如果没有进入一致性状态
3)本地日志恢复完成,切换日志源时同样调用CheckRecoveryConsistency函数进行一致性检查
3.1 进入一致性状态,starup进程向主进程发送PMSIGNAL_BEGIN_HOT_STANDBY信号,主进程接收到信号后调用sigusr1_handler->AddToDataDirLockFile向postmaster.pid文件写入ready
4、如果是备机即data目录下有recovery.cnf文件,且设置了hot standby,在实际恢复前到达一致性位置
1)startup进程向主进程发送PMSIGNAL_RECOVERY_STARTED信号,主进程调用信号处理函数sigusr1_handler,将pmState=PM_RECOVERY
2)CheckRecoveryConsistency函数进行一致性检查,向主进程发送PMSIGNAL_BEGIN_HOT_STANDBY信号,主进程接收到信号后调用sigusr1_handler->AddToDataDirLockFile向postmaster.pid文件写入ready
5、如果是备机即data目录下有recovery.cnf文件,没有设置hot standby
1)startup进程向主进程发送PMSIGNAL_RECOVERY_STARTED信号
2)主进程接收到信号后,向postmaster.将pmState=PM_RECOVERY
四、代码分析
1、pg_ctl start流程
do_start->
pm_pid = start_postmaster();
if (do_wait){
print_msg(_("waiting for server to start..."));
switch (wait_for_postmaster(pm_pid, false)){
case POSTMASTER_READY:
print_msg(_(" done\n"));
print_msg(_("server started\n"));
break;
case POSTMASTER_STILL_STARTING:
print_msg(_(" stopped waiting\n"));
write_stderr(_("%s: server did not start in time\n"), progname);
exit(1);
break;
case POSTMASTER_FAILED:
print_msg(_(" stopped waiting\n"));
write_stderr(_("%s: could not start server\n" "Examine the log output.\n"), progname);
exit(1);
break;
}
}else
print_msg(_("server starting\n"));
wait_for_postmaster->
for (i = 0; i < wait_seconds * WAITS_PER_SEC; i++){
if ((optlines = readfile(pid_file, &numlines)) != NULL && numlines >= LOCK_FILE_LINE_PM_STATUS){
pmpid = atol(optlines[LOCK_FILE_LINE_PID - 1]);
pmstart = atol(optlines[LOCK_FILE_LINE_START_TIME - 1]);
if (pmstart >= start_time - 2 && pmpid == pm_pid){
char *pmstatus = optlines[LOCK_FILE_LINE_PM_STATUS - 1];
if (strcmp(pmstatus, PM_STATUS_READY) == 0 || strcmp(pmstatus, PM_STATUS_STANDBY) == 0){
/* postmaster is done starting up */
free_readfile(optlines);
return POSTMASTER_READY;
}
}
}
free_readfile(optlines);
if (waitpid((pid_t) pm_pid, &exitstatus, WNOHANG) == (pid_t) pm_pid)
return POSTMASTER_FAILED;
pg_usleep(USEC_PER_SEC / WAITS_PER_SEC);
}
/* out of patience; report that postmaster is still starting up */
return POSTMASTER_STILL_STARTING;
2、server主进程及信号处理函数
PostmasterMain->
pqsignal_no_restart(SIGUSR1, sigusr1_handler); /* message from child process */
pqsignal_no_restart(SIGCHLD, reaper); /* handle child termination */
...
StartupXLOG();
...
proc_exit(0);//exit函数向主进程发送SIGCHLD信号
reaper->//进程终止或者停止的信号
AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_READY);
postmaster进程接收信号:
sigusr1_handler->
if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_STARTED) &&
pmState == PM_STARTUP && Shutdown == NoShutdown){
CheckpointerPID = StartCheckpointer();
BgWriterPID = StartBackgroundWriter();
if (XLogArchivingAlways())
PgArchPID = pgarch_start();
//hot_standby在postgresql.conf文件中配置TRUE
//表示在恢复的时候允许连接
if (!EnableHotStandby){
//将standby写入postmaster.pid文件,表示up但不允许连接
AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STANDBY);
}
pmState = PM_RECOVERY;
}
if (CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) &&
pmState == PM_RECOVERY && Shutdown == NoShutdown){
PgStatPID = pgstat_start();
//将ready写入postmaster.pid文件,允许连接
AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_READY);
pmState = PM_HOT_STANDBY;
}
...
3、Startup进程
StartupXLOG->
ReadCheckpointRecord
if (ArchiveRecoveryRequested && IsUnderPostmaster){//有recovery.conf文件则ArchiveRecoveryRequested为TRUE
//有recovery.conf文件则ArchiveRecoveryRequested为TRUE
PublishStartupProcessInformation();
SetForwardFsyncRequests();
//向master进程发送PMSIGNAL_RECOVERY_STARTED信号
SendPostmasterSignal(PMSIGNAL_RECOVERY_STARTED);
bgwriterLaunched = true;
}
CheckRecoveryConsistency();-->...
|-- if (standbyState == STANDBY_SNAPSHOT_READY && !LocalHotStandbyActive &&
| reachedConsistency && IsUnderPostmaster){
| SpinLockAcquire(&XLogCtl->info_lck);
| XLogCtl->SharedHotStandbyActive = true;
| SpinLockRelease(&XLogCtl->info_lck);
| LocalHotStandbyActive = true;
| SendPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY);
|-- }
...
回放一个record后,每次读取下一个record前都会调用CheckRecoveryConsistency
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:- dubbo之timeout超时分析
- 技术问题分析-超时02(10.25)
- 源码分析context的超时及关闭实现
- 携程容器偶发性超时问题案例分析(一)
- PostgreSQL pg_ctl start超时分析
- wifidog源码分析Lighttpd1.4.20源码分析之fdevent系统(4) -----连接socket的处理与超时处理
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Introduction to Programming in Java
Robert Sedgewick、Kevin Wayne / Addison-Wesley / 2007-7-27 / USD 89.00
By emphasizing the application of computer programming not only in success stories in the software industry but also in familiar scenarios in physical and biological science, engineering, and appli......一起来看看 《Introduction to Programming in Java》 这本书的介绍吧!