内容简介:之前在1, db alert log of primary sidetip:
之前在 Oracle 12c 关于密码(password)的几个新特性小结 曾经记录过12C 密码文件可以放到ASM DG的特性, 最近处理一个案例时就遇到了,简单记录一套修复他人搭建的问题RAC+ DG的过程, 环境Oracle 12cR2 2-nodes RAC+DG。
1, db alert log of primary side
2018-12-25 09:32:18.111000 +08:00 Error 1034 received logging on to the standby 2018-12-25 09:37:18.335000 +08:00 Error 1034 received logging on to the standby
tip:
提示目标实例未启动,后确认备库有一个节点1启动,节点2关闭,理论上Standby 端不应该因部分实例停而影响DG的传输。
2. 确认DG 配置 on primary side
SQL>show parameter log_archive NAME VALUES ----------------- -------------------------------------------------------------------------------------------- log_archive_dest_2 SERVICE=STD_ANBOB2 LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=stdANBOB oracle@DB01:/home/oracle> tnsping STD_ANBOB2 Used TNSNAMES adapter to resolve the alias Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 133.96.43.38)(PORT = 1531))) (CONNECT_DATA = (SID = ANBOB2))) OK (0 msec)
tip:
使用了非默认的1531 port, 但是配置的只是节点2的IP,而且还使用的SID而不是service_name, 后来发现standby side的实例监听中只配置了SERVICE_NAME=SID的服务,无同名服务说明静态监听配置也是不正确的,需要先修复standby side上增加共同的服务名。
3, 修正primary side 传输SERVICE配置
oracle@DB01:/home/oracle> tnsping STD_ANBOB Used TNSNAMES adapter to resolve the alias Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 133.96.43.137)(PORT = 1531)) (ADDRESS = (PROTOCOL = TCP)(HOST = 133.96.43.138)(PORT = 1531))) (CONNECT_DATA = (service_name = ANBOB.com))) OK (0 msec) SQL> alter system set log_archive_dest_2='SERVICE=STD_ANBOB LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=stdANBOB'; # Primary side db alert log Error 1017 received logging on to the standby ------------------------------------------------------------ Check that the primary and standby are using a password file and remote_login_passwordfile is set to SHARED or EXCLUSIVE, and that the SYS password is same in the password files. returning error ORA-16191 ------------------------------------------------------------ TT04: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (16191) TT04: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned Errors in file /oracle/app/oracle/diag/rdbms/ANBOB/anbob/trace/anbob_tt04_47574.trc: ORA-16191: Primary log shipping client not logged on standby Error 16191 for archive log file 5 to 'STD_ANBOB'
Tip:
增加了实例1,使用了服务名,支持STANDBY上的RAC部分实例关闭后的正常传输, 但是主库日志里又出现了ora-1017 ora-16191错误
4, STANDBY side ORA-7445 [kzsrgavs()+995]
# db alert log 2018-12-25 10:48:21.996000 +08:00 Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x7FFD269AF8A7] [PC:0x2897843, kzsrgavs()+995] [flags: 0x0, count: 1] Errors in file /oracle/app/oracle/diag/rdbms/stdANBOB/ANBOB1/trace/ANBOB1_pr00_36293.trc (incident=64809): ORA-07445: exception encountered: core dump [kzsrgavs()+995] [SIGSEGV] [ADDR:0x7FFD269AF8A7] [PC:0x2897843] [Address not mapped to object] [] Incident details in: /oracle/app/oracle/diag/rdbms/stdANBOB/ANBOB1/incident/incdir_64809/ANBOB1_pr00_36293_i64809.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Dumping diagnostic data in directory=[cdmp_20181225104822], requested by (instance=1, osid=36293 (PR00)), summary=[incident=64809]. 2018-12-25 10:48:51.764000 +08:00 alter database recover managed standby database cancel 2018-12-25 10:48:53.175000 +08:00 Logmerger died, shutting down parallel recovery slaves 2018-12-25 10:49:26.184000 +08:00 CLMN: delete dead process - failed 2018-12-25 10:53:59.078000 +08:00 ORA-1013 signalled during: alter database recover managed standby database cancel... # ora-7445 trace file *** 2018-12-25T10:48:22.063388+08:00 dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0) [TOC00004] 3> ***** SQL Statement (None) ***** Current SQL information unavailable - no cursor. 3< ***** current_sql_statement ***** [TOC00004-END] [TOC00005] 3> ***** Call Stack Trace ***** ksedst()+119 < dbkedDefDump()+1200 < ksedmp()+259 < ssexhd()+3188 < sslssSynchHdlr()+39 < sslsshandler()+118 < __sighandler() < kzsrgavs()+995 < krsu_pre11_pwd_conn() < krsu_oci_est_sec_conn()+1594 < krsu_wupi_log()+446 < krsu_upi_logc()+366 < krsu_connect()+104 < krsf_send_fal_request()+607 < krsf_get_gap_sequence()+1266 < krr_read_disk()+446 < krr_read_buffer() < krr_parse_redo() < krr_do_media_recovery()+6603 < krddmr()+1356 < krd_do_mrp()+1078 < krd_logmerger_driver()+6425 < krp_slave_main() < ksvrdp_int()+2010 < opirip()+602 < opidrv()+602 < sou2o()+145 < opimai_real()+202 < ssthrdmain()+417 < main()+262
tips:
Standby 实例具然出现了ora-7445 [kzsrgavs()+995].
KZSRGAVS >>>>> (Kzsr)Gavs – Kernel Security Privileges Roles/Remote Password File ??
KRSU_PRE11_PWD_CONN >>>>>(Krs)U – Kernel Recovery Standby/Dataguard
猜测DG 的db password file不一致,说明当时备库的节点1可能密码文件和主库和备库2节点不一致,这点可以使用md5sum验证文件。于是习惯性的从Primary节点1的$ORACLE_HOME/dbs下, scp过去一份PWD file到standby 下,此时执行alter database recover managed standby database cancel操作可能会hang, 可以shutdown abort重新启动. 再次启动log应用。
# standby side db alert log
Managed Standby Recovery starting Real Time Apply 2018-12-25 11:07:57.610000 +08:00 Parallel Media Recovery started with 80 slaves Media Recovery Log +ARCHDG/STDANBOB/ARCHIVELOG/2018_12_25/thread_2_seq_1906.301.995795603 Media Recovery Log +ARCHDG/STDANBOB/ARCHIVELOG/2018_12_24/thread_1_seq_5110.280.995721757 Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x7FFF32BC5EC7] [PC:0x2897843, kzsrgavs()+995] [flags: 0x0, count: 1] Errors in file /oracle/app/oracle/diag/rdbms/stdANBOB/ANBOB1/trace/ANBOB1_arc3_76413.trc (incident=128681): ORA-07445: exception encountered: core dump [kzsrgavs()+995] [SIGSEGV] [ADDR:0x7FFF32BC5EC7] [PC:0x2897843] [Address not mapped to object] [] Incident details in: /oracle/app/oracle/diag/rdbms/stdANBOB/ANBOB1/incident/incdir_128681/ANBOB1_arc3_76413_i128681.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. 2018-12-25 11:07:59.047000 +08:00 Dumping diagnostic data in directory=[cdmp_20181225110759], requested by (instance=1, osid=76413 (ARC3)), summary=[incident=128681]. 2018-12-25 11:08:00.972000 +08:00 ARCH: Detected ARCH process failure ARCH: STARTING ARCH PROCESSES Starting background process ARC3 ARC3 started with pid=167, OS id=29684 ARC3: Archival started ARCH: STARTING ARCH PROCESSES COMPLETE Media Recovery of Online Log [Thread=1, Seq=5111] Recovery of Online Redo Log: Thread 1 Group 52 Seq 5111 Reading mem 0 Mem# 0: +DATADG/STDANBOB/ONLINELOG/group_52.260.995277121 2018-12-25 11:09:01.424000 +08:00 Standby crash recovery failed to bring standby database to a consistent point because needed redo hasn't arrived yet. MRP: Wait timeout: thread 1 sequence# 5111 Media Recovery user canceled with status 16016 Managed Standby Recovery not using Real Time Apply Standby Crash Recovery aborted due to error 10877. Errors in file /oracle/app/oracle/diag/rdbms/stdANBOB/ANBOB1/trace/ANBOB1_ora_29171.trc: ORA-10877: error signaled in parallel recovery slave ORA-10877: error signaled in parallel recovery slave Recovery interrupted! Some recovered datafiles maybe left media fuzzy Media recovery may continue but open resetlogs may fail Errors in file /oracle/app/oracle/diag/rdbms/stdANBOB/ANBOB1/trace/ANBOB1_pr00_29289.trc: ORA-16043: Redo apply has been canceled. ORA-16016: archived log for thread 1 sequence# 5111 unavailable Completed Standby Crash Recovery.
Tip:
当时还以为redo archived log sequence# 5111有问题,手动传输应用一下。
5. 手动传输GAP ARCHIVED LOG file
RMAN> backup archivelog sequence between 5111 and 5122; # scp backuppiece file to standby side RMAN> catalog backuppiece '/home/oracle/bftllg3h_1_1'; RMAN> restore archivelog sequence between 5111 and 5122;
tip:
Standby 重启应用发现可以往下继续应用,但是很快把手动传输的日志应用完后,又ORA-7445挂起了, 此时判断还应该是密码文件问题。 难道密码文件不对,看来是忽略了12C 的password file in ASM的特性。
6, password file in ASM
oracle@DB01:/home/oracle> srvctl config database -d ANBOB Database unique name: ANBOB Database name: ANBOB Oracle home: /oracle/app/oracle/product/12.2.0/db_1 Oracle user: oracle Spfile: +DATADG/ANBOB/PARAMETERFILE/spfile.281.992037707 Password file: +DATADG/ANBOB/PASSWORD/pwdANBOB.269.992036729 Domain: com Start options: open Stop options: immediate Database role: PRIMARY Management policy: AUTOMATIC Server pools: Disk Groups: DATADG,ARCHDG Mount point paths: Services: Type: RAC Start concurrency: Stop concurrency: OSDBA group: dba OSOPER group: Database instances: ANBOB1,ANBOB2 ...
Note:
Password file记录确认了当前的实例密码文件是在ASM DiskGroup中。如果该行记录值为空表明使用使用之前的本地ORACLE_HOME/dbs下的目录文件,现在解决进来就简单了。
grid@DB01:/home/grid>asmcmd ASMCMD> cp +DATADG/ANBOB/PASSWORD/pwdANBOB.269.992036729 /tmp/orapwd.file copying +DATADG/ANBOB/PASSWORD/pwdANBOB.269.992036729 -> /tmp/orapwd.file grid@DB01:/home/grid>ls -l /tmp/orapwd.file -rw-r----- 1 grid oinstall 3584 Dec 25 14:15 /tmp/orapwd.file # scp the orapwd.file to standby instance both.
7. REGISTER Standby DB in CRS
oracle@qdord01:/home/oracle> srvctl add database -db stdANBOB -oraclehome $ORACLE_HOME -spfile +DATADG/STDANBOB/PARAMETERFILE/spfile.435.995277297 -role PHYSICAL_STANDBY -startoption MOUNT -dbtype RAC -dbname ANBOB -diskgroup "datadg,archdg" oracle@qdord01:/home/oracle> srvctl add instance -db stdANBOB -instance ANBOB1 -node qdord01 oracle@qdord01:/home/oracle> srvctl add instance -db stdANBOB -instance ANBOB2 -node qdord02 grid@qdord01:/home/grid> srvctl start instance -d stdANBOB -n qdord01 grid@qdord01:/home/grid> srvctl start instance -d stdANBOB -n qdord02
Conclusion:
12C中为了在不同实例间统一的密码管理, 支持把DB密码(ASM password same)存储到ASM DISKGROUP中,这样在维护DG环境时,当同步密码文件时就要先确认一下密码的位置, 同样DG端也可以把密码存储到ASM中,然后使用srvctl modify database修改pwd路径. 这个案例通过在标准化DG配置中因密码不一致产生了各种错误,后重新同步密码文件解决,提醒有相同问题的同学可以少走弯路。
— enjoy —
以上所述就是小编给大家介绍的《Oracle12c R2注意事项:DB password file in ASM (DataGuard环境重建PWD)》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Design for Hackers
David Kadavy / Wiley / 2011-10-18 / USD 39.99
Discover the techniques behind beautiful design?by deconstructing designs to understand them The term ?hacker? has been redefined to consist of anyone who has an insatiable curiosity as to how thin......一起来看看 《Design for Hackers》 这本书的介绍吧!