Você está na página 1de 3

ON standby -------------RFS[5]: Possible network disconnect with primary database Aborting archivelog file creation: /oraarch/cersaidb/1_9036_797271941.

arc If this a network disconnect, then this archivelog will be fetched again by GAP resolution mechanism. FAL[client, MRP0]: Error 12537 connecting to pri_cersaidb for fetching gap seque nce ORA-12537: TNS:connection closed Error 12505 received logging on to the standby ORA-12505: TNS:listener does not currently know of SID given in connect descript or FAL[client]: Failed to request gap sequence GAP - thread 1 sequence 9036-9135 DBID 2977243809 branch 797271941 FAL[client]: All defined FAL servers have been attempted. ------------------------------------------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. FAL failing to resolve archive gaps I had been having problems with FAL failing to resolve archive gaps. The classic error message on my 10.2.0.3 physical standby looked something like this: FAL[client]: Failed to request gap sequence GAP thread 1 sequence 11402-11403 DBID 2878580510 branch 589635294 FAL[client]: All defined FAL servers have been attempted. Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. Now, a FAL (Fetch Archive Log) request is triggered by the MRP process detecting a gap in the available archives to apply. The server to contact to fulfil the r equest is determined by the FAL_SERVER parameter which you set to an appropriate tnsnames alias. This essentially enables your standby to contact the primary da tabase and get it to ship the appropriate missing log using the arc process. While the above situation was logging the above on the standby, no indication of any issues were being logged on the primary. First thought of course was connec tivity, but that was easily disproved. So to get more information I deployed extra logging on both the primary and stan

dby, using the log_archive_trace parameter. This parameter has the ability to tr ace a number of background processes and is thus useful on both the primary and the standby. In fact it was the arcn process on the primary that really showed what was wrong , as the trace continually had the following lines: FAL[server]: Duplicate request detected (dbid 0 branch 127 thread 1 sequence 11402 dest STANDBY) FAL[server]: Request not queued in this case FAL[server]: Duplicate request detected (dbid 0 branch 127 thread 1 sequence 11403 dest STANDBY) FAL[server]: Request not queued in this case *** 2008-06-03 11:57:18.229 FAL[subfal]: Waiting on response from destination for thread 1 sequence 1089 0. So, what is this actually telling us? The standby is asking for 11402, and 11403 and the primary knows the standby is after these, however the primary also thin ks the standby is after 10890, but the standby having already applied this does not care about receiving this archived log. Essentially this log is blocking the FAL process from working properly. Turns out this is a bug situation. We were hitting bug:5576816 with an associate d MetaLink Note: 5576816.8. Seems like it s related to having max_connections set on the log_archive_dest parameter (which we did). As an attempted workaround I tried setting the log_archive_max_processes down fr om 6 to 1, with the following shown in the primary alert log: Shutting down archive processes ALTER SYSTEM SET log_archive_max_processes=1 SCOPE=BOTH; ARCH shutting down ARC4: Archival stopped ARCH shutting down ARC3: Archival stopped ARCH shutting down ARC2: Archival stopped ARCH shutting down ARC1: Archival stopped ARC5: Becoming the no FAL ARCH ARC5: Becoming the no SRL ARCH ARC5: Becoming the heartbeat ARCH ARCH shutting down ARC5: Archival stopped ARC0: Becoming the no FAL ARCH ARC0: Becoming the no SRL ARCH ARC0: Becoming the heartbeat ARCH I then killed (OS kill that is) the final running arc background process note th is is safe according Oracle support, and pmon detects it is dead and restarts th e arc process: ARCH: Detected ARCH process failure ARCH: STARTING ARCH PROCESSES ARC0: Archival started ARCH: STARTING ARCH PROCESSES COMPLETE ARC0 started with pid=276, OS id=24202 ARC0: Becoming the no FAL ARCH

ARC0: Becoming the no SRL ARCH ARC0: Becoming the heartbeat ARCH This did not work. When the arc0 process restarted it still was thinking it shou ld be sending the 10890 archive log. The only solution was to bounce the instanc e, thankfully with it being a RAC primary there was no real service outage. I won t be caring about FAL next week, as I m off to the seaside for a week. About these ads

Você também pode gostar