Escolar Documentos
Profissional Documentos
Cultura Documentos
ROLE TRANSITIONS
In Data Guard configuration, an Oracle database can operate in one of two modes: Primary or Standby.
We can change the role of database using either a switchover or a failover operation. Oracle Data
Guard supports 2 types of role transitions. They are SWITCHOVER & FAILOVER.
SWITCHOVER: Allows reversible role transition between the primary database and its one of
standby databases. (The Primary database becomes Standby, the Standby database becomes Primary).
After switchover happened, each database continues to participate in Data Guard configuration
with its new role. No data loss during a switchover operation.
FAILOVER: Changes a Standby database to the primary role in response to a primary database
failure. During a failover operation one of the standby databases would become primary database
and the old primary database unable to participate in the Data Guard Configuration. If the
Primary database was not operating MAXIMUM PROTECTION or MAXIMUM AVAILABILITY mode before
failure, data loss may occur.
The Old Primary can be entered back into the configuration as a Standby database. If Flashback
database is enabled on the primary database, it can be reinstated as a standby for the new primary
database. Without flashback, Data Guard configuration needs to be built from scratch.
SQL*Plus provides all kinds of administration & monitoring for the DBAs. In Data Guard configuration
operations such as (switchover or failover), as a DBA you have to access each server and need to
execute SQL statements separately unlike DGMGRL.
Data Guard Broker command line interface (DGMGRL) utility that automates and centralizes Data Guard
management. Using DGMGRL we can run consecutive operations such as (switchover/failover) with just
one command unlike SQL*Plus interface. DGMGRL configuration and operations explained here.
In this section, step by step we can discuss Data Guard operations using SQL*Plus interface.
SWITCHOVER OPERATION
SWITCHOVER_STATUS
--------------------
TO STANDBY
PROCESS STEP 2:
Verify that there are no redo transport errors or redo gaps at the standby database by querying
the view v$archive_dest_status view on the Primary database. Do not p
STATUS GAP_STATUS
--------- ------------------------
VALID NO GAP
The above query used to check the status of the standby database associated with LOG_ARCHIVE_DEST_2.
Do not start switchover process until the value of the STATUS is VALID and the value of the
GAP_STATUS is NOGAP. On the Primary database you can verify following parameter, those values are
corresponding to the standby database.
PROCESS STEP 3:
In order to perform switchover operation all sessions to the database need to be disconnected.
You have to use 'WITH SESSION SHUTDOWN' clause that should be added to alter database statement.
SYS> alter database commit to switchover to physical standby with session shutdown;
Database altered.
PROCESS STEP 4:
On the primary site, when you set log_archive_state_2=defer, then archives/redo shipping process
will be stopped to the standby site. This is part of switchover process.
PROCESS STEP 6
SWITCHOVER_STATUS
--------------------
TO PRIMARY
If the status returns SESSION ACTIVE, then you should append the with session shutdown clause.
SYS> startup;
...
SYS> alter database recover managed standby database using current logfile disconnect;
...
NOTE: In my case, real time apply enabled so that RECOVERY_MODE is MANAGED REAL TIME APPLY.
SYS> select sequence#, first_time, next_time, applied from v$archived_log order by sequence#;
...
SYS> select thread#, max(sequence#) from v$archived_log where applied='YES' group by thread#;
...
Our switchover process is successfully completed. There is no data divergence between the original
and new primary databases after successful completion of the switchover.
As we know very well Failover is basically unplanned Role Transition. Failover operations can be
done by using SQL*Plus, Data Guard broker, or automatically using the Fast Start failover with an
observer. Once the failover is complete, the new primary would be started in MAXIMUM PERFORMANCE
mode even if it were previously running in MAXIMUM AVAILABILITY or MAXIMUM PROTECTION mode.
If the flashback database is NOT enabled on the primary database, after a failover operation it is
impossible to include the old (failed) primary database into the data guard configuration again.
Suppose Flashback enabled on the Primary database before the failover operation, you can bring
the old Primary back into the configuration as a Standby database. It’s possible to move the old
Primary database back in time to point just before where the failure occurred.
The only consideration is you must have enabled Flashback Database on the Primary database before
the failover. It’s always good idea and recommended to enable flashback in Standby database also.
ENVIRONMENT DETAILS
Let’s see how to perform a failover to a Physical Standby database using SQL*Plus.
PROCESS STEP 1:
SWITCHOVER_STATUS FLASHBACK_ON
-------------------- ------------------
TO STANDBY YES
Assume that the Primary database cannot be started again so that the standby has to be activated.
FLASHBACK_ON
------------------
YES
$ tail –f crms_alert.log
..
...
Recovered data files to a consistent state at change 70591961
ORA-16037: user requested cancel of managed recovery operation
MRP0: Background Media Recovery process shutdown (stbycrms)
Waiting for MRP0 pid 11448 to terminate
Managed Standby Recovery Canceled (stbycrms)
Completed: alter database recover managed standby database cancel
The CANCEL clause cancels Redo apply on a Physical Standby database after applying the current
archived redo log file.
$ tail –f crms_alert.log
The FINISH clause initiates failover on the Physical Standby database and recovers the current
standby redo log files. The FORCE keyword terminates active RFS processes on the Physical Standby,
so that failover can proceed immediately without waiting for network connections to time out.
$ tail –f crms_alert.log
COMMIT TO SWITCHOVER TO PRIMARY clause changes the Standby database to the Primary database role.
If SWITCHOVER_STATUS column shows SESSIONS ACTIVE, terminate the active user using WITH SESSION
SHUTDOWN clause to the alter database commit … statement.
As a result, you can no longer use this database as a standby database and any subsequent redo
received from the original primary database cannot be applied.
If you do not enable flashback database on the primary database before failover process, the new
Primary still at MOUNT state. Shut down and start up the new primary database. Now failover
operation is over. You need to rebuild the old primary from the scratch.
SYS> startup;
...
OPEN_MODE
--------------------
READ WRITE
If the Physical Standby database has been opened in read-only mode since the last time it was
started, you must shut down the target standby database and restart it:
STANDBY_BECAME_PRIMARY_SCN
--------------------------
70591960
After a failover (when the Physical Standby becomes the Primary), Oracle writes the failover SCN
to the control file. We can query STANDBY_BECAME_PRIMARY_SCN column of the V$DATABASE fixed view
so that it’s easy to determine at which the old standby database became the new primary database.
Flashback was enabled on the failed primary and also we found the SCN (which would really help us
to get back the former Primary database to support as standby database for stbycrms instance.
FLASHBACK_ON
------------------
YES
$ tail –f crms_alert.log
$ tail –f crms_alert.log
..
...
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800;
Completed: alter database convert to physical standby
ALTER DATABASE CONVERT SQL STATEMENT dismounts the database after successfully converting the
control file to a standby control file.
STATUS
------------
STARTED
SYS> alter database recover managed standby database using current logfile disconnect;
Database altered.
$ tail –f crms_alert.log
The new Physical Standby database reads EOR marker, the MRP stops. What it clearly states there is
more redo beyond the current archived redo log containing the EOR marker. At this point it’s a
good idea to check if MRP is working on the current standby.
5 rows selected.
SYS> alter database recover managed standby database using current logfile disconnect;
Database altered.
$ tail –f crms_alert.log
alter database recover managed standby database using current logfile disconnect
Attempt to start background Managed Standby Recovery process (crms)
MRP0 started with pid=42, OS id=28045
MRP0: Background Managed Standby Recovery process started (crms)
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
..
When you use flashback database the failover operation will reset the log sequence number back to
#1. Make some log switches on the new primary (stbycrms) to the new standby (crms) site.
If you want to switch over operation, (to convert the Standby as Primary) issue the following query
to check current state of the archive destination on the new standby site.
REFERRED LINKS
Data Guard provides three types of data protection modes. They are MAXIMUM PROTECTION, MAXIMUM
AVAILABILITY & MAXIMUM PERFORMANCE. Understanding Data Guard Protection modes will help you which
serve different business needs in terms of data protection & performance. Data protection mode
controls what happens if the Primary database loses its connection to the standby database.
On the Primary database do not complete a commit operation until the redo data needed to recover
the transaction must be written to both (the primary online redo log & the Standby redo log).
If the redo data cannot be written to at least one standby, the primary will shut down. Because
this protection mode always priorities data protection over Primary database availability.
Oracle recommends to configure more than one Standby, to prevent a single standby database
failure from causing primary database to shut down.
MAXIMUM AVAILABILITY
Where a transactions on the primary do not commit until the redo data needed to recover that
transaction must be written to both (the Primary online redo log & at least 1 standby redo log)
If redo stream cannot be written to at least one standby, the primary database does NOT shut
down instead the primary database operates in Maximum Performance mode.
Once the fault is corrected and all gaps are resolved, the Primary database automatically
resumes operation in Maximum Availability mode.
Even this mode ensures no data loss (suppose the primary database fails). The Standby database
must have to resynchronize before the failover otherwise data loss will occur.
You need High Bandwith and Low Latency between primary and standby databases.
The network bandwidth & latency are very important for MAX PROTECTION & MAX AVAILABILITY modes.
MAXIMUM PERFORMANCE
When a transaction is committed as soon as the redo data needed to recover is written to the
local online redo log. Then Redo data is written to the one or more standby databases, but this
is done asynchronously with respect to transaction commitment.
The Primary database continues its transaction processing without checking data availability
of any standby databases. So the performance and availability of the Primary won’t affect.
First we need to understand Redo Transport services. It performs the automated transfer of redo
data between the Primary database and Standby databases.
As we know Redo Transport services can transmit redo data to local and remote destination. Remote
destination can include Physical Standby and Logical Standby databases. Data Guard supports 2 types
of redo transport methods using the LNS process: SYNCHRONOUS & ASYNCHRONOUS.
SYNC Vs ASYNC
Specifies whether the synchronous (SYNC) or asynchronous (ASYNC) redo transport mode is to be used.
Synchronous transport mode (SYNC) is required for MAXIMUM AVAILABILITY & MAXIMUM PROTECTION modes.
Asynchronous transfer mode (ASYNC) is required for the MAXIMUM PERFORMANCE mode.
Generally Synchronous redo transport affects the performance of the Primary database.
In order to complete the commit process, acknowledgement needed from the Standby database, any
delay in writing the redo data in Standby will impact performance of the Primary database.
Any failure in writing redo data to the standby database causing shutdown of the Primary
database in MAXIMUM PROTECTION mode.
The LOG_ARCHIVE_DEST_11 through LOG_ARCHIVE_DEST_31 parameter do not Support the SYNC attribute.
EX: log_archive_dest_2='service=STBY_CRMSDB LGWR SYNC'
Asynchronous redo transport does NOT affect the performance of the Primary. The Primary database
never waits for any acknowledgement from the standby database in order to complete the commit.
I.e. LGWR does NOT wait for any confirmation that redo data is written on the Standby database.
Delay in transfer redo data to the Standby database or Failure in writing redo data on the
Standby database do NOT impact availability of the Primary database.
AFFIRM Vs NOAFFIRM
Controls a whether a redo transport destination acknowledges received redo data before or after
writing it to the standby redo log. (You can verify affirm column of the v$archive_dest v$view).
# AFFIRM ATTRIBUTE
The AFFIRM attribute ensures that a redo transport destination acknowledges received redo data
after writing it to the standby redolog files.
# NO AFFRIM ATTRIBUTE
The NOAFFIRM attribute ensures that a redo transport destination acknowledges received redo
data before writing it to the Standby redolog.
If AFFIRM or NOAFFIRM are NOT specified, default is AFFIRM when the SYNC attribute is specified
and NOAFFIRM when the ASYNC attribute is specified.
NET_TIMEOUT
The NET_TIMEOUT attribute is optional. Specifies the number of seconds that the Primary database
LGWR background process will wait for a response (acknowledgment) from a redo transport destination
before to terminate the connection and marking the standby destination as failed.
If you do not specify any value to net_timeout attribute then the default value is 30 seconds.
Before setting any value to this attribute you need good network bandwidth. Suppose when you
specify lower value (5 seconds) and also having poor network bandwidth leads to the Primary database
may often disconnect from the Standby database.
REOPEN
The REOPEN attribute is optional. Specifies the time in seconds that the log writer should wait
before attempting to access a failed Standby destination. Redo transport services attempt to reopen
failed destination at log switch time. The default value is 300 seconds. Manually you can specify
The VALID_FOR attribute is optional. But Oracle recommends that you have to define the VALID_FOR
attribute for each redo transport destination in Data Guard configuration so that redo transport
PRIMARY_ROLE – This is valid when the database runs only in the Primary role.
STANDBY_ROLE – This is valid when the database runs only in Standby role
ALL_ROLES - This is valid when the runs in either Primary/Standby role.
You have to specify VALID_FOR attribute when setting LOG_ARCHIVE_DEST_n parameter. If VALID_FOR
attribute is NOT specified, online redo log files and standby redo log files will be archived
depending on role of the database.
DB_UNIQUE_NAME
# ENVIRONMENT DETAILS
Prior to changing the protection there are a few preliminary steps that should be taken.
In order to change data protection mode to Max Availability or Max Protection from Max Performance,
we need Standby Redo log files on the Standby database server and also Redo records synchronously
SYNC transmitted from the Primary database to Standby database using LGWR Processes.
In order to change Protection mode to MAX AVAILABILITY/MAX PROTECTION we must have Standby redo
log files. In my case, already I have Standby Redo Log groups at both (Primary and Standby).
Standby Redo log files on the Primary database will be used only when you perform Switchover.
It is always better to create SRL on both sides (Primary and Standby), Oracle recommends that you
create a standby redo log files at primary database also. SRL is NOT mandatory for Primary database
but it’s good and useful in Role conversion from Primary to Standby database so that the Primary
database can switch over quickly to the Standby role without any extra step. It is important to
configure the Standby Redo Logs (SRL) with the same size as the online redo logs.
GROUP# BYTES/1024/1024
---------- ---------------
1 50
2 50
3 50
On the Primary database I have 3 Online Redo LOG Groups and each file size is 50M. It is better to
create additionally an extra group for Standby redo log. In my case, I have 4 SRL Groups on the
Primary with the same size of the Online Redo log groups.
GROUP# BYTES/1024/1024
---------- ---------------
4 50
5 50
6 50
7 50
Suppose Standby redo log groups yet not configured on the Primary database. You can use following
SQL statement to create Standby redo log groups on the Primary database.
SYS> alter database add standby logfile group 4 '/location path/.../sredo04.log' size 50m;
...
SYS> alter database add standby logfile group 5 '/location path/.../sredo05.log' size 50m;
...
SYS> alter database add standby logfile group 6 '/location path/.../sredo06.log' size 50m;
...
SYS> alter database add standby logfile group 7 '/location path/.../sredo07.log' size 50m;
...
Once you create SRL groups on the Primary database, using following query you can verify it.
As I said, suppose you want to change the protection mode to either MAX AVAILABILITY/MAX PROTECTION
you must have the standby redo logs configured on the standby database.
If we want to use of REAL TIME APPLY, first you need to configure SRL files on the Standby side.
Once you configure Standby Redo Logs on the Standby database, LNS ships redo to RFS then RFS writes
redo on Standby Redo log files. Redo is applied directly through the SRL (Real Time Apply) and
does NOT have to wait for the SRL's to be archived. Finally ARCn process archives the standby redo
logs to archive destination.
In my case already I have SRL group on the Standby database; because I am using Real-Time-Apply.
Before you create SRL groups on the Standby database first cancel the MRP (If MRP is running).
Once you create required SRL groups then enable MRP on the Standby database.
Database altered.
We need to create 4 Standby Redo Log (SRL) groups of size same as online redo log groups (50M) on
the standby database.
SYS> alter database add standby logfile group 4 '/location path/.../sredo04.log' size 50m;
...
SYS> alter database add standby logfile group 5 '/location path/.../sredo05.log' size 50m;
...
SYS> alter database add standby logfile group 6 '/location path/.../sredo06.log' size 50m;
...
SYS> alter database add standby logfile group 7 '/location path/.../sredo07.log' size 50m;
...
SYS> alter database recover managed standby database disconnect using current logfile;
Database altered.
Next you need to set LOG_ARCHIVE_DEST_n parameter to reflect redo transport for the new Protection
mode. As we discussed earlier redo shipping parameter LOG_ARCHIVE_DEST_2 attributes should be
related to synchronous (SYNC) mode.
# ON PRIMARY DATABASE
System altered.
# ON STANBY DATABASE
# ON PRIMARY DATABASE
# ON STANDBY DATABASE
Now I am going to crash Standby database instance 'crms', once I kill 'crms' instance let’s see
oracle reaction in MAXIMUM PROTECTION mode.
$ kill -9 28213
$ tail –f alert_stbycrms.log
PROTECTION_MODE PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM PROTECTION UNPROTECTED
PROTECTION_MODE PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE