Lokesh RAC

ORACLE RAC
Click to add text Lokesh Aggarwal
7th Sept, 2011
AGENDA
Introduction to RAC Availability Manageability Voting Disk Crash Scenarios Global Cache Services ASM Grid infrastructure Patching and up gradation By others ACFS Patching in Data Guard New features in 11gR2 RAC concepts Eviction scenarios
2006 IBM Corporation 2 2006 IBM Corporation
7th Sept, 2011
INTRODUCTION
2006 IBM Corporation
7th Sept, 2011
What is RAC ???

Multiple instances running on separate servers (nodes) Single database on shared storage accessible to all nodes Instances exchange information over an interconnect network
Instance 1 Node 1 Local Disk

7th Sept, 2011
Interconnect
Instance 2
Node 2
Shared Storage
4
Local Disk
2006 IBM Corporation 2006 IBM Corporation
What is RAC Databases ???

Located on shared storage accessible by all instances Includes Control Files Data Files Online Redo Logs Server Parameter File May optionally include Archived Redo Logs Backups Flashback Logs (Oracle 10.1 and above) Change Tracking Writer files (Oracle 10.1 and above)
7th Sept, 2011
Contd
Contents similar to single instance database except One redo thread per instance
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 3 SIZE 51200K, GROUP 4 SIZE 51200K; ALTER DATABASE ENABLE PUBLIC THREAD 2;
If using Automatic Undo Management also require one UNDO tablespace per instance
CREATE UNDO TABLESPACE "UNDOTBS2" DATAFILE SIZE 25600K AUTOEXTEND ON MAXSIZE UNLIMITED EXTENT MANAGEMENT LOCAL;
Additional dynamic performance views (V$, GV$ but not X$) created by $ORACLE_HOME/rdbms/admin/catclust.sql
7th Sept, 2011
RAC Internal Structures and Services

Global Resource Directory (GRD) Records current state and owner of each resource Contains convert and write queues Distributed across all instances in cluster Global Cache Services (GCS) Implements cache coherency for database Coordinates access to database blocks for instances Maintains GRD Global Enqueue Services (GES) Controls access to other resources (locks) including library cache dictionary cache
7th Sept, 2011
Why Do Users Deploy RAC ???

Users may deploy RAC to achieve Increasing availability Increasing scalability Improving maintainability Reduction in total cost of ownership
7th Sept, 2011
Availability
7th Sept, 2011
What is Failover?
If one node or instance fails Node detecting failure will Read redo log of failed instance from last checkpoint Apply redo to datafiles including undo segments (roll forward) Rollback uncommitted transactions Cluster is frozen during part of this process Instance 1 Node 1 Local Disk
7th Sept, 2011
Interconnect
Instance 2 Node 2
Shared Storage
10
Local Disk
What are Database Services???

Database Services are logical groups of sessions Can be configured using DBCA Enterprise Manager (10.2 and above)
Can also be configured using SRVCTL (Oracle Cluster Registry only) SQL*Plus (Data Dictionary only)
In Oracle 10.1 and above, each service has Preferred Nodes (used by default) Available Nodes (used if preferred node fails)
7th Sept, 2011
11
What is Oracle Clusterware???

Introduced in Oracle 10.1 (Cluster Ready Services - CRS) Renamed in Oracle 10.2 to Oracle Clusterware Cluster Manager providing Node membership services Global resource management High availability functions On Linux Configured in /etc/inittab Implemented using three daemons CRS - Cluster Ready Service CSS - Cluster Synchronization Service EVM - Event Manager In Oracle 10.2 includes High Availability framework Allows non-Oracle applications to be managed
7th Sept, 2011
12
What is the OCR???

Oracle Cluster Registry (OCR) Configuration information for Oracle Clusterware / CRS Introduced in Oracle 10.1 Replaced Server Management (SRVM) disk/file Similar to Windows Registry Located on shared storage In Oracle 10.2 and above can be mirrored Maximum two copies
7th Sept, 2011
13
What is a Voting Disk???

Known as Quorum Disk / File in Oracle 9i Located on shared storage accessible to all instances Used to determine RAC instance membership In the event of node failure voting disk is used to determine which instance takes control of cluster Avoids split brain In Oracle 10.2 and above can be mirrored Odd number of copies (1, 3, 5 etc)
7th Sept, 2011
14
What is VIP???
Node application introduced in Oracle 10.1 Allows Virtual IP address to be defined for each node All applications connect using Virtual IP addresses If node fails Virtual IP address is automatically relocated to another node Only applies to newly connecting sessions
7th Sept, 2011
15
VIP Failover ???

mydb = x.x.x.201 x.x.x.202
(VIP) x.x.x.201 (Static) x.x.x.101
(VIP) x.x.x.202 (Static) x.x.x.102
7th Sept, 2011
16
VIP Failover ???

mydb = x.x.x.201 x.x.x.202 TCP Reset
(VIP) (VIP) x.x.x.202 x.x.x.201 (Static) x.x.x.101 (Static) x.x.x.102
7th Sept, 2011
17
VIP Failover ???

mydb = x.x.x.201 x.x.x.202
(VIP) (VIP) x.x.x.202 x.x.x.201 (Static) x.x.x.101 (Static) x.x.x.102
7th Sept, 2011
18
What is TAF???
TAF is Transparent Application Failover
Requires additional coding in client
Requires configuration in TNSNAMES.ORA

RAC_FAILOVER = (DESCRIPTION = (ADDRESS_LIST = (FAILOVER = ON) (ADDRESS = (PROTOCOL = TCP)(HOST = node1)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = node2)(PORT = 1521)) ) (CONNECT_DATA = (SERVICE_NAME = RAC) (SERVER = DEDICATED) (FAILOVER_MODE =(TYPE=SELECT)(METHOD=BASIC)(RETRIES=30)(DELAY=5)) ) )
7th Sept, 2011
19
Does RAC Increase Availability???

Depends on definition of availability May achieve less unplanned downtime May have more time to respond to failures Instance failover means any node can fail without total loss of service Must provide have overcapacity in cluster to survive failover Additional Oracle and RAC licenses Load can be distributed over all running nodes Can use Grid to provision additional nodes
7th Sept, 2011
20
Contd
Can still get data corruptions Human errors / software errors Only one logical copy of data Only one logical copy of application / Oracle software Lots of possibility for human errors Power / network cabling / storage configuration Upgrades and patches are more complex Can upgrade software on subset of nodes If database is affected then still need downtime
7th Sept, 2011
21
Manageability
7th Sept, 2011
22
Server Parameter File

Introduced in Oracle 9.0.1 Must reside on shared storage Shared by all RAC instances Binary (not text) files Parameters can be changed using ALTER SYSTEM Can be backed up using the Recovery Manager (RMAN) Created using
CREATE SPFILE [ = SPFILE_NAME ] FROM PFILE [ = PFILE_NAME ];
init.ora file on each node must contain SPFILE parameter

SPFILE = <pathname>
7th Sept, 2011
23
Parameters
Some parameters must be same on each instance including :
ACTIVE_INSTANCE_COUNT ARCHIVE_LAG_TARGET CLUSTER_DATABASE CONTROL_FILES DB_BLOCK_SIZE DB_DOMAIN DB_FILES DB_NAME DB_RECOVERY_FILE_DEST DB_RECOVERY_FILE_DEST_SIZE DB_UNIQUE_NAME TRACE_ENABLED UNDO_MANAGEMENT
7th Sept, 2011
24
Contd
Some parameters, if used, must be different on each instance including :
THREAD INSTANCE_NUMBER INSTANCE_NAME UNDO_TABLESPACE ROLLBACK_SEGMENTS
7th Sept, 2011
25
DBCA
Can be used to Create RAC database and instances Create ASM instance Manage ASM instance (10.2) Add RAC instances Create RAC database Create clone RAC database (10.2) Create, Manage and Drop Services Drop instances and database
7th Sept, 2011
26
What is SRVCTL?
Utility used to manage cluster database Configured in Oracle Cluster Registry (OCR) Controls Database Instance ASM Listener Node Applications Services Options include Start / Stop Enable / Disable Add / Delete Show current configuration Show current status
7th Sept, 2011
27
SRVCTL - Examples
Starting and Stopping a Database
srvctl start database -d RAC srvctl stop database -d RAC
Starting and Stopping an Instance

srvctl start instance -d RAC -i RAC1 srvctl stop instance -d RAC -i RAC1
Starting and Stopping a Service

srvctl start service -d RAC -s SERVICE1 srvctl stop service -d RAC -s SERVICE1
Starting and Stopping ASM on a specified node

srvctl start asm -n node1 srvctl stop asm -n node1
7th Sept, 2011
28
What is CLUVFY?
Introduced in Oracle 10.2
Supplied with Oracle Clusterware Can be downloaded from OTN (Linux and Windows) Written in Java - requires JRE (supplied) Also works with 10.1 (specify -10gR1 option) Checks cluster configuration stages - verifies all steps for specified stage have been completed components - verifies specified component has been correctly installed
7th Sept, 2011
29
CLUVFY
Stages include -post hwos -pre cfs -post cfs -pre crsinst -post crsinst -pre dbinst -pre dbcfg post check for hardware and operating system pre-check for CFS setup post-check for CFS setup pre-check for Oracle Clusterware installation post-check for Oracle Clusterware installation pre-check for database installation pre-check for database configuration
7th Sept, 2011
30
contd..
Components include
nodereach nodecon cfs ssa space sys clu clumgr ocr crs Checks reachability between nodes Checks node connectivity Checks CFS integrity Checks shared storage accessibility Checks space availability Checks minimum system requirements Checks cluster integrity Checks cluster manager integrity Checks OCR integrity Checks Oracle Clusterware (CRS) integrity
nodeapp
admprv peer
Checks node applications exist

Checks administrative privileges Compares properties with peers
7th Sept, 2011
31
contd..
For example, to check configuration before installing Oracle Clusterware on node1 and node2 use: sh runcluvfy.sh stage -pre crsinst -n node1,node2
Checks: node reachability user equivalence administrative privileges node connectivity shared stored accessibility
If any checks fail append -verbose to display more information
7th Sept, 2011
32
Other Utilities
Additional RAC utilities and diagnostics include OCRCONFIG OCRCHECK OCRDUMP CRSCTL CRS_STAT Additional RAC diagnostics can be obtained using ORADEBUG utility DUMP option LKDEBUG option Events
7th Sept, 2011
33
Does RAC Improve Manageability?

Advantages Fewer databases to manage Easier to monitor Easier to upgrade Easier to control resource allocation Resources can be shared between applications Disadvantages Upgrades potentially more complex Downtime may affect more applications Requires more experienced operational staff Higher cost / harder to replace
7th Sept, 2011
34
Voting Disk Crash Scenarios
7th Sept, 2011
35
contd
Losing one Voting Disk Voting disks are used in a RAC configuration for maintaining nodes membership. They are critical pieces in a cluster configuration. Starting with ORACLE 10gR2, it is possible to mirror the OCR and the voting disks. Using the default mirroring template, the minimum number of voting disks necessary for a normal functioning is two. Scenario Setup
Identify Votings:
crsctl query css votedisk /dev/raw/raw1 /dev/raw/raw2
/dev/raw/raw3
7th Sept, 2011
36
contd
corrupt one of the voting disks (as root):
dd if=/dev/zero /dev/raw/raw3 bs=1M
Recoverability Steps
check the $CRS_HOME/log/[hostname]/alert[hostname].log file. The following message should be written there which allows us to determine which voting disk became corrupted:
[cssd(9120)]CRS-1604:CSSD voting file is offline: /opt/oracle/product/10.2.0/crs_1/Voting1. Details in /opt/oracle/product/10.2.0/crs_1/log/ractest2/cssd/ocssd.log.
7th Sept, 2011
37
contd
According to the above listing the Voting1 is the corrupted disk. Shutdown the CRS stack:
srvctl stop database -d fitstest -o immediate srvctl stop asm -n ractest1 srvctl stop asm -n ractest2 srvctl stop nodeapps -n ractest1 srvctl stop nodeapps -n ractest2 crs_stat t
On every node as root:

crsctl stop crs
7th Sept, 2011
38
contd
Pick a good voting from the remaining ones and copy it over the corrupted one:
dd if=/dev/raw/raw4 of=/dev/raw/raw3 bs=1M
Start CRS (on every node as root)::

crsctl start crs
Check log file $CRS_HOME/log/[hostname]/alert[hostname].log. It should look like shown below:

[cssd(14463)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ractest1 ractest1. 2007-05-31 15:19:53.954 [crsd(14268)]CRS-1012:The OCR service started on node ractest1. 2007-05-31 15:19:53.987 [evmd(14228)]CRS-1401:EVMD started on node ractest1. 2007-05-31 15:19:55.861 [crsd(14268)]CRS-1201:CRSD started on node ractest1.
7th Sept, 2011
39
contd
After a couple of minutes check the status of the whole CRS stack:
[oracle@ractest1 ~]$ crs_stat -t
7th Sept, 2011
40
Global Cache Services
7th Sept, 2011
41
Read with No Transfer

N S 1 Request shared resource Resource Master
Instance 2
2 Request granted
Instance 3
3 Read request Instance 1 4 Block returned Instance 4
Instance 2 requests current read on block
1318
7th Sept, 2011
42
Read to Write Transfer

N S N 3 Block and resource status 2 Transfer block to Instance 1 for exclusive access 1 Request exclusive resource Resource Master
1318
Instance 2
Instance 3
N X
1320
Instance 1 Instance 1 requests exclusive read on block
4 Resource status
Instance 4
1318
7th Sept, 2011
43
Write to Write Transfer

N S N 2 Transfer block to Instance 4 in exclusive mode Resource Master 1
1318
Instance 2
Instance 3 4 Resource status
N X N
N X
1320
3 Block and resource status Instance 1 Instance 4 requests exclusive read on block
1323
Instance 4
1318
Note that Instance 1 will create a past image (PI) of the dirty block
7th Sept, 2011
44
Past Images
When an instance passes a dirty block to another instance it Flushes redo buffer to redo log Retains past image (PI) of block in buffer cache PI is retained until another instance writes block to disk Used to reduce recovery times Recorded in V$BH.STATUS as PI Based on X$BH.STATE (value 8 in Oracle 10.2)
7th Sept, 2011
45
contd..
Buffer Cache UPDATE t1 SET c1 = 1324; 1328; 1327; 1326; 1325; COMMIT; Buffer Cache UPDATE t1 SET c1 = 1329; COMMIT;
1323 1324 1325 1326 1327 1328 1329
1328 1329
Instance 1
Instance 2
1323
1324
1328
1329
1324
1325 1326 1327
1325
1326 1327 1328
1329 1323
BlockUndo/Redoapplied froma DBWR hasis 1makesperform to Instance 42updates column Block 421table t1 contains Assume2subsequentlyto 42 Instance notmustrecovery Undo/redoupdated in to GCS transferswritten from Block 1needs block Instance is2 written buffer Undo/redo Crasheswritten 42 is Instance written changes block Block 42 is read from disk ContentsPastdisk Instance 2 lost Instance 42cachePast Image toInstance usesInstance yet recovery1 to by block 2 backof 1 Imageto disk blocktobuffer cache 42 single row in block are a Redo Log 1 1327 DBWR 1324 2 1329 1328 1326 1325 for back
Redo Log 2
Redo Log 1
7th Sept, 2011
THANK YOU !!!
7th Sept, 2011
47
What is the Interconnect???
Customer Relationship Management (CRM) Information transferred between instances includes

data blocks locks SCNs Typically 1GB Ethernet UDP protocol
Instances communicate with each other over the interconnect (network)
Back
7th Sept, 2011
50
Why Use Shared Storage ???

Mandatory for Database files Control files Online redo logs Server Parameter file (if used) Optional for Archived redo logs (recommended) Executables (Binaries) Password files Parameter files Network configuration files Administrative directories Alert Log Dump Files
7th Sept, 2011
51
Back

Lokesh RAC

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Lokesh RAC

Enviado por

Direitos autorais:

Formatos disponíveis

ORACLE RAC

Click to add text Lokesh Aggarwal

7th Sept, 2011

7th Sept, 2011

2006 IBM Corporation

7th Sept, 2011

2006 IBM Corporation

What is RAC ???

Instance 1 Node 1 Local Disk

What is RAC Databases ???

7th Sept, 2011

2006 IBM Corporation

7th Sept, 2011

2006 IBM Corporation

RAC Internal Structures and Services

7th Sept, 2011

2006 IBM Corporation

Why Do Users Deploy RAC ???

2006 IBM Corporation

7th Sept, 2011

2006 IBM Corporation

2006 IBM Corporation

7th Sept, 2011

2006 IBM Corporation

What are Database Services???

2006 IBM Corporation

7th Sept, 2011

2006 IBM Corporation

What is Oracle Clusterware???

7th Sept, 2011

2006 IBM Corporation

What is the OCR???

2006 IBM Corporation

7th Sept, 2011

2006 IBM Corporation

What is a Voting Disk???

7th Sept, 2011

2006 IBM Corporation

2006 IBM Corporation

7th Sept, 2011

2006 IBM Corporation

VIP Failover ???

(VIP) x.x.x.201 (Static) x.x.x.101

(VIP) x.x.x.202 (Static) x.x.x.102

2006 IBM Corporation

7th Sept, 2011

2006 IBM Corporation

VIP Failover ???

(VIP) (VIP) x.x.x.202 x.x.x.201 (Static) x.x.x.101 (Static) x.x.x.102

2006 IBM Corporation

7th Sept, 2011

2006 IBM Corporation

VIP Failover ???

(VIP) (VIP) x.x.x.202 x.x.x.201 (Static) x.x.x.101 (Static) x.x.x.102

2006 IBM Corporation

7th Sept, 2011

2006 IBM Corporation

Requires configuration in TNSNAMES.ORA

7th Sept, 2011

2006 IBM Corporation

Does RAC Increase Availability???

2006 IBM Corporation

7th Sept, 2011