P. 1
Smart Oracle Database Solutions-Oracle Database Extended RAC

Smart Oracle Database Solutions-Oracle Database Extended RAC

|Views: 311|Likes:
Publicado porSerkan Kiracı

More info:

Published by: Serkan Kiracı on Feb 18, 2011
Direitos Autorais:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as TXT, PDF, TXT or read online from Scribd
See more
See less





Smart Oracle Database Solutions-Oracle Database Extended RAC

Powering Databases Ahead Support@OracleFusions.com Oracle Sniffer Released. Click to Download NewsletterDownloadsLogin

Home Products Library Team Management Forums Contact Horizon Books

Oracle Database Extended RAC Oracle 10g Extended RAC on Windows 2003 Handbook For Achieving HA & Disaster Recovery Solution By Database Manager

Index Introduction Test Bed overview Installation of Virtual Machines

Installation of Oracle Clusterware Services Installation of Oracle Software/ASM/DB Post RAC Installation Health Check Applying Oracle Patch set Adding a third Node to the Cluster Database RAC Concepts Primer Troubleshooting RAC Environment Backup and Recovery - RAC Environment References

Introduction The document is intended as guidelines for Oracle and System Administrators who are responsible for implementing an Oracle Extended (also known as stretched clustering) RAC for the nodes that are located within 5KM away from each other. I have implemented the same on IBM P590 series running AIX 5.3 with Oracle 10g Release, using Oracle Clusterware on SAN storage, the two data centers connected over a 1GB Laser Link network with latency of less than 5 ms. We have had some problems in implementing the solution and most of the issues were related to configuring Oracle RAC components related to network synchronization. At that time, I decided to first fully test the Extended RAC implementation on my own and thanks to VMWare software, I was able to fully test the implementation. This document is based on the VMWare installation on windows and I would highly recommend that before you actually go on implementing Extended RAC over Unix in UAT/Production environment, have it fully tested

on windows to clear the concepts behind it. For RAC Handbook on IBM P590 Series-AIX OS, Pease refer to RAC resources section on my site.

Test Bed overview Oracle 10g Release (later we will apply latest patch Windows 2003 Enterprise Operating System Service Pack 1 Windows resource kit to be installed on every virtual machine. VMWare Workstation Version 4.5 (You can download trial version from their site, but I would highly recommend buying it). 4 windows XP Workstation PCs attached to a Network of 5 ms latency at most, with 1GB Ram, 1 CPU each, and at least 40GB of free storage space. There are many articles on the internet which talks about implementing RAC on a single PC with VMWare installed. However they all need at least 2GB of Memory and even when you acquired that, after the RAC is installed, you can not test all possible RAC scenarios with lack of resources. What I did and recommend, is that at your work place, talk to other DBAs and say that the 4 DBAs will have their PCs which can be used to simulate the extended RAC testing. So all you need to make sure is 4 PCs with 1GB of memory and are connected over LAN with administrative privileges on their workstations.

Installation of Virtual Machines Lets call the 4 XP OS PC work stations as: XPWS1, XPWS2, XPWS3, & XPWS4 In each of these work stations, install VMware Workstation software and then Launch VMware software and create one virtual machine of windows 2003 on every XP work station. While

creating a virtual machine, please make sure to assign the following to each Virtual server: · On XPWS1, create folder C:\RACVIRTUAL and under it create two subfolders as RAC1 and ASMDISK. · On XPWS2, create folder D:\RACVIRTUAL and under it create two subfolders as RAC3 and ASMDISK. · On XPWS2, create folder D:\RACVIRTUAL and under it create 1 subfolder as RAC5. · The above folders are for the new Virtual Server hosting Windows 2003. ASM folder is for hosting ASM Raw devices. · Virtual OS should be assigned with 524MB of RAM each. · During creation of virtual servers, choose bridged network, IO Adapter as LSI Logic, disks as SCSI. · Under Virtual machine settings, remove Drive A: · As for windows 2003, choose default settings. · Make sure the swap space is 1Gb and goes to Drive C: · Create two logical drives C, D where C: drive should be of 4G for OS usage, and D: of 5GB where we will install Oracle 10g Software. · After installation is over, choose VM menu option and install VMWare tools for the new virtual machine/OS. WorkstationVirtual ServerRemarks XPWS1RAC1RAC Node1 with its storage defined XPWS2RAC3RAC Node2 with its storage defined XPWS3RAC5RAC Node3 with no storage of its own XPWS4RAC6Only storage for 3rd voting disk

At this point you will have 4 XP work stations installed with single virtual machine each running Windows 2003 OS. Let us focus now on the first two virtual machines created. Remaining two Virtual machines will be used later for a) adding a 3rd node, and adding a third voting disk/site respectively. Therefore, at this point you will work with two virtual machines which I named as RAC1 (on PC XPWS1), and RAC3 (on PC XPWS2). RAC5 (on XPWS3) will be configured later as a 3rd node. You now need to configure network settings for these two virtual servers. Shutdown RAC1 and RAC3 servers (we will call them from now on RAC1

10.0.42. This point is very important to note.142.0. Rename the two as Public and Private respectively.and RAC3 which are created on two separate workstations connected over your home or office network). Gateway: leave empty Repeat the same for RAC3 node.W found.139. because all of my configurations below are based on this fact.W found screen will appear again. edit Virtual machine settings and add a new Ethernet Adapter as of Bridged type. · Go to network connections from Control panel and you should see two NIC identified as Local Area connection1 and Local Area connection2. because Private NIC will be used for Cache fusion/Inter Nodes communication between the two RAC Nodes.0 Gateway: leave empty · Modify the Hosts file of the RAC1 & RAC3 OS (c:\windows\system32\drivers\etc) as: .10.10. the same new H.10. Bring up RAC1 server and you should see a window displayed for a new H. subnet:255. subnet:255. Click on the Advanced Settings of the Network connections windows and make sure the order of list is first Public. it will fail at the end.0 Gateway:10. RAC3 Node: Public NIC: 10.0. My settings are: RAC1 Node: Public NIC: subnet:255.72.1 Private NIC: 10. so that the host name RAC1 will resolve to Public NIC. and then Private. Click on their properties and choose Internet Protocol and assign the IP addresses. because if you pressed cancel.0.0.0 Private NIC: subnet: Gateway:10. If you would like to proceed with single PC RAC testing. Make sure the subnet is different for Public and Private. press Next and complete it.0.0. but let it be. so please do not create two virtual machines on the same physical PC but have them created on different physical servers.0 10.142. Lets now proceed with NIC settings: · Shutdown RAC1. every time you reboot RAC1.139.1 10. then you should see other articles on the internet.

add/remove Windows components. its VIP service will fail over to the surviving node and client will be re-directed to the surviving node without any tcp time out.msc remove greyed out NIC under Network category.250 10.10. go to windows registry via Regedit. RAC5. Shutdown Both servers(RAC1. When we have completed the RAC setup.139 10. which usually happens when listener is listening on a port attached to physical IP address. Also create a mapped drive from each workstation to each other s C:\RACVIRTUAL (and D:\RACVIRTUAL) folder with full permissions granted.42.42. and add DWORD key: DisableDHCPMediaSense=1. Navigate to HKEY_LOCAL_MACHINE\System\CurrentControlSet\Serv ices\Tcpip\Parameters. Perform following steps for all nodes (RAC1.139 10. RAC3. these are the IP addresses (or names) which will be used to configure client connections to the RAC.com/technet/prodtechnol/win dows2000serv/reskit/regentry/94173.mspx?mfr=true · Go to command prompt and type: set devmgr_show_nonpresent_devices =1 devmgmt. Now edit the main virtual Server file C:\RACVIRTUAL\RAC1\ .72 10. RAC3) from their respective Work stations.10.142. · From control panel.72 10. (you could use XPWS1 and from here open a remote connection to XPWS2. The Setup would be in such a way that these will act as components of the cluster so in case Nod1 is down.0. this will enable you to use Remote Desktop services to access both nodes from a remote laptop or workstation if required.that s what I did). The mapped drive can be called Z: mapped to C:\RACVIRTUAL.0.10. add Terminal Services component.251 RAC1 RAC1-priv RAC1-vip RAC3 RAC3-priv RAC3-vip RAC1-VIP and RAC3-VIP are not physically linked to any Network Card but are logically defined on the Public Subnet address. To do this. RAC6) · Oracle supports the TCP/IP protocol for the public and private networks and requires that Windows Media Sensing is disabled by setting the value of the DisableDHCPMediaSense parameter to 1. To know more about this parameter see http://www.42.exe.microsoft.

XPWS2) via VMware. I will create a set of raw devices on the storage of Both Workstations (XPWS1.vmdk Repeat the same from RAC3 node (on XPWS2) as: vmware-vdiskmanager.winNetEnterprise.exe -c -s 200MB -a lsilogic -t 2 D:\RACVIRTUAL\ASMDISK\ocr2.vmdk vmware-vdiskmanager.present = "TRUE" scsi1.exe -c -s 300MB -a lsilogic -t 2 C:\RACVIRTUAL\ASMDISK\ocr1. Each set will have three raw devices. as I had run out of SCSI limits for number of virtual devices.vmdk Raw Device to hold Voting disk information vmware-vdiskmanager. Later when configuring ASM with normal redundancy.vmdk Explanation of OCR and Voting disk will be explained in next section. Raw Device to hold Oracle Database Files.dataCacheMaxSize = "0" diskLib. Later I will show you how to create additional raw disks to move different databases files (like redo.vmx for the Virtual Server RAC3 created on XPWS2.maxUnsyncedWrites = "0" scsi1.virtualDev = "lsilogic" scsi1. second set for OCR and third for Voting disks. backup sets to their respective separate raw disks).dataCacheMinReadAheadSize = "0" diskLib.exe -c -s 200MB -a lsilogic -t 2 C:\RACVIRTUAL\ASMDISK\votingdisk1.exe -c -s 200MB -a lsilogic . so I chose IDE hard disk as new virtual disk.dataCachePageSize = "4096" diskLib. Raw Device to hold OCR information vmware-vdiskmanager.dataCacheMaxReadAheadSize = "0" diskLib. make sure you pre-allocate the disk space and chose a size of 4GB. From RAC1 node: Go to command prompt. Create the raw disk as C:\RACVIRTUAL\ASMDISK\oradata1. one to hold database files. I will mirror these two raw devices sets(created on two different PC s storage). We are now ready to create raw devices.vmx on XPWS1 (for RAC1 node settings) and add the following lines: disk.sharedBus = "VIRTUAL" · Repeat the same for mapped drive Z:\RAC3\ winNetEnterprise.locking = "FALSE" diskLib. You need to use VMWare GUI interface to create this. Since we are going to have an Extended RAC setup.

make sure Windows Time server is disabled. 3. This is required to make sure the raw devices will be auto mounted every time os starts up. What I did was basically sync the time for each of the Virtual server to the Host OS(which is the XP workstation) as: From RAC1 Node. However you need to run the above command on every time machine start up. So make these two command as part of a schedule job to trigger on every system starts up.vmdk via GUI as IDE hard disk. but when you opt for this option. You can download third part software which makes clock in synch among different servers in one network. 2. Check current Time Server with: NET TIME /QUERYSNTP To set the initial time with Time server as: NET TIME \\XPWS1 /SET Set current Time Server(XPWS1) for a RAC1 as: NET TIME /SETSNTP:XPWS1 Repeat the same for RAC3 node and make XPWS1 as its Time Server as well. Also note the names of the raw devices end with digit2 because these raw devices will be used together with raw devices created on XPWS1 and mirrored by ASM. Search for Time Sync Server on windows in Google for that. The two RAC nodes must have the time clock synchronized. Or you can use net time command to configure the time from any time server available on the internet. Go to command prompt and type Diskpart and then enter command Automount enable. Now you need to make the new disks available to the VMWare workstation software by editing the virtual settings as: · Bring up both nodes RAC1 and RAC2 and perform the following: 1. On XPWS2 I have used drive D: because I have more free space in Drive D: on that PC. you already have registered the raw device that holds database .vmdk And now create D:\RACVIRTUAL\ASMDISK\oradata2. Then check times on both servers from one place as: NET TIME \\RAC1 NET TIME \\RAC3 Alternatively right click on VMware tools icon and select time Sync between Host and Virtual machines.-t 2 D:\RACVIRTUAL\ASMDISK\votingdisk2. · For RAC1 Node. however for RAC1 node on XPWS1 I have used Drive C:.

· Bring up both nodes and verify all storage settings. right click on manager MyComputer short cut on desktop. choose remote location as Z:\. You should now see your three disks appear as offline in the Disk Management tab. but this time when you add disks from exiting.maxUnsyncedWrites = "0" config.vmdk) as you have created it via VMWare GUI. choose manage and then select Storage section and click on Disk Management. Repeat the same for remaining two disks (ocr and voting). However for the OCR and Voting disks.files (oradata1. since you created them via command prompt... Here make sure to choose "Do not assign drive letter and do not format the disk and continue until completion. 2. Click on each of them one by one and perform the following tasks: 1. You should now see the disk as Online status. you need to repeat the same procedure as above. Right click on the new disk and select new partition and choose extended and proceed to finish. choose create from existing and browse to these two files location and select the vmdk files. just use the VMWare gui -add Disks and this time instead of creating a new virtual disk. Right click again and choose Create Logical drive. also accept defaults here and complete.locking = "FALSE" diskLib. · Shutdown both RAC1 and RAC3 nodes.version = "7" virtualHW.dataCacheMinReadAheadSize = "0" diskLib. you should see a popup window which will list the three new disks you added. · Since RAC1 node also need to access the same disks created on RAC3 node. Accept defaults and press Next. · Start up the RAC1 node.dataCacheMaxSize = "0" diskLib. · Repeat the whole Procedure as described above for RAC3.dataCachePageSize = "4096" diskLib.vmx disk. and add the three raw devices from RAC3 node. For simplicity I have copied the contents of the vmx files for both nodes below: RAC1 Node vmx file (Notice remote raw device links with Z:) Location: C:\RACVIRTUAL\RAC1\winNetEnterprise.version = "3" .dataCacheMaxReadAheadSize = "0" diskLib.

startConnected = "TRUE" Ethernet0.virtualDev = "lsilogic" memsize = "524" scsi0:0.virtualDev = "lsilogic" scsi1.deviceType = "plainDisk" scsi1:2.vmdk" scsi1:1.fileName = "Windows Server 2003 Enterprise Edition.present = "FALSE" scsi1:1.fileName = "C:\RACVIRTUAL\ASMDISK\oradata1.fileName = "-1" displayName = "RAC1" guestOS = "winNetEnterprise" priority.mode = "persistent" scsi1:3.present = "TRUE" ide1:0.sharedBus = "VIRTUAL" scsi1:1.present = "TRUE" scsi1:2.addressType = "generated" uuid.location = "56 4d c5 0d df c2 22 83-78 96 e8 44 92 2d 88 e3" uuid.fileName = "C:\RACVIRTUAL\ASMDISK\votingdisk1.fileName = "C:\RACVIRTUAL\ASMDISK\ocr1.deviceType = "cdrom-raw" floppy0.bios = "56 4d c5 0d df c2 22 83-78 96 e8 44 92 2d 88 e3" ethernet0.present = "TRUE" scsi0:1.powerOff = "default" powerType.fileName = "auto detect" ide1:0.present = "TRUE" scsi0.scsi0.vmdk" ide1:0.present = "TRUE" scsi1.fileName = "Windows Server 2003 Enterprise Edition (3).ungrabbed = "normal" powerType.generatedAddress = "00:0c:29:2d:88:e3" ethernet0.deviceType = "plainDisk" scsi1:3.present = "TRUE" sound.syncTime = "FALSE" scsi0:1.grabbed = "normal" priority.present = "TRUE" scsi1:3.vmdk" scsi1:3.virtualDev = "es1371" scsi1.reset = "default" ide1:0.suspend = "default" powerType.mode = "persistent" scsi1:2.deviceType = "plainDisk" .mode = "persistent" scsi1:1.present = "TRUE" sound.vmdk" sound.fileName = "A:" Ethernet0.powerOn = "default" powerType.vmdk" scsi1:2.generatedAddressOffset = "0" tools.present = "TRUE" scsi0:0.

present = "TRUE" ide0:1.present = "TRUE" ide0:0.fileName = "C:\RACVIRTUAL\ASMDISK\test9.vmdk" scsi0:2.deviceType = "plainDisk" ide1:1." Ethernet0.mode = "persistent" scsi1:0.deviceType = "plainDisk" ide0:1.deviceType = "plainDisk" scsi0:5.fileName = "C:\RACVIRTUAL\ASMDISK\oradata1.present = "TRUE" scsi1:4.present = "FALSE" scsi0:6.dataCachePageSize = "4096" .deviceType = "plainDisk" Ethernet1.vmdk" scsi1:4.vmdk" scsi0:5.vmdk" ide0:0.fileName = "C:\RACVIRTUAL\ASMDISK\oradata1.generatedAddress = "00:0c:29:2d:88:ed" ethernet1.present = "FALSE" scsi0:3.generatedAddressOffset = "10" floppy0.vmdk" scsi1:0.present = "FALSE" scsi0:2.vmdk" scsi0:3.fileName = "C:\RACVIRTUAL\ASMDISK\oradata1.vmx disk.present = "TRUE" scsi1:0.fileName = "C:\RACVIRTUAL\ASMDISK\test.deviceType = "plainDisk" RAC3 Node vmx file: Note remote links to raw devices on RAC1 Location: Z:\RAC3\winNetEnterprise.fileName = "Z:\ASMDISK\ocr2.deviceType = "plainDisk" scsi1:0.mode = "persistent" scsi1:4.dataCacheMaxSize = "0" diskLib.present = "FALSE" redoLogDir = ".dataCacheMinReadAheadSize = "0" diskLib.locking = "FALSE" diskLib.present = "FALSE" scsi0:5.fileName = "Z:\ASMDISK\votingdisk2.fileName = "C:\RACVIRTUAL\ASMDISK\oradata1.dataCacheMaxReadAheadSize = "0" diskLib.connectionType = "bridged" scsi0:2.vmdk" scsi0:6.vmdk" ide0:1.scsi1:4.present = "FALSE" ide1:1.addressType = "generated" ethernet1.fileName = "Z:\ASMDISK\oradata2.vmdk" ide0:0.present = "TRUE" Ethernet1.

vmdk" .ungrabbed = "normal" powerType.location = "56 4d a4 62 67 78 2e 4e-bd 76 ea 69 ed 2d f9 03" uuid.present = "TRUE" scsi1.vmdk" ide1:0.fileName = "-1" displayName = "RAC3" guestOS = "winNetEnterprise" priority.powerOff = "default" powerType.fileName = "A:" Ethernet0.deviceType = "cdrom-raw" floppy0.generatedAddress = "00:0c:29:2d:f9:03" ethernet0.present = "FALSE" scsi1:1.vmdk" sound.fileName = "z:\ASMDISK\votingdisk1.addressType = "generated" uuid.deviceType = "plainDisk" scsi1:3.present = "TRUE" scsi0:0.fileName = "z:\ASMDISK\ocr1.generatedAddressOffset = "0" tools.fileName = "Z:\ASMDISK\oradata1.fileName = "Windows Server 2003 Enterprise Edition (3).sharedBus = "VIRTUAL" scsi1:1.present = "TRUE" scsi1:2.mode = "persistent" scsi1:3.virtualDev = "es1371" scsi1.bios = "56 4d a4 62 67 78 2e 4e-bd 76 ea 69 ed 2d f9 03" ethernet0.diskLib.virtualDev = "lsilogic" memsize = "524" scsi0:0.present = "TRUE" scsi0:1.startConnected = "TRUE" Ethernet0.present = "TRUE" sound.powerOn = "default" powerType.version = "3" scsi0.mode = "persistent" scsi1:2.present = "TRUE" scsi0.version = "7" virtualHW.mode = "persistent" scsi1:1.present = "TRUE" ide1:0.fileName = "auto detect" ide1:0.present = "TRUE" sound.suspend = "default" powerType.syncTime = "FALSE" scsi0:1.maxUnsyncedWrites = "0" config.fileName = "Windows Server 2003 Enterprise Edition.grabbed = "normal" priority.vmdk" scsi1:1.reset = "default" ide1:0.vmdk" scsi1:2.virtualDev = "lsilogic" scsi1.deviceType = "plainDisk" scsi1:2.present = "TRUE" scsi1:3.

This is achieved by running a verification utility provided by oracle called "runcluvfy. On RAC1 command prompt.addressType = "generated" ethernet1.deviceType = "plainDisk" scsi1:0.deviceType = "plainDisk" Installation of Oracle Clusterware Services · Before we go on installing Oracle CRS." Ethernet0. Cd d:\ Cd D:\clusterware\cluvfy runcluvfy.mode = "persistent" scsi1:4. · In Specify Home Details screen.deviceType = "plainDisk" scsi1:4.connectionType = "bridged" ide0:0.mode = "persistent" scsi1:0.present = "FALSE" redoLogDir = ".scsi1:3. · Execute the D:\clusterware\Setup.generatedAddress = "00:0c:29:2d:f9:0d" ethernet1.rac3 -verbose Verify that the output of the above command has only VIP verification failure and all tests should pass.bat stage -pre crsinst -n rac1.vmdk" ide0:1.present = "TRUE" ide0:1.fileName = "Z:\ASMDISK\oradata1.generatedAddressOffset = "10" floppy0.deviceType = "plainDisk" Ethernet1.present = "TRUE" scsi1:4. set the Name as oracrs and location as .present = "TRUE" Ethernet1.fileName = "D:\RACVIRTUAL\ASMDISK\oradata2.vmdk" ide0:0.present = "TRUE" ide0:0.fileName = "D:\RACVIRTUAL\ASMDISK\ocr2. navigate to the CD Rom where you have already inserted Oracle 10g Enterprise CD.deviceType = "plainDisk" ide0:1.bat .fileName = "D:\RACVIRTUAL\ASMDISK\votingdisk2.present = "TRUE" scsi1:0.exe which will launch oracle installer for Cluster Services.vmdk" scsi1:4.vmdk" scsi1:0. lets verify that our two nodes fulfill all of the pre-requisites for CRS installation.

· In the cluster configuration Storage screen. That means for OCR disks. · In the Specify Cluster Configuration screen. Virtual Private IP configuration · Except step 8. However in voting disk.10. · In the next screen of Specify Network Interface Usage. You should be able to reorganize the disks first with their sizes (remember we chose 4GB for data files. Reason is simple that I would like to show how to change RAC configurations after installation. You will see that for OCR you will have an option like Primary OCR and mirrored OCR locations. · At this point we will have the following services configured in the windows service manager 1. each node has to write to it separately.2. RAC3-priv and RAC3-vip as names of Private and VIP interfaces. you have to specify the two disks (main and its mirror) for both OCR and Voting disks.I:\oracle\product\10. and 200.0) and second for Private (10. · Oracle will then run a check on all of the pre-requisites and you should make sure all tests passed and then press Next. Here I should have specified three voting disks but I chose only two and third I shall add It later. you will have to add RAC3 as Second Node for RAC. Install successful 2. Ignore and proceed to complete the install and exit.10. Oracle Object service . But for Voting you have to specify its location multiple times.0\crs. Remote Operation Pending 4. Oracle Notification Server configuration 7. You can also know the location of these disks by their names and verifying the names from VMware machine settings for the disks added. For example. all steps should be completed OK. Public and VIP names. The other nodes will communicate to Master Node for ocr operations. Oracle will go through the following steps: 1. Oracle Private inter connect configuration 8. Configuration Pending 5. Oracle Clusterware configuration 6.0) in my case. As you can see in the Installation screen.0. Setup Successful 3. one RAC node becomes the master for it (responsible for read/write to it and its mirrored).42. you have to add the two nodes along with the all Private. 300 MB for Voting and Ocr). · Next oracle will begin installation. you would have edit and make sure there will be two interconnect one for Public (10.

2.2.bat stage -post crsinst -n RAC1.0\crs\bin\ocrcheck · Recycle both nodes and verify again the CRS health Installation of Oracle Software/ASM/DB · Make sure that the Cluster services are up and running on both nodes.0\db_1 · Make sure to check mark both nodes for s/w installation · Make sure all pre-requisites tests are passed · In the Select Configuration Option. · Select Enterprise Edition.0\crs\bin\vipca · The assistant will show you screen where you will have to provide RAC1-vip and RAC2-vip network names and then proceed to install the VIP services.2. · Choose Oracle Home (different from CRS) as oradb and location as i:\oracle\product\10. choose Install Database Software only . GSD Apps resource(Global Service Directory) 3.2. · Basically this will install the following 3 resources (not as windows services) 1.exe.2.RAC3 · Following are the commands that can also be used to verify cluster health on both nodes: I:\oracle\product\10. Oracle Oracle Oracle Oracle cluster volume service CR Service CS service EVM service · You need to now run the VIP configuration assistant as it was the one which got failed. · Complete the installation until end and you should receive an errors this time. 3.0\crs\bin\crs_stat -t I:\oracle\product\10. Run from CD ROM. d:\database\setup. 4. · Now launch oracle database configuration assistant from the Programs group (not from CD).0\crs\bin\crsctl check crs I:\oracle\product\10.2. So run it from i:\oracle\product\10. 5. · Choose Oracle RAC Database in the Welcome screen · In the next screen select Configure Automatic Assistant · Select both RAC nodes. VIP application resource 2. ONS Apps resource (Oracle notification Service) · At this point your CRS installation is complete and you should verify it by running: Cluvfy. provide password for ASM instance and choose pfile which .

2.0\crs SET ORACLE_HOME= I:\oracle\product\10. click on Stamp disks and you should see all of your raw devices (from both node s storage). · Select DATA as the ASM group for all database files. archived and Flash recovery area to their respective asm groups. · ASM instance setup is now completed and we can begin creating a database. · Press OK and continue to complete. select general purpose database template . · Do not specify Flash recovery area as it will be done later.means each ASM instance on RAC1 and RAC3 will have its own init. The two instances will be RACDB1 and RACDB2. · Use Oracle Managed files. I will create more groups later and show how to distribute REDO. · Select Redundancy as Normal and create the disk group as shown in the following picture. · Select ASM as the storage for the new database. lets perform basic health checks. · In the create disk group. · Choose Create Database. it is always recommended to use OMF with ASM. · Accepts defaults for rest of the screens and continue until completion. select both nodes. --Assignment of Environment Variables Right click on My Computer and select properties and go to Advanced tab and define the following environment variables on both servers. but specify Archived log location.2. · Launch DBCA from programs group and follow the screens as under. SET CRS_HOME = I:\oracle\product\10.0\db_1 . Post RAC Installation Health Check Now that the RAC is installed. · DBCA will then create ASM instance. Select the two raw devices of 4GB on the two nodes and accept defaults. name the database as RACDB. · The main group is DATA and two sub groups are DATAP and DATAS. · Now you should be able to see both disks appear as candidates in the Create Disk group screen.ora located on NTFS. and then you will see a screen where you have to create disk groups.

Provides group services .Manages 'application resources' .Spawns separate 'actions' to start/stop/check application resources .Provides access to node membership .exe evmd. .Spawns a permanent child evmlogger .exe oracle.exe OracleCRService OracleCSService OracleEVMService OracleClusterVolumeService --Oracle ASMServices OracleASMService+ASM1 crsd. and fails 'application resources' over .exe --Oracle Database Services OracleoradbTNSListenerLISTENER_RAC1 TNSLSNR.Provides basic cluster locking . when present .exe OracleJobSchedulerRACDB1 --Link between Windows Services and Processes (in Task Manager) Run command : TASKLIST /SVC (See the processes links above) Here is a short description of each of the CRS daemon processes: (Taken from Metalink Note: 259301. stops.Integrates with existing vendor clusterware.Can also runs without integration to vendor clusterware .--Oracle Clusterware Services Oracle Object Service OracleOBJService.Generates events when things happen .Failure exit causes machine reboot. .Engine for HA operation .Is restarted automatically on failure OCSSD: .exe ocssd.Starts.EXE OracleServiceRACDB1 oracle.OCSSD is part of RAC and Single Instance with ASM .Maintains configuration profiles in the OCR (Oracle Configuration Repository) .exe OcfsFindVol. on demand. --.exe OracleDBConsoleRACDB1 nmesrvc. EVMD: .Runs as Oracle. spawns children .Runs as root .1 CRSD: .Stores current known state in the OCR.Evmlogger.This is a feature to prevent data corruption in event of a split brain.

rac1..RACDB.SM2..DB. --Start/stop all oracle services Crs_start -all .ons rac1 ora.asm application rac3 ora.vip rac3 application application application crs_stat alone will provide full names listing crs_stat -f will provide detailed information about each of the compoenents..C1.rac1..gsd rac1 ora...asm application rac1 ora.inst application rac3 ora..lsnr application rac3 ora.C3.gsd rac3 ora..B2. .. ..rac1.Restarted auto when fails --CRS_STAT: Check Health of Resources(ASM.Scans callout directory and invokes callouts.lsnr application rac1 ora....B1.rac3.Runs as Oracle. Listener....vip application application application rac1 ora..SM1...db application ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE rac3 ora. Instance etc) Cd %CRS_HOME% Crs_stat -t Name Type Target State Host ---------------------------------------------------------ora.ons rac3 ora.inst application rac1 ora.rac3.rac3.

and is recommended to use it as it has more control of each service group.\votedsk2 Checks version of Clusterware crsctl query crs softwareversion/activeversion CRS software version on node [rac1] is [10.exe Dumps cluster state to crsd.log crsctl debug statedump crs Debug specific components(level 2) See crsd. See details at http://download-uk.com/docs/cd/B19306_01/ rac. .1.log crsctl debug log "CRSTIMER:2" Please note that dumping cluster state is a one time snapshop while other debig command are modes of tracing with different levels.oracle. 0 \\.\votedsk1 1. 0 \\.RACDB.2.RACDB1. --CRSCTL : Controls RAC parameters Checks health of cluster only Crsctl check crs CSS appears healthy CRS appears healthy EVM appears healthy Query voting disks location crsctl query css votedisk 0.0\crs\BIN\GUIOracleOBJMan ager.inst Please note that you can also use srvctl command to achieve the same for starting or stopping services.0.Crs_stop -all Start/Stop Individual services crs_start resounce_name -c cluster_member crs_start resource_name For example: crs_start ora.0] You can also use the utility to find out location of ocr and voting disk as : Run I:\oracle\product\10.102/b14197/appsupport.htm See Appendix at the end of this document for more details. You should see a document for understanding how to debug a real application cluster environment.2.

0\db_1\log\rac3\client --Export ocr (takes backup and restore and change location) ocrconfig -export ocr. ocrconfig -repair ocr <ocr_location> ocrdump <file-name> (dumps ascii format) Manage Cluster Database srvctl command srvctl <commanD> <OBJECT> [<OPTIONS>] I will explain this with an exmaple.\ocrmirrorcfg Device/File integrity check succeeded Cluster registry integrity check succeeded Make sure to check the log: I:\oracle\product\10. Let us suppose we need to stop all rac resources (not the windows services like CSS.dmp oracle performs 4hr backup at cdata folder under CRS_HOME but only on master node.Displays health of Oracle Cluster Registry Ocrcheck Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 192652 Used space (kbytes) : 3800 Available space (kbytes) : 188852 ID : 1953799442 Device/File Name : \\.-.2. ocrconfig -replace ocrmirror <new location> ocrconfig -restore ocrbackup Ocr backup is automatically taken every 4-hours on the master node. CRS and EVM). Please make sure to keep a copy of the backup files. You can use the command ocrconfig -showbackup to see existing backups. Step2:Stop database with its instances(all) .dmp -s online ocrconfig -import ocr.\ocrcfg Device/File integrity check succeeded Device/File Name : \\. Recycle RAC Environment Step1: Stops agent processes SET ORACLE_SID=RACDB1 emctl status agent emctl status dbconsole emctl stop dbconsole Repeat the same on RAC3 and then run emctl status to verify.

GSD. ONS SERVICES srvctl stop nodeapps -n RAC1 srvctl stop nodeapps -n RAC3 Step5: START VIP.3 Patch set . --Accesing RAC Environment from EM console. What ever actions you need to perform you can also see the corresponding SQL that will be run.srvctl stop database -d RACDB Step3:stop all asm instances srvctl stop asm -n RAC1 srvctl stop asm -n RAC3 Step4:stop VIP. Applying Oracle 10.0. Make sure agent is up and then open IExplorer: http://RAC1:1158/em where RAC1 is the dbconsole node. Make sure RAC is up by crs_stat -t command. I would highly recommend DBAs to get familiar with EM and it is an excellent GUI tool to monitor and manage your RAC environment.GSD.2. ONS AND LISTENER SERVICES srvctl start nodeapps -n RAC1 srvctl start nodeapps -n RAC3 Step6: starts asm instances srvctl start asm -n RAC1 srvctl start asm -n RAC3 Step7: starts db +instances srvctl start database -d RACDB Step8: dbconsole and agent startup set ORACLE_SID=RACDB1 emctl stART dbconsole Here you need to repeat this step on RAC3 as well.

NET Oracle Provider for OLE DB Oracle Objects for OLE Oracle Counters for Windows Performance Monitor Oracle Administration Assistant · After this point. you may receive errors like file in use. Major steps in applying the patch. on each of the two nodes. During db home patching. · On both nodes. Run c:\10203Patch\Setup. make sure all services/components of RAC are down. you should go to the file folder from windows explorer and remove that file and then retry the operation.exe.3 by running the following command: crsctl query crs softwareversion crsctl query crs activeversion · Stop all cluster services once again. Applying oracle patch has a pre requisite that all oracle services are down. I would recommend to patch the two RAC nodes with latest oracle patch for both Cluster layer and Database (asm inclusive).2. make all RAC services on windows as Manual start except two services.Now that the RAC is installed on the two nodes.exe again and this time choose oradb_1 home to patch the oracle asm and oracle database software. RAC3 and we are ready to add a third node as RAC5. Complete the patch installation.2. · Unzip and copy the patch contents to RAC1 node. run the following from command prompt: I:\oracle\product\10. you should notice in Windows service manager that Cluster related services will be up and running.bat on remote nodes to activate the following products: Oracle Data Provider for . You can verify that the Cluster layer is patched with 10.0. and bounce both nodes. · Select ORACRS as your first home to be patches which is the clusterware stack. Oracle Object Service and OracleClusterVolumeService. Now run the setup.0\crs\install\patch102. · On the next screen make sure both RAC Nodes are checked and proceed to complete the installation. · After installation is over. · After the installation is over. Once up start the services in the .ba t · At this point. RAC1. you need to run the following on remote node (RAC3) : You need to execute <Oracle Home>\bin\SelectHome. Create a folder as C:\10203Patch and use that as the patch contents unzip folder.

ora so that it will only contain the following line: SPFILE='+DATA/RACDB/spfileRACDB.2. Then revert back the saved init file to original initRACDB1. and also runalter system set cluster_database=FALSE scope=spfile. · Now startup the instance as mount. · Again startup the instance as : 1.0\db_1\RDBMS\ADMIN\catup grd. Save the pfile initRACDB1.0 00:01:40 · Oracle Database Java Packages VALID 10.0.0.ora from database folder as it is not required.3.3.0 00:19:33 · JServer JAVA Virtual Machine VALID 10. spool patch10203.0.following order: § OracleCSService § OracleEVMService § OracleCRService · Make sure database services are down otherwise run the following command: Srvctl stop database -d RACDB · Now you are ready to run the catupgrade against the data dictionary as part of the last step in patching database.2. log on to sqlplus after setting the ORACLE_SID=RACDB1.2.ora' from pfile.ora.0.0.0 00:00:33 .sql · Review the log file for any errors and make sure all database compoenents are showd updated with From sqlplus run create pfile from spfile.0 00:01:03 · Oracle Text VALID 10. Since am using Automatic SGA memory management.0. @I:\oracle\product\10.ora (located at local node location %ORACLE_HOME%/database to another file name.log 3.2.3 patch set as: · Component Status Version HH:MM:SS · Oracle Database Server VALID 10. change parameter values and then from sqlplus run create spfile=SPFILE='+DATA/RACDB/spfileRACDB.2. · From RAC1 node.2. you can revert it back to 150 after the patch is deployed. turn archive off by running alter database noarchivelog. thenshutdown the instance.0 00:06:23 · Oracle XDK VALID 10.ora' · Also remove the local file spfileracdb1. startup upgrade 2. However you need to make sure SGA components (shared pool and java pool should be at least 150 mb each). I will increase the SGA_TARGET from 150 to 300 MB for the instance. and then open the initRACDB1.

3.2.0. RAC5 needs to be configured with the following parameters: · IP addresses assigned and also need to be replicated to the host file of remaining two nodes.0 00:01:20 · Oracle interMedia VALID 10.2.SQL TO COMPILE ALLINVALID OBJECTS (ELSE THEY BE VALID WHEN ACCESSED) · alter system set cluster_database=TRUE scope=spfile.0.· Oracle XML Database 10. while the IP addressed of the existing two nodes need to be copied to the host file of RAC5: RAC5 will have the following IP addresses: 10.0 00:01:37 · Oracle Real Application Clusters VALID 10.0 00:08:31 · Spatial VALID 10.0.3. we are ready to create RAC5 as third node.3. RAC5 has already been created as a virtual machine on workstation XPWS3. 00:00:34 · OLAP Analytic Workspace VALID 10. RAC1.2.2.0 00:06:41 · Oracle Expression Filter VALID 10.2.0 00:00:13 · RUN UTLRP. 00:00:59 · OLAP Catalog VALID 10.3. At this point you have successfully deployed oracle 10.73 RAC5 . VALID Adding a third Node to the Cluster Database Now that the RAC is installed on the two nodes.0 00:00:30 · Oracle Enterprise Manager VALID patch set.0.3.0 00:01:41 · Oracle OLAP API VALID 10.3.0 00:00:02 · Oracle Data Mining VALID 10.0 00:02:32 · Oracle Rule Manager VALID 10. · SHUTDOWN · STARTUP MOUNT · ALTER DATABASE ARCHIVELOG. RAC3. · SHUTDOWN · STARTUP · srvctl start database -d RACDB · crs_stat -t should now show databases instances to be up and running.

present = "TRUE" sound.ungrabbed = "normal" powerType.version = "3" scsi0.virtualDev = "es1371" Ethernet1. · Following is the excerpt from RAC5 OS winNetEnterprise.locking = "FALSE" diskLib.present = "TRUE" scsi0:1.vmx: disk.generatedAddressOffset = "0" tools.dataCacheMaxSize = "0" diskLib.42.10.10.dataCacheMaxReadAheadSize = "0" diskLib.maxUnsyncedWrites = "0" scsi1.grabbed = "normal" priority.73 RAC5-PRIV 10.generatedAddress = "00:0c:29:6a:0b:18" ethernet0.virtualDev = "lsilogic" scsi1.present = "TRUE" .syncTime = "TRUE" scsi0:1.dataCachePageSize = "4096" diskLib.present = "TRUE" scsi0:0.addressType = "generated" uuid.location = "56 4d 49 04 57 26 bc 40-3f 27 76 e5 1c 6a 0b 18" uuid.reset = "default" ide1:0.present = "TRUE" sound.powerOff = "default" powerType.startConnected = "TRUE" Ethernet0.present = "FALSE" ide1:0.0.present = "TRUE" scsi0.virtualDev = "lsilogic" memsize = "540" scsi0:0.vmdk" ide1:0.252 RAC5-VIP · Map network drives on XPWS3 as Y and W to point to ASM folders of RAC1 and RAC3 with full permission.version = "7" virtualHW.powerOn = "default" powerType.10.suspend = "default" powerType.deviceType = "cdrom-raw" floppy0.bios = "56 4d 49 04 57 26 bc 40-3f 27 76 e5 1c 6a 0b 18" ethernet0.vmdk" sound.fileName = "Windows Server 2003 Enterprise Edition (3).dataCacheMinReadAheadSize = "0" diskLib.fileName = "-1" displayName = "rac5" guestOS = "winNetEnterprise" priority.fileName = "Windows Server 2003 Enterprise Edition.sharedBus = "VIRTUAL" config.fileName = "auto detect" ide1:0.present = "TRUE" scsi1.fileName = "A:" Ethernet0.

vmdk" ide0:0.0\crs on RAC5 and also cluster services but will not start cluster sevices (except first 2 obj serv and cluster volume) · cd I:\oracle\product\10.fileName = "W:\ASMDISK\ocr2.present = "TRUE" scsi0:2.generatedAddress = "00:0c:29:6a:0b:22" ethernet1.present = "TRUE" scsi1:0. clscfg: version 3 is 10G Release 2.fileName = "W:\ASMDISK\oradata2.0\crs\install\ · I:\oracle\product\10.ad d.present = "TRUE" scsi1:1. .present = "FALSE" ide0:0.generatedAddressOffset = "10" scsi0:2.present = "TRUE" ide0:1.fileName = "Y:\ASMDISK\votingdisk1.2.fileName = "W:\ASMDISK\votingdisk2.2.vmdk" floppy0.fileName = "Y:\ASMDISK\ocr1.0\crs\install>crssetup.present = "TRUE" scsi0:3.present = "TRUE" ide0:0.2.0\crs\oui\BIN addnode.vmdk" scsi0:3. · The above proc will install I:\oracle\product\10.deviceType = "plainDisk" · Now we are ready to add RAC Node2 as RACDB3 to Server RAC5.addressType = "generated" ethernet1. Step 1: checking status of CRS stack Step 2: Configuring basic cluster services Step 3: configuring OCR repository with new nodes clscfg: EXISTING configuration version 3 detected. Run the following commands from Existing Node RAC1: · cluvfy comp peer -refnode rac1 -n rac5 (Compare) · Install Clusterware stack software on RAC5 from RAC1 as: cd cd I:\oracle\product\10.vmdk" scsi1:0.fileName = "Y:\ASMDISK\oradata1.vmdk" ide0:1.2.vmdk" scsi1:1.deviceType = "plainDisk" ide0:1.bat · You should recive the following messages and make sure there are no errors even for VIP services.Ethernet1.bat · Press Next to the welcome screen and provide public and private IP address of the new new node and complete the installation.

node <nodenumber>: <nodename> <private interconnect name> <hostname> node 3: rac5 rac5-priv rac5 Creating OCR keys for user 'administrator'. · Creating ASM instance on RAC5 Node: Create admin folder under I:\oracle\product\10.bat Complete install process.instance_number=3 to init. Starting VIP application resource on (1) nodes. this is not required but in case later you would like to create spfile for asm..instance_number=3 copy +ASM3..ora of other instances. · At this point all cluster services on the new node RAC5 should be aytomatically started and this marks the end of Cluster stack installation for the new node. ons and vip services up and running. Step 4: configuring safe mode for CRS components Step 5: starting up the CRS stack on new nodes Step 6: configuring OCR with new node VIP information Creating VIP application resource on (1) nodes.. vip and gsd services.. Creating ONS application resource on (1) nodes.. Starting ONS application resource on (1) nodes. · Now you are ready to install oracle software on the new node.0 Copy all the contents of admin folder from RAC1.0\db_1\oui\bin addnode. · If you run crs_stat -t from Nod3 (RAC5) you should see gsd. Make sure listener creation is only for RAC5 node. privgrp ''. Operation successful. Run crs_stat -t and you should see listener component also apearing besides ons.. Creating GSD application resource on (1) nodes. Now you can go back to RAC1 node and run DBCA GUI tool and follow the screens to add RACDB3 instance on RAC5 Node (third node). Modify init+asm3. Starting GSD application resource on (1) nodes. You now have two subfolders as +ASM and RACDB. Go to command prompt of ORACLE_HOME/database set ORACLE_SID=+ASM3 . However I always prefer manual approach which is explained below: Perform the following steps from RAC5 node.ora to include +ASM3.Attempting to add 1 new nodes to the configuration Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897. Cd I:\oracle\product\10.2.2.. · Go to RAC5 node and run network config assistance and create the listener with default settings.

__shared_pool_size=121634816 racdb2.audit_file_dest='I:\oracle\product\10.0' *.control_files='+DATA/racdb/controlfile/current .ora password=password oradim -new -ASMSID +ASM3 (create windows service) make sure asmtoolg shows the oradata asm group disks via sqlplus mount instance as startup.background_dump_dest='I:\oracle\product\10.__large_pool_size=4194304 racdb3.__shared_pool_size=96468992 racdb1.__java_pool_size=4194304 racdb1.0 /admin/RACDB/bdump' *.__large_pool_size=4194304 racdb1.ora as shown below and then copy it back to spfile.260.__java_pool_size=4194304 racdb2.db_file_multiblock_read_count=16 *.db_create_file_dest='+DATA' *. · Creating Database instance on RAC5 Node: set ORACLE_SID=RACDB3 orapwd file=PWDRACDB3.__shared_pool_size=96468992 racdb3. so shutdown asm instance and go to crshome/bin and run the following command: srvctl add asm -n RAC5 -i +ASM3 -o %ORACLE_HOME% srvctl start asm -n rac5 Now execute srvctl start asm -n RAC5 and the ASM will be started.__db_cache_size=201326592 racdb2.2.__db_cache_size=159383552 racdb1.1.db_block_size=8192 *.db_name='RACDB' *.__streams_pool_size=0 *.__streams_pool_size=0 racdb2.2. racdb1.__streams_pool_size=0 racdb3.__java_pool_size=4194304 racdb3.__db_cache_size=159383552 racdb3.cluster_database=TRUE *.2.__large_pool_size=4194304 racdb2.0/admin /RACDB/cdump' *.cluster_database_instances=3 *.ORA password=password create pfile from spfile and edit the contents of the initRACDB3.0. You should now see asm instance running and you can verify by running command select * from v$asm_diskgroup However you need to add the asm service to cluster stack.0/admi n/RACDB/adump' *.core_dump_dest='I:\oracle\product\10.626900241' *.dispatchers='(PROTOCOL=TCP) (SERVICE=RACDBXDB)' .compatible='10.db_domain='' *.2.orapwd file=PWD+ASM3.

RACDB3.undo_management='AUTO' RACDB3. then that means it was not started with the proper spfile which has value set as cluster_database_instances=3.open_cursors=300 *. alter database enable public thread 3.undo_tablespace='UNDOTBS2' RACDB1.undo_tablespace='UNDOTBS1' *.user_dump_dest='I:\oracle\product\10.instance_number=2 RACDB1.sga_target=268435456 RACDB3.job_queue_processes=10 *.0/admin /RACDB/udump' Now create oracle db sid oradim -NEW -SID RACDB3 startup pfile=initRACDB3.thread=2 RACDB1.cannot mount To overcome this message you need to create the redo log for the new node from RAC1 as: alter database add logfile thread 3 group 5.processes=150 *.instance_number=3 RACDB2.remote_login_passwordfile='exclusive' *.pga_aggregate_target=16777216 *. You do not necessarily need a third party cluster software for RAC implementation as .instance_number=1 *.thread=1 *. Add new db instance to the node: adds instance srvctl add instance -d RACDB -i RACDB3 -n RAC5 Now shudown the database as srvctl stop database -d RACDB Now start the database as srvctl start database -d RACDB At this point all three instances are up and running and if you encounter issues like cluster_database_instances value is not in sync for any of the instance. RAC Concepts Primer Oracle Clusterware With 10g. ORA-01618: redo thread 3 is not enabled .2.remote_listener='LISTENERS_RACDB' *. Also create undo tablespace as: create undo tablespace UNDPTBS3. alter database add logfile thread 3 group 6.thread=3 RACDB2.undo_tablespace='UNDOTBS3' RACDB2. alter database mount.ora nomount.

Oracle Clusterware reads the ocr.Oracle Clusterware provides the clustering support.loc file during the system startup. 3. Database. It is important to note that Only one OCR process (designated as the master) in the cluster performs any disk I/O activity. It then reads the OCR file to determine the location of the voting disk. EVM. 8. 5. CSS then bring voting disks online. a new incarnation of the cluster . Finally. Each RAC node maintains a copy of the OCR in memory. All of the ACTIVE RAC nodes then register themselves with the MASTER node. it is then replicated from local OCR cache to the OCR cache on other nodes in the cluster. and to find out which resources need to be started on RAC Nodes after reading OCR file contents. Oracle Cluster Registry (OCR) OCR maintains RAC application resources and availability. Once the connection is established between the various RAC Nodes listeners. The OCR file contains information for all of cluster layers. 2. Oracle Clusterware determines the location of the OCR from the ocr. Oracle Clusterware software enables RAC nodes to communicate with each other and work as single logical RAC server. 4.loc(on Unix) or registry values(on windows). for the location of the ocr file. CSS performs the following: 1. 7. and CRS. It is created on a shared storage accessible to all Nodes. CSS then establishes a connection to all RAC nodes using private interconnect. This is the first process that is started in the Oracle Clusterware stack. The information relating to System includes CSS. CSS authorizes the first node that attains the ACTIVE state as the MASTER node unless a MASTER node is already assigned. these nodes are changed to ACTIVE status if the node(s) is able to access voting disk(s). The vote disk is required to determine the names/numbers of members in the cluster. Cluster Synchronization Services (CSS) CSS maintains membership of each RAC Nodes in the cluster through voting disk which is also stored in shared storage subsystem. ORA_CRS_HOME etc. The layers include System. 6. CRS. Once information is read by this master OCR process.

Event Manager Daemon (EVMD) The EVMD is an event-forwarding process that sends events through the Oracle Notification Service (ONS). Cluster Ready Service Daemon (CRSD) The CRSD process is used to define and manage resources. Reconfiguration of instances (when an instance joins or leaves the cluster) is also handled by GM. start. Cluster Synchronization Service(CSS) 2. and manage failover. All clients that perform I/O operations register with the GM (e. If the daemon fails.. it will automatically starts. The OCR information is cached inside CRS. 3.e. LMON. Cluster Synchronization Service Daemon: Cluster Synchronization Service Daemon (CSSD) is responsible for synchronization between the various resources in the cluster. Node Membership Service (NM) has the following role: o Check the heartbeat across RAC Nodes every second o Check the heartbeat of the disk by performing a read/write operation every second o If the heartbeat fails to receive for more than 60 sec. o Query voting disk to determine if any RAC node is not able to write to it. RACGIMON process 5. All communications between the CRS and CSS happen via this process. These services are performed by the Node Membership (NM) and the Group Membership (GM) services. so it also acts as gateway for messages. This process manages the application resources i. PROCD process. Oracle Clusterware Stack The main processes that compose the Oracle Clusterware stack are: 1.g. The GM provides membership services. Moreover this process also . Master Node will evict the problematic node from cluster. 1. Cluster Ready Service (CRSD) 4. the GM sends out messages to other instances regarding the status. When a node fails. DBWR). A failure of this process will cause the relevant RAC node to reboot. stop. 2.is established. Event Manager Service (EVMD) 3. Resources have profiles that define metadata about them in OCR.

the client connection will be rejected by the that node. If any of the nodes is unable to access the voting disk. Cluster Interconnect is a communication network used by the cluster nodes for the synchronization of resources and is also used to transfer instance-specific data from one . Listeners. the RACGIMON process is started on the MASTER node of the surviving nodes by the CRS process.starts and communicates with the RACGIMON process. and also performs the tasks of starting. stopping. 3rd party Services. 4. 5. When a node goes down. Virtual Internet Protocol (VIP). Resources that are managed by the CRS include: Global Service Daemon (GSD). however its VIP resource will be failed over to another existing node and there will be no TCP timeout whereas the clients will be connected to the RAC. Database listeners are configured to listen on VIPs addresses instead of the public ones. and failover services. PROCD Process PROCD is also a process monitor that runs on hardware platform supporting other third-party cluster managers and is present only on hardware platforms other than Linux like it is present on AIX OS machines. Instances. ONS Daemon. when the node that houses it fails. RACGIMON Daemon RACGIMON is a database health check process monitor. Databases. the cluster immediately recognizes the communication failure and Master node starts evicting the failed node from the cluster group to prevent data corruptions. Virtual IP is required to ensure that applications can be work to be high available. Additional Notes: The voting disk is a shared disk that will be accessed by all the nodes used as a central reference. keeps the heartbeat information between the nodes. You should always have three voting disks on different locations to avoid split brain issue which can result in corruption.

1. · Rollback uncommitted transactions for blocks that are being requested for consistent read by the remote instance. 4. RAC Background Processes RAC Instance will have the usual background processes that a single non-RAC instance has plus additional processes specifically required for the RAC environment. row cache. LMON (Lock Monitor): Global Enqueue Services Monitor LMON Process is a monitor process which manages: · Instance deaths and associated recovery for the failed node · Cluster/Locks reconfiguration when a new instance joins or existing instance gets evicted from the RAC · Maintains consistence among GCS memory in case any LMSx dies. LMS (Lock Manager Service): Global Cache Services Process LMS is the process used in Cache Fusion. The network layer should be dedicated to the RAC and has high bandwidth with low latency. Cache fusion uses high-speed interprocess communication ntwork for cache-to-cache transfer of data blocks between RAC instances. and lock requests that are local to the instance. · Number of LMS processes running is driven by GCS_SERVER_PROCESSES parameter say for example ora_lms0. LCK: Lock Process Primary function is to manage non-cache fusion resource requests such as library. · DIAG: Diagnostic Daemon Monitors health of the RAC instances .instance to another..ora_lms9 2. It addresses transaction concurrency between instances. 5. 3. LMD (Lock Manager Daemon): Global Enqueue Services Daemon It is a process responsible for: · Managing requests for resources and controls access to blocks and global Enqueues · Handling global deadlock detection and remote resource requests. functions are: · Enables consistent copies of blocks to be transferred between instances.

we are ready to create a third voting disk. Oracle Object service (Keep Auto Start) Oracle cluster volume service (Keep Auto Start) Oracle CS service (Keep Auto Start) Oracle EVM service Oracle CR Service . It may be possible that these issues will not arise on a stable environment like AIX/HP over SAN storage. · Concurrent Reads and Writes on different nodes is a combination of I/O operations for a single block of data. In a clustered database environment. · Concurrent Writes on different nodes occurs where multiple instances want to change the same data block frequently. On Windows environment. A block available on any of the instances is modified by a another instance while maintaining a different copy of data. I would like to share some of the issues faced and the methods to resolve. But before we do that. Please start services in the following order.and captures diagnostic data regarding process failures in an instance. but having the RAC tested over VMWare/Windows has its benefits in terms of troubleshooting. Troubleshooting RAC Environment Now that you created a three node RAC with storage extended from RAC1 to RAC3 with normal redundancy. you have the following services for each RAC Node. · Note that PMON restarts a new DIAG process to continue its service in case DIAG process dies. Additional Notes: The GCS and GES processes on each RAC-Node manage the cache synchronization by using the cluster interconnect network layer. there will exists different scenarios of block sharing which can be categorized as follows: · Concurrent Reads on multiple nodes occurs when two ore more instances are required to read the same block of data.

version. On windows some times you are not able to start the CRSD service and in the crsd.ora to have the timout increase from 10 secs.node3] Value [] I used ocrdump ocr. so I did the following to resolve: srvctl remove database -d RACDB (this will move db resource and instances registered) Crs_unregister ora. please make sure that the CS service is started on all nodes.RACDB.oradb.RACDB RAC3 5. 1. I had this problem with the database resources and the instances. Cluster components/services not starting.log file you will notice network timeout ora-errors.db srvctl add database -d RACDB -o srvctl add instance -d RACDB -i srvctl add instance -d RACDB -i srvctl start database -d RACDB also all %ORACLE_HOME% RACDB1 -n RAC1 RACDB2 -n RAC1 You can also remove a particular instance by running the command: srvctl remove instance -d RACDB -i RACDB1 3. Key [SYSTEM.inbound_connect_timeout=600 4. OCR Corrupted when starting crsd service After I added the 3rd node and starting crsd service which failed with a message in crsd. you need to add this parameter in sqlnet. Miscellenous cluster commands: srvctl start instance -d RACDB -i RACDB2 srvctl status instance -d RACDB -i RACDB2 srvctl status database -d RACDB crs_stat -t -v srvctl add asm -n RAC5 -i +ASM3 -o %ORACLE_HOME% srvctl start asm -n rac5 ocrcheck --starts specific resourse Crs_start <resource> -c <member> Crs_start ora.log: Incorrect SV stored in OCR.node_numbers. Creation of OCR mirror: ocrconfig -replace ocrmirror \\. Some times you receive a message that the <name> resource is not registered with the cluster and although you are able to see the resource when you type crs_stat -t.\ocrcfg 2. sqlnet.Some times you will not be able to start EVM service.txt and opened the file in text editor and found out that the value of .

Location: I:\oracle\product\10. You can also find information about all cluster members being active.3.2.log which displays client information for any missing entries in the registry.2. you should be able to find all relevant information of cluster resources (asm. When you start crsd service.0.version.0\crs\log\rac3\client Here you can find log files like cssn. For example: 2007-07-16 10:43:47. you will see a log mentioning about not able to find the corresponding location. For example when you start cluster services. RAC Node Alert Log Location: I:\oracle\product\10. you should see this log for any warnings or errors. listener.0\crs\log\rac3\crsd This is the most important log of all(cluster ready services) depending on the level of debug mode.SYSTEM. Cluster Services Log Location: I:\oracle\product\10.dmp file in hex editor and add the values.0\crs\log\rac3\css .0 like for other nodes.0\crs\log\rac3\alertrac3. and the only way out was to recreate OCR or re-install RAC5 node. you should be familiar with the following log files.784: [ OCROSD][2744]utgdv:11:could not read reg value ocrmirrorconfig_loc os error= The system could not find the environment option that was entered.2.dmp.2. You will not find details information about individual cluster components. OCR information when it is configured for changes like when upgrading ocr etc. This method however is not supported by oracle. Opened the ocr. log This log basically logs status information for the entire cluster. What I did was exported ocr using ocrconfig -export ocr.node_numbers. however executive information is logged here. Later I imported back using ocrconfig -import ocr.dmp and it worked. database. Location: I:\oracle\product\10.2.node3 should be 10. ons etc) as to why they failed to start and continuous running log for all failures. it will display status for voting disks being brought online. For example when your mirrored ocr gets corrupted. RAC Logs While investigating various problems. 6.

0\crs\BIN\OOBJService. Location: I:\oracle\product\10.Cluster stack log for the CSS service which gets started before CRSD service. You can control the information being logged with various trace levels. crsctl debug statedump crs will dump status of crsd Suppose you want to debug specific modules for a service.2. Location: I:\oracle\product\10. crsctl lsmodules css will list: CSSD COMMCRS COMMNS crsctl lsmodules crs will list: CRSUI CRSCOMM CRSRTI CRSMAIN CRSPLACE CRSAPP CRSRES CRSCOMM CRSOCR CRSTIMER CRSEVT CRSD CLUCLS CSSCLNT COMMCRS .0\crs\log\rac3\racg This folder has log files for VIP service (even for other nodes when they failed over) as well as the main database service log ora. You should look in this log (even if the windows service gets started fine) for any errors relating to accessing the shared storage from a node.RACDB.db. first this you should do is to find out all of the modules related to a service.log This is the log for the first windows service Oracle Object Service which gets started and is responsible to links to storage management (ocr disk. voting etc).2. Location: I:\oracle\product\10.2.0\crs\log\rac3\evmd Cluster event management log for the EVMD service which gets started after CSS but before CRSD. For example.log which controls all RAC instances for High availability and monitoring.

Backup and Recovery .2.racdb2.ini Database Services Log Location: I:\oracle\product\10. For example you find here a log about ocr not being able to initialized as file name: I:\oracle\product\10. Location: I:\oracle\product\10.RAC Environment Here I will discuss about backing up RAC .0\db_1\log\rac3\racg\imon .log Here you can find information related to Cache fusion communication channel over private interface card. Look here when there is an issue between nodes for private interface channel.RACDB.log I:\oracle\product\10.2.db:5 crsctl debug log res ora.log.CRSCOMM:2" crsctl debug log res ora.log Instance monitor/RACGIMON logs Location: I:\oracle\product\10.0\db_1\log\rac3\* This location holds several logs and its worthwhile to look here when there is an issue with cluster database.2.0\db_1\log\rac3\racg\imon _RACDB.RACDB.inst:5 ocr looging: uncomment:I:\oracle\product\10.0\db_1\RDBMS\log\ ipcdbg.COMMNS:5 crsctl debug log crs "CRSRTI:1.2.COMMCRS:5.RACDB1.COMMNS crsctl lsmodules evm will list: EVMD EVMDMAIN EVMCOMM EVMEVT EVMAPP EVMAGENT CRSOCR CLUCLS CSSCLNT COMMCRS COMMNS Now suppose you want to debug css modules for level 5(detailed info): crsctl debug log css CSSD:5. Which holds useful information everytime you run ocrcheck utility.2.2.log and ocrcheck_600.0\crs\srvm\admi n\ocrlog.0\db_1\log\rac3\\client\o crconfig_1064.

Export is always used when your requirements are more closer to the application level for specific objects. My viewpoint is that a DBA needs to be more aware about Backup and Recovery concepts in a RAC environment rather than the actual commands difference. The key here is that you should always define Archived Log location in the shared storage (where rest of the data files reside). 1. RAC. As far as OS and Oracle software layer is concerned. set ORACLE_SID=+ASM1 set ORACLE_HOME =I:\oracle\product\10. Cluster and Database. I have always maintained RAC Databases in such a away that I did not have to issue different backup/recovery commands for single instance vs. RMAN should always be The Choice when considering backup strategies for a RAC Database environment. .2. Therefore do not consider export as your backup strategy for RAC. For example. or even for a single instance Non-RAC database. however this does not differ from single instance to RAC and I would not consider export to replace the standard backup procedures. take a look at the following test case where archived logs are defined at a shared storage accessible to both nodes. you should have a cold backup for the OS System backup which should include OS and Oracle software mount points. then you should make sure to use a MML like Veritas and have the backup registered in Veritas as well as RMAN repository. There are many articles available on Metalink that talk about RAC Backup and Recovery Procedures/commands.0\db_1 asmcmd -p cd DATA cd RACDB mkdir BACKUP +DATA/RACDB/BACKUP is the shared backup location unless you are using a Tape device.environment including all of its components. Backup and Recovery for clustered database You can always use an export method to backup the database or specific schema. Create a folder in shared storage to hold your backup sets. Backup commands have specific switches when your archived logs are backed up locally on each node.

drop user user1. CREATE TABLESPACE tbs_test1 DATAFILE SIZE 20M.sequence#. alter database open. alter user user1 default tablespace tbs_test1. create user user1 identified by user1. create table dept (id number) --insert some values Now from both nodes switch logfiles. } run { delete obsolete. Take full database backup run { change archivelog all crosscheck. backup archivelog all format = 'i:\oracle\archbkup%u' delete input.thread#. alter database datafile 7 offline. . Point your archived logs to be created at shared storage. } configure CONTROLFILE AUTOBACKUP on. SELECT name. } RMAN> list backup of database. alter system set log_archive_dest='+DATA' scope=both 3. run { backup as compressed backupset database format = '+DATA/RACDB/BACKUP/FULLB1%u'. Note down archived logs created. Simulate Crash Shutdown database (only on windows) From ASMCMD. 4. Create test data show parameter db_create_file_dest NAME TYPE VALUE -----------------------------------. completion_time FROM gV$ARCHIVED_LOG where completion_time > '21-JUL-2007 12:00:31' and name is not null ORDER BY SEQUENCE# DESC 5.--------------------db_create_file_dest string +DATA drop tablespace tbs_test1 including contents and datafiles. remove the datafile for the tbs_test1 tablespace. Startup database in mount state.2.

services. Obviously if you only have one copy of the OCR and it is lost or corrupt then you must restore . recover datafile 7. OCR is the Oracle Cluster Registry.select * from v$recover_file exit. it holds all the cluster related information such as instances. .1. Backup and Recovery for OCR & VOTING disks OCR Backup and Recovery Reference: Metalink Note: 220970. However if you decide to put archived logs locally. The OCR file format is binary and starting with 10. and use -import option to restore the contents back.1 is an old one that deals with 9i and Parallel server but it does clear some concepts about raw devices.1 and Note:220970. these are the changes to be aware of: run { allocate channel d1 type disk connect 'sys/rac@node1'. Location of file(s) is located in: /etc/oracle/ocr. rman target / restore datafile 7. As you can see in the above example.1 OCR raw device/file gets backed up every four hours on the master RAC node at the default location: $ORA_CRS_HOME\cdata\"clustername"\ To display backups : ocrconfig -showbackup To restore a backup : ocrconfig -restore The automatic backup mechanism keeps about a week old copy.Note:207059. There is an excellent note on Metalink that deals with issues relating to recovery scenarios for RAC environments.Now you run commands as normal and those will have the scope for the relevant node specified above.CONFIGURE CHANNEL 2 DEVICE TYPE DISK connect 'SYS/rac@node2'.2 it is possible to mirror it.loc in ocrconfig_loc and ocrmirrorconfig_loc variables. If you want to take a logical copy of OCR at any time use : Ocrconfig -export . we did not have to specify any extra commands for the RAC environment because our archived logs are located in the common storage. Or CONFIGURE CHANNEL 1 DEVICE TYPE DISK connect 'SYS/rac@node1'. allocate channel d2 type disk connect 'sys/rac@node2'. Document 207059. sql 'alter database datafile 7 online'.

.2) 2006-07-12 10:53:54. Basically. b) Issue "ocrconfig -overwrite" on any one of the nodes in the cluster. The real answer depends on when the corruption takes place. then the corruption will be tolerated and the Oracle Clusterware will continue to function without interruptions. When the Clusterware attempts to start you will see messages similar to: total id sets (1). Despite the corrupt copy. if OCR device is configured with mirror. The interesting discussion is what happens if you have the OCR mirrored and one of the copies gets corrupt? You would expect that everything will continue to work seamlessly. specifically -showbackup and -restore flags. Until a valid backup is restored the Oracle Clusterware will not startup due to the corrupt/missing OCR file.301: [OCRRAW][1210108256]proprioini:disk 0 (/dev/raw/raw1) doesn't have enough votes (1.a recent backup. DBA can replace the failed device with another healthy device using the ocrconfig utility with -replace flag. total votes (2) 2006-07-12 10:53:54. There are 3 ways to fix this failure: a) Fix whatever problem (hardware/software?) that prevent OCR from accessing the device. This command will overwrite the vote check built into OCR when it starts up. alternatively.301: [OCRRAW][1210108256]proprseterror: Error in accessing physical storage [26] This is because the software can't determine which OCR copy is the valid one. Well. 1st set (1669906634.. see ocrconfig utility for details. DBA is advised to repair this hardware/software problem that prevent OCR from accessing the device as soon as possible.0) my votes (1). then it will not be possible to start it up until the failed device becomes online again or some administrative action using ocrconfig utility with -overwrite flag is taken.1958222370). Almost. OCR assign each device with one vote. The rule is to have more than 50% . 2nd set (0. In the above example one of the OCR mirrors was lost while the Oracle Clusterware was down. If the corruption happens while the Oracle Clusterware stack is up and running. If however the corruption happens while the Oracle Clusterware stack is down.

In the example above there isn't enough vote to start if only one device with one vote is available.Restore from one of the automatic physical backups using ocrconfig -restore.Run ocrcheck to verify.loc(or windows registry) on all nodes and set up ocrconfig_loc=new OCR device .Edit /var/opt/oracle/ocr.reboot to restart the CRS stack. In 2-way mirroring. OCR locations can be changed with ocrconfig: ocrconfig -replace ocr ocrmirror [<filename>] In short these are the commands to administer OCR: ocrconfig -replace ocr destination_file or disk Here. c) This method is not recommend to be performed by customers. OCR won't do the vote check if the mirror is not configured. How to move ocr location: Stop the CRS stack on all nodes . . while OCR is running when the device is down. . It is possible to manually modify ocr. the total vote count is 2 so it requires 2 votes to achieve the quorum. do the following to add a mirror file. ocrconfig -replace ocrmirror destination_file or disk To replace OCR do the following: ocrconfig -replace ocr destination_file or disk and to replace the OCR mirror: ocrconfig -replace ocrmirror destination_file or disk Repairing the OCR: ocrconfig -repair ocrmirror device_name To remove an OCR.loc to delete the failed device and restart the cluster. you need to have at least one OCR online ocrconfig -replace ocr OR ocrconfig -replace ocrmirror Voting Disk Backup and Recovery .of total vote (quorum) in order to safely make sure the available devices contain the latest data. OCR assign 2 vote to the surviving device and that is why this surviving device now with two votes can start after the cluster is down). (In the earlier example.

Adding 3rd voting disk: Our extended RAC environment already has one voting disk for RAC1 & RAC3 nodes.4 that crashes crs stack of voting disks are added online. You should reboot RAC1. therefore you need to use the force option: From the node RAC5: I:\oracle\product\10. There is a bug that is fixed in 10. List existing voting disks: crsctl query css votedisk To delete existing voting disk: crsctl delete css votedisk path To add another voting disk: crsctl add css votedisk path Above command should be run when crs is up.2.0\crs\BIN>crsctl add css votedisk \\. Run ASMTOOLG before and after adding the disks. RAC3 and RAC5 and then use GUIOracleOBJManager. make all cluster services must be DOWN and then verify this with the command crs_stat -t.vmdk. Now you need to run GUIOracleOBJManager. create a new pre-allocated virtual disk (IDE) of 300MB in size. however use force option if crs is down as: crsctl add css votedisk path -force Test Case: Lets apply what we have learned onto the RAC environment we have earlier creates.On Unix. Now share the D:\RACVIRTUAL\RAC5 folder on XPWS5 to XPWS1 and XPWS2 with full access. I would like to add a third voting disk on RAC5 (3rd) node. Now from RAC1 node. likewise on XPWS2 as well.exe to see that the new DISK of 300MB is visible.exe under CRS HOME/bin to assign logical name/link the new candidate disk as VOTEDSK3 as shown in the following figure. so you would be able to identify the raw partition name. you can use the following to backup voting disks: dd if=voting_disk_name of=backup_file_name You can use the ocopy command in Windows environments along with the use the crsctl commands to copy and administer the files.2. Then from VMWare add existing virtual disk from both Wok stations (XPWS1 and XPWS2) to point to Y:\votedisk3.0. On XPWS1 create a logical Y: Drive to point to it.\votedsk3 . From VMWare settings of RAC5 node.

Cluster is not in a ready state for online disk addition I:\oracle\product\10. Verify this from all nodes by running crsctl query css votedisk Now start the cluster node rac1 with all services and it should be up and running. then follow the procedures to create a new raw voting disk/device as described earlier until the point where you assign the link name. follow the same procedures as described above to re-create the new voting disk.bak \\.0\crs\BIN>ocopy OCOPY v2. 0 \\.\votedsk3.0\crs\BIN>crsctl query css votedisk 0.2. Then run the restore as shown below: I:\oracle\product\10.2.\votedsk3 \\.\VOTEDSK3 Changing Location of Voting disk: Use the add method described above.Copyright 1989-1993 Oracle Corp. Use the ocopy oracle supplied command to take a backup as shown below: From RAC5: I:\oracle\product\10. Taking a backup of voting disk: Shutdown all cluster services across nodes. However suppose you lost all of your voting disk but you had a backup.\votedsk3 successful addition of votedisk \\.\votedsk3 votedsk3.2.2.0\crs\BIN> Restoring a backup of voting disk: Suppose you lost your voting disk/device. Usage: ocopy from_file [to_file [a size_1 [size_n]]] ocopy -b from_file to_drive ocopy -r from_drive to_dir I:\oracle\product\10. 0 \\.0\crs\BIN>crsctl add css votedisk \\. I:\oracle\product\10.2.1 .BAK I:\oracle\product\10.bak VOTEDSK3.2. 0 \\.0\crs\BIN>ocopy \\.\votedsk2 2.\votedsk1 1.\votedsk3 located 3 votedisk(s).0 . What to do when OCR/Voting disks are lost and there is no backup: Reference Metalink ID: 399482. All rights reserved.\votedsk3 -force Now formatting voting disk: \\.0\crs\BIN>ocopy votedsk3.

html Copyright © 2007 www.Next I though of publishing this document at the moment and I will create additional articles on Performance Tuning and Failover strategies. Legal Privacy Powered by SiteKreator. and I can not guarantee the accuracy of any information presented after the date of publication. please contact at Support@OracleFusions.com All rights reserved. IN THIS DOCUMENT.oracle. EXPRESS OR IMPLIED. This document is for informational purposes only. I MAKE NO WARRANTIES.EmailPrint .com Click here to Go Back to Resources Section Home Team Newsletter Products Library Forums Contact Download Login Copyright © 2007-2010 Smart Oracle Solutions All Rights Reserved. For further information. For further details please visit Resource Section. For Educational Purpose Only The information contained in this document represents my personal view on the issues discussed as of the date of publication.com/technology/products/databa se/clustering/index. eferences http://www.OracleFusions.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->