P. 1
Smart Oracle Database Solutions-Oracle Database Extended RAC

Smart Oracle Database Solutions-Oracle Database Extended RAC

|Views: 311|Likes:
Publicado porSerkan Kiracı

More info:

Published by: Serkan Kiracı on Feb 18, 2011
Direitos Autorais:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as TXT, PDF, TXT or read online from Scribd
See more
See less





Smart Oracle Database Solutions-Oracle Database Extended RAC

Powering Databases Ahead Support@OracleFusions.com Oracle Sniffer Released. Click to Download NewsletterDownloadsLogin

Home Products Library Team Management Forums Contact Horizon Books

Oracle Database Extended RAC Oracle 10g Extended RAC on Windows 2003 Handbook For Achieving HA & Disaster Recovery Solution By Database Manager

Index Introduction Test Bed overview Installation of Virtual Machines

Installation of Oracle Clusterware Services Installation of Oracle Software/ASM/DB Post RAC Installation Health Check Applying Oracle Patch set Adding a third Node to the Cluster Database RAC Concepts Primer Troubleshooting RAC Environment Backup and Recovery - RAC Environment References

Introduction The document is intended as guidelines for Oracle and System Administrators who are responsible for implementing an Oracle Extended (also known as stretched clustering) RAC for the nodes that are located within 5KM away from each other. I have implemented the same on IBM P590 series running AIX 5.3 with Oracle 10g Release, using Oracle Clusterware on SAN storage, the two data centers connected over a 1GB Laser Link network with latency of less than 5 ms. We have had some problems in implementing the solution and most of the issues were related to configuring Oracle RAC components related to network synchronization. At that time, I decided to first fully test the Extended RAC implementation on my own and thanks to VMWare software, I was able to fully test the implementation. This document is based on the VMWare installation on windows and I would highly recommend that before you actually go on implementing Extended RAC over Unix in UAT/Production environment, have it fully tested

on windows to clear the concepts behind it. For RAC Handbook on IBM P590 Series-AIX OS, Pease refer to RAC resources section on my site.

Test Bed overview Oracle 10g Release (later we will apply latest patch Windows 2003 Enterprise Operating System Service Pack 1 Windows resource kit to be installed on every virtual machine. VMWare Workstation Version 4.5 (You can download trial version from their site, but I would highly recommend buying it). 4 windows XP Workstation PCs attached to a Network of 5 ms latency at most, with 1GB Ram, 1 CPU each, and at least 40GB of free storage space. There are many articles on the internet which talks about implementing RAC on a single PC with VMWare installed. However they all need at least 2GB of Memory and even when you acquired that, after the RAC is installed, you can not test all possible RAC scenarios with lack of resources. What I did and recommend, is that at your work place, talk to other DBAs and say that the 4 DBAs will have their PCs which can be used to simulate the extended RAC testing. So all you need to make sure is 4 PCs with 1GB of memory and are connected over LAN with administrative privileges on their workstations.

Installation of Virtual Machines Lets call the 4 XP OS PC work stations as: XPWS1, XPWS2, XPWS3, & XPWS4 In each of these work stations, install VMware Workstation software and then Launch VMware software and create one virtual machine of windows 2003 on every XP work station. While

creating a virtual machine, please make sure to assign the following to each Virtual server: · On XPWS1, create folder C:\RACVIRTUAL and under it create two subfolders as RAC1 and ASMDISK. · On XPWS2, create folder D:\RACVIRTUAL and under it create two subfolders as RAC3 and ASMDISK. · On XPWS2, create folder D:\RACVIRTUAL and under it create 1 subfolder as RAC5. · The above folders are for the new Virtual Server hosting Windows 2003. ASM folder is for hosting ASM Raw devices. · Virtual OS should be assigned with 524MB of RAM each. · During creation of virtual servers, choose bridged network, IO Adapter as LSI Logic, disks as SCSI. · Under Virtual machine settings, remove Drive A: · As for windows 2003, choose default settings. · Make sure the swap space is 1Gb and goes to Drive C: · Create two logical drives C, D where C: drive should be of 4G for OS usage, and D: of 5GB where we will install Oracle 10g Software. · After installation is over, choose VM menu option and install VMWare tools for the new virtual machine/OS. WorkstationVirtual ServerRemarks XPWS1RAC1RAC Node1 with its storage defined XPWS2RAC3RAC Node2 with its storage defined XPWS3RAC5RAC Node3 with no storage of its own XPWS4RAC6Only storage for 3rd voting disk

At this point you will have 4 XP work stations installed with single virtual machine each running Windows 2003 OS. Let us focus now on the first two virtual machines created. Remaining two Virtual machines will be used later for a) adding a 3rd node, and adding a third voting disk/site respectively. Therefore, at this point you will work with two virtual machines which I named as RAC1 (on PC XPWS1), and RAC3 (on PC XPWS2). RAC5 (on XPWS3) will be configured later as a 3rd node. You now need to configure network settings for these two virtual servers. Shutdown RAC1 and RAC3 servers (we will call them from now on RAC1

then you should see other articles on the internet. so please do not create two virtual machines on the same physical PC but have them created on different physical servers. RAC3 Node: Public NIC: 10.0.1 10.142.139. because if you pressed cancel. it will fail at the end. and then Private. If you would like to proceed with single PC RAC testing. because all of my configurations below are based on this fact.0.W found screen will appear again.10. Make sure the subnet is different for Public and Private.0. Rename the two as Public and Private respectively.0. every time you reboot RAC1. Click on the Advanced Settings of the Network connections windows and make sure the order of list is first Public. Gateway: leave empty Repeat the same for RAC3 node. press Next and complete it.139.142. subnet:255. subnet:255.0 Private NIC: subnet:255.72.0. My settings are: RAC1 Node: Public NIC: subnet: edit Virtual machine settings and add a new Ethernet Adapter as of Bridged type. Lets now proceed with NIC settings: · Shutdown RAC1.0 Gateway: leave empty · Modify the Hosts file of the RAC1 & RAC3 OS (c:\windows\system32\drivers\etc) as: . Bring up RAC1 server and you should see a window displayed for a new H. · Go to network connections from Control panel and you should see two NIC identified as Local Area connection1 and Local Area connection2.42.10.0 10.72.0 Gateway:10.W found. so that the host name RAC1 will resolve to Public NIC. This point is very important to note.0. Gateway:10. but let it be.0.1 Private NIC: 10.and RAC3 which are created on two separate workstations connected over your home or office network).0.0. the same new H.10.10.0. because Private NIC will be used for Cache fusion/Inter Nodes communication between the two RAC Nodes. Click on their properties and choose Internet Protocol and assign the IP addresses.

RAC3) from their respective Work stations.exe. and add DWORD key: DisableDHCPMediaSense=1.mspx?mfr=true · Go to command prompt and type: set devmgr_show_nonpresent_devices =1 devmgmt.com/technet/prodtechnol/win dows2000serv/reskit/regentry/94173. add Terminal Services component.142.251 RAC1 RAC1-priv RAC1-vip RAC3 RAC3-priv RAC3-vip RAC1-VIP and RAC3-VIP are not physically linked to any Network Card but are logically defined on the Public Subnet address. The Setup would be in such a way that these will act as components of the cluster so in case Nod1 is down.10. When we have completed the RAC setup. Now edit the main virtual Server file C:\RACVIRTUAL\RAC1\ . To know more about this parameter see http://www. · From control panel. this will enable you to use Remote Desktop services to access both nodes from a remote laptop or workstation if required. go to windows registry via Regedit. RAC5.139 10. To do this.139 10.42.42.that s what I did).72 10.msc remove greyed out NIC under Network category.10.42. these are the IP addresses (or names) which will be used to configure client connections to the RAC. Shutdown Both servers(RAC1.0.250 10. Also create a mapped drive from each workstation to each other s C:\RACVIRTUAL (and D:\RACVIRTUAL) folder with full permissions granted.10. add/remove Windows components.microsoft.10. The mapped drive can be called Z: mapped to C:\RACVIRTUAL.72 10. Perform following steps for all nodes (RAC1.10.10. its VIP service will fail over to the surviving node and client will be re-directed to the surviving node without any tcp time out. (you could use XPWS1 and from here open a remote connection to XPWS2. RAC6) · Oracle supports the TCP/IP protocol for the public and private networks and requires that Windows Media Sensing is disabled by setting the value of the DisableDHCPMediaSense parameter to 1. RAC3.10. which usually happens when listener is listening on a port attached to physical IP address.0. Navigate to HKEY_LOCAL_MACHINE\System\CurrentControlSet\Serv ices\Tcpip\Parameters.

as I had run out of SCSI limits for number of virtual devices.exe -c -s 200MB -a lsilogic -t 2 C:\RACVIRTUAL\ASMDISK\votingdisk1. Raw Device to hold Oracle Database Files. From RAC1 node: Go to command prompt.vmx on XPWS1 (for RAC1 node settings) and add the following lines: disk. Since we are going to have an Extended RAC setup. Later when configuring ASM with normal redundancy.virtualDev = "lsilogic" scsi1. Later I will show you how to create additional raw disks to move different databases files (like redo. Each set will have three raw devices. second set for OCR and third for Voting disks.sharedBus = "VIRTUAL" · Repeat the same for mapped drive Z:\RAC3\ winNetEnterprise.vmdk Explanation of OCR and Voting disk will be explained in next section. so I chose IDE hard disk as new virtual disk.dataCachePageSize = "4096" diskLib.exe -c -s 200MB -a lsilogic -t 2 D:\RACVIRTUAL\ASMDISK\ocr2.winNetEnterprise. I will create a set of raw devices on the storage of Both Workstations (XPWS1. make sure you pre-allocate the disk space and chose a size of 4GB. one to hold database files. We are now ready to create raw devices. backup sets to their respective separate raw disks).exe -c -s 300MB -a lsilogic -t 2 C:\RACVIRTUAL\ASMDISK\ocr1. XPWS2) via VMware. Create the raw disk as C:\RACVIRTUAL\ASMDISK\oradata1.vmdk vmware-vdiskmanager.present = "TRUE" scsi1.exe -c -s 200MB -a lsilogic .dataCacheMaxReadAheadSize = "0" diskLib. You need to use VMWare GUI interface to create this.vmdk Raw Device to hold Voting disk information vmware-vdiskmanager.dataCacheMaxSize = "0" diskLib. Raw Device to hold OCR information vmware-vdiskmanager.dataCacheMinReadAheadSize = "0" diskLib.maxUnsyncedWrites = "0" scsi1.vmx for the Virtual Server RAC3 created on XPWS2.locking = "FALSE" diskLib.vmdk Repeat the same from RAC3 node (on XPWS2) as: vmware-vdiskmanager. I will mirror these two raw devices sets(created on two different PC s storage).

Now you need to make the new disks available to the VMWare workstation software by editing the virtual settings as: · Bring up both nodes RAC1 and RAC2 and perform the following: 1. This is required to make sure the raw devices will be auto mounted every time os starts up. but when you opt for this option. you already have registered the raw device that holds database . Go to command prompt and type Diskpart and then enter command Automount enable. Also note the names of the raw devices end with digit2 because these raw devices will be used together with raw devices created on XPWS1 and mirrored by ASM. What I did was basically sync the time for each of the Virtual server to the Host OS(which is the XP workstation) as: From RAC1 Node.vmdk And now create D:\RACVIRTUAL\ASMDISK\oradata2.vmdk via GUI as IDE hard disk. The two RAC nodes must have the time clock synchronized. 2. You can download third part software which makes clock in synch among different servers in one network. On XPWS2 I have used drive D: because I have more free space in Drive D: on that PC. make sure Windows Time server is disabled. however for RAC1 node on XPWS1 I have used Drive C:. Search for Time Sync Server on windows in Google for that. 3. So make these two command as part of a schedule job to trigger on every system starts up. Check current Time Server with: NET TIME /QUERYSNTP To set the initial time with Time server as: NET TIME \\XPWS1 /SET Set current Time Server(XPWS1) for a RAC1 as: NET TIME /SETSNTP:XPWS1 Repeat the same for RAC3 node and make XPWS1 as its Time Server as well. · For RAC1 Node. However you need to run the above command on every time machine start up. Then check times on both servers from one place as: NET TIME \\RAC1 NET TIME \\RAC3 Alternatively right click on VMware tools icon and select time Sync between Host and Virtual machines. Or you can use net time command to configure the time from any time server available on the internet.-t 2 D:\RACVIRTUAL\ASMDISK\votingdisk2.

· Shutdown both RAC1 and RAC3 nodes. and add the three raw devices from RAC3 node. Accept defaults and press Next. Right click again and choose Create Logical drive. right click on manager MyComputer short cut on desktop.vmdk) as you have created it via VMWare GUI. since you created them via command prompt. Click on each of them one by one and perform the following tasks: 1. · Since RAC1 node also need to access the same disks created on RAC3 node. However for the OCR and Voting disks.dataCacheMinReadAheadSize = "0" diskLib.version = "7" virtualHW. 2. you need to repeat the same procedure as above. · Repeat the whole Procedure as described above for RAC3. but this time when you add disks from exiting.dataCacheMaxSize = "0" diskLib. You should now see your three disks appear as offline in the Disk Management tab. also accept defaults here and complete.dataCachePageSize = "4096" diskLib. Right click on the new disk and select new partition and choose extended and proceed to finish. choose manage and then select Storage section and click on Disk Management.dataCacheMaxReadAheadSize = "0" diskLib.maxUnsyncedWrites = "0" config.. just use the VMWare gui -add Disks and this time instead of creating a new virtual disk. · Start up the RAC1 node. choose remote location as Z:\.. You should now see the disk as Online status.locking = "FALSE" diskLib. Repeat the same for remaining two disks (ocr and voting).vmx disk.version = "3" . you should see a popup window which will list the three new disks you added.files (oradata1. · Bring up both nodes and verify all storage settings. choose create from existing and browse to these two files location and select the vmdk files. For simplicity I have copied the contents of the vmx files for both nodes below: RAC1 Node vmx file (Notice remote raw device links with Z:) Location: C:\RACVIRTUAL\RAC1\winNetEnterprise. Here make sure to choose "Do not assign drive letter and do not format the disk and continue until completion.

present = "TRUE" scsi1:3.scsi0.present = "TRUE" ide1:0.sharedBus = "VIRTUAL" scsi1:1.mode = "persistent" scsi1:1.present = "FALSE" scsi1:1.fileName = "C:\RACVIRTUAL\ASMDISK\votingdisk1.present = "TRUE" scsi0:0.fileName = "auto detect" ide1:0.reset = "default" ide1:0.virtualDev = "lsilogic" scsi1.fileName = "A:" Ethernet0.deviceType = "plainDisk" scsi1:2.syncTime = "FALSE" scsi0:1.generatedAddress = "00:0c:29:2d:88:e3" ethernet0.mode = "persistent" scsi1:3.fileName = "C:\RACVIRTUAL\ASMDISK\ocr1.grabbed = "normal" priority.virtualDev = "lsilogic" memsize = "524" scsi0:0.ungrabbed = "normal" powerType.bios = "56 4d c5 0d df c2 22 83-78 96 e8 44 92 2d 88 e3" ethernet0.fileName = "-1" displayName = "RAC1" guestOS = "winNetEnterprise" priority.vmdk" scsi1:2.generatedAddressOffset = "0" tools.present = "TRUE" sound.location = "56 4d c5 0d df c2 22 83-78 96 e8 44 92 2d 88 e3" uuid.deviceType = "cdrom-raw" floppy0.virtualDev = "es1371" scsi1.present = "TRUE" scsi0:1.powerOff = "default" powerType.mode = "persistent" scsi1:2.vmdk" scsi1:1.vmdk" sound.startConnected = "TRUE" Ethernet0.vmdk" scsi1:3.present = "TRUE" scsi0.present = "TRUE" sound.deviceType = "plainDisk" .present = "TRUE" scsi1.suspend = "default" powerType.addressType = "generated" uuid.fileName = "Windows Server 2003 Enterprise Edition.vmdk" ide1:0.present = "TRUE" scsi1:2.fileName = "C:\RACVIRTUAL\ASMDISK\oradata1.deviceType = "plainDisk" scsi1:3.powerOn = "default" powerType.fileName = "Windows Server 2003 Enterprise Edition (3).

present = "TRUE" scsi1:4.vmdk" scsi1:4.locking = "FALSE" diskLib.vmdk" scsi0:3.deviceType = "plainDisk" scsi1:0.present = "TRUE" scsi1:0.present = "TRUE" ide0:0.scsi1:4.dataCacheMinReadAheadSize = "0" diskLib.present = "FALSE" redoLogDir = ".vmdk" ide0:0.generatedAddressOffset = "10" floppy0.fileName = "C:\RACVIRTUAL\ASMDISK\test9.deviceType = "plainDisk" ide0:1.vmdk" scsi1:0.present = "FALSE" scsi0:6.mode = "persistent" scsi1:0.fileName = "C:\RACVIRTUAL\ASMDISK\test.dataCachePageSize = "4096" .fileName = "C:\RACVIRTUAL\ASMDISK\oradata1.dataCacheMaxSize = "0" diskLib.present = "FALSE" ide1:1.vmdk" scsi0:2.dataCacheMaxReadAheadSize = "0" diskLib.fileName = "C:\RACVIRTUAL\ASMDISK\oradata1.present = "TRUE" ide0:1.vmdk" scsi0:5.fileName = "C:\RACVIRTUAL\ASMDISK\oradata1.mode = "persistent" scsi1:4.addressType = "generated" ethernet1." Ethernet0.deviceType = "plainDisk" ide1:1.fileName = "Z:\ASMDISK\votingdisk2.vmdk" ide0:0.present = "FALSE" scsi0:5.present = "TRUE" Ethernet1.present = "FALSE" scsi0:3.connectionType = "bridged" scsi0:2.fileName = "Z:\ASMDISK\ocr2.vmdk" scsi0:6.vmdk" ide0:1.deviceType = "plainDisk" Ethernet1.generatedAddress = "00:0c:29:2d:88:ed" ethernet1.present = "FALSE" scsi0:2.fileName = "Z:\ASMDISK\oradata2.deviceType = "plainDisk" scsi0:5.fileName = "C:\RACVIRTUAL\ASMDISK\oradata1.vmx disk.deviceType = "plainDisk" RAC3 Node vmx file: Note remote links to raw devices on RAC1 Location: Z:\RAC3\winNetEnterprise.

version = "3" scsi0.vmdk" sound.mode = "persistent" scsi1:3.reset = "default" ide1:0.fileName = "z:\ASMDISK\ocr1.mode = "persistent" scsi1:2.present = "FALSE" scsi1:1.fileName = "Z:\ASMDISK\oradata1.deviceType = "plainDisk" scsi1:3.vmdk" .present = "TRUE" scsi1:2.present = "TRUE" scsi0:1.bios = "56 4d a4 62 67 78 2e 4e-bd 76 ea 69 ed 2d f9 03" ethernet0.version = "7" virtualHW.grabbed = "normal" priority.suspend = "default" powerType.location = "56 4d a4 62 67 78 2e 4e-bd 76 ea 69 ed 2d f9 03" uuid.fileName = "Windows Server 2003 Enterprise Edition.addressType = "generated" uuid.virtualDev = "es1371" scsi1.present = "TRUE" scsi0:0.virtualDev = "lsilogic" memsize = "524" scsi0:0.present = "TRUE" ide1:0.diskLib.vmdk" ide1:0.present = "TRUE" scsi1.fileName = "Windows Server 2003 Enterprise Edition (3).deviceType = "cdrom-raw" floppy0.present = "TRUE" scsi0.fileName = "A:" Ethernet0.vmdk" scsi1:1.fileName = "auto detect" ide1:0.present = "TRUE" scsi1:3.powerOff = "default" powerType.generatedAddressOffset = "0" tools.present = "TRUE" sound.present = "TRUE" sound.powerOn = "default" powerType.startConnected = "TRUE" Ethernet0.syncTime = "FALSE" scsi0:1.sharedBus = "VIRTUAL" scsi1:1.virtualDev = "lsilogic" scsi1.maxUnsyncedWrites = "0" config.fileName = "-1" displayName = "RAC3" guestOS = "winNetEnterprise" priority.mode = "persistent" scsi1:1.deviceType = "plainDisk" scsi1:2.vmdk" scsi1:2.generatedAddress = "00:0c:29:2d:f9:03" ethernet0.fileName = "z:\ASMDISK\votingdisk1.ungrabbed = "normal" powerType.

generatedAddressOffset = "10" floppy0. lets verify that our two nodes fulfill all of the pre-requisites for CRS installation. set the Name as oracrs and location as .deviceType = "plainDisk" ide0:1.fileName = "D:\RACVIRTUAL\ASMDISK\oradata2. Cd d:\ Cd D:\clusterware\cluvfy runcluvfy.addressType = "generated" ethernet1.scsi1:3.fileName = "D:\RACVIRTUAL\ASMDISK\votingdisk2.mode = "persistent" scsi1:0.vmdk" scsi1:0.bat stage -pre crsinst -n rac1.connectionType = "bridged" ide0:0.deviceType = "plainDisk" scsi1:4." Ethernet0.deviceType = "plainDisk" Ethernet1. · Execute the D:\clusterware\Setup.exe which will launch oracle installer for Cluster Services.present = "TRUE" Ethernet1.generatedAddress = "00:0c:29:2d:f9:0d" ethernet1.vmdk" ide0:1.vmdk" ide0:0.deviceType = "plainDisk" scsi1:0.present = "TRUE" ide0:1.rac3 -verbose Verify that the output of the above command has only VIP verification failure and all tests should pass. · In Specify Home Details screen.deviceType = "plainDisk" Installation of Oracle Clusterware Services · Before we go on installing Oracle CRS.present = "FALSE" redoLogDir = ".fileName = "Z:\ASMDISK\oradata1.present = "TRUE" scsi1:0.vmdk" scsi1:4.bat .mode = "persistent" scsi1:4. navigate to the CD Rom where you have already inserted Oracle 10g Enterprise CD. On RAC1 command prompt. This is achieved by running a verification utility provided by oracle called "runcluvfy.fileName = "D:\RACVIRTUAL\ASMDISK\ocr2.present = "TRUE" ide0:0.present = "TRUE" scsi1:4.

one RAC node becomes the master for it (responsible for read/write to it and its mirrored). You should be able to reorganize the disks first with their sizes (remember we chose 4GB for data files. all steps should be completed OK.0. · At this point we will have the following services configured in the windows service manager 1. Oracle will go through the following steps: 1. However in voting disk. you would have edit and make sure there will be two interconnect one for Public (10. Virtual Private IP configuration · Except step 8. Oracle Clusterware configuration 6. But for Voting you have to specify its location multiple times.0\crs. You can also know the location of these disks by their names and verifying the names from VMware machine settings for the disks added. Public and VIP names. That means for OCR disks. Configuration Pending 5. and 200. As you can see in the Installation screen. Install successful 2. The other nodes will communicate to Master Node for ocr operations. you will have to add RAC3 as Second Node for RAC. you have to specify the two disks (main and its mirror) for both OCR and Voting disks. Remote Operation Pending 4. Ignore and proceed to complete the install and exit.0) in my case.10. Oracle Object service . Reason is simple that I would like to show how to change RAC configurations after installation. you have to add the two nodes along with the all Private. For example. each node has to write to it separately.0) and second for Private (10.10.I:\oracle\product\10. Oracle Private inter connect configuration 8. · In the cluster configuration Storage screen. · Next oracle will begin installation. You will see that for OCR you will have an option like Primary OCR and mirrored OCR locations. 300 MB for Voting and Ocr). · In the Specify Cluster Configuration screen.42. Setup Successful 3. · In the next screen of Specify Network Interface Usage. Here I should have specified three voting disks but I chose only two and third I shall add It later. Oracle Notification Server configuration 7.2. RAC3-priv and RAC3-vip as names of Private and VIP interfaces. · Oracle will then run a check on all of the pre-requisites and you should make sure all tests passed and then press Next.

0\crs\bin\crs_stat -t I:\oracle\product\10. provide password for ASM instance and choose pfile which .0\crs\bin\crsctl check crs I:\oracle\product\10.2.2. choose Install Database Software only . Oracle Oracle Oracle Oracle cluster volume service CR Service CS service EVM service · You need to now run the VIP configuration assistant as it was the one which got failed. 4. · Complete the installation until end and you should receive an errors this time.2. ONS Apps resource (Oracle notification Service) · At this point your CRS installation is complete and you should verify it by running: Cluvfy.2. · Select Enterprise Edition. Run from CD ROM.0\crs\bin\vipca · The assistant will show you screen where you will have to provide RAC1-vip and RAC2-vip network names and then proceed to install the VIP services. d:\database\setup.exe.2. · Basically this will install the following 3 resources (not as windows services) 1. So run it from i:\oracle\product\10.0\crs\bin\ocrcheck · Recycle both nodes and verify again the CRS health Installation of Oracle Software/ASM/DB · Make sure that the Cluster services are up and running on both nodes.RAC3 · Following are the commands that can also be used to verify cluster health on both nodes: I:\oracle\product\10. 5. VIP application resource 2. 3.2. · Now launch oracle database configuration assistant from the Programs group (not from CD).bat stage -post crsinst -n RAC1. GSD Apps resource(Global Service Directory) 3. · Choose Oracle Home (different from CRS) as oradb and location as i:\oracle\product\10. · Choose Oracle RAC Database in the Welcome screen · In the next screen select Configure Automatic Assistant · Select both RAC nodes.0\db_1 · Make sure to check mark both nodes for s/w installation · Make sure all pre-requisites tests are passed · In the Select Configuration Option.

select general purpose database template . · ASM instance setup is now completed and we can begin creating a database.means each ASM instance on RAC1 and RAC3 will have its own init. · Select Redundancy as Normal and create the disk group as shown in the following picture.ora located on NTFS. I will create more groups later and show how to distribute REDO. · Use Oracle Managed files.0\db_1 . SET CRS_HOME = I:\oracle\product\10. · Accepts defaults for rest of the screens and continue until completion. · Do not specify Flash recovery area as it will be done later. --Assignment of Environment Variables Right click on My Computer and select properties and go to Advanced tab and define the following environment variables on both servers. · Select ASM as the storage for the new database.2. but specify Archived log location. · In the create disk group. · Press OK and continue to complete. lets perform basic health checks. archived and Flash recovery area to their respective asm groups. select both nodes. and then you will see a screen where you have to create disk groups.0\crs SET ORACLE_HOME= I:\oracle\product\10. it is always recommended to use OMF with ASM. name the database as RACDB. · The main group is DATA and two sub groups are DATAP and DATAS. click on Stamp disks and you should see all of your raw devices (from both node s storage). · Now you should be able to see both disks appear as candidates in the Create Disk group screen. · DBCA will then create ASM instance. Post RAC Installation Health Check Now that the RAC is installed. · Launch DBCA from programs group and follow the screens as under. · Select DATA as the ASM group for all database files. Select the two raw devices of 4GB on the two nodes and accept defaults. · Choose Create Database.2. The two instances will be RACDB1 and RACDB2.

exe OracleDBConsoleRACDB1 nmesrvc.exe evmd. and fails 'application resources' over .exe --Oracle Database Services OracleoradbTNSListenerLISTENER_RAC1 TNSLSNR. EVMD: .Generates events when things happen .Stores current known state in the OCR.Maintains configuration profiles in the OCR (Oracle Configuration Repository) .OCSSD is part of RAC and Single Instance with ASM .exe ocssd. --.Provides access to node membership .Runs as Oracle.Spawns separate 'actions' to start/stop/check application resources . when present .EXE OracleServiceRACDB1 oracle.Engine for HA operation . stops.exe OracleCRService OracleCSService OracleEVMService OracleClusterVolumeService --Oracle ASMServices OracleASMService+ASM1 crsd.Manages 'application resources' .Spawns a permanent child evmlogger .Provides basic cluster locking .exe oracle.Integrates with existing vendor clusterware. on demand. .Runs as root .Starts.Is restarted automatically on failure OCSSD: .1 CRSD: .Evmlogger. .Failure exit causes machine reboot.--Oracle Clusterware Services Oracle Object Service OracleOBJService.This is a feature to prevent data corruption in event of a split brain.Can also runs without integration to vendor clusterware .Provides group services .exe OcfsFindVol. spawns children .exe OracleJobSchedulerRACDB1 --Link between Windows Services and Processes (in Task Manager) Run command : TASKLIST /SVC (See the processes links above) Here is a short description of each of the CRS daemon processes: (Taken from Metalink Note: 259301.

. .. Listener.. Instance etc) Cd %CRS_HOME% Crs_stat -t Name Type Target State Host ---------------------------------------------------------ora.rac1.rac1.SM2.RACDB.asm application rac1 ora.gsd rac1 ora...Runs as Oracle. .ons rac3 ora.B2.rac1.rac3....Restarted auto when fails --CRS_STAT: Check Health of Resources(ASM.C3.DB.inst application rac1 ora...vip rac3 application application application crs_stat alone will provide full names listing crs_stat -f will provide detailed information about each of the compoenents.....gsd rac3 ora.ons rac1 ora.vip application application application rac1 ora..lsnr application rac3 ora.Scans callout directory and invokes callouts.C1.B1. --Start/stop all oracle services Crs_start -all ..rac3.rac3.lsnr application rac1 ora.inst application rac3 ora.db application ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE rac3 ora.SM1..asm application rac3 ora...

exe Dumps cluster state to crsd. See details at http://download-uk.RACDB1.\votedsk2 Checks version of Clusterware crsctl query crs softwareversion/activeversion CRS software version on node [rac1] is [10.Crs_stop -all Start/Stop Individual services crs_start resounce_name -c cluster_member crs_start resource_name For example: crs_start ora.2. You should see a document for understanding how to debug a real application cluster environment.log crsctl debug log "CRSTIMER:2" Please note that dumping cluster state is a one time snapshop while other debig command are modes of tracing with different levels. --CRSCTL : Controls RAC parameters Checks health of cluster only Crsctl check crs CSS appears healthy CRS appears healthy EVM appears healthy Query voting disks location crsctl query css votedisk 0.log crsctl debug statedump crs Debug specific components(level 2) See crsd. 0 \\.htm See Appendix at the end of this document for more details.oracle.\votedsk1 1.RACDB.0.1.0\crs\BIN\GUIOracleOBJMan ager.inst Please note that you can also use srvctl command to achieve the same for starting or stopping services.2. .0] You can also use the utility to find out location of ocr and voting disk as : Run I:\oracle\product\10. and is recommended to use it as it has more control of each service group.com/docs/cd/B19306_01/ rac.102/b14197/appsupport. 0 \\.

dmp -s online ocrconfig -import ocr. You can use the command ocrconfig -showbackup to see existing backups.Displays health of Oracle Cluster Registry Ocrcheck Status of Oracle Cluster Registry is as follows : Version : 2 Total space (kbytes) : 192652 Used space (kbytes) : 3800 Available space (kbytes) : 188852 ID : 1953799442 Device/File Name : \\. Recycle RAC Environment Step1: Stops agent processes SET ORACLE_SID=RACDB1 emctl status agent emctl status dbconsole emctl stop dbconsole Repeat the same on RAC3 and then run emctl status to verify.dmp oracle performs 4hr backup at cdata folder under CRS_HOME but only on master node. ocrconfig -repair ocr <ocr_location> ocrdump <file-name> (dumps ascii format) Manage Cluster Database srvctl command srvctl <commanD> <OBJECT> [<OPTIONS>] I will explain this with an exmaple.2.\ocrcfg Device/File integrity check succeeded Device/File Name : \\. Step2:Stop database with its instances(all) .0\db_1\log\rac3\client --Export ocr (takes backup and restore and change location) ocrconfig -export ocr. CRS and EVM). Let us suppose we need to stop all rac resources (not the windows services like CSS. Please make sure to keep a copy of the backup files.\ocrmirrorcfg Device/File integrity check succeeded Cluster registry integrity check succeeded Make sure to check the log: I:\oracle\product\10.-. ocrconfig -replace ocrmirror <new location> ocrconfig -restore ocrbackup Ocr backup is automatically taken every 4-hours on the master node.

What ever actions you need to perform you can also see the corresponding SQL that will be run. --Accesing RAC Environment from EM console. Make sure RAC is up by crs_stat -t command.srvctl stop database -d RACDB Step3:stop all asm instances srvctl stop asm -n RAC1 srvctl stop asm -n RAC3 Step4:stop VIP. Applying Oracle 10.2.GSD. ONS SERVICES srvctl stop nodeapps -n RAC1 srvctl stop nodeapps -n RAC3 Step5: START VIP. I would highly recommend DBAs to get familiar with EM and it is an excellent GUI tool to monitor and manage your RAC environment. Make sure agent is up and then open IExplorer: http://RAC1:1158/em where RAC1 is the dbconsole node.0.GSD. ONS AND LISTENER SERVICES srvctl start nodeapps -n RAC1 srvctl start nodeapps -n RAC3 Step6: starts asm instances srvctl start asm -n RAC1 srvctl start asm -n RAC3 Step7: starts db +instances srvctl start database -d RACDB Step8: dbconsole and agent startup set ORACLE_SID=RACDB1 emctl stART dbconsole Here you need to repeat this step on RAC3 as well.3 Patch set .

ba t · At this point. make all RAC services on windows as Manual start except two services. · Unzip and copy the patch contents to RAC1 node. Applying oracle patch has a pre requisite that all oracle services are down.bat on remote nodes to activate the following products: Oracle Data Provider for . · After installation is over.3 by running the following command: crsctl query crs softwareversion crsctl query crs activeversion · Stop all cluster services once again. on each of the two nodes. Once up start the services in the .exe.NET Oracle Provider for OLE DB Oracle Objects for OLE Oracle Counters for Windows Performance Monitor Oracle Administration Assistant · After this point. Major steps in applying the patch.Now that the RAC is installed on the two nodes. Create a folder as C:\10203Patch and use that as the patch contents unzip folder. · On the next screen make sure both RAC Nodes are checked and proceed to complete the installation. you need to run the following on remote node (RAC3) : You need to execute <Oracle Home>\bin\SelectHome.2. you should notice in Windows service manager that Cluster related services will be up and running. you should go to the file folder from windows explorer and remove that file and then retry the operation.exe again and this time choose oradb_1 home to patch the oracle asm and oracle database software. Run c:\10203Patch\Setup. Now run the setup. You can verify that the Cluster layer is patched with 10.0\crs\install\patch102. Complete the patch installation. Oracle Object Service and OracleClusterVolumeService. RAC3 and we are ready to add a third node as RAC5.2.0. run the following from command prompt: I:\oracle\product\10. make sure all services/components of RAC are down. During db home patching. · After the installation is over. RAC1. · On both nodes. and bounce both nodes. you may receive errors like file in use. I would recommend to patch the two RAC nodes with latest oracle patch for both Cluster layer and Database (asm inclusive). · Select ORACRS as your first home to be patches which is the clusterware stack.

ora. startup upgrade 2. I will increase the SGA_TARGET from 150 to 300 MB for the instance. you can revert it back to 150 after the patch is deployed.3.ora from database folder as it is not required.2. · Again startup the instance as : 1.ora (located at local node location %ORACLE_HOME%/database to another file name.2.following order: § OracleCSService § OracleEVMService § OracleCRService · Make sure database services are down otherwise run the following command: Srvctl stop database -d RACDB · Now you are ready to run the catupgrade against the data dictionary as part of the last step in patching database.0 00:01:03 · Oracle Text VALID 10. Then revert back the saved init file to original initRACDB1.0.0. change parameter values and then from sqlplus run create spfile=SPFILE='+DATA/RACDB/spfileRACDB. spool patch10203. Since am using Automatic SGA memory management.ora so that it will only contain the following line: SPFILE='+DATA/RACDB/spfileRACDB.sql · Review the log file for any errors and make sure all database compoenents are showd updated with 10.0 00:01:40 · Oracle Database Java Packages VALID 10.0.3.log 3.0.0. turn archive off by running alter database noarchivelog.2.ora' from pfile.0\db_1\RDBMS\ADMIN\catup grd. · From RAC1 node. log on to sqlplus after setting the ORACLE_SID=RACDB1.3. @I:\oracle\product\10. However you need to make sure SGA components (shared pool and java pool should be at least 150 mb each).3.2.2. From sqlplus run create pfile from spfile.0 00:19:33 · JServer JAVA Virtual Machine VALID 10. thenshutdown the instance.2. · Now startup the instance as mount.ora' · Also remove the local file spfileracdb1.2. and then open the initRACDB1. and also runalter system set cluster_database=FALSE scope=spfile. Save the pfile initRACDB1.3.3 patch set as: · Component Status Version HH:MM:SS · Oracle Database Server VALID 10.0 00:06:23 · Oracle XDK VALID 10.0 00:00:33 .0.

0 00:01:41 · Oracle OLAP API VALID 00:00:30 · Oracle Enterprise Manager VALID while the IP addressed of the existing two nodes need to be copied to the host file of RAC5: RAC5 will have the following IP addresses: RAC5 .0 00:00:02 · Oracle Data Mining VALID 00:00:34 · OLAP Analytic Workspace VALID 10. RAC5 has already been created as a virtual machine on workstation XPWS3.0.2.3. VALID Adding a third Node to the Cluster Database Now that the RAC is installed on the two nodes.0.0 00:08:31 · Spatial VALID 10. · SHUTDOWN · STARTUP · srvctl start database -d RACDB · crs_stat -t should now show databases instances to be up and running.· Oracle XML Database 10.3. we are ready to create RAC5 as third node. RAC3.0 00:01:20 · Oracle interMedia VALID 10. At this point you have successfully deployed oracle 10.3.SQL TO COMPILE ALLINVALID OBJECTS (ELSE THEY BE VALID WHEN ACCESSED) · alter system set cluster_database=TRUE scope=spfile.2.3. RAC1.3.3 patch set. · SHUTDOWN · STARTUP MOUNT · ALTER DATABASE ARCHIVELOG. RAC5 needs to be configured with the following parameters: · IP addresses assigned and also need to be replicated to the host file of remaining two nodes.0 00:01:37 · Oracle Real Application Clusters VALID 00:02:32 · Oracle Rule Manager VALID 10.3.0 00:00:13 · RUN UTLRP.0.0 00:06:41 · Oracle Expression Filter VALID 10.2.0 00:00:59 · OLAP Catalog VALID

dataCacheMaxReadAheadSize = "0" diskLib.fileName = "Windows Server 2003 Enterprise Edition (3).present = "TRUE" sound.suspend = "default" powerType.252 RAC5-VIP · Map network drives on XPWS3 as Y and W to point to ASM folders of RAC1 and RAC3 with full permission.locking = "FALSE" diskLib.42.maxUnsyncedWrites = "0" scsi1.version = "7" virtualHW.10.addressType = "generated" uuid.grabbed = "normal" priority.fileName = "auto detect" ide1:0.dataCacheMaxSize = "0" diskLib. · Following is the excerpt from RAC5 OS winNetEnterprise.10.vmdk" sound.present = "TRUE" .73 RAC5-PRIV 10.fileName = "A:" Ethernet0.present = "FALSE" ide1:0.vmx: disk.present = "TRUE" scsi0:1.fileName = "-1" displayName = "rac5" guestOS = "winNetEnterprise" priority.present = "TRUE" scsi0.present = "TRUE" scsi1.syncTime = "TRUE" scsi0:1.dataCacheMinReadAheadSize = "0" diskLib.powerOn = "default" powerType.generatedAddress = "00:0c:29:6a:0b:18" ethernet0.virtualDev = "lsilogic" scsi1.virtualDev = "es1371" Ethernet1.reset = "default" ide1:0.ungrabbed = "normal" powerType.10.bios = "56 4d 49 04 57 26 bc 40-3f 27 76 e5 1c 6a 0b 18" ethernet0.0.location = "56 4d 49 04 57 26 bc 40-3f 27 76 e5 1c 6a 0b 18" uuid.virtualDev = "lsilogic" memsize = "540" scsi0:0.fileName = "Windows Server 2003 Enterprise Edition.powerOff = "default" powerType.startConnected = "TRUE" Ethernet0.sharedBus = "VIRTUAL" config.generatedAddressOffset = "0" tools.present = "TRUE" scsi0:0.deviceType = "cdrom-raw" floppy0.present = "TRUE" sound.dataCachePageSize = "4096" diskLib.vmdk" ide1:0.version = "3" scsi0.

2.vmdk" scsi1:1.fileName = "W:\ASMDISK\votingdisk2.0\crs\install\ · I:\oracle\product\10.0\crs on RAC5 and also cluster services but will not start cluster sevices (except first 2 obj serv and cluster volume) · cd I:\oracle\product\10.present = "FALSE" ide0:0.generatedAddress = "00:0c:29:6a:0b:22" ethernet1.2.bat · You should recive the following messages and make sure there are no errors even for VIP services. Step 1: checking status of CRS stack Step 2: Configuring basic cluster services Step 3: configuring OCR repository with new nodes clscfg: EXISTING configuration version 3 detected.fileName = "Y:\ASMDISK\oradata1.0\crs\oui\BIN addnode.2. Run the following commands from Existing Node RAC1: · cluvfy comp peer -refnode rac1 -n rac5 (Compare) · Install Clusterware stack software on RAC5 from RAC1 as: cd cd I:\oracle\product\10.generatedAddressOffset = "10" scsi0:2.vmdk" scsi1:0.Ethernet1.fileName = "Y:\ASMDISK\votingdisk1.present = "TRUE" ide0:0.deviceType = "plainDisk" ide0:1.fileName = "Y:\ASMDISK\ocr1.0\crs\install>crssetup.present = "TRUE" scsi1:1.vmdk" floppy0.fileName = "W:\ASMDISK\oradata2.bat · Press Next to the welcome screen and provide public and private IP address of the new new node and complete the installation.vmdk" ide0:1.ad d. · The above proc will install I:\oracle\product\10.vmdk" ide0:0.fileName = "W:\ASMDISK\ocr2.2.present = "TRUE" scsi0:2.present = "TRUE" scsi1:0.present = "TRUE" ide0:1. clscfg: version 3 is 10G Release 2.deviceType = "plainDisk" · Now we are ready to add RAC Node2 as RACDB3 to Server RAC5.present = "TRUE" scsi0:3.vmdk" scsi0:3. .addressType = "generated" ethernet1.

Cd I:\oracle\product\10. Make sure listener creation is only for RAC5 node.. · At this point all cluster services on the new node RAC5 should be aytomatically started and this marks the end of Cluster stack installation for the new node.Attempting to add 1 new nodes to the configuration Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897. Starting GSD application resource on (1) nodes. · Creating ASM instance on RAC5 Node: Create admin folder under I:\oracle\product\10. ons and vip services up and running.. this is not required but in case later you would like to create spfile for asm. You now have two subfolders as +ASM and RACDB.0\db_1\oui\bin addnode. Modify init+asm3. Creating GSD application resource on (1) nodes. · Now you are ready to install oracle software on the new node.. However I always prefer manual approach which is explained below: Perform the following steps from RAC5 node.instance_number=3 to init. Starting ONS application resource on (1) nodes.bat Complete install process. Go to command prompt of ORACLE_HOME/database set ORACLE_SID=+ASM3 .0 Copy all the contents of admin folder from RAC1. Now you can go back to RAC1 node and run DBCA GUI tool and follow the screens to add RACDB3 instance on RAC5 Node (third node). privgrp ''.. node <nodenumber>: <nodename> <private interconnect name> <hostname> node 3: rac5 rac5-priv rac5 Creating OCR keys for user 'administrator'. · If you run crs_stat -t from Nod3 (RAC5) you should see gsd. Starting VIP application resource on (1) nodes..2. Step 4: configuring safe mode for CRS components Step 5: starting up the CRS stack on new nodes Step 6: configuring OCR with new node VIP information Creating VIP application resource on (1) nodes. Run crs_stat -t and you should see listener component also apearing besides ons.ora to include +ASM3.. Operation successful. Creating ONS application resource on (1) nodes..instance_number=3 copy +ASM3. · Go to RAC5 node and run network config assistance and create the listener with default settings.ora of other instances.2. vip and gsd services.

db_block_size=8192 *.__java_pool_size=4194304 racdb1.0' *.__java_pool_size=4194304 racdb3.db_domain='' *.2. so shutdown asm instance and go to crshome/bin and run the following command: srvctl add asm -n RAC5 -i +ASM3 -o %ORACLE_HOME% srvctl start asm -n rac5 Now execute srvctl start asm -n RAC5 and the ASM will be started.__large_pool_size=4194304 racdb2. racdb1.background_dump_dest='I:\oracle\product\10.dispatchers='(PROTOCOL=TCP) (SERVICE=RACDBXDB)' .__db_cache_size=201326592 racdb2.0 /admin/RACDB/bdump' *.2.orapwd file=PWD+ASM3.ora password=password oradim -new -ASMSID +ASM3 (create windows service) make sure asmtoolg shows the oradata asm group disks via sqlplus mount instance as startup.0/admi n/RACDB/adump' *.626900241' *.ORA password=password create pfile from spfile and edit the contents of the initRACDB3.__db_cache_size=159383552 racdb3.__streams_pool_size=0 racdb3. You should now see asm instance running and you can verify by running command select * from v$asm_diskgroup However you need to add the asm service to cluster stack.ora as shown below and then copy it back to spfile.260.__streams_pool_size=0 *.compatible='10.__large_pool_size=4194304 racdb3.db_create_file_dest='+DATA' *.0.2.__shared_pool_size=121634816 racdb2.2.__streams_pool_size=0 racdb2. · Creating Database instance on RAC5 Node: set ORACLE_SID=RACDB3 orapwd file=PWDRACDB3.__shared_pool_size=96468992 racdb1.__java_pool_size=4194304 racdb2.audit_file_dest='I:\oracle\product\10.0/admin /RACDB/cdump' *.1.core_dump_dest='I:\oracle\product\10.db_file_multiblock_read_count=16 *.cluster_database_instances=3 *.control_files='+DATA/racdb/controlfile/current .__shared_pool_size=96468992 racdb3.cluster_database=TRUE *.db_name='RACDB' *.__large_pool_size=4194304 racdb1.__db_cache_size=159383552 racdb1.

remote_listener='LISTENERS_RACDB' *.pga_aggregate_target=16777216 *.undo_management='AUTO' RACDB3. Add new db instance to the node: adds instance srvctl add instance -d RACDB -i RACDB3 -n RAC5 Now shudown the database as srvctl stop database -d RACDB Now start the database as srvctl start database -d RACDB At this point all three instances are up and running and if you encounter issues like cluster_database_instances value is not in sync for any of the instance.thread=3 RACDB2. Also create undo tablespace as: create undo tablespace UNDPTBS3. then that means it was not started with the proper spfile which has value set as cluster_database_instances=3.remote_login_passwordfile='exclusive' *.open_cursors=300 *. alter database add logfile thread 3 group 6.instance_number=3 RACDB2.sga_target=268435456 RACDB3.2. ORA-01618: redo thread 3 is not enabled .cannot mount To overcome this message you need to create the redo log for the new node from RAC1 as: alter database add logfile thread 3 group 5.job_queue_processes=10 *. alter database mount.thread=2 RACDB1.processes=150 *.undo_tablespace='UNDOTBS1' *. alter database enable public thread 3.undo_tablespace='UNDOTBS2' RACDB1.thread=1 *.undo_tablespace='UNDOTBS3' RACDB2. RAC Concepts Primer Oracle Clusterware With 10g.ora nomount.instance_number=2 RACDB1.RACDB3.0/admin /RACDB/udump' Now create oracle db sid oradim -NEW -SID RACDB3 startup pfile=initRACDB3.instance_number=1 *. You do not necessarily need a third party cluster software for RAC implementation as .user_dump_dest='I:\oracle\product\10.

Oracle Clusterware provides the clustering support. Oracle Clusterware reads the ocr. 3. Oracle Clusterware software enables RAC nodes to communicate with each other and work as single logical RAC server. Each RAC node maintains a copy of the OCR in memory. It is created on a shared storage accessible to all Nodes. it is then replicated from local OCR cache to the OCR cache on other nodes in the cluster. Finally. CSS authorizes the first node that attains the ACTIVE state as the MASTER node unless a MASTER node is already assigned. Database. The OCR file contains information for all of cluster layers. 7. Oracle Clusterware determines the location of the OCR from the ocr. Oracle Cluster Registry (OCR) OCR maintains RAC application resources and availability. Once the connection is established between the various RAC Nodes listeners. CSS performs the following: 1. It is important to note that Only one OCR process (designated as the master) in the cluster performs any disk I/O activity. CSS then establishes a connection to all RAC nodes using private interconnect. CRS. ORA_CRS_HOME etc. It then reads the OCR file to determine the location of the voting disk. 2. and to find out which resources need to be started on RAC Nodes after reading OCR file contents. The vote disk is required to determine the names/numbers of members in the cluster. 6. This is the first process that is started in the Oracle Clusterware stack.loc(on Unix) or registry values(on windows).loc file during the system startup. EVM. CSS then bring voting disks online. The layers include System. and CRS. The information relating to System includes CSS. these nodes are changed to ACTIVE status if the node(s) is able to access voting disk(s). 4. a new incarnation of the cluster . Once information is read by this master OCR process. All of the ACTIVE RAC nodes then register themselves with the MASTER node. for the location of the ocr file. 5. Cluster Synchronization Services (CSS) CSS maintains membership of each RAC Nodes in the cluster through voting disk which is also stored in shared storage subsystem. 8.

1. start. Resources have profiles that define metadata about them in OCR. If the daemon fails. The OCR information is cached inside CRS. Cluster Ready Service (CRSD) 4. Cluster Synchronization Service(CSS) 2. Node Membership Service (NM) has the following role: o Check the heartbeat across RAC Nodes every second o Check the heartbeat of the disk by performing a read/write operation every second o If the heartbeat fails to receive for more than 60 sec. 3. All communications between the CRS and CSS happen via this process. so it also acts as gateway for messages.e. RACGIMON process 5. These services are performed by the Node Membership (NM) and the Group Membership (GM) services.. This process manages the application resources i. stop. PROCD process. All clients that perform I/O operations register with the GM (e. Oracle Clusterware Stack The main processes that compose the Oracle Clusterware stack are: 1. Event Manager Service (EVMD) 3. and manage failover. DBWR). 2. Moreover this process also . The GM provides membership services. When a node fails. Event Manager Daemon (EVMD) The EVMD is an event-forwarding process that sends events through the Oracle Notification Service (ONS). o Query voting disk to determine if any RAC node is not able to write to it. LMON. it will automatically starts.is established. Reconfiguration of instances (when an instance joins or leaves the cluster) is also handled by GM. A failure of this process will cause the relevant RAC node to reboot. Master Node will evict the problematic node from cluster. Cluster Ready Service Daemon (CRSD) The CRSD process is used to define and manage resources. the GM sends out messages to other instances regarding the status. Cluster Synchronization Service Daemon: Cluster Synchronization Service Daemon (CSSD) is responsible for synchronization between the various resources in the cluster.g.

3rd party Services. Listeners. RACGIMON Daemon RACGIMON is a database health check process monitor. Virtual Internet Protocol (VIP). Database listeners are configured to listen on VIPs addresses instead of the public ones. keeps the heartbeat information between the nodes. stopping. Databases. 5.starts and communicates with the RACGIMON process. and also performs the tasks of starting. Instances. PROCD Process PROCD is also a process monitor that runs on hardware platform supporting other third-party cluster managers and is present only on hardware platforms other than Linux like it is present on AIX OS machines. 4. the cluster immediately recognizes the communication failure and Master node starts evicting the failed node from the cluster group to prevent data corruptions. when the node that houses it fails. Resources that are managed by the CRS include: Global Service Daemon (GSD). Cluster Interconnect is a communication network used by the cluster nodes for the synchronization of resources and is also used to transfer instance-specific data from one . the RACGIMON process is started on the MASTER node of the surviving nodes by the CRS process. ONS Daemon. You should always have three voting disks on different locations to avoid split brain issue which can result in corruption. Additional Notes: The voting disk is a shared disk that will be accessed by all the nodes used as a central reference. Virtual IP is required to ensure that applications can be work to be high available. When a node goes down. the client connection will be rejected by the that node. however its VIP resource will be failed over to another existing node and there will be no TCP timeout whereas the clients will be connected to the RAC. and failover services. If any of the nodes is unable to access the voting disk.

Cache fusion uses high-speed interprocess communication ntwork for cache-to-cache transfer of data blocks between RAC instances. 3. · Number of LMS processes running is driven by GCS_SERVER_PROCESSES parameter say for example ora_lms0. row cache. LMS (Lock Manager Service): Global Cache Services Process LMS is the process used in Cache Fusion. 5. LMD (Lock Manager Daemon): Global Enqueue Services Daemon It is a process responsible for: · Managing requests for resources and controls access to blocks and global Enqueues · Handling global deadlock detection and remote resource requests. The network layer should be dedicated to the RAC and has high bandwidth with low latency. LMON (Lock Monitor): Global Enqueue Services Monitor LMON Process is a monitor process which manages: · Instance deaths and associated recovery for the failed node · Cluster/Locks reconfiguration when a new instance joins or existing instance gets evicted from the RAC · Maintains consistence among GCS memory in case any LMSx dies. · DIAG: Diagnostic Daemon Monitors health of the RAC instances . 4. 1. RAC Background Processes RAC Instance will have the usual background processes that a single non-RAC instance has plus additional processes specifically required for the RAC environment.. It addresses transaction concurrency between instances. · Rollback uncommitted transactions for blocks that are being requested for consistent read by the remote instance.ora_lms9 2. functions are: · Enables consistent copies of blocks to be transferred between instances.instance to another. LCK: Lock Process Primary function is to manage non-cache fusion resource requests such as library. and lock requests that are local to the instance.

In a clustered database environment. I would like to share some of the issues faced and the methods to resolve. you have the following services for each RAC Node. Additional Notes: The GCS and GES processes on each RAC-Node manage the cache synchronization by using the cluster interconnect network layer. But before we do that. On Windows environment. · Concurrent Reads and Writes on different nodes is a combination of I/O operations for a single block of data. A block available on any of the instances is modified by a another instance while maintaining a different copy of data. · Concurrent Writes on different nodes occurs where multiple instances want to change the same data block frequently. · Note that PMON restarts a new DIAG process to continue its service in case DIAG process dies.and captures diagnostic data regarding process failures in an instance. Troubleshooting RAC Environment Now that you created a three node RAC with storage extended from RAC1 to RAC3 with normal redundancy. Oracle Object service (Keep Auto Start) Oracle cluster volume service (Keep Auto Start) Oracle CS service (Keep Auto Start) Oracle EVM service Oracle CR Service . we are ready to create a third voting disk. It may be possible that these issues will not arise on a stable environment like AIX/HP over SAN storage. Please start services in the following order. there will exists different scenarios of block sharing which can be categorized as follows: · Concurrent Reads on multiple nodes occurs when two ore more instances are required to read the same block of data. but having the RAC tested over VMWare/Windows has its benefits in terms of troubleshooting.

log file you will notice network timeout ora-errors. I had this problem with the database resources and the instances. Some times you receive a message that the <name> resource is not registered with the cluster and although you are able to see the resource when you type crs_stat -t. please make sure that the CS service is started on all nodes.Some times you will not be able to start EVM service.RACDB RAC3 5. 1. Creation of OCR mirror: ocrconfig -replace ocrmirror \\. Key [SYSTEM. so I did the following to resolve: srvctl remove database -d RACDB (this will move db resource and instances registered) Crs_unregister ora.log: Incorrect SV stored in OCR. OCR Corrupted when starting crsd service After I added the 3rd node and starting crsd service which failed with a message in crsd.ora to have the timout increase from 10 secs.txt and opened the file in text editor and found out that the value of .inbound_connect_timeout=600 4.version.\ocrcfg 2.oradb. Cluster components/services not starting.node_numbers.node3] Value [] I used ocrdump ocr.db srvctl add database -d RACDB -o srvctl add instance -d RACDB -i srvctl add instance -d RACDB -i srvctl start database -d RACDB also all %ORACLE_HOME% RACDB1 -n RAC1 RACDB2 -n RAC1 You can also remove a particular instance by running the command: srvctl remove instance -d RACDB -i RACDB1 3. sqlnet. Miscellenous cluster commands: srvctl start instance -d RACDB -i RACDB2 srvctl status instance -d RACDB -i RACDB2 srvctl status database -d RACDB crs_stat -t -v srvctl add asm -n RAC5 -i +ASM3 -o %ORACLE_HOME% srvctl start asm -n rac5 ocrcheck --starts specific resourse Crs_start <resource> -c <member> Crs_start ora.RACDB. On windows some times you are not able to start the CRSD service and in the crsd. you need to add this parameter in sqlnet.

For example: 2007-07-16 10:43:47.0\crs\log\rac3\alertrac3. Later I imported back using ocrconfig -import ocr. Opened the ocr. Location: I:\oracle\product\10.2. You will not find details information about individual cluster components.node_numbers.dmp file in hex editor and add the values.dmp and it worked.784: [ OCROSD][2744]utgdv:11:could not read reg value ocrmirrorconfig_loc os error= The system could not find the environment option that was entered.3. you should see this log for any warnings or errors.2.dmp. database. it will display status for voting disks being brought online. This method however is not supported by oracle.0\crs\log\rac3\client Here you can find log files like cssn. you will see a log mentioning about not able to find the corresponding location.2. OCR information when it is configured for changes like when upgrading ocr etc. For example when you start cluster services.0 like for other nodes.version. Location: I:\oracle\product\10. What I did was exported ocr using ocrconfig -export ocr.0.2. Cluster Services Log Location: I:\oracle\product\10. listener. For example when your mirrored ocr gets corrupted. You can also find information about all cluster members being active. RAC Logs While investigating various problems. you should be able to find all relevant information of cluster resources (asm. 6.2. you should be familiar with the following log files.node3 should be 10. and the only way out was to recreate OCR or re-install RAC5 node. log This log basically logs status information for the entire cluster. RAC Node Alert Log Location: I:\oracle\product\10. ons etc) as to why they failed to start and continuous running log for all failures.0\crs\log\rac3\crsd This is the most important log of all(cluster ready services) depending on the level of debug mode. When you start crsd service.SYSTEM.0\crs\log\rac3\css .log which displays client information for any missing entries in the registry. however executive information is logged here.

2.2.log which controls all RAC instances for High availability and monitoring. Location: I:\oracle\product\10. crsctl debug statedump crs will dump status of crsd Suppose you want to debug specific modules for a service. Location: I:\oracle\product\10.0\crs\log\rac3\evmd Cluster event management log for the EVMD service which gets started after CSS but before CRSD. You can control the information being logged with various trace levels. first this you should do is to find out all of the modules related to a service. For example.RACDB. Location: I:\oracle\product\10.2. crsctl lsmodules css will list: CSSD COMMCRS COMMNS crsctl lsmodules crs will list: CRSUI CRSCOMM CRSRTI CRSMAIN CRSPLACE CRSAPP CRSRES CRSCOMM CRSOCR CRSTIMER CRSEVT CRSD CLUCLS CSSCLNT COMMCRS . You should look in this log (even if the windows service gets started fine) for any errors relating to accessing the shared storage from a node.0\crs\BIN\OOBJService.0\crs\log\rac3\racg This folder has log files for VIP service (even for other nodes when they failed over) as well as the main database service log ora.db.log This is the log for the first windows service Oracle Object Service which gets started and is responsible to links to storage management (ocr disk.Cluster stack log for the CSS service which gets started before CRSD service. voting etc).

2.2.2. Backup and Recovery . For example you find here a log about ocr not being able to initialized as file name: I:\oracle\product\10.2.2. Which holds useful information everytime you run ocrcheck utility. Location: I:\oracle\product\10.RACDB.log.db:5 crsctl debug log res ora.2.log I:\oracle\product\10.COMMCRS:5.log Instance monitor/RACGIMON logs Location: I:\oracle\product\10.0\db_1\log\rac3\racg\imon .RACDB1.log and ocrcheck_600.racdb2.ini Database Services Log Location: I:\oracle\product\10.COMMNS crsctl lsmodules evm will list: EVMD EVMDMAIN EVMCOMM EVMEVT EVMAPP EVMAGENT CRSOCR CLUCLS CSSCLNT COMMCRS COMMNS Now suppose you want to debug css modules for level 5(detailed info): crsctl debug log css CSSD:5.log Here you can find information related to Cache fusion communication channel over private interface card.RAC Environment Here I will discuss about backing up RAC .inst:5 ocr looging: uncomment:I:\oracle\product\10.0\crs\srvm\admi n\ocrlog.0\db_1\log\rac3\* This location holds several logs and its worthwhile to look here when there is an issue with cluster database.0\db_1\log\rac3\racg\imon _RACDB.0\db_1\RDBMS\log\ ipcdbg. Look here when there is an issue between nodes for private interface channel.0\db_1\log\rac3\\client\o crconfig_1064.RACDB.COMMNS:5 crsctl debug log crs "CRSRTI:1.CRSCOMM:2" crsctl debug log res ora.

or even for a single instance Non-RAC database. As far as OS and Oracle software layer is concerned. then you should make sure to use a MML like Veritas and have the backup registered in Veritas as well as RMAN repository. you should have a cold backup for the OS System backup which should include OS and Oracle software mount points. There are many articles available on Metalink that talk about RAC Backup and Recovery Procedures/commands. . Backup and Recovery for clustered database You can always use an export method to backup the database or specific schema. RAC. take a look at the following test case where archived logs are defined at a shared storage accessible to both nodes. 1.0\db_1 asmcmd -p cd DATA cd RACDB mkdir BACKUP +DATA/RACDB/BACKUP is the shared backup location unless you are using a Tape device. For example. I have always maintained RAC Databases in such a away that I did not have to issue different backup/recovery commands for single instance vs. Create a folder in shared storage to hold your backup sets. however this does not differ from single instance to RAC and I would not consider export to replace the standard backup procedures. RMAN should always be The Choice when considering backup strategies for a RAC Database environment. My viewpoint is that a DBA needs to be more aware about Backup and Recovery concepts in a RAC environment rather than the actual commands difference. The key here is that you should always define Archived Log location in the shared storage (where rest of the data files reside).2. Export is always used when your requirements are more closer to the application level for specific objects. Therefore do not consider export as your backup strategy for RAC.environment including all of its components. Cluster and Database. Backup commands have specific switches when your archived logs are backed up locally on each node. set ORACLE_SID=+ASM1 set ORACLE_HOME =I:\oracle\product\10.

} run { delete obsolete. 4. drop user user1. } RMAN> list backup of database. Startup database in mount state. Note down archived logs created. create table dept (id number) --insert some values Now from both nodes switch logfiles. Simulate Crash Shutdown database (only on windows) From ASMCMD. alter system set log_archive_dest='+DATA' scope=both 3. . alter database open. Point your archived logs to be created at shared storage.--------------------db_create_file_dest string +DATA drop tablespace tbs_test1 including contents and datafiles. Take full database backup run { change archivelog all crosscheck. SELECT name. alter database datafile 7 offline. create user user1 identified by user1. run { backup as compressed backupset database format = '+DATA/RACDB/BACKUP/FULLB1%u'. Create test data show parameter db_create_file_dest NAME TYPE VALUE -----------------------------------.thread#.2. completion_time FROM gV$ARCHIVED_LOG where completion_time > '21-JUL-2007 12:00:31' and name is not null ORDER BY SEQUENCE# DESC 5. CREATE TABLESPACE tbs_test1 DATAFILE SIZE 20M. remove the datafile for the tbs_test1 tablespace. backup archivelog all format = 'i:\oracle\archbkup%u' delete input. alter user user1 default tablespace tbs_test1. } configure CONTROLFILE AUTOBACKUP on.sequence#.

services. Or CONFIGURE CHANNEL 1 DEVICE TYPE DISK connect 'SYS/rac@node1'. Backup and Recovery for OCR & VOTING disks OCR Backup and Recovery Reference: Metalink Note: 220970. There is an excellent note on Metalink that deals with issues relating to recovery scenarios for RAC environments. rman target / restore datafile 7. these are the changes to be aware of: run { allocate channel d1 type disk connect 'sys/rac@node1'.Now you run commands as normal and those will have the scope for the relevant node specified above. As you can see in the above example. . recover datafile 7.1 OCR raw device/file gets backed up every four hours on the master RAC node at the default location: $ORA_CRS_HOME\cdata\"clustername"\ To display backups : ocrconfig -showbackup To restore a backup : ocrconfig -restore The automatic backup mechanism keeps about a week old copy. it holds all the cluster related information such as instances. Location of file(s) is located in: /etc/oracle/ocr.CONFIGURE CHANNEL 2 DEVICE TYPE DISK connect 'SYS/rac@node2'.select * from v$recover_file exit. allocate channel d2 type disk connect 'sys/rac@node2'. we did not have to specify any extra commands for the RAC environment because our archived logs are located in the common storage. If you want to take a logical copy of OCR at any time use : Ocrconfig -export . sql 'alter database datafile 7 online'. and use -import option to restore the contents back. Document 207059. However if you decide to put archived logs locally.1 and Note:220970. The OCR file format is binary and starting with 10. OCR is the Oracle Cluster Registry.1 is an old one that deals with 9i and Parallel server but it does clear some concepts about raw devices.Note:207059. Obviously if you only have one copy of the OCR and it is lost or corrupt then you must restore .loc in ocrconfig_loc and ocrmirrorconfig_loc variables.2 it is possible to mirror it.1.

1958222370). then the corruption will be tolerated and the Oracle Clusterware will continue to function without interruptions. if OCR device is configured with mirror. see ocrconfig utility for details. Despite the corrupt copy. In the above example one of the OCR mirrors was lost while the Oracle Clusterware was down. DBA can replace the failed device with another healthy device using the ocrconfig utility with -replace flag. Almost. DBA is advised to repair this hardware/software problem that prevent OCR from accessing the device as soon as possible. The rule is to have more than 50% . then it will not be possible to start it up until the failed device becomes online again or some administrative action using ocrconfig utility with -overwrite flag is taken. b) Issue "ocrconfig -overwrite" on any one of the nodes in the cluster. OCR assign each device with one vote.301: [OCRRAW][1210108256]proprioini:disk 0 (/dev/raw/raw1) doesn't have enough votes (1. total votes (2) 2006-07-12 10:53:54. There are 3 ways to fix this failure: a) Fix whatever problem (hardware/software?) that prevent OCR from accessing the device.. This command will overwrite the vote check built into OCR when it starts up. If however the corruption happens while the Oracle Clusterware stack is down. The real answer depends on when the corruption takes place. When the Clusterware attempts to start you will see messages similar to: total id sets (1). If the corruption happens while the Oracle Clusterware stack is up and running.. The interesting discussion is what happens if you have the OCR mirrored and one of the copies gets corrupt? You would expect that everything will continue to work seamlessly.a recent backup.0) my votes (1). 2nd set (0. specifically -showbackup and -restore flags. Well.301: [OCRRAW][1210108256]proprseterror: Error in accessing physical storage [26] This is because the software can't determine which OCR copy is the valid one. alternatively. Basically.2) 2006-07-12 10:53:54. 1st set (1669906634. Until a valid backup is restored the Oracle Clusterware will not startup due to the corrupt/missing OCR file.

It is possible to manually modify ocr. How to move ocr location: Stop the CRS stack on all nodes . OCR locations can be changed with ocrconfig: ocrconfig -replace ocr ocrmirror [<filename>] In short these are the commands to administer OCR: ocrconfig -replace ocr destination_file or disk Here. . do the following to add a mirror file.Run ocrcheck to verify. ocrconfig -replace ocrmirror destination_file or disk To replace OCR do the following: ocrconfig -replace ocr destination_file or disk and to replace the OCR mirror: ocrconfig -replace ocrmirror destination_file or disk Repairing the OCR: ocrconfig -repair ocrmirror device_name To remove an OCR.loc to delete the failed device and restart the cluster. In the example above there isn't enough vote to start if only one device with one vote is available.Edit /var/opt/oracle/ocr. In 2-way mirroring. you need to have at least one OCR online ocrconfig -replace ocr OR ocrconfig -replace ocrmirror Voting Disk Backup and Recovery . (In the earlier example.Restore from one of the automatic physical backups using ocrconfig -restore. the total vote count is 2 so it requires 2 votes to achieve the quorum.loc(or windows registry) on all nodes and set up ocrconfig_loc=new OCR device .reboot to restart the CRS stack. .of total vote (quorum) in order to safely make sure the available devices contain the latest data. OCR assign 2 vote to the surviving device and that is why this surviving device now with two votes can start after the cluster is down). OCR won't do the vote check if the mirror is not configured. c) This method is not recommend to be performed by customers. while OCR is running when the device is down.

exe under CRS HOME/bin to assign logical name/link the new candidate disk as VOTEDSK3 as shown in the following figure. you can use the following to backup voting disks: dd if=voting_disk_name of=backup_file_name You can use the ocopy command in Windows environments along with the use the crsctl commands to copy and administer the files. therefore you need to use the force option: From the node RAC5: I:\oracle\product\10.0\crs\BIN>crsctl add css votedisk \\. You should reboot RAC1. Now you need to run GUIOracleOBJManager. List existing voting disks: crsctl query css votedisk To delete existing voting disk: crsctl delete css votedisk path To add another voting disk: crsctl add css votedisk path Above command should be run when crs is up. Now share the D:\RACVIRTUAL\RAC5 folder on XPWS5 to XPWS1 and XPWS2 with full access. Then from VMWare add existing virtual disk from both Wok stations (XPWS1 and XPWS2) to point to Y:\votedisk3.4 that crashes crs stack of voting disks are added online. On XPWS1 create a logical Y: Drive to point to it. Adding 3rd voting disk: Our extended RAC environment already has one voting disk for RAC1 & RAC3 nodes.exe to see that the new DISK of 300MB is visible.vmdk. From VMWare settings of RAC5 node. so you would be able to identify the raw partition name. make all cluster services must be DOWN and then verify this with the command crs_stat -t. create a new pre-allocated virtual disk (IDE) of 300MB in size.On Unix. Run ASMTOOLG before and after adding the disks. I would like to add a third voting disk on RAC5 (3rd) node.0. There is a bug that is fixed in 10. likewise on XPWS2 as well. RAC3 and RAC5 and then use GUIOracleOBJManager.2.2. however use force option if crs is down as: crsctl add css votedisk path -force Test Case: Lets apply what we have learned onto the RAC environment we have earlier creates. Now from RAC1 node.\votedsk3 .

Usage: ocopy from_file [to_file [a size_1 [size_n]]] ocopy -b from_file to_drive ocopy -r from_drive to_dir I:\oracle\product\10.\votedsk3 located 3 votedisk(s).\votedsk3 -force Now formatting voting disk: \\.0 .\votedsk3.\votedsk3 successful addition of votedisk \\.2. Verify this from all nodes by running crsctl query css votedisk Now start the cluster node rac1 with all services and it should be up and running.0\crs\BIN>ocopy \\.\votedsk1 1. All rights reserved.\votedsk2 2.1 .2.2.bak \\.Copyright 1989-1993 Oracle Corp.0\crs\BIN>ocopy votedsk3.BAK I:\oracle\product\10. Use the ocopy oracle supplied command to take a backup as shown below: From RAC5: I:\oracle\product\10. 0 \\. then follow the procedures to create a new raw voting disk/device as described earlier until the point where you assign the link name.0\crs\BIN> Restoring a backup of voting disk: Suppose you lost your voting disk/device.0\crs\BIN>ocopy OCOPY v2.\VOTEDSK3 Changing Location of Voting disk: Use the add method described above. However suppose you lost all of your voting disk but you had a backup. I:\oracle\product\10. What to do when OCR/Voting disks are lost and there is no backup: Reference Metalink ID: 399482.2.2.0\crs\BIN>crsctl add css votedisk \\. Taking a backup of voting disk: Shutdown all cluster services across nodes.Cluster is not in a ready state for online disk addition I:\oracle\product\10. Then run the restore as shown below: I:\oracle\product\10.2.bak VOTEDSK3. 0 \\.0\crs\BIN>crsctl query css votedisk 0.\votedsk3 \\. 0 \\.\votedsk3 votedsk3. follow the same procedures as described above to re-create the new voting disk.

For further details please visit Resource Section. please contact at Support@OracleFusions. EXPRESS OR IMPLIED. For Educational Purpose Only The information contained in this document represents my personal view on the issues discussed as of the date of publication.Next I though of publishing this document at the moment and I will create additional articles on Performance Tuning and Failover strategies. IN THIS DOCUMENT.oracle. eferences http://www. I MAKE NO WARRANTIES. For further information.com/technology/products/databa se/clustering/index. and I can not guarantee the accuracy of any information presented after the date of publication. This document is for informational purposes only.html Copyright © 2007 www.EmailPrint .com All rights reserved.OracleFusions. Legal Privacy Powered by SiteKreator.com Click here to Go Back to Resources Section Home Team Newsletter Products Library Forums Contact Download Login Copyright © 2007-2010 Smart Oracle Solutions All Rights Reserved.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->