Wednesday, January 27, 2016

CRS-4535: Cannot communicate with Cluster Ready Services

If checking status of CRS (and other RAC resources) returns you this error, it means that CRS is not running.
[grid@dbnode1 ]$ crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

Log in as root user and start the crs
[root@dbnode1 ~]# /u01/oracle/11.2.0.4/grid/bin/crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

Sometimes it may happen that even after using this command, CRS does not come up as you can see bellow
 [root@dbnode1 ~]# /u01/oracle/11.2.0.4/grid/bin/crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
In this case, start cluster as follows. But Oracle High Availability Service should be online for the following command to succeed.

[root@dbnode1 ~]# /u01/oracle/11.2.0.4/grid/bin/crsctl start cluster -n dbnode1

Thursday, January 14, 2016

OEM Database Control Configuration gets Stuck

I encountered a situation where I was trying to configure OEM Database Control (11.2.0.4) on Windows based system using command similar to the following
emca -config dbcontrol db -repos recreate

But command was always getting stuck for hours and hours until I stop the configuration. Last line displayed at the command prompt was “INFO: Uploaded configuration data successfully”. Emca log had last few lines similar to the following
CONFIG: Copying file D:\ORACLE\11204\rdbms\install\install.excl.emca.util.tmp to D:\ORACLE\11204\rdbms\install\install.excl
Oct 30, 2015 4:56:11 PM oracle.sysman.emcp.util.FileUtil addLine
CONFIG: File D:\ORACLE\11204\rdbms\install\install.excl is successfully updated
Oct 30, 2015 4:56:11 PM oracle.sysman.emcp.util.PlatformInterface serviceCommand
CONFIG: cmdType: 2
Oct 30, 2015 4:56:11 PM oracle.sysman.emcp.util.PlatformInterface isPre112Home
CONFIG: oracleHome: D:\ORACLE\11204 isPre112Home: false
Oct 30, 2015 4:56:11 PM oracle.sysman.emcp.util.PlatformInterface serviceCommand
CONFIG: Service does not exist

After several weeks of working (also involved oracle support), I just realized that the Windows OS user name (that I was using to log in to the system) started with a number (OS user name was similar to 0148-DBADMIN). I have seen same kind of issues while using an OS user on windows that started with a special character ($, #, @ etc.).
I tried a different OS user (of course user must be member of Administrator and ORA_DBA OS groups), and this time I was able to configure OEM Successfully.


Moral of the story: OS user being used to configure Oracle/OEM should not have name starting with number of special character.

Monday, January 4, 2016

ORA-00221: error on write to controlfile

Reason of this error (and probably instance crash) is that some other process has locked the controlfile and oracle process is not able to get hold of a lock on the controlfile before it could write in it. Most probable reason for this is some third party backup solution that might be copying oracle related files in backup process and hence holding a lock while copying.
ORA-00221: error on write to controlfile
ORA-00206: error in writing (block 3, # blocks 1) of controlfile
ORA-00202: controlfile: 'C:\ORACLE\ORADATA\MYDB\CONTROL02.CTL'
ORA-27072: skgfdisp: I/O error
OSD-04008: WriteFile() failure, unable to write to file
O/S-Error: (OS 33) The process cannot access the file because another process has locked a portion of the file.


To avoid this, you should exclude oracle files (control files, datafiles, redo log files) from the backup by any third party tool. These Oracle related files should be backed up through some oracle’s recommended backup tool, like RMAN.

Same can also happen if datafiles are locked by some backup tool. For datafiles, you may see ORA-01186, ORA-01122, ORA-01110, ORA-01208