- StormOS (desktop distribution based on Nexenta Core Platform)
- Less known Solaris features: BART
- Solaris Automated File Integrity Checking: bartlog
- YSlow! to YFast! in 45 minutes
- How to Fix OpenSolaris Keyboard Irregularities with Virtual Box
- OpenSolaris Immutable Service Containers (Updates)
- Automatic and event-based ZFS snapshots
- ZFS auto-backup and auto-snapshot tool
- Oracle SQL Developer Data Modeler
Today we upgrade our e-learning platform CLIX to the latest version and get some problems wihle upgrading the database in our Oracle RAC environment. In a Oracle-RAC with ASM setup, archive-logs will primary backuped to the Flash Recovery Area (+FRA). For redundancy we additionally save all archive-logs outside of the ASM on local disks. You can setting up this with the following command:
sqlplus SYS password/db-instance AS SYSDBA ALTER SYSTEM SET log_archive_dest_2='location="/path/on/disk"' SCOPE=BOTH SID='*'
Set SCOPE to BOTH make the change in memory and in the server parameter file. Setting the SID to * commit the change to all instances in an RAC environment. It has no effect in a single-instance database.
In our case the 2nd desitnation on local disk has to insufficient space available for all new created archive-logs (which were created during the database update process). So our backup partition was 100% full and the database upgrade process was stalling. To get the upgrade continue to work we misleadingly decide to delete some archive-logs manually on disk. A very bad outcome was that our RMAN backup did not work any more after this. In the logfiles found his lines:
RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of backup plus archivelog command at 01/29/2009 05:30:18 RMAN-06726: could not locate archivelog /oraarch/CLIX/1_12972_630768524.arc [...]
The first thing to solve the problem were to crosscheck and delete all our expired archives-logs with the RMAN command:
rman target SYS/password nocatalog crosscheck archivelog all ; delete expired archivelog ;
The backup seems now to to work but it still aborts with the error message:
RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of backup plus archivelog command at 01/29/2009 12:53:25 RMAN-03009: failure of backup command on c1 channel at 01/29/2009 12:52:55 ORA-19809: limit exceeded for recovery files ORA-19804: cannot reclaim 127926272 bytes disk space from 5368709120 limit [...]
After wasting some time with searching for a solution on google i solved the problem by increasing the DB_RECOVERY_FILE_DEST_SIZE parameter with the SQL statement:
ALTER SYSTEM SET DB_RECOVERY_FILE_DEST_SIZE = 10g SCOPE=BOTH SID='*';
Our FRA has about 15G capacity and was just 50% full, so i`m not sure why the RMAN backup works after increasing this value. I will update this article when i get more background information on this error. Please feel free to comment this post!
References:

We decide to upgrade our existing two node Oracle-RAC installation to the latest SLES10 SP2 version. At this time we had a stable SLES10 64-Bit productive environment. The first problem we had was to register both nodes within the novell customer care center in Yast2. But this is a another issue and i will post the solution in a separate article.
Starting point was a SLES10 SP1 64-Bit installation. First we had to install all available patches and software updates. Then we did an fast and easy upgrade to the SP2 with the following script:
#!/bin/bash set -x rug in -y -t patch slesp1-libzypp # For SLED use sledp1-libzypp sleep 40 && rug ping -a rug in -y -t patch move-to-sles10-sp2 # For SLED use move-to-sled10-sp2 rug refresh && rug ping -a rug up -y -t patch sleep 240 && rug ping -a rug up -y --agree-to-third-party-licences -t patch <<EOF n # For an automatic reboot after the migration, you may change 'n' to 'y' EOF
The whole upgrade process was really smooth, but after booting the new kernel the CRS daemon of the Oracle cluster did not start. The first problem was found very quick, the SLES 10 upgrade overwrites without any note the file /etc/udev/rules.d/50-udev-default.rules. This contains a very important entry if you are running Oracle-RAC with ASM on raw devices. So we modified the config file and replace the line
KERNEL=="raw[0-9]*", SUBSYSTEM=="raw";, NAME="raw/%k", GROUP="disk"
with the following one
KERNEL=="raw*", SUBSYSTEM=="raw", NAME="raw/%k", OWNER="oracle", GROUP="dba", MODE="660"
After restarting the system all raw partitions were accessible by the user oracle but the CRS daemon still did not come up and the process list returns:
root 6929 1 0 13:56 ? 00:00:00 /bin/sh /etc/init.d/init.cssd fatal root 6960 6928 0 13:56 ? 00:00:00 /bin/sh /etc/init.d/init.cssd startcheck root 6963 6929 0 13:56 ? 00:00:00 /bin/sh /etc/init.d/init.cssd startcheck root 7064 6935 0 13:56 ? 00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
Ok, i thought “time to take a look into the logs”
bash$# tail /var/log/messages Nov 20 09:20:37 orac1 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.4822. Nov 20 09:20:37 orac1 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.4746. Nov 20 09:20:37 orac1 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.4871.
This sounds very encouraging but a closer look the these files were frustrating, because they all had no content (zero-bytes). My next idea was to check the error message when i manually start the CRS daemon as root with the command:
1 2 3 4 5 6 7 8 9 10 11 12 13 | bash$# sh -x /etc/init.d/init.cssd startcheck + /etc/init.d/init.cssd runcheck + STATUS=0 + '[' 0 '!=' 0 ']' + '[' -f /opt/oracle/10.2/crs/lib/INVALID_DIRECTORY ']' + '[' '!' -r /opt/oracle/10.2/crs/bin/crsctl ']' + '[' '' = CSS ']' + /bin/su -l oracle -c '/opt/oracle/10.2/crs/bin/crsctl check boot > /tmp/crsctl.18545' /bin/sh: /opt/oracle/10.2/crs/bin/crsctl: Keine Berechtigung + RC=126 + '[' 126 '!=' 0 ']' + /bin/logger -puser.err Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.18545. + /bin/sleep 60 |
Line #9 gives a important hint, so i checked the permissions of the involved file
bash$# ls -la /opt/oracle/10.2/crs/bin/crsctl -r-xr-x--x 1 root dba 2002 2007-08-08 15:37 /opt/oracle/10.2/crs/bin/crsctl
This looks good and after some hours of wasting time i decide to check the group membership of the oracle user:
### after SLES10 SP2 installation ### bash$# id oracle uid=102(oracle) gid=103(oinstall) groups=103(oinstall) ### before SLES10 SP2 installation ### uid=102(oracle) gid=104(dba) Gruppen=104(dba),6(disk),103(oinstall)
I did not believe my eyes but the upgrade procedure of SLES10 SP2 removes the primary group dba of the oracle user. Therefore the user is not allowed the start the CRS daemon. The solution of this problem was quite simple. Just execute the following command as super-user and reboot the machine. The CRS daemon should start automatically and all databases should come up without any problems:
bash$# usermod -g dba -G oinstall,disk oracle
Conclusion:
The upgrade procedure to SLES10 SP2 was really straight forward but there a some pitfalls with are not documented by Novell and which are very hard to find out. I give you the advice to do a SLES10 SP2 upgrade first on a non-productive Oracle-RAC environment.
We have to use SLES10 because we don’t want to loose the Oracle support and the support of a 3rd party application which is certified for Oracle on SLES10. But next time i will propose my boss to buy RedHat Enterprise Linux or use a OpenSource Enterprise Linux distribution like CentOS. The package management of Novell/Suse is very frustrating if u a confirmed Ubuntu user. Please feel free to comment this post or contact me if you have any questions.
References:
- Oracle Meta-Link NOTE:329450.1
- Novell Support: How to update to SLES/SLED 10 SP2