Fixing Oracle RAC Node Problems With Addnode: DB Binaries

Fixing Oracle RAC Node Problems With Addnode: DB Binaries

From time to time, a DBA faces an Oracle RAC node that needs to be fixed, usually after applying a nasty patch.

Currently, my first approach is to remove the node, then add it back. There are other methods to try fixing the problem, but they usually will take some time, such as opening an SR with Oracle.

Even though it sounds complicated, it's not. I will show you how to recover a node from several disaster scenarios.

My configuration
Oracle RAC with 2 nodes: ol8-19-rac1 and ol8-19-rac2
CDB: cdbrac1
PBD: pdb1

Grid Version
34318175;TOMCAT RELEASE UPDATE 19.0.0.0.0 (34318175)
34160635;OCW RELEASE UPDATE 19.16.0.0.0 (34160635)
34139601;ACFS RELEASE UPDATE 19.16.0.0.0 (34139601)
34133642;Database Release Update : 19.16.0.0.220719 (34133642)
33575402;DBWLM RELEASE UPDATE 19.0.0.0.0 (33575402)

DB Version
34086870;OJVM RELEASE UPDATE: 19.16.0.0.220719 (34086870)
34160635;OCW RELEASE UPDATE 19.16.0.0.0 (34160635)
34133642;Database Release Update : 19.16.0.0.220719 (34133642)

When you see <dbenv>, load the DB HOME variables.
When you see <gridenv>, load the GRID HOME variables.

When you see <bnode>, execute the command on the broken node.
When you see <anode>, execute the command on any other working node.        

I am using an installation where both DB and GRID are installed under the user ORACLE, and I set the variables to access each environment. But the same procedure works even if the installation is done using two different users (usually oracle e grid).

Always validate any procedure before you try it in a production environment.

Scenario A - DB Binaries Corrupted On ol8-19-rac2 After Patch Apply

In this scenario, the grid binaries were not affected, so we don't need to replace them. The database is already down since the binaries are corrupted.



1. Backup the $ORACLE_HOME/network/admin
========================================

<bnode dbenv>
[oracle@ol8-19-rac2 admin]$ mkdir -p /tmp/oracle; tar cvf /tmp/oracle/db_netadm.tar -C $ORACLE_HOME/network/admin .


2. Deinstall the DB binaries
============================

<bnode dbenv>
[oracle@ol8-19-rac2 admin]$ $ORACLE_HOME/deinstall/deinstall -local

Confirm the database name, and choose yes to proceed with the instance deletion.

It's not unusual to face file deletion error messages; ignore them.


3. Add the node back
====================

If we were adding an actual new node, we were supposed to run some cluster verification checks to confirm that everything is OK, but since the node was already part of the cluster, let's skip that. 

Due to that, it's not unusual to face the message "[WARNING] [INS-13014] Target environment does not meet some optional requirements." ignore it.

From any other node, add the ex-broken node back. This step takes a while, as it copies the files from one node to another.

<anode dbenv>
[oracle@ol8-19-rac1 ~]$ $ORACLE_HOME/addnode/addnode.sh -silent "CLUSTER_NEW_NODES={ol8-19-rac2}"

As root, execute the following on the ex-broken node.
<bnodt>
[root@ol8-19-rac2 scripts]# /u01/app/oracle/product/19.0.0/dbhome_1/root.sh


4. Add the instance back
========================

From any other node, as oracle user execute the following, to each existing database, to add the nodes' instance back (recreate UNDOTBS, REDOS, etc).

<anode dbenv>
[oracle@ol8-19-rac1 ~]$ dbca -silent -ignorePrereqFailure -addInstance -nodeName ol8-19-rac2 -gdbName cdbrac -instanceName cdbrac2 -sysDBAUserName sys -sysDBAPassword SysPassword1


5. If needed, restore the $ORACLE_HOME/network/admin on the ex-broken node
==========================================================================

<bnode dbenv>
[oracle@ol8-19-rac2 admin]$ tar xfv /tmp/oracle/db_netadm.tar -C $ORACLE_HOME/network/admin

[oracle@ol8-19-rac2 ~]$ srvctl status database -db cdbrac
Instance cdbrac1 is running on node ol8-19-rac1
Instance cdbrac2 is running on node ol8-19-rac2        

Voilà, that's it. Now your database is supposed to be back online on the ex-broken node.

In the following article, I'll show how to recover from corrupted GRID binaries.

To view or add a comment, sign in

More articles by Rodrigo L.

Others also viewed

Explore content categories