I never thought mounting a shared filesystem maybe cause server reboot!

One node in a three-node RAC database was rebooted without any reason, and I was contacted to find the possible reason.

I checked the servers and found it was hard to confirm the root cause as I could not get the system information before the reboot, and had to get all the information from the system and database log and trace files.

From the /var/log/messages of the affected server, got below errors:

So the server was restarted at 22:29:58, and before these errors I also found other errors:

The server mounted a large share folder from a Windows server for backup purpose.

From other two nodes, got the related information:

And I found possible reason from below two notes:

CIFS Mount Soft Lockup While Transferring Large Files (Doc ID 1621616.1)

Hung CIFS processes in D state and stuck in "cifs_reopen_file" Also the logs show "kernel: CIFS VFS: No writable handles for inode" before server getting rebooted (Doc ID 2318108.1)

Yes, sometimes the CIFS sharing even could cause server reboot.

From these notes, below methods could be taken:

From the alert log of the database, I did not find useful information:

In the trace files of the cluster, I could only got the system worked abnormally at least from 22:24, and was evicted finally.

I guess the system maybe lost response from 22:24 while I could not confirm this point.


