This night I installed an Oracle 12c database on SLES12SP3 to test the dbss script on SLES system and found it worked well until I rebooted the server.
After the reboot, I found the database was not opened and the worse thing was I even could not run the sqlplus command:
oracle@linux-ryjt:~> sqlplus "/as sysdba"
SQL*Plus: Release 22.214.171.124.0 Production on Sun Jun 10 18:31:29 2018
Copyright (c) 1982, 2014, Oracle. All rights reserved.
ORA-12547: TNS:lost contact
Enter user-name: ^C
It was very weird as it did work very well before the reboot.
I got this issue of Oracle 11g so I tested the solution again while no luck, so I had to find the root cause of this issue.
I traced the sqlplus command and also searched and read many notes on the Oracle support portal, and I learned I should check below things:
1. The permissions of $ORACLE_HOME/bin/oracle
should be 6751 and it was correct
2. Maybe need to relink all
Done while not work
3. Check the resource limits
Checked and they were right
4. Unset the EXTSHM environment variable
Done while not work
5. nosuid maybe set in the /etc/fstab
Checked and did not find such option
6. Kernel parameters
Such parameters were set using script and it worked all the time
From the strace file of the sqlplus, I got below error:
12916 times(NULL) = 1718063887
12916 write(4, "ORA-00600: internal error code, "..., 117) = 117
12916 write(4, "\n", 1)
Such error maybe was caused by wrong kernel parameters, so I recheck the generated sysctl config file under /etc/sysctl.d.
linux-ryjt:~ # cat /etc/sysctl.d/oracle.conf
#Added for Oracle installation
fs.aio-max-nr = 4194304
fs.file-max = 6815744
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.shmall = 772618
kernel.shmmax = 3090470
kernel.panic_on_oops = 1
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 2097152
net.ipv4.tcp_wmem = 262144 262144 6291456
net.ipv4.tcp_rmem = 4194304 4194304 4194304
The Oracle installation guide mentions for SUSE systems, maybe need to set the kernel parameter vm.hugetlb_shm_group, so I inserted such line and run 'sysctl -p' to take effect, but when I run 'sysctl -a|grep hugetlb' I found the parameter was not changed.
And I recalled today I modified the scirpt oraprep.sh so it would generate a sysctl config file under /etc/sysctl.d, and add two lines to get more accurate values for kernel.shmall and kernel.shmmax, and after I read the manual of the sysctl command I know why I got such issue.
First, the kernel.shmmax was wrong and I should multiple it by 1024 to get bytes.
Second the 'sysctl -p' would just read the file /etc/sysctl.conf but not take the config files under the /etc/sysctl.d into account. In such situation I should run 'sysctl --system'.
So when I installed and created the database, the wrong parameter did not change the system as in the script I still used 'sysctl -p' at that time, and all the things run well.
But after the reboot, the wrong parameter did change the system and the database could not be started and also caused the ORA-12547 error.
I corrected both wrong configurations and after that I could start the database.