Systemd service was terminated abnormally during reboot

  • A+
Categories:Linux

To start/stop all the instances on a server, I created a script named dbss (database start/stop) and it worked well on AIX, RHEL5/6 and SLES11, while I found it could not work on RHEL7/SLES12 with systemd service.

When the server was started, the instances would be started without any issue, but when I rebooted the system, the instances were stopped abnormally. Such issue maybe is not a big problem for general applications, but for database I should fix it.

Last year I tried to fix this issue while failed and I just found it maybe was related with cgroup, but I could not figure out the root cause.

I thought one day I would learn RHEL7 from the beginning and fix it, and one year past I still did not handle it.

Last night my bro told me when he deployed dbss on SLES12, he got one issue that the started listener could not be accessed, and I believed this issues should be similar with the above one, so this time I had to face them directly.

I could create a systemd service with the user oracle and it worked well.

While if I created a systemd service with dbss script, I got different result:

If I added the pidfile to the config file, I would get below status output:

And I found the instance was terminated:

I tested lots of options of systemd service but most of them failed until I found this webpage:

SAP Instances failed stop on shutdown (PACEMAKER, SYSTEMD, SAP)

The key point was this part:

I already found all the processes were under the user.slice when I started the systemd dbss service:

While for systemd oradb service, they were under system.slice:

To start all the instances, the dbss has to be run as root user, and it will su to other users to start the instances, so the systemd dbss service could not be run with a special user like oracle.

Following the above webpage I modified the file /etc/pam.d/system-auth-ac:

And created a file /etc/orausers:

Then I restart the systemd dbss service and all the processes began to run under system.slice:

Then I rebooted the system and checked the alert log of the instance:

I got big help also from below command:

I found the issue was caused by pam_systemd when I checked the log at the beginning.

Then I updated the dbss script to do such changes automatically. :)

Great day!

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: