After a recent update in CentOS 7 and RHEL 7, the cassandra daemon stopped working. I was getting the following error while trying to start the cassandra using systemd. Similar installations were working fine in the recent past and suddenly it stopped working.

Mar 20 13:22:34 localhost systemd[1]: New main PID 72596 does not belong to service, and PID file is not owned by root. Refusing.
Mar 20 13:22:34 localhost systemd[1]: New main PID 72596 does not belong to service, and PID file is not owned by root. Refusing.
Mar 20 13:22:34 localhost systemd[1]: Failed to start LSB: distributed storage system for structured data.

Root Cause

The cassandra starts, but the systemd cannot control it. The cause is that when the cassandra starts, the old initialization SysV script is used, in which it is obviously impossible to specify the user and group to start the service.

It’s about user/group options for systemd:
—————–
[Service]
User=cassandra
Group=cassandra
—————–

But since the process pid is created with the permissions of the cassandra user, and the user and group are not specified in the initialization script, the systemd consider that it uses the root to start the service (by default) and does not allow creating the pid with cassandra user permissions.
——————
systemd[1]: New main PID 2545 does not belong to service, and PID file is not owned by root. Refusing.
——————

More details in CVE-2018-16888 (https://access.redhat.com/security/cve/cve-2018-16888)
——————
It was discovered systemd does not correctly check the content of PIDFile files before using it to kill processes. When a service is run from an unprivileged user (e.g. User field set in the service file), a local attacker who is able to write to the PIDFile of the mentioned service may use this flaw to trick systemd into killing other services and/or privileged processes.
——————

Solution

Update the /etc/rc.d/init.d/cassandra file. Either make the following patch manually or replace the entire file with the file that I provided below.

Option: 1 – Manual Patch

Open /etc/rc.d/init.d/cassandra file and make the modifications as per comments in the below script. The below snippet is not the complete script, it is only the portion which needs update. Do not copy paste and replace the file completely with this

case "$1" in
start)
# Cassandra startup
echo -n "Starting Cassandra: "
[ -d `dirname "$pid_file"` ] || \
install -m 755 -o $CASSANDRA_OWNR -g $CASSANDRA_OWNR -d `dirname $pid_file`
# -Commented for fix
#su $CASSANDRA_OWNR -c "$CASSANDRA_PROG -p $pid_file" > $log_file 2>&1
# +Added for fix
runuser -u $CASSANDRA_OWNR -- $CASSANDRA_PROG -p $pid_file > $log_file 2>&1
retval=$?
# +Added new
chown root.root $pid_file
[ $retval -eq 0 ] && touch $lock_file
echo "OK"
;;
stop)
# Cassandra shutdown
echo -n "Shutdown Cassandra: "
# -Commented as per the fix
#su $CASSANDRA_OWNR -c "kill `cat $pid_file`"
# +Added for fixing the issue
runuser -u $CASSANDRA_OWNR -- kill `cat $pid_file`
retval=$?
[ $retval -eq 0 ] && rm -f $lock_file
for t in `seq 40`; do
status -p $pid_file cassandra > /dev/null 2>&1
retval=$?
if [ $retval -eq 3 ]; then
echo "OK"
exit 0
else
sleep 0.5
fi;
done

Option:2 – Replace the file

Replace the /etc/rc.d/init.d/cassandra file with the file present in the following link. This patch was made as per the JIRA issue CASSANDRA-15273.

Steps to replace the file are given below.

mv /etc/rc.d/init.d/cassandra /etc/rc.d/init.d/cassandra.old
curl -o /etc/rc.d/init.d/cassandra https://gist.githubusercontent.com/amalgjose/74cf98e0110c27b6124b0adbb698d372/raw/c08ce3481e9cb0601e79e127c78a65bf82080e5f/cassandra
systemctl daemon-reload
systemctl restart cassandra

 
The Gist code is pasted below.


#!/bin/bash
#
# /etc/init.d/cassandra
#
# Startup script for Cassandra
#
# chkconfig: 2345 80 20
# description: Starts and stops Cassandra
# pidfile: /var/run/cassandra/cassandra.pid
### BEGIN INIT INFO
# Provides: cassandra
# Required-Start: $remote_fs $network $named $time
# Required-Stop: $remote_fs $network $named $time
# Should-Start: ntp mdadm
# Should-Stop: ntp mdadm
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: distributed storage system for structured data
# Description: Cassandra is a distributed (peer-to-peer) system for
# the management and storage of structured data.
### END INIT INFO
. /etc/rc.d/init.d/functions
export CASSANDRA_HOME=/usr/share/cassandra
export CASSANDRA_CONF=/etc/cassandra/conf
export CASSANDRA_INCLUDE=$CASSANDRA_HOME/cassandra.in.sh
export CASSANDRA_OWNR=cassandra
NAME="cassandra"
log_file=/var/log/cassandra/cassandra.log
pid_file=/var/run/cassandra/cassandra.pid
lock_file=/var/lock/subsys/$NAME
CASSANDRA_PROG=/usr/sbin/cassandra
# The first existing directory is used for JAVA_HOME if needed.
JVM_SEARCH_DIRS="/usr/lib/jvm/jre /usr/lib/jvm/jre-1.7.* /usr/lib/jvm/java-1.7.*/jre"
# Read configuration variable file if it is present
[ -r /etc/default/$NAME ] && . /etc/default/$NAME
# If JAVA_HOME has not been set, try to determine it.
if [ -z "$JAVA_HOME" ]; then
# If java is in PATH, use a JAVA_HOME that corresponds to that. This is
# both consistent with how the upstream startup script works, and with
# the use of alternatives to set a system JVM (as is done on Debian and
# Red Hat derivatives).
java="`/usr/bin/which java 2>/dev/null`"
if [ -n "$java" ]; then
java=`readlink –canonicalize "$java"`
JAVA_HOME=`dirname "\`dirname \$java\`"`
else
# No JAVA_HOME set and no java found in PATH; search for a JVM.
for jdir in $JVM_SEARCH_DIRS; do
if [ -x "$jdir/bin/java" ]; then
JAVA_HOME="$jdir"
break
fi
done
# if JAVA_HOME is still empty here, punt.
fi
fi
JAVA="$JAVA_HOME/bin/java"
export JAVA_HOME JAVA
case "$1" in
start)
# Cassandra startup
echo -n "Starting Cassandra: "
[ -d `dirname "$pid_file"` ] || \
install -m 755 -o $CASSANDRA_OWNR -g $CASSANDRA_OWNR -d `dirname $pid_file`
#Commented as per CVSS Fix
#su $CASSANDRA_OWNR -c "$CASSANDRA_PROG -p $pid_file" > $log_file 2>&1
#Added new
runuser -u $CASSANDRA_OWNR$CASSANDRA_PROG -p $pid_file > $log_file 2>&1
retval=$?
#Added new
chown root.root $pid_file
[ $retval -eq 0 ] && touch $lock_file
echo "OK"
;;
stop)
# Cassandra shutdown
echo -n "Shutdown Cassandra: "
#Commented and added new line as per CVSS fix
#su $CASSANDRA_OWNR -c "kill `cat $pid_file`"
runuser -u $CASSANDRA_OWNRkill `cat $pid_file`
retval=$?
[ $retval -eq 0 ] && rm -f $lock_file
for t in `seq 40`; do
status -p $pid_file cassandra > /dev/null 2>&1
retval=$?
if [ $retval -eq 3 ]; then
echo "OK"
exit 0
else
sleep 0.5
fi;
done
# Adding a sleep here to give jmx time to wind down (CASSANDRA-4483). Not ideal…
# Adam Holmberg suggests this, but that would break if the jmx port is changed
# for t in `seq 40`; do netstat -tnlp | grep "0.0.0.0:7199" > /dev/null 2>&1 && sleep 0.1 || break; done
sleep 5
status -p $pid_file cassandra > /dev/null 2>&1
retval=$?
if [ $retval -eq 3 ]; then
echo "OK"
else
echo "ERROR: could not stop $NAME"
exit 1
fi
;;
reload|restart)
$0 stop
$0 start
;;
status)
status -p $pid_file cassandra
exit $?
;;
*)
echo "Usage: `basename $0` start|stop|status|restart|reload"
exit 1
esac
exit 0

view raw

cassandra

hosted with ❤ by GitHub

 

This solution helped me. I hope this will help someone else also.

Advertisement