NSAS Wiki

Network System Access Solutions

User Tools

Site Tools


Sidebar

Contact

linux:corosync

corosync

setup

Setup the corosync authkey

corosync-keygen

We will use the example config included. Copy this file and edit it.

cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf

We need to set bindnetaddr to our local subnet:

bindnetaddr: IPADDRESS OF YOUR NODE

And we need corosync to start pacemaker. Add the following to the end of the file

service {
	# Load the Pacemaker Cluster Resource Manager
	name: pacemaker
	ver: 0
}

using cluster resources

Every cluster resource is defined by a Resource Agent. Resource Agents must provide Linux Cluster with a complete resource status and availability at any time! The most important and most used Resource Agent classes are:

  • LSB (Linux Standard Base) – These are common cluster resource agents found in /etc/init.d directory (init scripts).
  • OCF (Open Cluster Framework) – These are actually extended LSB cluster resource agents and usually support additional parameters

From this we can presume it is always better to use OCF (if available) over LSB Resource Agents since OCF support additional configuration parameters and are optimized for Cluster Resources.

We can check for available Resource Agents by running the “crm ra list” and the desired resource agent:

LSB

crm ra list lsb

OCF

crm ra list ocf

parameters

crm ra meta IPaddr2

general

Disable Stonith

crm configure property stonith-enabled=false

Disable Failback (there is no need to failback in this case)

crm configure rsc_defaults resource-stickiness=100

With 2 nodes we cannot attain a quorum

crm configure property no-quorum-policy=ignore

create failover ip

crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 params ip=IPADDRESS cidr_netmask=NETMASK (/31) op monitor interval=30s

services

Automatic Migration

If an resource fails, for some reason, like postfix crashes, and cannot start again, we want to migrate to another server. Per default the migration-threshold is not defined/set to infinity, which will never migrate it.

When we have 3 fails, migrate the node, and expire the failed resource after 60 seconds. This will allow it to automatically to move it back to this node. We can add migration-threshold=3 and failure-timeout=60s to the configuration, using “crm configure edit”.

apache

crm configure primitive APACHE ocf:heartbeat:apache params configfile="APACHECONF-FILE" op start interval="0s" timeout="60s" op monitor interval="5s" timeout="20s" op stop interval="0s" timeout="60s"

All services running on the main server

crm configure colocation C_ALL_IN_ONE_PLACE inf: ClusterIP APACHE

The order of application startup (Apache once the network is up)

crm configure order O_ORDER inf: ClusterIP APACHE

kvm

crm configure primitive KVMHOST ocf:heartbeat:VirtualDomain params config="/etc/libvirt/qemu/KVMHOST.xml" hypervisor="qemu:///system" meta allow-migrate="true" op start interval="0" timeout="120s" op stop interval="0" timeout="120s" op monitor interval="10" timeout="30" depth="0"

connectiond

crm configure primitive SESSIONS ocf:heartbeat:conntrackd params config="/etc/conntrackd/conntrackd.conf" op monitor interval="20" role="Slave" timeout="20" op monitor interval="10" role="Master" timeout="20"
crm configure ms MS_CONNTRACKD SESSIONS meta notify="true" interleave="true"

postfix

crm primitive postfix lsb:postfix op monitor interval="15s" meta target-role="Started" migration-threshold="3" failure-timeout=60s
  

openvpn

crm configure primitive openvpn lsb:heartbeat:openvpn op monitor interval="15s" meta target-role="Started" migration-threshold="3" failure-timeout=60s

OR

crm configure primitive openvpn ocf:heartbeat:anything params binfile="/usr/sbin/openvpn" cmdline_options="--writepid /var/run/openvpn.lekker-guest-vpn.pid --config /etc/openvpn/lekker-guest-vpn.conf --cd /etc/openvpn --daemon" pidfile=/var/run/openvpn.lekker-guest-vpn.pid op start timeout="20" op stop timeout="30" op monitor interval="20"

shorewall

crm configure primitive FIREWALL lsb:heartbeat:shorewall op monitor interval="15s" meta target-role="Started" migration-threshold="3" failure-timeout=60s

grouping

service groups

crm configure group VPN ClusterIP FIREWALL OPENVPN meta target-role="Started"

examples

send email alerts

crm_mon --daemonize --mail-to user@example.com --mail-host mail.example.com

show cluster status

crm_mon -1

show configuration

crm configure show

stop resource

crm resource stop

migrate resource

crm resource migrate ClusterIP (other-node)

Install Script

IP=$(ifconfig | grep -v '127.0.0.1' | sed -n 's/.*inet addr:\([0-9.]\+\)\s.*/\1/p')
echo -e "${IP}\t $(hostname)" >> /etc/hosts
 
echo -n "Setting up pacemaker "
apt-get -qq -y -m install corosync pacemaker
sed -i "s/START=no/START=yes/g" /etc/default/corosync
 
if [ ! -f /etc/corosync/authkey ]
then
    echo -n "Generating authkey... this will take a while "  
    corosync-keygen && echo done || echo failed 
fi
 
cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
 
sed -i "s/bindnetaddr: 192.168.1.1/bindnetaddr: ${IP}/g" /etc/corosync/corosync.conf
 
echo -e "service {
# Load the Pacemaker Cluster Resource Manager
name: pacemaker
ver: 1
}" > /etc/corosync/service.d/pcmk
 
mkdir -p /var/log/cluster
chmod 777 /var/log/cluster
 
/etc/init.d/corosync start
/etc/init.d/pacemaker start                                                                                                                                     
 
crm configure property stonith-enabled=false
crm configure property no-quorum-policy=ignore
 
 
crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 params ip=10.0.0.120 cidr_netmask=32 op monitor interval=30s
 
update-rc.d pacemaker defaults 95 00
linux/corosync.txt · Last modified: 2014/11/25 11:08 by michel.pelzer