FreeBSD Manual Pages

home | help
COROSYNC_CONF(5)  Corosync Cluster Engine Programmer's Manual COROSYNC_CONF(5)

NAME
       corosync.conf - corosync	executive configuration	file

SYNOPSIS
       /etc/corosync/corosync.conf

DESCRIPTION
       The  corosync.conf instructs the	corosync executive about various para-
       meters needed to	control	the corosync executive.	 Empty lines and lines
       starting	with # character are ignored.  The configuration file consists
       of bracketed top	level directives.  The possible	directive choices are:

       totem { }
	      This top level directive contains	configuration options for  the
	      totem protocol.

       logging { }
	      This top level directive contains	configuration options for log-
	      ging.

       quorum {	}
	      This top level directive contains	configuration options for quo-
	      rum.

       nodelist	{ }
	      This  top	 level	directive  contains  configuration options for
	      nodes in cluster.

       system {	}
	      This top level directive contains	configuration options  related
	      to system.

       resources { }
	      This  top	level directive	contains configuration options for re-
	      sources.

       nozzle {	}
	      This top level directive contains	configuration  options	for  a
	      libnozzle	device.

       Corosync	 supports  multiple types of network transports	for communica-
       tion between the	nodes in the cluster. There are	three types of	trans-
       ports:

	      1.     KNET.  This is a default and recommended transport	intro-
		     duced in Corosync 3. It provides several advantages  over
		     the  UDP  and  UDPU  transports, including	better perfor-
		     mance, link-level redundancy,  automatic  link  recovery,
		     and native	IP compression and encryption.

	      2.     UDPU.  This  is for unicast communication.	This transport
		     is	deprecated.

	      3.     UDP. This is for multicast	communication. This  transport
		     is	deprecated and highly discouraged to use.

       The  interface  sub-directive  of  totem	 is  optional for UDP and KNET
       transports.

       For KNET, multiple interface subsections	 define	 parameters  for  each
       KNET link on the	system.

       For  UDPU an interface section is not needed and	it is recommended that
       the nodelist is used to define cluster nodes.

       linknumber
	      This specifies the link number for the  interface.   When	 using
	      the  KNET	 protocol, each	interface should specify separate link
	      numbers to uniquely identify to the  membership  protocol	 which
	      interface	 to  use for which link.  The linknumber must start at
	      0. For UDP the only supported linknumber is 0.

       knet_link_priority
	      This specifies the priority for the link when KNET  is  used  in
	      'passive'	mode. (see link_mode below)

       knet_ping_interval
	      This   specifies	 the   interval	  between   KNET  link	pings.
	      knet_ping_interval and knet_ping_timeout are a pair, if  one  is
	      specified	 the other should be too, otherwise one	will be	calcu-
	      lated from the token timeout and one will	be taken from the con-
	      fig file.	 (default is token timeout / (knet_pong_count*2))

       knet_ping_timeout
	      If no ping is received within this time, the KNET	 link  is  de-
	      clared  dead.   knet_ping_interval  and  knet_ping_timeout are a
	      pair, if one is specified	the other should be too, otherwise one
	      will be calculated from the token	timeout	and one	will be	 taken
	      from   the   config   file.    (default	is   token  timeout  /
	      knet_pong_count)

       knet_ping_precision
	      How many values of latency are used  to  calculate  the  average
	      link latency. (default 2048 samples)

       knet_pong_count
	      How  many	 valid ping/pongs before a link	is marked UP. (default
	      2)

       knet_transport
	      Which IP transport KNET should use. valid	values are  "udp"  (or
	      "sctp" if	supported by the installed version of kronosnet). (de-
	      fault: udp) NOTE:	even if	currently included, sctp is deprecated
	      and will be removed in kronoset v2.0

       bindnetaddr (UDP	only)
	      This specifies the network address the corosync executive	should
	      bind to when using UDP transport.

	      bindnetaddr (UDP only) should be an IP address configured	on the
	      system, or a network address.

	      For example, if the local	interface is 192.168.5.92 with netmask
	      255.255.255.0,  you  should  set	bindnetaddr to 192.168.5.92 or
	      192.168.5.0.  If the local interface is 192.168.5.92  with  net-
	      mask   255.255.255.192,	set  bindnetaddr  to  192.168.5.92  or
	      192.168.5.64, and	so forth.

	      This may also be an IPV6 address,	in which case IPV6  networking
	      will be used.  In	this case, the exact address must be specified
	      and  there  is  no  automatic selection of the network interface
	      within a specific	subnet as with IPv4.

	      If IPv6 networking is used, the nodeid field in nodelist must be
	      specified.

       broadcast (UDP only)
	      This is optional and can be set to yes.  If it is	 set  to  yes,
	      the  broadcast  address will be used for communication.  If this
	      option is	set, mcastaddr should not be set.

       mcastaddr (UDP only)
	      This is the multicast address used by corosync  executive.   The
	      default  should work for most networks, but the network adminis-
	      trator should be queried	about  a  multicast  address  to  use.
	      Avoid 224.x.x.x because this is a	"config" multicast address.

	      This  may	 also be an IPV6 multicast address, in which case IPV6
	      networking will be used.	If IPv6	networking is used, the	nodeid
	      field in nodelist	must be	specified.

	      It's not necessary to use	this option if cluster_name option  is
	      used. If both options are	used, mcastaddr	has higher priority.

       mcastport
	      This  specifies  the  UDP	 port number. Exact meaning depends on
	      used transport.

	      For the KNET transport, it is used as both the binding and send-
	      ing port.	For additional links, the default value	is computed as
	      (mcastport + linknumber).

	      For the UDPU transport, mcastport	specifies the bind port. Coro-
	      sync also	allocates and binds a random sending port for each re-
	      mote node.

	      For the UDP transport, it	is used	as a bind port for  mcastaddr.
	      It  is  possible	to use the same	multicast address on a network
	      with the corosync	services configured for	different  UDP	ports.
	      Please note corosync uses	two UDP	ports mcastport	(for mcast re-
	      ceives) and mcastport - 1	(for mcast sends).  If you have	multi-
	      ple clusters on the same network using the same mcastaddr	please
	      configure	the mcastports with a gap.

	      The default is 5405.

       ttl (UDP	only)
	      This  specifies  the Time	To Live	(TTL). If you run your cluster
	      on a routed network then the default of "1" will be  too	small.
	      This option provides a way to increase this up to	255. The valid
	      range is 0..255.

       Within  the  totem  directive, there are	seven configuration options of
       which one is required, five are optional, and one is required when IPV6
       is configured in	the interface subdirective.   The  required  directive
       controls	 the  version of the totem configuration.  The optional	option
       unless using IPV6 directive controls identification of  the  processor.
       The  optional  options  control secrecy and authentication, the network
       mode of operation and maximum network MTU field.

       version
	      This specifies the version of the	configuration file.  Currently
	      the only valid version for this directive	is 2.

       clear_node_high_bit
	      This configuration option	is optional and	is only	relevant  when
	      no  nodeid is specified.	Some corosync clients require a	signed
	      32 bit nodeid that is greater than zero however by default coro-
	      sync uses	all 32 bits of the IPv4	address	space when  generating
	      a	 nodeid.   Set	this option to yes to force the	high bit to be
	      zero and therefore ensure	the nodeid is a	positive signed	32 bit
	      integer.

	      WARNING: Cluster behavior	is undefined if	this option is enabled
	      on only a	subset of the cluster (for example  during  a  rolling
	      upgrade).

       crypto_model
	      This  specifies  which  cryptographic  library should be used by
	      KNET.  Supported values depend on	the libknet build and  on  the
	      installed	cryptography libraries.	Typically nss and openssl will
	      be available but gcrypt and others could also be allowed.

	      The default is nss.

       crypto_hash
	      This  specifies  which HMAC authentication should	be used	to au-
	      thenticate all messages. Valid values are	none  (no  authentica-
	      tion), md5, sha1,	sha256,	sha384 and sha512. Encrypted transmis-
	      sion is only supported for the KNET transport.

	      The default is none.

       crypto_cipher
	      This  specifies  which cipher should be used to encrypt all mes-
	      sages.  Valid values are none (no	 encryption),  aes256,	aes192
	      and  aes128.   Enabling crypto_cipher, requires also enabling of
	      crypto_hash. Encrypted transmission is only  supported  for  the
	      KNET transport.

	      The default is none.

       secauth
	      This implies crypto_cipher=aes256	and crypto_hash=sha256,	unless
	      those options are	explicitly set.	Encrypted transmission is only
	      supported	for the	KNET transport.

	      The default is off.

       keyfile
	      This  specifies  the fully qualified path	to the shared key used
	      to authenticate and encrypt data used within the Totem protocol.

	      The default is /etc/corosync/authkey.

       key    Shared key stored	in configuration instead of authkey file. This
	      option has lower precedence than keyfile	option	so  it's  used
	      only  when  keyfile  is not specified.  Using this option	is not
	      recommended for security reasons.

       link_mode
	      This specifies the Kronosnet mode, which may be passive, active,
	      or rr (round-robin).  passive: the active	link with the  highest
	      priority	(highest  number)  will	 be used. If one or more links
	      share the	same priority the one with the lowest link ID will  be
	      used.   active:  All active links	will be	used simultaneously to
	      send traffic.  link priority is ignored.	rr:  Round-Robin  pol-
	      icy. Each	packet will be sent to the next	active link in order.

	      If  only	one interface directive	is specified, passive is auto-
	      matically	chosen.

	      The maximum number of interface directives that is allowed  with
	      Kronosnet	is 8. For other	transports it is 1.

       netmtu This  specifies  maximum	packet	length	sent by	corosync. It's
	      mainly for the UDPU (and UDP) transport, where it	specifies  the
	      network  maximum	transmit  size,	 but can be used also with the
	      KNET transport, where it defines the maximum length  of  packets
	      passed  to  the  KNET layer. To specify the network MTU manually
	      for KNET,	use the	knet_mtu option.

	      For UDPU (and UDP), setting this value beyond 1500, the  regular
	      frame MTU, requires ethernet devices that	support	large, or also
	      called jumbo, frames.  If	any device in the network doesn't sup-
	      port  large frames, the protocol will not	operate	properly.  The
	      hosts must also have their mtu size set from  1500  to  whatever
	      frame size is specified here.

	      Please  note  while some NICs or switches	claim large frame sup-
	      port, they support 9000 MTU as the maximum frame size  including
	      the  IP  header.	 Setting the netmtu and	host MTUs to 9000 will
	      cause totem to use the full 9000 bytes of	the frame.  Then Linux
	      will add a 18 byte header	moving the full	frame  size  to	 9018.
	      As  a  result  some hardware will	not operate properly with this
	      size of data.  A netmtu of 8982 seems to work for	the few	 large
	      frame  devices  that have	been tested.  Some manufacturers claim
	      large frame support when in fact they  support  frame  sizes  of
	      4500 bytes.

	      When sending multicast traffic, if the network frequently	recon-
	      figures,	chances	 are  that  some device	in the network doesn't
	      support large frames.

	      Choose hardware carefully	if intending to	use large  frame  sup-
	      port.

	      The  default  is	1500 for UDPU (and UDP)	and 65536 for the KNET
	      transport.

       transport
	      This directive controls the transport mechanism used.   The  de-
	      fault is knet (for KNET).	 The transport type can	also be	set to
	      udpu  (for  UDPU)	 or  udp (for UDP). Only KNET allows crypto or
	      multiple interfaces per node.

       cluster_name
	      This specifies the name of cluster and it's used	for  automatic
	      generating of multicast address.

       config_version
	      This  specifies version of config	file. This is converted	to un-
	      signed 64-bit int.  By default it's 0. Option is used to prevent
	      joining old nodes	with not up-to-date configuration. If value is
	      not 0, and node is going for first time (only  for  first	 time,
	      join  after  split  doesn't  follow this rules) from single-node
	      membership to multiple nodes membership, other nodes config_ver-
	      sions are	collected. If current node config_version is not equal
	      to highest of collected versions,	corosync is terminated.

       ip_version
	      This specifies version of	IP to ask DNS resolver for.  The value
	      can be one of ipv4 (look only for	an IPv4	address) , ipv6	(check
	      only IPv6	address) , ipv4-6 (look	for all	address	 families  and
	      use  first  IPv4	address	found in the list if there is such ad-
	      dress, otherwise use first IPv6 address) and  ipv6-4  (look  for
	      all  address  families  and  use first IPv6 address found	in the
	      list if there is such address,  otherwise	 use  first  IPv4  ad-
	      dress).

	      Default  (if unspecified)	is ipv6-4 for KNET and UDPU transports
	      and ipv4 for UDP transport.

	      The KNET transport supports  IPv4	 and  IPv6  addresses  concur-
	      rently, provided they are	consistent on each link.

       ip_dscp
	      This  specifies  the  dscp  (differentiated services code	point)
	      value to be used in the IP header's type of service (TOS)	 field
	      according	 to  RFC  2474.	  Allowed values are 0-63 and symbolic
	      dscp names cs0, cs1, cs2,	cs3, cs4, cs5, cs6, cs7,  af11,	 af12,
	      af13,  af21,  af22, af23,	af31, af32, af33, af41,	af42, af43 and
	      ef.  Value 0 disables the	dscp support. Use of  TOS  field  then
	      depends on the used transport.  If ip_dscp is set	to a value not
	      equal  to	 zero,	and  an	 appropriate socket option (IP_TOS) is
	      available, the value is put into the upper six bits of  the  TOS
	      header field.

       Within  the  totem  directive,  there are several configuration options
       which are used to control the operation of the protocol.	 It is	gener-
       ally not	recommended to change any of these values without proper guid-
       ance  and  sufficient testing.  Some networks may require larger	values
       if suffering from frequent reconfigurations.  Some applications may re-
       quire faster failure detection times which can be achieved by  reducing
       the token timeout.

       token  This  timeout is used directly or	as a base for real token time-
	      out calculation (explained in token_coefficient section).	 Token
	      timeout specifies	in milliseconds	until a	token loss is declared
	      after not	receiving a token.  This is the	time spent detecting a
	      failure  of a processor in the current configuration.  Reforming
	      a	new configuration takes	about 50 milliseconds in  addition  to
	      this timeout.

	      For  real	token timeout used by totem it's possible to read cmap
	      value of runtime.config.totem.token key.

	      Be careful to use	the same timeout values	on each	of  the	 nodes
	      in the cluster or	unpredictable results may occur.

	      The default is 3000 milliseconds.

       token_warning
	      Specifies	 the  interval between warnings	that the token has not
	      been received.  The value	is a percentage	of the	token  timeout
	      and can be set to	0 to disable warnings.

	      The default is 75%.

       token_coefficient
	      This  value  is used only	when nodelist section is specified and
	      contains at least	3 nodes. If so,	real  token  timeout  is  then
	      computed	as  token + (number_of_nodes - 2) * token_coefficient.
	      This allows cluster to scale  without  manually  changing	 token
	      timeout every time new node is added. This value can be set to 0
	      resulting	in effective removal of	this feature.

	      The default is 650 milliseconds.

       token_retransmit
	      This timeout specifies in	milliseconds after how long before re-
	      ceiving  a token the token is retransmitted.  This will be auto-
	      matically	calculated if token is modified.   It  is  not	recom-
	      mended  to  alter	 this value without guidance from the corosync
	      community.

	      The minimum is 30	milliseconds. If not set and error occur, make
	      sure token / (token_retransmits_before_loss_const	+ 0.2) is more
	      than 30.

	      The default is 238 milliseconds for two nodes cluster. Three  or
	      more nodes reference token_coefficient.

       knet_compression_model
	      Type  of	compression used by Kronosnet. Supported values	depend
	      on the libknet build and on the installed	compression libraries.
	      Typically	zlib and lz4 will be available but  bzip2  and	others
	      could also be allowed. The default is 'none'.

       knet_compression_threshold
	      Tells KNET to NOT	compress any packets that are smaller than the
	      value indicated. Default 100 bytes.

	      Set  to  0 to reset to the default.  Set to 1 to compress	every-
	      thing.

       knet_compression_level
	      Many compression libraries allow tuning of  compression  parame-
	      ters.  For  example  0 or	1 ... 9	are commonly used to determine
	      the level	of compression.	This value is passed unmodified	to the
	      compression library so it	is  recommended	 to  consult  the  li-
	      brary's documentation for	more detailed information.

       hold   This timeout specifies in	milliseconds how long the token	should
	      be  held	by  the	 representative	when the protocol is under low
	      utilization.   It	is not recommended to alter this value without
	      guidance from the	corosync community.

	      The default is 180 milliseconds.

       token_retransmits_before_loss_const
	      This value identifies how	many token retransmits should  be  at-
	      tempted  before forming a	new configuration. It is also used for
	      token_retransmit and hold	calculations.

	      The default is 4 retransmissions.

       join   This timeout specifies in	milliseconds how long to wait for join
	      messages in the membership protocol.

	      The default is 50	milliseconds.

       send_join
	      This timeout specifies in	milliseconds an	upper range between  0
	      and  send_join  to wait before sending a join message.  For con-
	      figurations with less than 32 nodes, this	parameter is not  nec-
	      essary.  For larger rings, this parameter	is necessary to	ensure
	      the  NIC	is not overflowed with join messages on	formation of a
	      new ring.	 A reasonable value for	large rings (128 nodes)	 would
	      be 80msec.  Other	timer values must also change if this value is
	      changed.	 Seek  advice from the corosync	mailing	list if	trying
	      to run larger configurations.

	      The default is 0 milliseconds.

       consensus
	      This timeout specifies in	milliseconds how long to wait for con-
	      sensus to	be achieved before starting a new round	of  membership
	      configuration.   The  minimum  value for consensus must be 1.2 *
	      token.  This value will be automatically calculated at 1.2 * to-
	      ken if the user doesn't specify a	consensus value.

	      For two node clusters, a consensus larger	than the join  timeout
	      but less than token is safe.  For	three node or larger clusters,
	      consensus	 should	 be larger than	token.	There is an increasing
	      risk of odd membership changes, which  still  guarantee  virtual
	      synchrony,  as node count	grows if consensus is less than	token.

	      The default is 3600 milliseconds.

       merge  This  timeout  specifies in milliseconds how long	to wait	before
	      checking for a partition when  no	 multicast  traffic  is	 being
	      sent.   If  multicast traffic is being sent, the merge detection
	      happens automatically as a function of the protocol.

	      The default is 200 milliseconds.

       downcheck
	      This timeout specifies in	milliseconds how long to  wait	before
	      checking	that  a	network	interface is back up after it has been
	      downed.

	      The default is 1000 milliseconds.

       fail_recv_const
	      This constant specifies how many rotations of the	token  without
	      receiving	 any  of the messages when messages should be received
	      may occur	before a new configuration is formed.

	      The default is 2500 failures to receive a	message.

       seqno_unchanged_const
	      This constant specifies how many rotations of the	token  without
	      any  multicast  traffic  should  occur  before the hold timer is
	      started.

	      The default is 30	rotations.

       heartbeat_failures_allowed
	      [HeartBeating mechanism] Configures  the	optional  HeartBeating
	      mechanism	for faster failure detection. Keep in mind that	engag-
	      ing this mechanism in lossy networks could cause faulty loss de-
	      claration	 as the	mechanism relies on the	network	for heartbeat-
	      ing.

	      So as a rule of thumb use	this mechanism if you require improved
	      failure in low to	medium utilized	networks.

	      This constant specifies the number  of  heartbeat	 failures  the
	      system should tolerate before declaring heartbeat	failure	e.g 3.
	      Also  if this value is not set or	is 0 then the heartbeat	mecha-
	      nism is not engaged in the system	 and  token  rotation  is  the
	      method of	failure	detection

	      The default is 0 (disabled).

       max_network_delay
	      [HeartBeating mechanism] This constant specifies in milliseconds
	      the  approximate	delay that your	network	takes to transport one
	      packet from one machine to another. This value is	to be  set  by
	      system engineers and please don't	change if not sure as this ef-
	      fects the	failure	detection mechanism using heartbeat.

	      The default is 50	milliseconds.

       window_size
	      This  constant specifies the maximum number of messages that may
	      be sent on  one  token  rotation.	  If  all  processors  perform
	      equally  well,  this value could be large	(300), which would in-
	      troduce higher latency from origination  to  delivery  for  very
	      large  rings.   To  reduce  latency in large rings(16+), the de-
	      faults are a safe	compromise.  If	1 or  more  slow  processor(s)
	      are  present  among  fast	 processors,  window_size should be no
	      larger than 256000 / netmtu to avoid overflow of the kernel  re-
	      ceive buffers.  The user is notified of this by the display of a
	      retransmit  list	in the notification logs.  There is no loss of
	      data, but	performance is reduced when these errors occur.

	      The default is 50	messages.

       max_messages
	      This constant specifies the maximum number of messages that  may
	      be  sent by one processor	on receipt of the token.  The max_mes-
	      sages parameter is limited to 256000 / netmtu to	prevent	 over-
	      flow of the kernel transmit buffers.

	      The default is 17	messages.

       miss_count_const
	      This  constant defines the maximum number	of times on receipt of
	      a	token a	message	is checked for	retransmission	before	a  re-
	      transmission  occurs.   This  parameter  is useful to modify for
	      switches that delay multicast packets compared to	unicast	 pack-
	      ets.   The  default  setting  works  well	 for nearly all	modern
	      switches.

	      The default is 5 messages.

       knet_pmtud_interval
	      How often	the KNET PMTUd runs to look for	network	 MTU  changes.
	      Value in seconds,	default: 30

       knet_mtu
	      Switch  between manual and automatic MTU discovery. A value of 0
	      means automatic, other values set	a manual MTU.  In a setup with
	      multiple interfaces, please specify the lowest MTU  of  the  se-
	      lected interfaces.

	      The default value	is 0.

       block_unlisted_ips
	      Allow  UDPU  and KNET to drop packets from IP addresses that are
	      not known	(nodes which don't exist in the	nodelist) to corosync.
	      Value is yes or no.

	      This feature is mainly to	protect	against	the joining  of	 nodes
	      with outdated configurations after a cluster split.  Another use
	      case is to allow the atomic merge	of two independent clusters.

	      Changing	the  default value is not recommended, the overhead is
	      tiny and an existing cluster may fail if corosync	is started  on
	      an unlisted node with an old configuration.

	      The default value	is yes.

       cancel_token_hold_on_retransmit
	      Allows  Corosync	to  hold token by representative when there is
	      too much retransmit messages. This allows	network	to process in-
	      creased load without overloading it. Used	mechanism is  same  as
	      described	for hold directive.

	      Some  deployments	 may  prefer to	never hold token when there is
	      retransmit messages. If so, option should	be set to yes.

	      The default value	is no.

       Within the logging directive, there are several	configuration  options
       which are all optional.

       The following 3 options are valid only for the top level	logging	direc-
       tive:

       timestamp
	      This  specifies  that a timestamp	is placed on all log messages.
	      It can be	one of off (no timestamp), on (second precision	 time-
	      stamp)  or  hires	 (millisecond  precision timestamp - only when
	      supported	by LibQB).

	      The default is hires (or on if hires is not supported).

       fileline
	      This specifies that file and line	should be printed.

	      The default is off.

       function_name
	      This specifies that the code function name should	be printed.

	      The default is off.

       blackbox
	      This specifies that blackbox functionality should	be enabled.

	      The default is on.

       The following options are valid both for	top  level  logging  directive
       and they	can be overridden in logger_subsys entries.

       to_stderr

       to_logfile

       to_syslog
	      These specify the	destination of logging output. Any combination
	      of these options may be specified. Valid options are yes and no.

	      The default is syslog and	stderr.

	      Please  note, if you are using to_logfile	and want to rotate the
	      file, use	logrotate(8) with the option copytruncate.  eg.
	      /var/log/corosync.log {
		   missingok
		   compress
		   notifempty
		   daily
		   rotate 7
		   copytruncate
	      }

       logfile
	      If the to_logfile	directive is set to yes	, this	option	speci-
	      fies the pathname	of the log file.

	      No default.

       logfile_priority
	      This  specifies the logfile priority for this particular subsys-
	      tem. Ignored if debug is on.  Possible values are: alert,	 crit,
	      debug (same as debug = on), emerg, err, info, notice, warning.

	      The default is: info.

       syslog_facility
	      This  specifies  the  syslog facility type that will be used for
	      any messages sent	to syslog. options are daemon, local0, local1,
	      local2, local3, local4, local5, local6 & local7.

	      The default is daemon.

       syslog_priority
	      This specifies the syslog	level for this	particular  subsystem.
	      Ignored if debug is on.  Possible	values are: alert, crit, debug
	      (same as debug = on), emerg, err,	info, notice, warning.

	      The default is: info.

       debug  This  specifies whether debug output is logged for this particu-
	      lar logger. Also can contain value trace,	what is	highest	 level
	      of debug information.

	      The default is off.

       Within the logging directive, logger_subsys directives are optional.

       Within  the  logger_subsys sub-directive, all of	the above logging con-
       figuration options are valid and	can be used to	override  the  default
       settings.   The subsys entry, described below, is mandatory to identify
       the subsystem.

       subsys This specifies the subsystem identity (name) for	which  logging
	      is  specified.  This  is	the  name  used	 by  a	service	in the
	      log_init() call. E.g. 'CPG'. This	directive is required.

       Within the quorum directive it is possible to specify the  quorum  con-
       figuration options. The following option	is required to activate	quorum
       service:

       provider
	      This  specifies  algorithm  to  use. At the time of writing only
	      corosync_votequorum is supported.	 See votequorum(5) for config-
	      uration options.

       Within the nodelist directive it	is possible to specify specific	infor-
       mation about nodes in cluster. Directive	can contain only node  sub-di-
       rective,	which specifies	every node that	should be a member of the mem-
       bership,	and where non-default options are needed. Every	node must have
       at least	ring0_addr field filled.

       Every node that should be a member of the membership must be specified.

       Possible	options	are:

       ringX_addr
	      This  specifies IP or network hostname address of	the particular
	      node.  X is a link number.

       nodeid This configuration option	is required for	each node for  Kronos-
	      net  mode.   It is a 32 bit value	specifying the node identifier
	      delivered	to the cluster membership service. The node identifier
	      value of zero is reserved	and should not be  used.  If  KNET  is
	      set, this	field must be set.

       name   This option is used mainly with KNET transport to	identify local
	      node.  It's also used by client software (pacemaker).  Algorithm
	      for identifying local node is following:

	      1.     Looks up $HOSTNAME	in the nodelist

	      2.     If	 this  fails  strip the	domain name from $HOSTNAME and
		     looks up that in the nodelist

	      3.     If	this fails look	in the nodelist	for a  fully-qualified
		     name  whose  short	 version  matches the short version of
		     $HOSTNAME

	      4.     If	all this fails then search the interfaces list for  an
		     address that matches a name in the	nodelist

       Within the system directive it is possible to specify system options.

       Possible	options	are:

       qb_ipc_type
	      This  specifies  type  of	 IPC to	use. Can be one	of native (de-
	      fault), shm and socket.  Native means one	of shm or socket,  de-
	      pending  on what is supported by OS. On systems with support for
	      both, SHM	is selected. SHM is generally faster, but need to  al-
	      locate ring buffer file in /dev/shm.

       sched_rr
	      Should  be  set  to  yes (default) if corosync should try	to set
	      round robin realtime scheduling with maximal priority to itself.
	      When setting of scheduler	fails, fallback	to set maximal	prior-
	      ity.

       priority
	      Set  priority  of	 corosync process. Valid only when sched_rr is
	      set to no.  Can be ether numeric value with similar  meaning  as
	      nice(1) or max / min meaning maximal / minimal priority (so min-
	      imal / maximal nice value).

       move_to_root_cgroup
	      Can be one of yes	(Corosync always moves itself to root cgroup),
	      no  (Corosync never tries	to move	itself to root cgroup) or auto
	      (Corosync	first checks if	sched_rr is enabled,  and  if  so,  it
	      tries to set round robin realtime	scheduling with	maximal	prior-
	      ity  to itself.  If setting of priority fails, corosync tries to
	      move itself to root cgroup and retries setting of	priority).

	      This feature is available	only for systems with cgroups v1  with
	      RT  sched	 enabled  (Linux with CONFIG_RT_GROUP_SCHED kernel op-
	      tion) and	cgroups	v2.

	      It's worth noting	that currently (May 3 2021) cgroup2 doesnt yet
	      support control of realtime processes and	the cpu	controller can
	      only be enabled when all RT processes are	 in  the  root	cgroup
	      (applies only for	kernel with CONFIG_RT_GROUP_SCHED enabled). So
	      when  move_to_root_cgroup	 is  disabled, kernel is compiled with
	      CONFIG_RT_GROUP_SCHED and	systemd	is used, it may	be  impossible
	      to  make	systemd	 options like CPUQuota working correctly until
	      corosync is stopped.

	      Also when	moving to root cgroup is enforced  and	used  together
	      with  cgroup2 and	systemd	it makes impossible (most of the time)
	      for journald to add systemd specific metadata (most  importantly
	      _SYSTEMD_UNIT) properly, because corosync	is moved out of	cgroup
	      created  by  systemd.  This  means  it is	not possible to	filter
	      corosync logged messages based on	these  metadata	 (for  example
	      using -u or _SYSTEMD_UNIT=UNIT pattern) and also running system-
	      ctl  status  doesn't  display  (all) corosync log	messages.  The
	      problem is even worse because journald caches pid	for some  time
	      (approx.	5 sec) so initial corosync messages have correct meta-
	      data.

       allow_knet_handle_fallback
	      If KNET handle creation fails using privileged operations, allow
	      fallback to creating KNET	handle using unprivileged  operations.
	      Defaults	to  no,	 meaning  if  privileged  KNET handle creation
	      fails, corosync will refuse to start.

	      The KNET handle will always be created using  privileged	opera-
	      tions  if	 possible, setting this	to yes only allows fallback to
	      unprivileged operations. This fallback may result	in performance
	      issues, but if running in	an unprivileged	environment, e.g. as a
	      normal user or in	unprivileged container,	this may be required.

       state_dir
	      Existing directory where corosync	should	chdir  into.  Corosync
	      stores important state files and blackboxes there.

	      The  default  is the value of the	environment variable STATE_DI-
	      RECTORY or /var/lib/corosync.

       Within the resources directive it is possible to	 specify  options  for
       resources.

       Possible	option is:

       watchdog_device
	      (Valid only if Corosync was compiled with	watchdog support.)
	      Watchdog	device	to  use, for example /dev/watchdog.  If	unset,
	      empty or "off", no watchdog is used.

	      In a cluster with	properly configured power fencing  a  watchdog
	      provides	no additional value.  On the other hand, slow watchdog
	      communication may	incur multi-second delays in the Corosync main
	      loop, potentially	breaking down membership.  IPMI	watchdogs  are
	      particularly   notorious	 in   this  regard:  read  about  kip-
	      mid_max_busy_us in IPMI.txt in the Linux kernel documentation.

       Within the nozzle directive it is possible to  specify  options	for  a
       libnozzle  device. This is a pseudo ethernet device that	routes network
       traffic through a channel on the	corosync KNET network (NOT cpg or  any
       corosync	 internal  service) to other nodes in the cluster. This	allows
       applications to take advantage of KNET features such  as	 multipathing,
       automatic  failover,  link  switching etc. Note that libnozzle is not a
       reliable	transport, but you can tunnel TCP through it for reliable com-
       munications.
       libnozzle also supports optional	interface  up/down  scripts  that  are
       kept under a /etc/corosync/updown.d/ directory. See the KNET documenta-
       tion for	more information.
       Only one	nozzle device is allowed.
       The nozzle stanza takes several options:

       name   The  name	of the network device to be created. On	Linux this may
	      be any name at all, other	platforms  have	 restrictions  on  the
	      name.

       ipaddr The  IP address (IPv6 or IPv4) of	the interface. The bottom part
	      of this address will be replaced by the local node's  nodeid  in
	      conjunction  with	ipprefix. so, eg ipaddr: 192.168.1.0 ipprefix:
	      24  will	make  nodeids  1,2,5  use  IP  addresses  192.168.1.1,
	      192.168.1.2  &  192.168.1.5.   If	 a prefix length of 16 is used
	      then the bottom two bytes	will be	filled in with nodeid numbers.
	      IPv6 addresses must end in '::', the nodeid will be added	 after
	      the  two	colons	to make	the local IP address.  Only one	IP ad-
	      dress is currently supported in the  corosync.conf  file.	 Addi-
	      tional  IP  addresses  can be added in the ifup script if	neces-
	      sary.

       ipprefix
	      specifies	the IP address	prefix	for  the  nozzle  device  (see
	      above)

       macaddr
	      Specifies	 the  MAC address prefix for the nozzle	device.	As for
	      the IP address, the bottom part  of  the	MAC  address  will  be
	      filled  in with the node id. In this case	no prefix applies, the
	      bottom two bytes of the MAC address will always  be  overwritten
	      with  the	 node  id. So specifying macaddr: 54:54:12:24:12:12 on
	      nodeid  1	 will  result  in  it  having	a   MAC	  address   of
	      54:54:12:24:00:01

TO ADD A NEW NODE TO THE CLUSTER
       For  example to add a node with address 10.24.38.108 with nodeid	3. The
       node has	the name NEW (in DNS or	/etc/hosts) and	is not currently  run-
       ning corosync. The current corosync.conf	nodelist looks like this:

	      nodelist {
		  node {
		      nodeid: 1
		      ring0_addr: 10.24.38.101
		      name: node1
		  }
		  node {
		      nodeid: 2
		      ring0_addr: 10.24.38.102
		      name: node2

		  }
	      }

       Add  a  new  entry  for the node	below the existing nodes. Node entries
       don't have to be	in nodeid order, but it	will help keep	you  sane.  So
       the nodelist now	looks like this:

	      nodelist {
		  node {
		      nodeid: 1
		      ring0_addr: 10.24.38.101
		      name: node1
		  }
		  node {
		      nodeid: 2
		      ring0_addr: 10.24.38.102
		      name: node2

		  }
		  node {
		      nodeid: 3
		      ring0_addr: 10.24.38.108
		      name: NEW

		  }
	      }

       This  file must then be copied onto all three nodes -  the existing two
       nodes, and the new one.	On one of the existing	corosync  nodes,  tell
       corosync	to re-read the updated config file into	memory:

	      corosync-cfgtool -R

       This  command  only needs to be run on one node in the cluster. You may
       then start corosync on the NEW node and it should join the cluster.  If
       this doesn't work as expected then check	the communications between all
       three  nodes  is	 working,  and check the syslog	files on all nodes for
       more information. It's important	to note	that the key bit  of  informa-
       tion about a node failing to join might be on a different node than you
       expect.

TO REMOVE A NODE FROM THE CLUSTER
       This  is	the reverse procedure to 'Adding a node' above.	First you need
       to shut down the	node you will be removing from the cluster.

	      corosync-cfgtool -H

       Then delete the nodelist	stanza from corosync.conf and  finally	update
       corosync	on the remaining nodes by running

	      corosync-cfgtool -R

       on one of them.

ADDRESS	RESOLUTION
       corosync	 resolves  ringX_addr  names/IP	 addresses  using  the	getad-
       drinfo(3) call with respect of totem.ip_version setting.

       getaddrinfo() function uses a sophisticated algorithm to	sort node  ad-
       dresses	into  a	 preferred order and corosync always chooses the first
       address in that list of the required family.  As	such it	 is  essential
       that  your DNS or /etc/hosts files are correctly	configured so that all
       addresses for ringX appear on the same network (or are  reachable  with
       minimal	hops)  and  over the same IP protocol. If this is not the case
       then some nodes might not be able to join the cluster. It  is  possible
       to override the search order used by getaddrinfo() using	the configura-
       tion file /etc/gai.conf(5) if necessary,	but this is not	recommended.

       If there	is any doubt about the order of	addresses returned from	getad-
       drinfo()	then it	might be simpler to use	IP addresses (v4 or v6)	in the
       ringX_addr field.

FILES
       /etc/corosync/corosync.conf
	      The corosync executive configuration file.

SEE ALSO
       corosync_overview(7),  votequorum(5), corosync-qdevice(8), logrotate(8)
       getaddrinfo(3) gai.conf(5)

corosync Man Page		  2025-06-12		      COROSYNC_CONF(5)
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=corosync.conf&sektion=5&manpath=FreeBSD+Ports+15.1.quarterly>
home | help
Header And Logo

Peripheral Links

Site Navigation

FreeBSD Manual Pages

Header And Logo

Peripheral Links

Search

Site Navigation

FreeBSD Manual Pages