FreeBSD Manual Pages

home | help
COROSYNC_CONF(5)  Corosync Cluster Engine Programmer's Manual COROSYNC_CONF(5)

NAME
       corosync.conf - corosync	executive configuration	file

SYNOPSIS
       /etc/corosync/corosync.conf

DESCRIPTION
       The  corosync.conf instructs the	corosync executive about various para-
       meters needed to	control	the corosync executive.	 Empty lines and lines
       starting	with # character are ignored.  The configuration file consists
       of bracketed top	level directives.  The possible	directive choices are:

       totem { }
	      This top level directive contains	configuration options for  the
	      totem protocol.

       logging { }
	      This top level directive contains	configuration options for log-
	      ging.

       quorum {	}
	      This top level directive contains	configuration options for quo-
	      rum.

       nodelist	{ }
	      This  top	 level	directive  contains  configuration options for
	      nodes in cluster.

       system {	}
	      This top level directive contains	configuration options  related
	      to system.

       resources { }
	      This  top	level directive	contains configuration options for re-
	      sources.

       nozzle {	}
	      This top level directive contains	configuration  options	for  a
	      libnozzle	device.

       Corosync	 supports  multiple types of network transports	for communica-
       tion between the	nodes in the cluster. There are	three types of	trans-
       ports:

	      1.     KNET.  This is a default and recommended transport	intro-
		     duced in Corosync 3. It provides several advantages  over
		     the  UDP  and  UDPU  transports, including	better perfor-
		     mance, link-level redundancy,  automatic  link  recovery,
		     and native	IP compression and encryption.

	      2.     UDPU.  This  is for unicast communication.	This transport
		     is	deprecated.

	      3.     UDP. This is for multicast	communication. This  transport
		     is	deprecated and highly discouraged to use.

       The  interface  sub-directive  of  totem	 is  optional for UDP and KNET
       transports.

       For KNET, multiple interface subsections	 define	 parameters  for  each
       KNET link on the	system.

       For  UDPU an interface section is not needed and	it is recommended that
       the nodelist is used to define cluster nodes.

       linknumber
	      This specifies the link number for the  interface.   When	 using
	      the  KNET	 protocol, each	interface should specify separate link
	      numbers to uniquely identify to the  membership  protocol	 which
	      interface	 to  use for which link.  The linknumber must start at
	      0. For UDP the only supported linknumber is 0.

       knet_link_priority
	      This specifies the priority for the link when KNET  is  used  in
	      'passive'	mode. (see link_mode below)

       knet_ping_interval
	      This   specifies	 the   interval	  between   KNET  link	pings.
	      knet_ping_interval and knet_ping_timeout are a pair, if  one  is
	      specified	 the other should be too, otherwise one	will be	calcu-
	      lated from the token timeout and one will	be taken from the con-
	      fig file.	 (default is token timeout / (knet_pong_count*2))

       knet_ping_timeout
	      If no ping is received within this time, the KNET	 link  is  de-
	      clared  dead.   knet_ping_interval  and  knet_ping_timeout are a
	      pair, if one is specified	the other should be too, otherwise one
	      will be calculated from the token	timeout	and one	will be	 taken
	      from   the   config   file.    (default	is   token  timeout  /
	      knet_pong_count)

       knet_ping_precision
	      How many values of latency are used  to  calculate  the  average
	      link latency. (default 2048 samples)

       knet_pong_count
	      How  many	 valid ping/pongs before a link	is marked UP. (default
	      2)

       knet_transport
	      Which IP transport KNET should use. valid	values are  "sctp"  or
	      "udp". (default: udp)

       bindnetaddr (UDP	only)
	      This specifies the network address the corosync executive	should
	      bind to when using UDP transport.

	      bindnetaddr (UDP only) should be an IP address configured	on the
	      system, or a network address.

	      For example, if the local	interface is 192.168.5.92 with netmask
	      255.255.255.0,  you  should  set	bindnetaddr to 192.168.5.92 or
	      192.168.5.0.  If the local interface is 192.168.5.92  with  net-
	      mask   255.255.255.192,	set  bindnetaddr  to  192.168.5.92  or
	      192.168.5.64, and	so forth.

	      This may also be an IPV6 address,	in which case IPV6  networking
	      will be used.  In	this case, the exact address must be specified
	      and  there  is  no  automatic selection of the network interface
	      within a specific	subnet as with IPv4.

	      If IPv6 networking is used, the nodeid field in nodelist must be
	      specified.

       broadcast (UDP only)
	      This is optional and can be set to yes.  If it is	 set  to  yes,
	      the  broadcast  address will be used for communication.  If this
	      option is	set, mcastaddr should not be set.

       mcastaddr (UDP only)
	      This is the multicast address used by corosync  executive.   The
	      default  should work for most networks, but the network adminis-
	      trator should be queried	about  a  multicast  address  to  use.
	      Avoid 224.x.x.x because this is a	"config" multicast address.

	      This  may	 also be an IPV6 multicast address, in which case IPV6
	      networking will be used.	If IPv6	networking is used, the	nodeid
	      field in nodelist	must be	specified.

	      It's not necessary to use	this option if cluster_name option  is
	      used. If both options are	used, mcastaddr	has higher priority.

       mcastport
	      This  specifies  the UDP port number.  It	is possible to use the
	      same multicast address on	a network with the  corosync  services
	      configured  for  different UDP ports.  Please note corosync uses
	      two UDP ports mcastport (for mcast receives) and mcastport  -  1
	      (for  mcast  sends).   If	you have multiple clusters on the same
	      network using the	same mcastaddr please configure	the mcastports
	      with a gap.

	      The default is 5405.

       ttl (UDP	only)
	      This specifies the Time To Live (TTL). If	you run	 your  cluster
	      on  a  routed network then the default of	"1" will be too	small.
	      This option provides a way to increase this up to	255. The valid
	      range is 0..255.

       Within the totem	directive, there are seven  configuration  options  of
       which one is required, five are optional, and one is required when IPV6
       is  configured  in  the interface subdirective.	The required directive
       controls	the version of the totem configuration.	 The  optional	option
       unless  using  IPV6 directive controls identification of	the processor.
       The optional options control secrecy and	 authentication,  the  network
       mode of operation and maximum network MTU field.

       version
	      This specifies the version of the	configuration file.  Currently
	      the only valid version for this directive	is 2.

       clear_node_high_bit
	      This  configuration option is optional and is only relevant when
	      no nodeid	is specified.  Some corosync clients require a	signed
	      32 bit nodeid that is greater than zero however by default coro-
	      sync  uses all 32	bits of	the IPv4 address space when generating
	      a	nodeid.	 Set this option to yes	to force the high  bit	to  be
	      zero and therefore ensure	the nodeid is a	positive signed	32 bit
	      integer.

	      WARNING: Cluster behavior	is undefined if	this option is enabled
	      on  only	a  subset of the cluster (for example during a rolling
	      upgrade).

       crypto_model
	      This specifies which cryptographic library  should  be  used  by
	      KNET.   Supported	 values	depend on the libknet build and	on the
	      installed	cryptography libraries.	Typically nss and openssl will
	      be available but gcrypt and others could also be allowed.

	      The default is nss.

       crypto_hash
	      This specifies which HMAC	authentication should be used  to  au-
	      thenticate  all  messages. Valid values are none (no authentica-
	      tion), md5, sha1,	sha256,	sha384 and sha512. Encrypted transmis-
	      sion is only supported for the KNET transport.

	      The default is none.

       crypto_cipher
	      This specifies which cipher should be used to encrypt  all  mes-
	      sages.   Valid  values  are none (no encryption),	aes256,	aes192
	      and aes128.  Enabling crypto_cipher, requires also  enabling  of
	      crypto_hash.  Encrypted  transmission  is	only supported for the
	      KNET transport.

	      The default is none.

       secauth
	      This implies crypto_cipher=aes256	and crypto_hash=sha256,	unless
	      those options are	explicitly set.	Encrypted transmission is only
	      supported	for the	KNET transport.

	      The default is off.

       keyfile
	      This specifies the fully qualified path to the shared  key  used
	      to authenticate and encrypt data used within the Totem protocol.

	      The default is /etc/corosync/authkey.

       key    Shared key stored	in configuration instead of authkey file. This
	      option  has  lower  precedence  than keyfile option so it's used
	      only when	keyfile	is not specified.  Using this  option  is  not
	      recommended for security reasons.

       link_mode
	      This specifies the Kronosnet mode, which may be passive, active,
	      or  rr (round-robin).  passive: the active link with the highest
	      priority (highest	number)	will be	used. If  one  or  more	 links
	      share  the same priority the one with the	lowest link ID will be
	      used.  active: All active	links will be used  simultaneously  to
	      send  traffic.   link priority is	ignored.  rr: Round-Robin pol-
	      icy. Each	packet will be sent to the next	active link in order.

	      If only one interface directive is specified, passive  is	 auto-
	      matically	chosen.

	      The  maximum number of interface directives that is allowed with
	      Kronosnet	is 8. For other	transports it is 1.

       netmtu This specifies maximum packet  length  sent  by  corosync.  It's
	      mainly  for the UDPU (and	UDP) transport,	where it specifies the
	      network maximum transmit size, but can be	 used  also  with  the
	      KNET  transport,	where it defines the maximum length of packets
	      passed to	the KNET layer.	To specify the	network	 MTU  manually
	      for KNET,	use the	knet_mtu option.

	      For  UDPU	(and UDP), setting this	value beyond 1500, the regular
	      frame MTU, requires ethernet devices that	support	large, or also
	      called jumbo, frames.  If	any device in the network doesn't sup-
	      port large frames, the protocol will not operate properly.   The
	      hosts  must  also	 have their mtu	size set from 1500 to whatever
	      frame size is specified here.

	      Please note while	some NICs or switches claim large  frame  sup-
	      port,  they support 9000 MTU as the maximum frame	size including
	      the IP header.  Setting the netmtu and host MTUs	to  9000  will
	      cause totem to use the full 9000 bytes of	the frame.  Then Linux
	      will  add	 a  18 byte header moving the full frame size to 9018.
	      As a result some hardware	will not operate  properly  with  this
	      size  of data.  A	netmtu of 8982 seems to	work for the few large
	      frame devices that have been tested.  Some  manufacturers	 claim
	      large  frame  support  when  in fact they	support	frame sizes of
	      4500 bytes.

	      When sending multicast traffic, if the network frequently	recon-
	      figures, chances are that	some device  in	 the  network  doesn't
	      support large frames.

	      Choose  hardware	carefully if intending to use large frame sup-
	      port.

	      The default is 1500 for UDPU (and	UDP) and 65536	for  the  KNET
	      transport.

       transport
	      This  directive  controls	the transport mechanism	used.  The de-
	      fault is knet (for KNET).	 The transport type can	also be	set to
	      udpu (for	UDPU) or udp (for UDP).	Only  KNET  allows  crypto  or
	      multiple interfaces per node.

       cluster_name
	      This  specifies  the name	of cluster and it's used for automatic
	      generating of multicast address.

       config_version
	      This specifies version of	config file. This is converted to  un-
	      signed 64-bit int.  By default it's 0. Option is used to prevent
	      joining old nodes	with not up-to-date configuration. If value is
	      not  0,  and  node is going for first time (only for first time,
	      join after split doesn't follow  this  rules)  from  single-node
	      membership to multiple nodes membership, other nodes config_ver-
	      sions are	collected. If current node config_version is not equal
	      to highest of collected versions,	corosync is terminated.

       ip_version
	      This specifies version of	IP to ask DNS resolver for.  The value
	      can be one of ipv4 (look only for	an IPv4	address) , ipv6	(check
	      only  IPv6  address) , ipv4-6 (look for all address families and
	      use first	IPv4 address found in the list if there	 is  such  ad-
	      dress,  otherwise	 use  first IPv6 address) and ipv6-4 (look for
	      all address families and use first IPv6  address	found  in  the
	      list  if	there  is  such	 address, otherwise use	first IPv4 ad-
	      dress).

	      Default (if unspecified) is ipv6-4 for KNET and UDPU  transports
	      and ipv4 for UDP transport.

	      The  KNET	 transport  supports  IPv4  and	IPv6 addresses concur-
	      rently, provided they are	consistent on each link.

	      Within the totem directive, there	are several configuration  op-
	      tions  which  are	used to	control	the operation of the protocol.
	      It is generally not recommended to change	any  of	 these	values
	      without  proper  guidance	and sufficient testing.	 Some networks
	      may require larger values	if suffering from frequent  reconfigu-
	      rations.	Some applications may require faster failure detection
	      times which can be achieved by reducing the token	timeout.

       token  This  timeout is used directly or	as a base for real token time-
	      out calculation (explained in token_coefficient section).	 Token
	      timeout specifies	in milliseconds	until a	token loss is declared
	      after not	receiving a token.  This is the	time spent detecting a
	      failure  of a processor in the current configuration.  Reforming
	      a	new configuration takes	about 50 milliseconds in  addition  to
	      this timeout.

	      For  real	token timeout used by totem it's possible to read cmap
	      value of runtime.config.totem.token key.

	      Be careful to use	the same timeout values	on each	of  the	 nodes
	      in the cluster or	unpredictable results may occur.

	      The default is 3000 milliseconds.

       token_warning
	      Specifies	 the  interval between warnings	that the token has not
	      been received.  The value	is a percentage	of the	token  timeout
	      and can be set to	0 to disable warnings.

	      The default is 75%.

       token_coefficient
	      This  value  is used only	when nodelist section is specified and
	      contains at least	3 nodes. If so,	real  token  timeout  is  then
	      computed	as  token + (number_of_nodes - 2) * token_coefficient.
	      This allows cluster to scale  without  manually  changing	 token
	      timeout every time new node is added. This value can be set to 0
	      resulting	in effective removal of	this feature.

	      The default is 650 milliseconds.

       token_retransmit
	      This timeout specifies in	milliseconds after how long before re-
	      ceiving  a token the token is retransmitted.  This will be auto-
	      matically	calculated if token is modified.   It  is  not	recom-
	      mended  to  alter	 this value without guidance from the corosync
	      community.

	      The minimum is 30	milliseconds. If not set and error occur, make
	      sure token / (token_retransmits_before_loss_const	+ 0.2) is more
	      than 30.

	      The default is 238 milliseconds for two nodes cluster. Three  or
	      more nodes reference token_coefficient.

       knet_compression_model
	      Type  of	compression used by Kronosnet. Supported values	depend
	      on the libknet build and on the installed	compression libraries.
	      Typically	zlib and lz4 will be available but  bzip2  and	others
	      could also be allowed. The default is 'none'.

       knet_compression_threshold
	      Tells KNET to NOT	compress any packets that are smaller than the
	      value indicated. Default 100 bytes.

	      Set  to  0 to reset to the default.  Set to 1 to compress	every-
	      thing.

       knet_compression_level
	      Many compression libraries allow tuning of  compression  parame-
	      ters.  For  example  0 or	1 ... 9	are commonly used to determine
	      the level	of compression.	This value is passed unmodified	to the
	      compression library so it	is  recommended	 to  consult  the  li-
	      brary's documentation for	more detailed information.

       hold   This timeout specifies in	milliseconds how long the token	should
	      be  held	by  the	 representative	when the protocol is under low
	      utilization.   It	is not recommended to alter this value without
	      guidance from the	corosync community.

	      The default is 180 milliseconds.

       token_retransmits_before_loss_const
	      This value identifies how	many token retransmits should  be  at-
	      tempted  before forming a	new configuration. It is also used for
	      token_retransmit and hold	calculations.

	      The default is 4 retransmissions.

       join   This timeout specifies in	milliseconds how long to wait for join
	      messages in the membership protocol.

	      The default is 50	milliseconds.

       send_join
	      This timeout specifies in	milliseconds an	upper range between  0
	      and  send_join  to wait before sending a join message.  For con-
	      figurations with less than 32 nodes, this	parameter is not  nec-
	      essary.  For larger rings, this parameter	is necessary to	ensure
	      the  NIC	is not overflowed with join messages on	formation of a
	      new ring.	 A reasonable value for	large rings (128 nodes)	 would
	      be 80msec.  Other	timer values must also change if this value is
	      changed.	 Seek  advice from the corosync	mailing	list if	trying
	      to run larger configurations.

	      The default is 0 milliseconds.

       consensus
	      This timeout specifies in	milliseconds how long to wait for con-
	      sensus to	be achieved before starting a new round	of  membership
	      configuration.   The  minimum  value for consensus must be 1.2 *
	      token.  This value will be automatically calculated at 1.2 * to-
	      ken if the user doesn't specify a	consensus value.

	      For two node clusters, a consensus larger	than the join  timeout
	      but less than token is safe.  For	three node or larger clusters,
	      consensus	 should	 be larger than	token.	There is an increasing
	      risk of odd membership changes, which  still  guarantee  virtual
	      synchrony,  as node count	grows if consensus is less than	token.

	      The default is 3600 milliseconds.

       merge  This  timeout  specifies in milliseconds how long	to wait	before
	      checking for a partition when  no	 multicast  traffic  is	 being
	      sent.   If  multicast traffic is being sent, the merge detection
	      happens automatically as a function of the protocol.

	      The default is 200 milliseconds.

       downcheck
	      This timeout specifies in	milliseconds how long to  wait	before
	      checking	that  a	network	interface is back up after it has been
	      downed.

	      The default is 1000 milliseconds.

       fail_recv_const
	      This constant specifies how many rotations of the	token  without
	      receiving	 any  of the messages when messages should be received
	      may occur	before a new configuration is formed.

	      The default is 2500 failures to receive a	message.

       seqno_unchanged_const
	      This constant specifies how many rotations of the	token  without
	      any  multicast  traffic  should  occur  before the hold timer is
	      started.

	      The default is 30	rotations.

       heartbeat_failures_allowed
	      [HeartBeating mechanism] Configures  the	optional  HeartBeating
	      mechanism	for faster failure detection. Keep in mind that	engag-
	      ing this mechanism in lossy networks could cause faulty loss de-
	      claration	 as the	mechanism relies on the	network	for heartbeat-
	      ing.

	      So as a rule of thumb use	this mechanism if you require improved
	      failure in low to	medium utilized	networks.

	      This constant specifies the number  of  heartbeat	 failures  the
	      system should tolerate before declaring heartbeat	failure	e.g 3.
	      Also  if this value is not set or	is 0 then the heartbeat	mecha-
	      nism is not engaged in the system	 and  token  rotation  is  the
	      method of	failure	detection

	      The default is 0 (disabled).

       max_network_delay
	      [HeartBeating mechanism] This constant specifies in milliseconds
	      the  approximate	delay that your	network	takes to transport one
	      packet from one machine to another. This value is	to be  set  by
	      system engineers and please don't	change if not sure as this ef-
	      fects the	failure	detection mechanism using heartbeat.

	      The default is 50	milliseconds.

       window_size
	      This  constant specifies the maximum number of messages that may
	      be sent on  one  token  rotation.	  If  all  processors  perform
	      equally  well,  this value could be large	(300), which would in-
	      troduce higher latency from origination  to  delivery  for  very
	      large  rings.   To  reduce  latency in large rings(16+), the de-
	      faults are a safe	compromise.  If	1 or  more  slow  processor(s)
	      are  present  among  fast	 processors,  window_size should be no
	      larger than 256000 / netmtu to avoid overflow of the kernel  re-
	      ceive buffers.  The user is notified of this by the display of a
	      retransmit  list	in the notification logs.  There is no loss of
	      data, but	performance is reduced when these errors occur.

	      The default is 50	messages.

       max_messages
	      This constant specifies the maximum number of messages that  may
	      be  sent by one processor	on receipt of the token.  The max_mes-
	      sages parameter is limited to 256000 / netmtu to	prevent	 over-
	      flow of the kernel transmit buffers.

	      The default is 17	messages.

       miss_count_const
	      This  constant defines the maximum number	of times on receipt of
	      a	token a	message	is checked for	retransmission	before	a  re-
	      transmission  occurs.   This  parameter  is useful to modify for
	      switches that delay multicast packets compared to	unicast	 pack-
	      ets.   The  default  setting  works  well	 for nearly all	modern
	      switches.

	      The default is 5 messages.

       knet_pmtud_interval
	      How often	the KNET PMTUd runs to look for	network	 MTU  changes.
	      Value in seconds,	default: 30

       knet_mtu
	      Switch  between manual and automatic MTU discovery. A value of 0
	      means automatic, other values set	a manual MTU.  In a setup with
	      multiple interfaces, please specify the lowest MTU  of  the  se-
	      lected interfaces.

	      The default value	is 0.

       block_unlisted_ips
	      Allow  UDPU  and KNET to drop packets from IP addresses that are
	      not known	(nodes which don't exist in the	nodelist) to corosync.
	      Value is yes or no.

	      This feature is mainly to	protect	against	the joining  of	 nodes
	      with outdated configurations after a cluster split.  Another use
	      case is to allow the atomic merge	of two independent clusters.

	      Changing	the  default value is not recommended, the overhead is
	      tiny and an existing cluster may fail if corosync	is started  on
	      an unlisted node with an old configuration.

	      The default value	is yes.

       cancel_token_hold_on_retransmit
	      Allows  Corosync	to  hold token by representative when there is
	      too much retransmit messages. This allows	network	to process in-
	      creased load without overloading it. Used	mechanism is  same  as
	      described	for hold directive.

	      Some  deployments	 may  prefer to	never hold token when there is
	      retransmit messages. If so, option should	be set to yes.

	      The default value	is no.

       Within the logging directive, there are several	configuration  options
       which are all optional.

       The following 3 options are valid only for the top level	logging	direc-
       tive:

       timestamp
	      This  specifies  that a timestamp	is placed on all log messages.
	      It can be	one of off (no timestamp), on (second precision	 time-
	      stamp)  or  hires	 (millisecond  precision timestamp - only when
	      supported	by LibQB).

	      The default is hires (or on if hires is not supported).

       fileline
	      This specifies that file and line	should be printed.

	      The default is off.

       function_name
	      This specifies that the code function name should	be printed.

	      The default is off.

       blackbox
	      This specifies that blackbox functionality should	be enabled.

	      The default is on.

       The following options are valid both for	top  level  logging  directive
       and they	can be overridden in logger_subsys entries.

       to_stderr

       to_logfile

       to_syslog
	      These specify the	destination of logging output. Any combination
	      of these options may be specified. Valid options are yes and no.

	      The default is syslog and	stderr.

	      Please  note, if you are using to_logfile	and want to rotate the
	      file, use	logrotate(8) with the option copytruncate.  eg.
	      /var/log/corosync.log {
		   missingok
		   compress
		   notifempty
		   daily
		   rotate 7
		   copytruncate
	      }

       logfile
	      If the to_logfile	directive is set to yes	, this	option	speci-
	      fies the pathname	of the log file.

	      No default.

       logfile_priority
	      This  specifies the logfile priority for this particular subsys-
	      tem. Ignored if debug is on.  Possible values are: alert,	 crit,
	      debug (same as debug = on), emerg, err, info, notice, warning.

	      The default is: info.

       syslog_facility
	      This  specifies  the  syslog facility type that will be used for
	      any messages sent	to syslog. options are daemon, local0, local1,
	      local2, local3, local4, local5, local6 & local7.

	      The default is daemon.

       syslog_priority
	      This specifies the syslog	level for this	particular  subsystem.
	      Ignored if debug is on.  Possible	values are: alert, crit, debug
	      (same as debug = on), emerg, err,	info, notice, warning.

	      The default is: info.

       debug  This  specifies whether debug output is logged for this particu-
	      lar logger. Also can contain value trace,	what is	highest	 level
	      of debug information.

	      The default is off.

       Within the logging directive, logger_subsys directives are optional.

       Within  the  logger_subsys sub-directive, all of	the above logging con-
       figuration options are valid and	can be used to	override  the  default
       settings.   The subsys entry, described below, is mandatory to identify
       the subsystem.

       subsys This specifies the subsystem identity (name) for	which  logging
	      is  specified.  This  is	the  name  used	 by  a	service	in the
	      log_init() call. E.g. 'CPG'. This	directive is required.

       Within the quorum directive it is possible to specify the  quorum  con-
       figuration options. The following option	is required to activate	quorum
       service:

       provider
	      This  specifies  algorithm  to  use. At the time of writing only
	      corosync_votequorum is supported.	 See votequorum(5) for config-
	      uration options.

       Within the nodelist directive it	is possible to specify specific	infor-
       mation about nodes in cluster. Directive	can contain only node  sub-di-
       rective,	which specifies	every node that	should be a member of the mem-
       bership,	and where non-default options are needed. Every	node must have
       at least	ring0_addr field filled.

       Every node that should be a member of the membership must be specified.

       Possible	options	are:

       ringX_addr
	      This  specifies IP or network hostname address of	the particular
	      node.  X is a link number.

       nodeid This configuration option	is required for	each node for  Kronos-
	      net  mode.   It is a 32 bit value	specifying the node identifier
	      delivered	to the cluster membership service. The node identifier
	      value of zero is reserved	and should not be  used.  If  KNET  is
	      set, this	field must be set.

       name   This option is used mainly with KNET transport to	identify local
	      node.  It's also used by client software (pacemaker).  Algorithm
	      for identifying local node is following:

	      1.     Looks up $HOSTNAME	in the nodelist

	      2.     If	 this  fails  strip the	domain name from $HOSTNAME and
		     looks up that in the nodelist

	      3.     If	this fails look	in the nodelist	for a  fully-qualified
		     name  whose  short	 version  matches the short version of
		     $HOSTNAME

	      4.     If	all this fails then search the interfaces list for  an
		     address that matches a name in the	nodelist

       Within the system directive it is possible to specify system options.

       Possible	options	are:

       qb_ipc_type
	      This  specifies  type  of	 IPC to	use. Can be one	of native (de-
	      fault), shm and socket.  Native means one	of shm or socket,  de-
	      pending  on what is supported by OS. On systems with support for
	      both, SHM	is selected. SHM is generally faster, but need to  al-
	      locate ring buffer file in /dev/shm.

       sched_rr
	      Should  be  set  to  yes (default) if corosync should try	to set
	      round robin realtime scheduling with maximal priority to itself.
	      When setting of scheduler	fails, fallback	to set maximal	prior-
	      ity.

       priority
	      Set  priority  of	 corosync process. Valid only when sched_rr is
	      set to no.  Can be ether numeric value with similar  meaning  as
	      nice(1) or max / min meaning maximal / minimal priority (so min-
	      imal / maximal nice value).

       move_to_root_cgroup
	      Can be one of yes	(Corosync always moves itself to root cgroup),
	      no  (Corosync never tries	to move	itself to root cgroup) or auto
	      (Corosync	first checks if	sched_rr is enabled,  and  if  so,  it
	      tries to set round robin realtime	scheduling with	maximal	prior-
	      ity  to itself.  If setting of priority fails, corosync tries to
	      move itself to root cgroup and retries setting of	priority).

	      This feature is available	only for systems with cgroups v1  with
	      RT  sched	 enabled  (Linux with CONFIG_RT_GROUP_SCHED kernel op-
	      tion) and	cgroups	v2.

	      It's worth noting	that currently (May 3 2021) cgroup2 doesnt yet
	      support control of realtime processes and	the cpu	controller can
	      only be enabled when all RT processes are	 in  the  root	cgroup
	      (applies only for	kernel with CONFIG_RT_GROUP_SCHED enabled). So
	      when  move_to_root_cgroup	 is  disabled, kernel is compiled with
	      CONFIG_RT_GROUP_SCHED and	systemd	is used, it may	be  impossible
	      to  make	systemd	 options like CPUQuota working correctly until
	      corosync is stopped.

	      Also when	moving to root cgroup is enforced  and	used  together
	      with  cgroup2 and	systemd	it makes impossible (most of the time)
	      for journald to add systemd specific metadata (most  importantly
	      _SYSTEMD_UNIT) properly, because corosync	is moved out of	cgroup
	      created  by  systemd.  This  means  it is	not possible to	filter
	      corosync logged messages based on	these  metadata	 (for  example
	      using -u or _SYSTEMD_UNIT=UNIT pattern) and also running system-
	      ctl  status  doesn't  display  (all) corosync log	messages.  The
	      problem is even worse because journald caches pid	for some  time
	      (approx.	5 sec) so initial corosync messages have correct meta-
	      data.

       allow_knet_handle_fallback
	      If KNET handle creation fails using privileged operations, allow
	      fallback to creating KNET	handle using unprivileged  operations.
	      Defaults	to  no,	 meaning  if  privileged  KNET handle creation
	      fails, corosync will refuse to start.

	      The KNET handle will always be created using  privileged	opera-
	      tions  if	 possible, setting this	to yes only allows fallback to
	      unprivileged operations. This fallback may result	in performance
	      issues, but if running in	an unprivileged	environment, e.g. as a
	      normal user or in	unprivileged container,	this may be required.

       state_dir
	      Existing directory where corosync	should	chdir  into.  Corosync
	      stores important state files and blackboxes there.

	      The default is /var/lib/corosync.

       Within  the  resources  directive it is possible	to specify options for
       resources.

       Possible	option is:

       watchdog_device
	      (Valid only if Corosync was compiled with	watchdog support.)
	      Watchdog device to use, for example  /dev/watchdog.   If	unset,
	      empty or "off", no watchdog is used.

	      In  a  cluster with properly configured power fencing a watchdog
	      provides no additional value.  On	the other hand,	slow  watchdog
	      communication may	incur multi-second delays in the Corosync main
	      loop,  potentially breaking down membership.  IPMI watchdogs are
	      particularly  notorious  in  this	 regard:   read	  about	  kip-
	      mid_max_busy_us in IPMI.txt in the Linux kernel documentation.

       Within  the  nozzle  directive  it is possible to specify options for a
       libnozzle device. This is a pseudo ethernet device that routes  network
       traffic	through	a channel on the corosync KNET network (NOT cpg	or any
       corosync	internal service) to other nodes in the	cluster.  This	allows
       applications  to	 take advantage	of KNET	features such as multipathing,
       automatic failover, link	switching etc. Note that libnozzle  is	not  a
       reliable	transport, but you can tunnel TCP through it for reliable com-
       munications.
       libnozzle  also	supports  optional  interface up/down scripts that are
       kept under a /etc/corosync/updown.d/ directory. See the KNET documenta-
       tion for	more information.
       Only one	nozzle device is allowed.
       The nozzle stanza takes several options:

       name   The name of the network device to	be created. On Linux this  may
	      be  any  name  at	 all, other platforms have restrictions	on the
	      name.

       ipaddr The IP address (IPv6 or IPv4) of the interface. The bottom  part
	      of  this	address	will be	replaced by the	local node's nodeid in
	      conjunction with ipprefix. so, eg	ipaddr:	192.168.1.0  ipprefix:
	      24  will	make  nodeids  1,2,5  use  IP  addresses  192.168.1.1,
	      192.168.1.2 & 192.168.1.5.  If a prefix length  of  16  is  used
	      then the bottom two bytes	will be	filled in with nodeid numbers.
	      IPv6  addresses must end in '::',	the nodeid will	be added after
	      the two colons to	make the local IP address.  Only  one  IP  ad-
	      dress  is	 currently  supported in the corosync.conf file. Addi-
	      tional IP	addresses can be added in the ifup  script  if	neces-
	      sary.

       ipprefix
	      specifies	 the  IP  address  prefix  for	the nozzle device (see
	      above)

       macaddr
	      Specifies	the MAC	address	prefix for the nozzle device.  As  for
	      the  IP  address,	 the  bottom  part  of the MAC address will be
	      filled in	with the node id. In this case no prefix applies,  the
	      bottom  two  bytes of the	MAC address will always	be overwritten
	      with the node id.	So specifying  macaddr:	 54:54:12:24:12:12  on
	      nodeid   1   will	  result   in  it  having  a  MAC  address  of
	      54:54:12:24:00:01

TO ADD A NEW NODE TO THE CLUSTER
       For example to add a node with address 10.24.38.108 with	nodeid 3.  The
       node  has the name NEW (in DNS or /etc/hosts) and is not	currently run-
       ning corosync. The current corosync.conf	nodelist looks like this:

	      nodelist {
		  node {
		      nodeid: 1
		      ring0_addr: 10.24.38.101
		      name: node1
		  }
		  node {
		      nodeid: 2
		      ring0_addr: 10.24.38.102
		      name: node2

		  }
	      }

       Add a new entry for the node below the  existing	 nodes.	 Node  entries
       don't  have  to	be in nodeid order, but	it will	help keep you sane. So
       the nodelist now	looks like this:

	      nodelist {
		  node {
		      nodeid: 1
		      ring0_addr: 10.24.38.101
		      name: node1
		  }
		  node {
		      nodeid: 2
		      ring0_addr: 10.24.38.102
		      name: node2

		  }
		  node {
		      nodeid: 3
		      ring0_addr: 10.24.38.108
		      name: NEW

		  }
	      }

       This file must then be copied onto all three nodes -  the existing  two
       nodes,  and  the	 new one.  On one of the existing corosync nodes, tell
       corosync	to re-read the updated config file into	memory:

	      corosync-cfgtool -R

       This command only needs to be run on one	node in	the cluster.  You  may
       then  start corosync on the NEW node and	it should join the cluster. If
       this doesn't work as expected then check	the communications between all
       three nodes is working, and check the syslog files  on  all  nodes  for
       more  information.  It's	important to note that the key bit of informa-
       tion about a node failing to join might be on a different node than you
       expect.

TO REMOVE A NODE FROM THE CLUSTER
       This is the reverse procedure to	'Adding	a node'	above. First you  need
       to shut down the	node you will be removing from the cluster.

	      corosync-cfgtool -H

       Then  delete  the nodelist stanza from corosync.conf and	finally	update
       corosync	on the remaining nodes by running

	      corosync-cfgtool -R

       on one of them.

ADDRESS	RESOLUTION
       corosync	 resolves  ringX_addr  names/IP	 addresses  using  the	getad-
       drinfo(3) call with respect of totem.ip_version setting.

       getaddrinfo()  function uses a sophisticated algorithm to sort node ad-
       dresses into a preferred	order and corosync always  chooses  the	 first
       address	in  that list of the required family.  As such it is essential
       that your DNS or	/etc/hosts files are correctly configured so that  all
       addresses  for  ringX appear on the same	network	(or are	reachable with
       minimal hops) and over the same IP protocol. If this is	not  the  case
       then  some  nodes might not be able to join the cluster.	It is possible
       to override the search order used by getaddrinfo() using	the configura-
       tion file /etc/gai.conf(5) if necessary,	but this is not	recommended.

       If there	is any doubt about the order of	addresses returned from	getad-
       drinfo()	then it	might be simpler to use	IP addresses (v4 or v6)	in the
       ringX_addr field.

FILES
       /etc/corosync/corosync.conf
	      The corosync executive configuration file.

SEE ALSO
       corosync_overview(7), votequorum(5), corosync-qdevice(8),  logrotate(8)
       getaddrinfo(3) gai.conf(5)

corosync Man Page		  2024-07-22		      COROSYNC_CONF(5)
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=corosync.conf&sektion=5&manpath=FreeBSD+Ports+14.3.quarterly>
home | help
Header And Logo

Peripheral Links

Site Navigation

FreeBSD Manual Pages

Header And Logo

Peripheral Links

Search

Site Navigation

FreeBSD Manual Pages