FreeBSD Manual Pages

home | help
RNASUBOPT(1)			 User Commands			  RNASUBOPT(1)

NAME
       RNAsubopt - manual page for RNAsubopt 2.7.0

SYNOPSIS
       RNAsubopt [OPTION]...

DESCRIPTION
       RNAsubopt 2.7.0

       calculate suboptimal secondary structures of RNAs

       Reads  RNA sequences from stdin and (in the default -e mode) calculates
       all suboptimal secondary	structures within a user defined energy	 range
       above  the  minimum  free energy	(mfe). It prints the suboptimal	struc-
       tures in	dot-bracket notation followed by the  energy  in  kcal/mol  to
       stdout.	Be  careful,  the number of structures returned	grows exponen-
       tially with both	sequence length	and energy range.

       Alternatively, when used	with the -p option, RNAsubopt produces	Boltz-
       mann weighted samples of	secondary structures.

       -h, --help
	      Print help and exit

       --detailed-help
	      Print help, including all	details	and hidden options, and	exit

       --full-help
	      Print help, including hidden options, and	exit

       -V, --version
	      Print version and	exit

       -v, --verbose
	      Be verbose.  (default=off)

	      Lower  the  log  level  setting such that	even INFO messages are
	      passed through.

   I/O Options:
	      Command line options for input and output	(pre-)processing

       -i, --infile=filename
	      Read a file instead of reading from stdin.

	      The default behavior of RNAsubopt	is to read input  from	stdin.
	      Using  this  parameter  the  user	can specify an input file name
	      where data is read from.

       -o, --outfile[=filename]
	      Print output to file instead of stdout.

	      This option may be used to write	all  output  to	 output	 files
	      rather than printing to stdout. The default filename is "RNAsub-
	      opt_output.sub"  if no FASTA header precedes the input sequences
	      and the --auto-id	feature	is inactive. Otherwise,	 output	 files
	      with  the	 scheme	"prefix.sub" are generated, where the "prefix"
	      is taken from the	sequence id. The user  may  specify  a	single
	      output  file  name for all data generated	from the input by sup-
	      plying an	optional string	as argument to this parameter. In case
	      a	file with the same filename already exists, any	output of  the
	      program  will be appended	to it. Note: Any special characters in
	      the filename will	be replaced by the filename  delimiter,	 hence
	      there  is	 no  way to pass an entire directory path through this
	      option yet. (See also the	"--filename-delim" parameter)

       --noconv
	      Do not automatically substitute nucleotide "T" with "U".

	      (default=off)

       --auto-id
	      Automatically generate an	ID for each sequence.  (default=off)

	      The default mode of RNAsubopt is to automatically	 determine  an
	      ID  from the input sequence data if the input file format	allows
	      to do that. Sequence IDs are usually given in the	 FASTA	header
	      of  input	 sequences.  If	this flag is active, RNAsubopt ignores
	      any IDs retrieved	from the input and automatically generates  an
	      ID  for  each  sequence. This ID consists	of a prefix and	an in-
	      creasing number. This flag can also  be  used  to	 add  a	 FASTA
	      header to	the output even	if the input has none.

       --id-prefix=STRING
	      Prefix  for  automatically generated IDs (as used	in output file
	      names).

	      (default=`sequence')

	      If this parameter	is set,	each sequences'	FASTA id will be  pre-
	      fixed  with  the	provided  string. FASTA	ids then take the form
	      ">prefix_xxxx" where xxxx	is the sequence	number.	Note:  Setting
	      this parameter implies --auto-id.

       --id-delim=CHAR
	      Change  the  delimiter  between prefix and increasing number for
	      automatically generated IDs (as used in output file names).

	      (default=`_')

	      This parameter can be used to change the default	delimiter  "_"
	      between the prefix string	and the	increasing number for automat-
	      ically generated ID.

       --id-digits=INT
	      Specify  the  number  of	digits of the counter in automatically
	      generated	alignment IDs.

	      (default=`4')

	      When alignments IDs are automatically generated, they receive an
	      increasing number, starting with 1. This number will  always  be
	      left-padded  by  leading	zeros, such that the number takes up a
	      certain width. Using this	parameter, the width can be  specified
	      to  the  users  need. We allow numbers in	the range [1:18]. This
	      option implies --auto-id.

       --id-start=LONG
	      Specify the first	number in automatically	generated IDs.

	      (default=`1')

	      When sequence IDs	are automatically generated, they  receive  an
	      increasing  number,  usually starting with 1. Using this parame-
	      ter, the first number can	be specified  to  the  users  require-
	      ments.  Note:  negative  numbers are not allowed.	 Note: Setting
	      this parameter implies to	ignore any IDs retrieved from the  in-
	      put data,	i.e. it	activates the --auto-id	flag.

       --filename-delim=CHAR
	      Change the delimiting character used in sanitized	filenames.

	      (default=`ID-delimiter')

	      This  parameter  can  be used to change the delimiting character
	      used while sanitizing filenames, i.e. replacing invalid  charac-
	      ters. Note, that the default delimiter ALWAYS is the first char-
	      acter  of	 the "ID delimiter" as supplied	through	the --id-delim
	      option. If the delimiter is a whitespace character or empty, in-
	      valid characters will be simply removed rather than substituted.
	      Currently, we regard the following characters as illegal for use
	      in filenames: backslash '\', slash '/', question mark '?',  per-
	      cent  sign '%', asterisk '*', colon ':', pipe symbol '|',	double
	      quote '"', triangular brackets '<' and '>'.

       --filename-full
	      Use full FASTA header to create filenames.  (default=off)

	      This parameter can be used to deactivate the default behavior of
	      limiting output filenames	to the first word of the sequence  ID.
	      Consider	the  following	example:  An  input  with FASTA	header
	      '>NM_0001	Homo Sapiens some gene'	usually	produces output	 files
	      with  the	prefix "NM_0001" without the additional	data available
	      in the FASTA header, e.g.	"NM_0001.sub". With this flag set,  no
	      truncation  of  the  output  filenames is	performed, i.e.	output
	      filenames	receive	the full FASTA header data as prefixes.	 Note,
	      however,	that  invalid  characters (such	as whitespace) will be
	      substituted by a delimiting character or	simply	removed,  (see
	      also the parameter option	--filename-delim).

       --log-level=level
	      Set log level threshold.	(default=`2')

	      By  default,  any	log messages are filtered such that only warn-
	      ings (level 2) or	errors (level 3) are printed. This setting al-
	      lows for specifying the log level	threshold, where higher	values
	      result in	fewer information. Log-level 5 turns off all messages,
	      even errors and other critical information.

       --log-file[=filename]
	      Print log	messages to a file instead of stderr.	(default=`RNA-
	      subopt.log')

       --log-time
	      Include time stamp in log	messages.

	      (default=off)

       --log-call
	      Include file and line of log calling function.

	      (default=off)

   Algorithms:
	      Select  the  algorithms which should be applied to the given RNA
	      sequence(s).

       -e, --deltaEnergy=range
	      Compute suboptimal structures with energy	in a certain range  of
	      the optimum (kcal/mol).

	      Default is calculation of	mfe structure only.

       --deltaEnergyPost=range
	      Only  print structures with energy within	range of the mfe after
	      post reevaluation	of energies.

	      Useful in	conjunction with -logML, -d1 or	-d3: while the -e  op-
	      tion  specifies the range	before energies	are re-evaluated, this
	      option specifies the maximum energy after	re-evaluation.

       -s, --sorted
	      Sort the suboptimal structures by	energy and lexicographical or-
	      der.

	      (default=off)

	      Structures are first sorted by energy in ascending order.	Within
	      groups of	the same energy, structures are	then sorted in ascend-
	      ing in lexicographical order of their dot-bracket	notation.  See
	      the  --en-only  flag  to	deactivate this	second step. Note that
	      sorting is done in memory, thus it can easily lead to  exhaution
	      of RAM! This is especially true if the number of structures pro-
	      duced  becomes large or the RNA sequence is rather long. In such
	      cases better use an external sort	method,	such as	UNIX "sort".

       --en-only
	      Only sort	structures by free energy.  (default=off)

	      In combination with --sorted, this flag deactivates  the	second
	      sorting  criteria	 and sorts structures solely by	their free en-
	      ergy instead of additionally sorting by lexicographic  order  in
	      each  energy  band. This might save some time during the sorting
	      process in situations where lexicographic	order is not required.

       -p, --stochBT=number
	      Randomly draw structures according to their probability  in  the
	      Boltzmann	ensemble.

	      Instead of producing all suboptimals in an energy	range, produce
	      a	 random	sample of suboptimal structures, drawn with probabili-
	      ties equal to their Boltzmann weights via	stochastic  backtrack-
	      ing  in  the partition function. The -e and -p options are mutu-
	      ally exclusive.

       --stochBT_en=number
	      Same as "--stochBT" but also print free energies and  probabili-
	      ties of the backtraced structures.

       --random-seed=INT
	      Set the seed for the random number generator

       --betaScale=DOUBLE
	      Set the scaling of the Boltzmann factors.	 (default=`1.')

	      The  argument  provided  with  this  option is used to scale the
	      thermodynamic temperature	in the Boltzmann factors independently
	      from the temperature of the  individual  loop  energy  contribu-
	      tions.  The  Boltzmann  factors then become 'exp(- dG/(kT*betaS-
	      cale))' where 'k'	is the Boltzmann constant, 'dG'	the  free  en-
	      ergy contribution	of the state and 'T' the absolute temperature.

       -N, --nonRedundant
	      Enable non-redundant sampling strategy.

	      (default=off)

       -S, --pfScale=DOUBLE
	      In  the  calculation  of the pf use scale*mfe as an estimate for
	      the ensemble free	energy (used to	avoid overflows).

	      (default=`1.07')

	      The default is 1.07, useful values are 1.0 to 1.2.  Occasionally
	      needed for long sequences.

       -c, --circ
	      Assume a circular	(instead of linear) RNA	molecule.

	      (default=off)

       -D, --dos
	      Compute density of states	instead	of secondary structures.

	      (default=off)

	      This  option  enables  the evaluation of the number of secondary
	      structures in certain energy bands around	the MFE.

       -z, --zuker
	      Compute Zuker suboptimals	instead	of all	suboptimal  structures
	      within an	energy band around the MFE.

	      (default=off)

       -g, --gquad
	      Incoorporate G-Quadruplex	formation.  (default=off)

	      No  support of G-quadruplex prediction for stochastic backtrack-
	      ing and Zuker-style suboptimals yet).

   Structure Constraints:
	      Command line options to interact with the	structure  constraints
	      feature of this program

       --maxBPspan=INT
	      Set the maximum base pair	span.

	      (default=`-1')

       -C, --constraint[=filename]
	      Calculate	structures subject to constraints.  (default=`')

	      The  program  reads first	the sequence, then a string containing
	      constraints on the structure encoded with	the symbols:

	      '.' (no constraint for this base)

	      '|' (the corresponding base has to be paired

	      'x' (the base is unpaired)

	      '<' (base	i is paired with a base	j>i)

	      '>' (base	i is paired with a base	j<i)

	      and matching brackets '('	')' (base i pairs base j)

	      With the exception of '|', constraints will disallow  all	 pairs
	      conflicting  with	 the constraint. This is usually sufficient to
	      enforce the constraint, but occasionally a  base	may  stay  un-
	      paired  in  spite	of constraints.	PF folding ignores constraints
	      of type '|'.

       --batch
	      Use constraints for multiple sequences.  (default=off)

	      Usually, constraints provided from input file only  apply	 to  a
	      single input sequence. Therefore,	RNAsubopt will stop its	compu-
	      tation  and  quit	 after the first input sequence	was processed.
	      Using this switch, RNAsubopt processes multiple input  sequences
	      and applies the same provided constraints	to each	of them.

       --canonicalBPonly
	      Remove non-canonical base	pairs from the structure constraint.

	      (default=off)

       --enforceConstraint
	      Enforce  base pairs given	by round brackets '(' ')' in structure
	      constraint.

	      (default=off)

       --shape=filename
	      Use SHAPE	reactivity data	to guide structure predictions.

       --shapeMethod=method
	      Select SHAPE reactivity data incorporation strategy.

	      (default=`D')

	      The following methods can	be used	to convert SHAPE  reactivities
	      into pseudo energy contributions.

	      'D': Convert by using the	linear equation	according to Deigan et
	      al 2009.

	      Derived pseudo energy terms will be applied for every nucleotide
	      involved in a stacked pair. This method is recognized by a capi-
	      tal  'D'	in  the	provided parameter, i.e.: --shapeMethod="D" is
	      the default setting. The slope 'm' and the intercept 'b' can  be
	      set  to  a  non-default  value if	necessary, otherwise m=1.8 and
	      b=-0.6. To alter these parameters, e.g. m=1.9 and	b=-0.7,	use  a
	      parameter	 string	like this: --shapeMethod="Dm1.9b-0.7". You may
	      also  provide   only   one   of	the   two   parameters	 like:
	      --shapeMethod="Dm1.9" or --shapeMethod="Db-0.7".

	      'Z':  Convert SHAPE reactivities to pseudo energies according to
	      Zarringhalam

	      et al 2012. SHAPE	reactivities  will  be	converted  to  pairing
	      probabilities  by	 using linear mapping. Aberration from the ob-
	      served pairing probabilities will	be penalized during the	 fold-
	      ing  recursion.  The  magnitude of the penalties can affected by
	      adjusting	the factor beta	(e.g. --shapeMethod="Zb0.8").

	      'W': Apply a given vector	of perturbation	energies  to  unpaired
	      nucleotides

	      according	 to  Washietl  et al 2012. Perturbation	vectors	can be
	      calculated by using RNApvmin.

       --shapeConversion=method
	      Select method for	SHAPE reactivity conversion.

	      (default=`O')

	      This parameter is	useful when dealing with the SHAPE  incorpora-
	      tion  according to Zarringhalam et al. The following methods can
	      be used to convert SHAPE reactivities into the probability for a
	      certain nucleotide to be unpaired.

	      'M': Use linear mapping according	to Zarringhalam	et  al.	  'C':
	      Use  a  cutoff-approach  to  divide into paired and unpaired nu-
	      cleotides	(e.g. "C0.25") 'S': Skip the  normalizing  step	 since
	      the  input  data	already	represents probabilities for being un-
	      paired rather than raw reactivity	values 'L': Use	a linear model
	      to convert the reactivity	into a probability for being  unpaired
	      (e.g.  "Ls0.68i0.2"  to  use a slope of 0.68 and an intercept of
	      0.2) 'O':	Use a linear model to convert the log of the  reactiv-
	      ity into a probability for being unpaired	(e.g. "Os1.6i-2.29" to
	      use a slope of 1.6 and an	intercept of -2.29)

       --commands=filename
	      Read additional commands from file

	      Commands	include	 hard and soft constraints, but	also structure
	      motifs in	hairpin	and internal loops that	 need  to  be  treeted
	      differently.  Furthermore,  commands can be set for unstructured
	      and structured domains.

   Energy Parameters:
	      Energy parameter sets can	be adapted or  loaded  from  user-pro-
	      vided input files

       -T, --temp=DOUBLE
	      Rescale energy parameters	to a temperature of temp C. Default is
	      37C.

	      (default=`37.0')

       -P, --paramFile=paramfile
	      Read  energy parameters from paramfile, instead of using the de-
	      fault parameter set.

	      Different	sets of	energy parameters for RNA and DNA  should  ac-
	      company your distribution.  See the RNAlib documentation for de-
	      tails on the file	format.	The placeholder	file name 'DNA'	can be
	      used to load DNA parameters without the need to actually specify
	      any input	file.

       -4, --noTetra
	      Do  not include special tabulated	stabilizing energies for tri-,
	      tetra- and hexaloop hairpins.

	      (default=off)

	      Mostly for testing.

       --salt=DOUBLE
	      Set salt concentration in	molar (M). Default is 1.021M.

       -m, --modifications[=STRING]
	      Allow for	modified bases within the RNA sequence string.

	      (default=`7I6P9D')

	      Treat modified bases within the RNA sequence  differently,  i.e.
	      use  corresponding  energy  corrections  and/or  pairing partner
	      rules if available.  For that, the modified bases	in  the	 input
	      sequence	must be	marked by their	corresponding one-letter code.
	      If no additional arguments are supplied, all  available  correc-
	      tions are	performed. Otherwise, the user may limit the modifica-
	      tions  to	a particular subset of modifications, resp. one-letter
	      codes, e.g. -mP6 will only correct  for  pseudouridine  and  m6A
	      bases.

	      Currently	supported one-letter codes and energy corrections are:

	      '7': 7-deaza-adenonsine (7DA)

	      'I': Inosine

	      '6': N6-methyladenosine (m6A)

	      'P': Pseudouridine

	      '9': Purine (a.k.a. nebularine)

	      'D': Dihydrouridine

       --mod-file=STRING
	      Use additional modified base data	from JSON file.

   Model Details:
	      Tweak  the energy	model and pairing rules	additionally using the
	      following	parameters

       -d, --dangles=INT
	      How to treat "dangling end" energies for bases adjacent  to  he-
	      lices in free ends and multi-loops.

	      (default=`2')

	      With -d1 only unpaired bases can participate in at most one dan-
	      gling  end.   With  -d2 this check is ignored, dangling energies
	      will be added for	the bases adjacent to a	helix on both sides in
	      any case;	this is	the default for	 mfe  and  partition  function
	      folding  (-p).   The option -d0 ignores dangling ends altogether
	      (mostly for debugging).  With -d3	mfe folding will allow coaxial
	      stacking of adjacent helices in multi-loops. At the  moment  the
	      implementation  will  not	 allow coaxial stacking	of the two en-
	      closed pairs in a	loop of	degree 3 and works only	for mfe	 fold-
	      ing.

	      Note that	with -d1 and -d3 only the MFE computations will	be us-
	      ing this setting while partition function	uses -d2 setting, i.e.
	      dangling ends will be treated differently.

       --noLP Produce structures without lonely	pairs (helices of length 1).

	      (default=off)

	      For  partition  function	folding	this only disallows pairs that
	      can only occur isolated. Other pairs may still occasionally  oc-
	      cur as helices of	length 1.

       --noGU Do not allow GU pairs.

	      (default=off)

       --noClosingGU
	      Do not allow GU pairs at the end of helices.

	      (default=off)

       --logML
	      Recompute	 energies  of  structures  using  a logarithmic	energy
	      function for multi-loops before output.  (default=off)

	      This option does not effect structure generation,	only the ener-
	      gies that	are printed out. Since logML lowers energies somewhat,
	      some structures may be missing.

       --nsp=STRING
	      Allow other pairs	in addition to the usual AU,GC,and GU pairs.

	      Its argument is a	comma separated	list of	 additionally  allowed
	      pairs.  If  the first character is a "-" then AB will imply that
	      AB and BA	are allowed pairs, e.g.	--nsp="-GA"  will allow	GA and
	      AG pairs.	Nonstandard pairs are given 0 stacking energy.

       --energyModel=INT
	      Set energy model.

	      Rarely used option to fold sequences from	the artificial ABCD...
	      alphabet,	where A	pairs B, C-D etc.  Use the  energy  parameters
	      for GC (--energyModel 1) or AU (--energyModel 2) pairs.

       --helical-rise=FLOAT
	      Set the helical rise of the helix	in units of Angstrom.

	      (default=`2.8')

	      Use with caution!	This value will	be re-set automatically	to 3.4
	      in  case	DNA  parameters	 are  loaded via -P DNA	and no further
	      value is provided.

       --backbone-length=FLOAT
	      Set the average backbone length for looped regions in  units  of
	      Angstrom.

	      (default=`6.0')

	      Use  with	 caution!  This	 value will be re-set automatically to
	      6.76 in case DNA parameters are loaded via -P DNA	and no further
	      value is provided.

REFERENCES
       If you use this program in your work you	might want to cite:

       R. Lorenz, S.H. Bernhart, C.  Hoener  zu	 Siederdissen,	H.  Tafer,  C.
       Flamm,  P.F. Stadler and	I.L. Hofacker (2011), "ViennaRNA Package 2.0",
       Algorithms for Molecular	Biology: 6:26

       I.L. Hofacker, W. Fontana, P.F. Stadler,	S. Bonhoeffer, M.  Tacker,  P.
       Schuster	 (1994),  "Fast	Folding	and Comparison of RNA Secondary	Struc-
       tures", Monatshefte f. Chemie: 125, pp 167-188

       R. Lorenz, I.L. Hofacker, P.F. Stadler (2016), "RNA folding  with  hard
       and soft	constraints", Algorithms for Molecular Biology 11:1 pp 1-13

       S. Wuchty, W. Fontana, I. L. Hofacker and P. Schuster (1999), "Complete
       Suboptimal  Folding  of RNA and the Stability of	Secondary Structures",
       Biopolymers: 49,	pp 145-165

       M. Zuker	(1989),	"On Finding All	Suboptimal Foldings of	an  RNA	 Mole-
       cule", Science 244.4900,	pp 48-52

       Y.  Ding,  and  C.E. Lawrence (2003), "A	statistical sampling algorithm
       for RNA secondary structure prediction",	Nucleic	Acids Research	31.24,
       pp 7280-7301

       The energy parameters are taken from:

       D.H.  Mathews, M.D. Disney, D. Matthew, J.L. Childs, S.J. Schroeder, J.
       Susan, M. Zuker,	D.H. Turner (2004), "Incorporating chemical  modifica-
       tion constraints	into a dynamic programming algorithm for prediction of
       RNA secondary structure", Proc. Natl. Acad. Sci.	USA: 101, pp 7287-7292

       D.H  Turner, D.H. Mathews (2009), "NNDB:	The nearest neighbor parameter
       database	for predicting stability of nucleic acid secondary structure",
       Nucleic Acids Research: 38, pp 280-282

AUTHOR
       Ivo L Hofacker, Stefan Wuchty, Walter Fontana, Ronny Lorenz

REPORTING BUGS
       If in doubt our program is right, nature	is at fault.  Comments	should
       be sent to rna@tbi.univie.ac.at.

RNAsubopt 2.7.0			 October 2024			  RNASUBOPT(1)
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=RNAsubopt&sektion=1&manpath=FreeBSD+Ports+15.0>
home | help
Header And Logo

Peripheral Links

Site Navigation

FreeBSD Manual Pages

Header And Logo

Peripheral Links

Search

Site Navigation

FreeBSD Manual Pages