FreeBSD Manual Pages

home | help
LOCARNA(1)			 User Commands			    LOCARNA(1)

NAME
       LocARNA - manual	page for LocARNA 2.0.0

DESCRIPTION
       locarna - pairwise (global and local) alignment of RNA.

       USAGE: locarna [options]	<Input 1> <Input 2>

       locarna	is  the	 pairwise alignment tool of the	LocARNA	package, which
       performs	fast simultaneous folding and alignment	based on two  RNA  se-
       quences (or alignments).

   Input
       Input  consists	of two sequences or alignments,	which are specified in
       fasta, clustal, stockholm, or LocARNA pp	format.	 Optionally, structure
       and anchor constraints can be specified in the input files.  If	align-
       ments are given in the input, they are aligned without revising the gap
       structure  within  the  given  alignments.  Unless specified, base pair
       probabilities of	the input sequences or alignments are predicted	 using
       the  ViennaRNA  package.	 Optionally, base pair probability information
       can be passed for one or	both input sequences (or alignments) using the
       input formats LocARNA PP	2.0 or ViennaRNA postscript dotplot format.

   Constraints
       Anchor and structure constraints	can be specified in the	 input	files.
       Anchor  constraints for sequences (alignments) are defined by assigning
       names to	sequence positions (alignment columns),	respectively. The  ex-
       act  semantics  is either strict	or relaxed (controled by --relaxed-an-
       chors). In strict semantics, anchor names have  to  be  sorted  lexico-
       graphically  in	the  input  as well as in the result alignment (in the
       sense that result columns receive inherit the name from one or both in-
       put positions, where conflicts are disallowed). In  relaxed  semantics,
       anchors of the same name	are forced into	the same alignment column. The
       actual  syntax of the constraint	specification depends on the file for-
       mat (see	Constraint Examples below).

   Output
       The final pairwise alignment is reported	in standard and/or variants of
       the clustal and stockholm format, as well as LocARNA's own pp format.

OPTIONS
       -h, --help
	      Print this help.

       --galaxy-xml
	      Print galaxy xml wrapper.

       -V, --version
	      Print only version string.

       -v, --verbose
	      Be verbose. Prints input parameters, sequences and size informa-
	      tion.

       -q, --quiet
	      Be quiet.

   Scoring parameters:
       -i, --indel=<score>(-150)
	      Indel score. Score contribution of each single base insertion or
	      deletion.	 Indel opening score and indel score define the	affine
	      scoring of gaps.

       --indel-opening=<score>(-750)
	      Indel opening score. Score contribution of opening an  insertion
	      or  deletion,  i.e.  score for a consecutive run of deletions or
	      insertions. Indel	opening	 score	and  indel  score  define  the
	      affine scoring of	gaps.

       --ribosum-file=<f>(RIBOSUM85_60)
	      File  specifying	the  Ribosum  base and base-pair similarities.
	      [default:	use RIBOSUM85_60 without requiring a Ribosum file.]

       --use-ribosum=<bool>(true)
	      Use ribosum scores  for  scoring	base  matches  and  base  pair
	      matches; note that tau=0 suppresses any effect on	the latter.

       -m, --match=<score>(50)
	      Set score	contribution of	a base match (unless ribosum scoring).

       -M, --mismatch=<score>(0)
	      Set  score contribution of a base	mismatch (unless ribosum scor-
	      ing).

       --unpaired-penalty=<score>(0)
	      Penalty for unpaired bases

       -s, --struct-weight=<score>(200)
	      Maximal weight of	1/2 arc	match.	 Balances  structure  vs.  se-
	      quence score contributions.

       -e, --exp-prob=<prob>
	      Expected	base  pair probability.	Used as	background probability
	      for  base	 pair  scoring	[default:  calculated  from   sequence
	      length].

       -t, --tau=<factor>(50)
	      Tau factor. Factor for contribution of sequence similarity in an
	      arc match	(in percent). tau=0 does not penalize any sequence in-
	      formation	including compensatory mutations at arc	matches, while
	      tau=100 scores sequence similarity at ends of base matches (if a
	      scoring matrix like ribosum is used, this	adds the contributions
	      for base pair match from the matrix). [default tau=0!]

       -E, --exclusion=<score>(0)
	      Score  contribution  per exclusion in structure local alignment.
	      Set to zero for unrestricted structure locality.

       --stacking
	      Use stacking terms (requires stack-probs by RNAfold -p2)

       --new-stacking
	      Use new stacking terms (requires stack-probs by RNAfold -p2)

   Partition function representation (for sequence envelopes):
       --extended-pf
	      Use extended precision  for  the	computation  of	 sequence  en-
	      velopes.	This  enables handling significantly larger instances.
	      [default]

       --quad-pf
	      Use quad precision for partition function	values.	Even more pre-
	      cision than extended pf, but usually much	slower (overrides  ex-
	      tended-pf).

   Locality:
       --struct-local=<bool>(false)
	      Turn  on/off  structure locality.	Allow exclusions in alignments
	      of connected substructures.

       --sequ-local=<bool>(false)
	      Turn on/off sequence locality. Find best alignment of  arbitrary
	      subsequences of the input	sequences.

       --free-endgaps=<spec>(----)
	      Control  where end gaps are allowed for free. String of four +/-
	      symbols, allowing/disallowing free end gaps at the four sequence
	      ends in the order	left end of first sequence, right end of first
	      sequence,	left end of second sequence, right end of  second  se-
	      quence.	For  example,  "+---" allows free end gaps at the left
	      end of the first alignment string; "----"	forbids	free end  gaps
	      [default].

       --normalized=<L>(0)
	      Perform normalized local alignment with parameter	L. This	causes
	      locarna to compute the best local	alignment according to 'Score'
	      /	 (  L  + 'length' ), where length is the sum of	the lengths of
	      the two locally aligned subsequences. Thus, the  larger  L,  the
	      larger  the local	alignment; the size of value L is in the order
	      of local alignment lengths. Verbose yields info on the iterative
	      optimizations.

       --penalized=<PP>(0)
	      Penalized	local alignment	with penalty PP

   Output:
       -w, --width=<columns>(120)
	      Width of alignment output.

       --clustal=<file>
	      Write alignment in ClustalW (aln)	format to given	file.

       --stockholm=<file>
	      Write alignment Stockholm	format to given	file.

       --pp=<file>
	      Write alignment in PP format to given file.

       --alifold-consensus-dp
	      Compute consensus	dot plot by alifold (warning:  this  may  fail
	      for long sequences).

       --consensus-structure=<type>(none)
	      Type  of	consensus  structures  written to screen and stockholm
	      output [alifold|mea|none]	(default: none).

       --consensus-gamma=<float>(1.0)
	      Base pair	weight for mea consensus computation.  For  MEA,  base
	      pairs  are  scored  by their pair	probability times 2 gamma; un-
	      paired bases, by their unpaired probability.

       -L, --local-output
	      Output only local	sub-alignment (to std out).

       --local-file-output
	      Write only local sub-alignment to	output files.

       -P, --pos-output
	      Output only local	sub-alignment positions.

       --write-structure
	      Write guidance structure in output.

       --score-components
	      Output components	of the score (experimental).

       --stopwatch
	      Print run	time informations.

   Heuristics for speed	accuracy trade off:
       -p, --min-prob=<probability>(0.001)
	      Minimal probability. Only	base pairs of at least this  probabil-
	      ity are taken into account.

       --max-bps-length-ratio=<factor>(0.0)
	      Maximal  ratio  of  #base	pairs divided by sequence length. This
	      serves as	a second filter	on the "significant" base pairs.  [de-
	      fault: 0.0 = no effect].

       -D, --max-diff-am=<diff>(-1)
	      Maximal difference for sizes of matched arcs. [-1=off]

       -d, --max-diff=<diff>(-1)
	      Maximal  difference  for	positions  of  alignment  traces  (and
	      aligned bases).  [-1=off]

       --max-diff-at-am=<diff>(-1)
	      Maximal difference for positions	of  alignment  traces  at  arc
	      match ends.  [-1=off]

       --max-diff-aln=<aln file>()
	      Maximal difference relative to given alignment (file in clustalw
	      format)

       --max-diff-pw-aln=<alignment>()
	      Maximal  difference  relative  to	 given	alignment (string, de-
	      lim=AMPERSAND)

       --max-diff-relax
	      Relax deviation constraints in multiple aligmnent

       --min-trace-probability=<probability>(1e-4)
	      Minimal  sequence	 alignment  probability	 of  potential	traces
	      (probability-based sequence alignment envelope) [default=1e-4].

   Special sauce options:
       --kbest=<k>(-1)
	      Enumerate	k-best alignments

       --better=<t>(-1000000)
	      Enumerate	alignments better threshold t

   MEA score:
       --mea-alignment
	      Perform  maximum	expected  accuracy alignment (instead of using
	      the default similarity scoring).

       --match-prob-method=<int>(0)
	      Select method for	computing sequence-based base match  probabli-
	      ties  (to	 be  used  for	mea-type  alignment  scores). Methods:
	      1=probcons-style from HMM, 2=probalign-style  from  PFs,	3=from
	      PFs, local

       --probcons-file=<file>
	      Read parameters for probcons-like	calculation of match probabil-
	      ities from probcons parameter file.

       --temperature-alipf=<int>(300)
	      Temperature  for	the  /sequence	alignment/ partition functions
	      used by the probcons-like	sequence-based match/trace probability
	      computation (this	temperature is different from  the  'physical'
	      temperature of RNA folding!).

       --pf-struct-weight=<weight>(200)
	      Structure	 weight	in PF computations (for	the computation	of se-
	      quence-based match probabilties from partition functions).

       --mea-gapcost
	      Use gap cost in mea alignment

       --mea-alpha=<weight>(0)
	      Weight alpha for MEA

       --mea-beta=<weight>(200)
	      Weight beta for MEA

       --mea-gamma=<weight>(100)
	      Weight gamma for MEA

       --probability-scale=<scale>(10000)
	      Scale for	probabilities/resolution of mea	score

       --write-match-probs=<file>
	      Write match probs	to file	(don't align!).

       --write-trace-probs=<file>
	      Write trace probs	to file	(don't align!).

       --read-match-probs=<file>
	      Read match probabilities from file.

       --write-arcmatch-scores=<file>
	      Write arcmatch scores (don't align!)

       --read-arcmatch-scores=<file>
	      Read arcmatch scores.

       --read-arcmatch-probs=<file>
	      Read arcmatch probabilities (weighted by factor mea_beta/100)

   Constraints:
       --noLP Disallow lonely pairs in prediction and alignment.

       --maxBPspan=<span>(-1)
	      Limit maximum base pair span [default=off].

       --relaxed-anchors
	      Use relaxed semantics of anchor constraints [default=strict  se-
	      mantics].

   Input files:
	      The tool is called with two input	files <Input 1>	and <Input 2>,
	      which  specify the two input sequences or	input alignments. Dif-
	      ferent input formats (Fasta, Clustal, Stockholm, LocARNA PP, Vi-
	      ennaRNA postscript dotplots) are accepted	and automatically rec-
	      ognized (by file content); the two input files can be in differ-
	      ent formats. Extended variants of	the Clustal and	Stockholm for-
	      mats enable specifying anchor and	structure constraints.

DISCLAIMER
       For many	purposes, it is	more convenient	to use the multiple  alignment
       tool  mlocarna  (even  for  pairwise alignment).	However, certain tasks
       --like aligning two specific alignments-- are  supported	 only  by  the
       pairwise	tool or	can be better controlled. Note that the	performance of
       locarna	(as well as basically all tools	in the LocARNA package)	is of-
       ten significantly improved by the use of	suitable  application-specific
       options,	deviating from the default settings.

REFERENCES
       If you use locarna please cite us:

       Sebastian  Will,	Kristin	Reiche,	Ivo L. Hofacker, Peter F. Stadler, and
       Rolf Backofen.  Inferring non-coding RNA	families and classes by	 means
       of genome-scale structure-based clustering. PLOS	Computational Biology,
       3 no. 4 pp. e65,	2007. doi:10.1371/journal.pcbi.0030065

       Sebastian  Will,	 Tejal	Joshi,	Ivo L. Hofacker, Peter F. Stadler, and
       Rolf Backofen.  LocARNA-P: Accurate boundary  prediction	 and  improved
       detection    of	  structural	RNAs.	  RNA,	 18(5):900.14,	 2012.
       doi:10.1261/rna.029041.111

AVAILABILITY
       The latest LocARNA package release is available	online	at  at	Github
       https://github.com/s-will/LocARNA       and	http://www.bioinf.uni-
       freiburg.de/Software/LocARNA/

EXAMPLES
       In the simplest case, the tool is called	with two  sequences  in	 fasta
       format or two alignments	in multiple fasta, clustal or stockholm	format
       like

	 locarna file1.fa file2.fa

       or

	  locarna file1.aln file2.aln

       Note that input formats can be mixed like in

	 locarna file1.aln file2.stk

   Constraint Examples
       Anchor  and structure constraints can be	specified in extended versions
       of the Clustal format, in the LocARNA PP	2.0  format,  as  well	as  in
       Stockholm  format. Currently, the pairwise alignment tools of the pack-
       age do not support constraints in fasta-like input. Here	is an  example
       of constraints in Clustal format:

       CLUSTAL W

       vhuU	       AGCUCACAACCGAACCCAUUUGGGAGGUUGUGAGCU
       fruA	       CC-UCGAGGG-GAACCCGAAA-GGGACCCGAGA-GG
       #S	       (<<<<<<<<<......xxxx...............)
       #A1	       .............AAABB..................
       #A2	       .............12312..................

       The  syntax (and	semantic) of structure constraint strings (prefixed by
       #S) is the one of RNAfold of the	 ViennaRNA  package.  Moreover,	 fixed
       structures  prefixed  by	#FS are	accepted; fixed	structures can contain
       pseudoknots encodes by different	bracket	symbols.

       Anchors are specified by	naming columns,	where  names  can  consist  of
       several	places,	 in  the example each name consists of two characters,
       such that the names are A1, A2, A3, B1, B2 for the respective columns.

       Constraints in PP format	are specified in the  same  way;  however,  in
       Stockholm format	we use different prefixes, such	that the example would
       look like

       # STOCKHOLM 1.0

       vhuU	       AGCUCACAACCGAACCCAUUUGGGAGGUUGUGAGCU
       fruA	       CC-UCGAGGG-GAACCCGAAA-GGGACCCGAGA-GG
       #=GC cS	       (<<<<<<<<<......xxxx...............)
       #=GC cA1	       .............AAABB..................
       #=GC cA2	       .............12312..................

       The prefix for fixed structures is '#=GC	cFS'.

AUTHOR
       This  man  page is written and maintained by Sebastian Will. It is part
       of the LocARNA package.

REPORTING BUGS
       Report bugs to <will (at) informatik.uni-freiburg.de>.

COPYRIGHT
       Copyright 2005- Sebastian Will.	The LocARNA package is released	 under
       GNU Public License v3.0

SEE ALSO
       The LocARNA PP 2.0 format is described online at	http://www.bioinf.uni-
       freiburg.de/Software/LocARNA/PP/

LocARNA	2.0.0			 November 2022			    LOCARNA(1)
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=locarna&sektion=1&manpath=FreeBSD+Ports+15.0>
home | help
Header And Logo

Peripheral Links

Site Navigation

FreeBSD Manual Pages

Header And Logo

Peripheral Links

Search

Site Navigation

FreeBSD Manual Pages