Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
MLOCARNA(1)	      User Contributed Perl Documentation	   MLOCARNA(1)

NAME
       MLocARNA	- multiple alignment of	RNA

SYNOPSIS
       mlocarna	[options] <fasta file>

DESCRIPTION
       MLocarna	computes a multiple sequence-structure alignment of RNA
       sequences. The structure	of these sequences does	not have to be known
       but is inferred based in	simultaneous alignment and folding.

       Generally, mlocarna takes multiple sequences as input, given in a fasta
       file. The fasta file can	be extended to specify structure and anchor
       constraints that	respectively control the possible foldings and
       possible	alignments. The	main outcome is	a multiple alignment together
       with a consensus	structure.

       Technically, mlocarna works as front end	to the pairwise	alignment
       tools locarna, locarna_p, and sparse (and even carna), which are
       employed	to construct the multiple alignment progressively.

       Going beyond the	basic progressive alignment scheme, Mlocarna
       implements probabilistic	consistency transformation and iterative
       alignment, which	are available in probabilistic mode. Moreover, the
       LocARNA package provides	an alternative multiple	alignment tool
       "locarnate", which generates alignments based on	T-Coffee using (non-
       probabilistic) consistency transformation.

OPTIONS
   Load	configurations from file
       --configure=file
	   Load	 a  parameter  set  from  a  configuration file	of options and
	   option value	pairs.	This  enables  specifying  (sets  of)  default
	   parameters  for  mlocarna,  which  can  still  be modified by other
	   options to mlocarna.	Command	line arguments always take  precedence
	   over	 this  configuration.  Options are specified as	single entries
	   per line; option value pairs, like option: value.   Whitespace  and
	   '#'-prefixed	comments are ignored.

   Major alignment modes
       By   default,   mlocarna	 performs  progressive	alignment,  where  the
       progressive alignment  steps  are  computed  by	the  pairwise  aligner
       locarna	based  on  sequences and dot plots (RNAfold -p); subsequently,
       partial alignmetns and their consensus dot plots.

       --probabilistic
	   In probabilistic  mode,  mlocarna  scores  alignments  using	 match
	   probabilities  that	are  computed by a partition function approach
	   [tech. details:  the	 probability  computation  is  implemented  in
	   locarna_p; the probability-based scoring is performed by locarna in
	   mea	mode].	This  enables  mlocarna	 to  consistency-transform the
	   probabilities  (option  --consistency-transform)  and  to   compute
	   reliabilities.   The	 tool  reliability-profile.pl  is  provided to
	   visualize reliability profiles. Reliabilities can also be used  for
	   iterating  the  alignment  with  reliably  aligned  base  pairs  as
	   structural constraints (option --it-reliable-structure).

       --sparse
	   Apply the sparsified	alignment algorithm SPARSE  for	 all  pairwise
	   alignments  (instead	 of  the  default  pairwise  aligner locarna).
	   SPARSE  supports  stronger  sparsification  for  faster   alignment
	   computation	and  increases	the  structure prediction capabilities
	   over	locarna.

   Controlling Output
       --tgtdir
	   Target directory. All output	files are written to  this  directory.
	   Per	default	 the  target  directory	 is  generated	from the input
	   filename by replacing suffix	fa by (or appending) out.

       -v, --verbose
	   Turn	on verbose ouput. Shows	progress of computation	 of  all-2-all
	   pairwise  alignments	for guide tree computation; shows intermediary
	   alignments during the progressive alignment computation.

       --moreverbose
	   Be  even  more  verbose:  additionally  shows  parameters  for  the
	   pairwise  aligner;  moreover,  the calls and	output of the RNA base
	   pair	probability computations  as  well  as	the  pairwise  aligner
	   during progressive alignment.

       -q, --quiet
	   Be quiet.

       --keep-sequence-order
	   Preserve  sequence  order  of  the  input  in  the final alignment.
	   Affects output to stdout and	results/result.aln.

       --stockholm
	   Write STOCKHOLM files of all	final and intermediate alignments  (in
	   addition to CLUSTALW	files).

       --consensus-structure
	   Type	 of  consensus	structures  written  to	 stockholm output (and
	   screen in verbose modes) [alifold|mea|none] (default: none).	  This
	   includes   intermediate  alignments	of  the	 progressive  multiple
	   alignment. If not explicitly	specified othwise, the option alifold-
	   consensus-dp	implicitly  sets  this	to  alifold.   Note  that  the
	   alifold  consenus  of  the final alignment is computed and printed,
	   regardless of this option.

       -w, --width=columns (120)
	   Output width	for sequences in clustal-like  and  stockholm  output;
	   note	that the clustalw standard format requires 60 or less.

       --write-structure
	   Write  guidance  structure  in output to stdout. This provides some
	   insight into	the influence of structure into	the generated pairwise
	   alignments. The guidance structure shows the	base pairs 'predicted'
	   by each pairwise locarna (or	sparse)	 alignment.  These  structures
	   should  not	be  mistaken  as  predicted  consensus	structures  of
	   multiple alignments.	Consensus structures can  be  more  adequately
	   derived  from  the  multiple	 alignment.  For this reason, mlocarna
	   reports the consensus structure by RNAalifold.

   Locality
       --free-endgaps
	   Allow  free	endgaps.  (Corresponds	to  pairwise  locarna	option
	   --free-endgaps "++++".)

       --free-endgaps-3
	   Allow free endgaps 3'.

       --free-endgaps-5
	   Allow free endgaps 5'.

       --sequ-local=bool (false)
	   Turns  on/off sequence locality [def=off]. Sequence locality	refers
	   to the usual	form of	local alignment. If  on,  mlocarna  bases  all
	   calculations	 on local pairwise alignments, which determin the best
	   alignments of  subsequences	(disregarding  dissimilar  starts  and
	   ends). Note that truely local structure alignments as well as local
	   multiple alignments are still a matter of research; so don't	expect
	   perfect results in all instances.

       --struct-local=bool (false)
	   Turns  on/off  structure  locality  [def=off].  Structural locality
	   enables skipping entire substructures in  alignments.  In  pairwise
	   alignments,	this  allows one exclusion of some subsequence in each
	   loop; thus, guaranteeing that the (structure	locally) aligned parts
	   of  the  sequences  are  always  connected  w.r.t.  the   predicted
	   structure   but   not  necessarily  consecutive  in	the  sequence.
	   Structure locality does not imply sequence locality,	but rather the
	   two concepts	are orthogonal.

       --penalized=score
	   Variant of sequence local alignment (cf. --sequ-local),  where  the
	   specified  penalty  score  is subtracted for	each base in the local
	   alignment. [Experimental]

   Pairwise alignment and scoring
       --indel=score (-150)
	   Score of each single	base insertion or deletion.

       --indel-opening=score (-750)
	   Score of opening  an	 insertion  or	deletion,  i.e.	 score	for  a
	   consecutive run of deletions	or insertions. Indel opening score and
	   indel score define the affine scoring of gaps.

       -m, --match=score (50)
	   Score of a base match (unless ribosum-based scoring)

       -M, --mismatch=score (0)
	   Score of a base mismatch (unless ribosum-based scoring)

       --use-ribosum=bool (true)
	   Use	ribosum	scores for scoring base	matches	and base pair matches;
	   note	that tau=0 suppresses any effect on the	latter.

       --ribosum-file=file
	   File	 specifying  the  Ribosum  base	 and  base-pair	 similarities.
	   [default: use RIBOSUM85_60 without requiring	a Ribosum file.]

       -s, --struct-weight=score (200)
	   Maximum  weight of one predicted arc, aka base pair.	Note that this
	   means that the maximum weight of an arc match is twice as high. The
	   maximum weight is assigned to base pairs with (almost)  probability
	   1  in  the  dot  plot;  less	 probable base pairs receive gradually
	   degrading scores.  The  struct-weight  factor  balances  the	 score
	   contribution	 from  structure  to  the score	contribution from base
	   similarity scores (e.g. ribosum scores).

       -e, --exp-prob=prob
	   Expected probability	of a base pair.

       -t, --tau=factor	(0)
	   Tau factor in percent. The tau factor controls the contribution  of
	   sequence-dependent scores to	the score of arc matches.

       -E, --exclusion=<score> (0)
	   Weight  of  an  exclusion,  i.e. an ommitted	subsequence in a loop,
	   which applies only to structural local alignment.

       --stacking
	   Use stacking	terms. In this case, stacked arcs are scored based  on
	   conditional	probabilities (conditioned by their stacked inner arc)
	   rather than unconditioned base pair probabilities. [Experimental]

       --new-stacking
	   Use new stacking terms; cf. --stacking. These terms directly	 award
	   bonuses to stacking.	[Experimental]

   Alignment heuristics
       Several	parameters  are	 available  to speed up	the pairwise alignment
       computations heuristically. Choosing  these  parameters	reasonably  is
       necessary  to  achieve  good  trade-off	between	 speed	and  accuracy,
       especially for large alignment instances.

       -p, --min-prob=probability (0.001)
	   Minimum base	pair / arc probability.	Arc with lower probability  in
	   the input RNA structure ensembles are ignored.

       -P, --tree-min-prob=probability
	   Minimal  prob  for constructing guide tree. This probability	can be
	   set separately for the all-2-all comparison	for  constructing  the
	   guide tree and the progressive/iterative alignment steps.

       --max-bps-length-ratio=factor (0.0)
	   Maximal  ratio  of  the  number  of	base pairs divided by sequence
	   length (default: no effect)

       -D, --max-diff-am=difference
	   Maximal difference for lengths of matched arcs. Two arcs that  have
	   a  higher  difference  of their lengths are ignored.	This speeds up
	   the alignment, since	less arc comparisons (i.e. less	 DP  matrices)
	   have	to be computed.	[def: off/-1]

       -d, --max-diff=difference
	   Maximal  difference	of  the	 positions  of	any two	bases that are
	   considered  to  be  aligned.	 Bases	with  higher  difference   are
	   generally  not  aligned. This allows	banding	of the DP matrices and
	   thus	can result in high speed ups. Note that	the  semantic  changes
	   in  the  context  of	a reference alignment specified	with max-diff-
	   aln.	Then, the difference to	the reference alignment	is restricted.
	   [def: off/-1]

       --max-diff-at-am=difference
	   Same	restriction as max-diff	but only at the	ends of	 arcs  in  arc
	   matches. [def: off/-1]

       --min-trace-probability=probability
	   Minimal   sequence	alignment   probability	 of  potential	traces
	   (probability-based  sequence	 alignment  envelope)	[default=1e-4,
	   moderate filter].

       --max-diff-aln=file
	   Computes  "realignment"  in	the environment	of the given reference
	   alignment (file in clustalw format)	by  constraining  the  maximum
	   difference  to this reference (controlled by	--max-diff). The input
	   sequences (and their	names) have to be identical to these alignment
	   sequences; however  the  alignment  is  allowed  to	contain	 extra
	   sequences, which are	ignored. In combination	with option --realign,
	   the	reference  alignment  is  taken	from the (main)	input file. In
	   this	case, the 'file' argument should be '.', but is	ignored	 (with
	   warning) otherwise.

       --max-diff-relax
	   Relax   deviation  constraints  (cf.	 --max-diff-aln)  in  multiple
	   aligmnent. This option  is  useful  if  the	default	 strategy  for
	   realignment fails.

       -a, --min-am-prob=probability (0.001)
	   Minimum arc-match probability (filters output of locarna-p)

       -b, --min-bm-prob=probability (0.001)
	   Minimum base-match probability (filters output of locarna-p)

   Low-level selection of pairwise alignment tools and options
       --pw-aligner
	   Utilize   the   given   tool	  for  computing  pairwise  alignments
	   (def=locarna).

       --pw-aligner-p=tool
	   Utilize the given tool for computing	 partition  function  pairwise
	   alignments (def=locarna_p).

       --pw-aligner-options
	   Additional option string for	the pairwise alignment tool (def="").

       --pw-aligner-p-options
	   Additional  option  string  for  the	 partition  function  pairwise
	   alignment tool (def="").

   Controlling the guide tree construction
       --treefile=file
	   File	with guide tree	in NEWICK format. The given tree  is  used  as
	   guide   tree	  for	the  progressive  alignment.  This  saves  the
	   calculation of pairwise all-vs-all similarities and construction of
	   the guide tree.

       --similarity-matrix=file
	   File	with similarity	matrix.	The similarities  in  the  matrix  are
	   used	 to  construct	the  guide tree	for the	progressive alignment.
	   This	saves the calculation of pairwise all-vs-all similarities.

       --score-lists
	   Construct the guide tree from pairwise scores in files  scores*  in
	   the	subdirectory  scores  of the target directory.	The scores are
	   typically  precomputed,  possibly  in  a  distributed  way,	 using
	   --compute-pairwise-scores.

       --compute-pairwise-scores=k/N
	   Compute   only   the	  pairwise   alignments	 for  the  guide  tree
	   construction. Write scores to the file $tgtdir/scores/scores-$k and
	   terminate. By computing only	the k-th  fraction  of	N  parts,  the
	   option  supports  distributing  the	computation of the alignments.
	   Before computing the	pairwise scores, the dot plot files should  be
	   precomputed using --only-dps.  (see also: --score-lists)

       --graphkernel
	   Use the graphkernel for constructing	the guide tree.

       --svmsgdnspdk=program
	   Specify  the	 svmsgdnspdk  program  (potentially  including	path).
	   Default: use	"svmsgdnspdk" in path.

       --fasta2shrep=program
	   Program  "fasta2shrep"  for	generating  graphs  from   the	 input
	   sequences  for  use	with  the  graph  kernel guide tree generation
	   (potentially	including path). Default:  use	"fasta2shrep_gspan.pl"
	   in path.

       --fasta2shrep-options=argument-string
	   Command  line arguments for fasta2shrep. Default: "-wins 200	-shift
	   50 -stack -t	3 -M 3".

   Controlling multiple	alignment construction
       --alifold-consensus-dp
	   Employs RNAalifold -p for generating	consensus dotplot  after  each
	   progressive	alignment  step.  This	replaces the default consensus
	   dotplot computation,	which averages over the	input dot plots.  This
	   method should be used with  care  in	 combination  with  structural
	   constraints,	 since	it  ignores  them  for	all  but  the pairwise
	   alignments of single	sequences. Furthermore,	note that it does  not
	   support --stacking or --new-stacking.

       --max-alignment-size=size
	   Limit  the maximum number of	sequences that are aligned together by
	   progressive	alignment.  This  can  be  used	 to  save  unnecessary
	   computations,  when producing a clustering of the input RNAs	rather
	   than	 constructing  a  single  multiple  alignment.	 [default:  no
	   limit].

       --local-progressive
	   Align  only	the  subalignment  of  locally aligned subsequences in
	   subsequent steps of the progressive multiple	alignment. Note:  this
	   is  only  effective	if  local alignment is turned on. (Default for
	   sequence local alignment; turn off by --global-progressive)

       --global-progressive
	   Use alignments including "locality gaps" in subsequent steps	of the
	   progressive multiple	alignment. Note: this  is  only	 effective  if
	   local alignment is turned on. (Opposite of --local-progressive)

       --consistency-transformation
	   Apply  probabilistic	 consistency  transformation (only possible in
	   probabilistic mode).

       --iterate
	   Refine  iteratively	 after	 progressive   alignment.   Currently,
	   iterative  refinement  optimizes  the  SCI  or RELIABILITY (not the
	   locarna score)! Iterative refinement	 realigns  all	binary	splits
	   along the guide tree.

       --iterations=number
	   Refine  iteratively	for  given  number  of	iterations (or stop at
	   convergence).

       --it-reliable-structure=number
	   Iterate alignment <num> times with reliable structure.  This	 works
	   only	in probabilistic mode, when reliabilities can be computed.

   Further options for probabilistic mode
       --pf-only-basematch-probs
	   Use	 only	base   match   probabilities   (no   base  pair	 match
	   probabilities).

       --extended-pf
	    Use	extended precision for partition function values. This increases
	    run-time and space (less than 2x), however enables handling
	    significantly larger instances.

       --quad-pf
	    Use	quad precision for partition function values. Even more	precision
	    than extended pf, but usually much slower (overrides extended-pf).

       --pf-scale=<scale>
	   Scale of partition function;	use for	avoiding  overflow  in	larger
	   instances.

       --fast-mea
	   Compute base	match probabilities using Gotoh	PF-algorithm.

       --mea-alpha
	   Weight of unpaired probabilities in fast mea	mode.

       --mea-beta
	   Weight of base pair match contribution in probabilistic mode.

       --mea-gamma
	   Reserved parameter for fast-mea mode.

       --mea-gapcost
	   Turn	on gap penalties in probabilistic/mea mode (default: off).

       --write-probs / --no-write-probs
	   Write / don't write probabilities (of base matches and arc matches)
	   to	 the   target	directory.    Override	 by   single   options
	   --(no-)write-bm-probs and --(no-)write-am-probs  is	possible.  Use
	   this	 to  make the probability files	available for post-processing.
	   (default: don't write).

       --write-bm-probs	/ --no-write-bm-probs
	   Don't write / Write base match probabilities	to files in target dir
	   (default: don't write).

       --write-am-probs	/ --no-write-am-probs
	   Don't write / Write arc match probabilities to files	in target  dir
	   (default: don't write).

   Miscallaneous modes of operation
       --realign
	   Realignment mode. In	this mode, the input must be in	clustal	format
	   and	is  interpreted	 as  alignment	of  the	 input	sequences; the
	   sequences are obtained by removing all gap symbols.	Moreover,  the
	   given  alignment  is	set as reference alignment for --max-diff-aln.
	   Structure and anchor	constraints  can  be  specified	 as  consensus
	   constraints	in  the	input; constraints are specified as 'alignment
	   strings' with names '#A1', '#S', or '#FS' for anchor, structure, or
	   fixed structure constraints,	respectively. Characters in the	 '#A1'
	   anchor  specification  other	than '-' and '.' constrain the aligned
	   residues in the respective column to	 remain	 aligned  (blanks  are
	   disallowed;	 annotations  '#A2',  '#A3',  ...  are	ignored).  The
	   consensus structure constraint is equivalent	to  constraining  each
	   single  sequence  by	 the projection	of the consensus constraint to
	   the sequence	(removing all base pairs  with	at  least  one	gapped
	   end).

       --dp-cache=directory
	   Use	directory  <dir> as cache for dot plot or pp files (useful for
	   avoiding multiple computation).

       --only-dps
	   Compute only	the pair probability files / dot  plots,  don't	 align
	   (useful for filling the dp-cache).

       --evaluate=file
	   Evaluate  the given multiple	alignment (clustalw aln	format,	or use
	   --eval-fasta). This requires	that probailities are already computed
	   (mlocarna --probabilistic) and  present  in	the  target  directory
	   (--tgtdir).

       --eval-fasta
	   Assume  that	 alignment for evaluation (cf. --evaluate) is in fasta
	   format.

   Constraints
       --anchor-constraints=<file>
	   Read	anchor constraints from	bed format specification.

	   Anchor constraints in four-column bed format	specify	 positions  of
	   named  anchor  regions  per	sequence.  The	'contig' names have to
	   correspond to the fasta input sequence names. Anchor	names must  be
	   unique  per	sequence  and  regions	of the same name for different
	   sequences must have the same	length.	This constrains	the  alignment
	   to align all	regions	of the same name.

	   The	specification  of  anchors  via	this option removes all	anchor
	   definitions that may	be given directly in the fasta input file!

       --ignore-constraints
	   Ignore all constraints (anchor and structure	constraints)  even  if
	   given.

   Rna folding (RNAfold/RNAplfold)
       --noLP /	--LP
	   Disallow/Allow lonely pairs (default: Disallow).

       --maxBPspan
	   Limit maximum span of base pairs (default off).

       --relaxed-anchors
	   Relax   semantics  of  anchor  constraints  (default	 off,  meaning
	   'strict' semantics).	For lexicographically ordered  anchors,	 where
	   each	 sequence  is  annotated  with	exactly	 the  same names, both
	   semantics are equivalent; thus, in this  common  case,  the	subtle
	   differences	can be ignored.	In strict semantics, anchor names must
	   be ordered lexicographically	and can	only be	aligned	in this	order.
	   In relaxed semantics, the only requirement  is  that	 equal	anchor
	   names  are  matched.	Consequently, anchor names that	don't occur in
	   all sequences could be overwritten (if two names  are  assigned  to
	   the same position) or even introduce	inconsistencies.

       --plfold-span=span
	   Use RNAplfold with span.

       --plfold-winsize=ws
	   Use RNAplfold with window of	size ws	(default=2*span).

       --rnafold-parameter=<file>
	   Parameter file for RNAfold (RNAfold's -P option)

       --rnafold-temperature=<temp>
	   Temperature for RNAfold (RNAfold's -T option)

       --skip-pp
	   Skip	 computation  of  pair	probs if the probabilities are already
	   existing. Non-existing ones are still computed.

       --no-bpp-precomputation
	   Switch off precomputation  of  base	pair  probabilties.  Overwrite
	   potentially	existing input files.  (compare	skip-pp). For use with
	   special pairwise aligners (e.g. locarna_n) that recompute the  base
	   pair	probabilities at each invokation.

       --in-loop-probabilities
	   Turn	 on  precomputation  of	 in  loop  probabilties.  For use with
	   special  pairwise  aligners	(e.g.	locarna_n)   that   use	  such
	   probabilities.

   Multithreading
       --threads, --cpus=number
	   Use	the  given  number of threads for computing pair probabilities
	   and all-2-all alignments in parallel	(multicore/processor support).

	   Be aware: mlocarna seems not	to scale well  for  more  than	a  few
	   threads  (often  only  2  or	 3).   Using  more  threads  is	 often
	   detrimental,	since it strongly increases memory consumption due  to
	   the	 current   perl	 threading  implementation.  This  unfortunate
	   behavior seems  hard	 to  improve  without  major  rewrite  of  the
	   software.

   Getting Help
       --help
	   Brief help message

       --man
	   Full	documentation

       The  sequences  are  given  in input file <file>	in mfasta format.  All
       results are written to a	target directory <dir>.	If the	file  tree  is
       given, contained	tree (in NEWICK-tree format) is	used as	guide tree for
       the   progressive   alignment.  The  final  results  are	 collected  in
       <tgtdir>/results.     The     final     multiple	     alignment	    is
       <tgtdir>/results/result.aln.

EXAMPLES
   Calling mlocarna
       [Note that the LocARNA distribution provides files of the following and
       other examples in Data/Examples.]

       Sequences are typically given in	plain fasta format like

	   example.fa
	   ----------------------------------------
	   >fruA
	   CCUCGAGGGGAACCCGAAAGGGACCCGAGAGG
	   >fdhA
	   CGCCACCCUGCGAACCCAAUAUAAAAUAAUACAAGGGAGCAGGUGGCG
	   >vhuU
	   AGCUCACAACCGAACCCAUUUGGGAGGUUGUGAGCU
	   ----------------------------------------

       To align	these sequences, simply	call

	 mlocarna example.fa

       Usually,	 it makes sense	to set additional options; this	is either done
       on the command line or via  configuration  files.  A  reasonable	 small
       configuration for global	alignment of large instances would be

	   short-example.cfg
	   ----------------------------------------
	   max-diff-am:	25
	   max-diff:	60
	   min-prob:	0.01
	   plfold-span:	100
	   indel:	-50
	   indel-open:	-750
	   threads:	8   # <- adapt to your hardware
	   alifold-consensus-dp
	   ----------------------------------------

       To use it, call

	   mlocarna --config short-example.cfg example.fa

       which is	equivalent to

	   mlocarna --max-diff-am 25 --max-diff	60 --min-prob 0.01 \
		    --indel -50	--indel-open -750 \
		    --plfold-span 100 --threads	8 --alifold-consensus-dp \
		    example.fa

       For  probabilistic alignment with consistency transformation, call

	 mlocarna --probabilistic --consistency-transform example.fa

       In  both	 cases,	 mlocarna  writes  the main results to stdout and more
       detailed	results	to  the	 target	 directory  example.out.  The  results
       directory  is  overwritten if it	exists already.	To avoid this, one can
       specify the target directory (--tgtdir).

   Use of constraints
       Mlocarna	 supports  structure  constraints  for	folding	  and	anchor
       constraints  for	 alignment. Both types of constraints can be specified
       in extension of the  standard  fasta  format  via  'constraint  lines'.
       Fasta-ish input with constraints	looks like this

	   example-w-constraints.fa
	   ----------------------------------------
	   >A
	   GACCCUGGGAACAUUAACUACUCUCGUUGGUGAUAAGGAACA
	   ..((.(....xxxxxx...................))).xxx #S
	   ..........000000.......................111 #1
	   ..........123456.......................123 #2
	   >B
	   ACGGAGGGAAAGCAAGCCUUCUGCGACA
	   .(((....xxxxxx.......))).xxx	#S
	   ........000000...........111	#1
	   ........123456...........123	#2
	   ----------------------------------------

       The  same  anchor  constraints  (like  by  the lines tagged #1, #2) can
       alternatively be	specified in bed format	by the entries

	   example-anchors.bed
	   ----------------------------------------
	   A   10      16      first_box
	   B   8       14      first_box
	   A   39      42      ACA-box
	   B   25      28      ACA-box
	   ----------------------------------------

       where anchor regions (boxes) have  arbitrary  but  matching  names  and
       contig/sequence	 names	 correspond  to	 the  sequence	names  of  the
       fasta(-like) input.

       Given, e.g.

	   example-wo-anchors.fa
	   ----------------------------------------
	   >A
	   GACCCUGGGAACAUUAACUACUCUCGUUGGUGAUAAGGAACA
	   ..((.(....xxxxxx...................))).xxx #S
	   >B
	   ACGGAGGGAAAGCAAGCCUUCUGCGACA
	   .(((....xxxxxx.......))).xxx	#S
	   ----------------------------------------

       one calls

	 mlocarna --anchor-constraints example-anchors.bed  example-wo-anchors.fa

   Realignment
       In realignment mode (option --realign),	mlocarna  is  called  with  an
       input alignment in clustal format, e.g.

	 mlocarna --realign example-realign.aln

       This  allows  to	 define	 constraints as	'consensus constraints'	in the
       input, e.g.

	   example-realign.aln
	   ----------------------------------------
	   CLUSTAL W

	   fruA		      --CCUCGAGGGGAACCCGAA-------------AGGGACCCGAGAGG--
	   vhuU		      AGCUCACAACCGAACCCAUU-------------UGGGAGGUUGUGAGCU
	   fdhA		      CGCCACCCUGCGAACCCAAUAUAAAAUAAUACAAGGGAGCAG-GUGGCG
	   #A1		      ..*...........CCC.............................5..
	   #S		      ((((((.((((...(((.................))).)))).))))))
	   ----------------------------------------

       Note that anchor	names are arbitrary and	 the  consensus	 structure  is
       'projected' to the single sequences.  Moreover, the input alignment can
       be used as reference for	fast limited realignment, e.g. call to realign
       in distance 5 of	the reference alignment:

	 mlocarna --realign example-realign.aln	--max-diff 5 --max-diff-aln .

AUTHORS
       Sebastian  Will	Christina  Otto	(ExpaRNA-P, sparsification classes for
       ExpaRNA-P and SPARSE) Milad Miladi (SPARSE)

ONLINE INFORMATION
       For	 download	and	  online       information,	   see
       <https://github.com/s-will/LocARNA>				   and
       <http://www.bioinf.uni-freiburg.de/Software/LocARNA>.

       Latest  releases	 are  available	 as   source   code   on   Github   at
       <https://github.com/s-will/LocARNA/releases>.

REFERENCES
       Sebastian  Will,	Kristin	Reiche,	Ivo L. Hofacker, Peter F. Stadler, and
       Rolf Backofen.  Inferring non-coding RNA	families and classes by	 means
       of   genome-scale   structure-based   clustering.   PLOS	 Computational
       Biology,	3 no. 4	pp. e65, 2007.	doi:10.1371/journal.pcbi.0030065

       Sebastian Will, Tejal Joshi, Ivo	L. Hofacker,  Peter  F.	 Stadler,  and
       Rolf  Backofen.	LocARNA-P:  Accurate  boundary prediction and improved
       detection of  structural	 RNAs.	RNA,  18  no.  5  pp.  900-914,	 2012.
       doi:10.1261/rna.029041.111

       Sebastian  Will,	 Michael  Yu, and Bonnie Berger. Structure-based Whole
       Genome Realignment Reveals Many Novel Non-coding	RNAs. Genome Research,
       no. 23 pp. 1018-1027, 2013. doi:10.1101/gr.137091.111

perl v5.32.1			  2022-11-19			   MLOCARNA(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=mlocarna&sektion=1&manpath=FreeBSD+Ports+15.0>

home | help