Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
RNAForester(2.0.1)					    RNAForester(2.0.1)

NAME
       RNAforester - compare RNA secondary structures via forest alignment

SYNOPSIS
       RNAforester [options]
       Options are:
       --help			 shows this help info
       --version		 shows version information
       -d			 calculate distance instead of similarity
       -r			 calculate relative score
       -l			 local similarity
       -so=int			 local suboptimal alignments within int%
       -s			 small-in-large	similarity
       -m			 multiple alignment mode
       -mt=double		 clustering threshold
       -mc=double		 clustering cutoff
       -p			 predict structures from sequences
       -pmin=num		 minimum basepair frequency for	prediction
       -pm=int			 basepair(bond)	match score
       -pd=int			 basepair bond indel score
       -bm=int			 base match score
       -br=int			 base mismatch score
       -bd=int			 base indel score
       --RIBOSUM		 RIBOSUM85-60 scoring matrix
       -cmin=double		  minimum  basepair  frequency	for  consensus
       structure
       -2d			 generate alignment  2D	 plots	in  postscript
       format
       --2d_hidebasenum		 hide base numbers in 2D plot
       --2d_basenuminterval=n	 show every n-th base number
       --2d_grey		 use only grey colors in 2D plots
       --2d_scale=double	 scale factor for the 2d plots
       --score			 compute only scores, no alignment
       --fasta			 generate fasta	output of alignments
       -f=file			 read input from file
       --noscale		 suppress output of scale

DESCRIPTION
       RNAforester  calculates	RNA secondary structure	alignments, both pair-
       wise and	multiple.  The comparison is based on the tree alignment model
       [1,2].

   Model
       The model for pairwise and multiple  alignment  differs	slightly.  The
       pairwise	 model	is  based on the following edit	operations on sequence
       and structure:

       basepair	replacement/match: A basepair, INCLUDING the paired bases,  is
       substituted by another basepair.	 The scoring contribution is p_m.
       basepair	bond deletion: A basepair bond WITHOUT the paired bases	is re-
       moved. The scoring contribution is p_d.
       Sequence	 edit  operations:  Base match/mismatch	and base deletion give
       the scoring contributions b_m and b_d, respectively.

       In the multiple alignment mode (-m), parameter p_m  is  the  score  for
       matching	a basepair bond	WITHOUT	the paired bases.  Thus, the score for
       a  whole	 basepair replacement is p_m+2*b_m. For	more information about
       multiple	alignment refer	to the description of parameter	-m.

   Input
       RNAforester reads  RNA  secondary structures from stdin by default.  It
       accepts sequences and structures	in Fasta format, where matching	brack-
       ets symbolize base pairs	and unpaired bases are represented by a	dot. A
       line containing the primary sequence  can  precede  the	RNA  secondary
       structure(s). An	example	is given below:

	 > test
	 accaguuacccauucgggaaccggu   primary structure
	 .((..(((...)))..((..)))).   secondary structure

       All  characters	after a	"blank"	are ignored and	all '-'	characters are
       removed.	 The  program will continue to read  new  structures  until  a
       line consisting of the single character @ or an end of file is  encoun-
       tered. Input lines starting with	> can contain a	structure name.

       Option  -f=filename  let	 RNAforester read the input from file. Results
       files are then written to files prefixed	by filename.

   Output
       Alignments in ASCII format are written to stdout. Option	-2d  generates
       postscript drawings of structure	alignments.

Options
       -d     Calculate	 distance  instead of similarity. In contrast to simi-
	      larity, scoring contributions are	minimized.  The	scoring	 para-
	      meters  must not be negative and equal structures	achieve	a dis-
	      tance of zero. This parameter can	not  be	 used  in  conjunction
	      with multiple alignment, where relative similarity is computed.

       -r     Calculate	       relative	       score,	     defined	    by
	      sr(a,b)=2*s(a,b)/(s(a,a)+s(b,b).	 Relative  scores  are	 upper
	      bounded by 1 which is the	score for equal	structures.

       -l     Calculate	 local	similar	 structures.  The term local refers to
	      subwords of the input sequences and structures. If parameter -so
	      is used suboptimal solutions are calculated. This	does not  mean
	      suboptimal solutions of the same local structures, but different
	      substructures which do not include each other.

       -so=int
	      Calculates  suboptimal local alignments within int% of the opti-
	      mum. This	option requires	option -l.

       -s     Calculates small-in-large	similarity, i.e. the best alignment of
	      the first	structure against  all	substructures  of  the	second
	      structure	is computed.

       -m, -mc=double, -mt=double, -cmin=double
	      Multiple	alignment  mode. Multiple alignments of	structures are
	      calculated in a progressive fashion. First,  an  all-against-all
	      comparison  of structures	is performed (relative scores) and af-
	      terwards structural alignments are joined	 along	a  guide  tree
	      (the  guide tree is constructed dynamically).  If	the best score
	      which a single structure or structure alignment can  achieve  by
	      aligning	to  all	 others	 is  below cutoff value	-mc, it	is not
	      joined and put into the results list. Thus, a multiple structure
	      alignment	can produce a list of alignments. The main purpose  of
	      parameter	 -mc  is  to identify alternative and wrong structures
	      produced by structure predictions. The default value for -mc  is
	      zero,  as	this separates similar from dissimilar in a similarity
	      scoring model.

	      In each step in the multiple  alignment  calculation,  the  best
	      scoring  pair  is	joined and then	the guide tree is adjusted. To
	      speed up computation, parameter -mt defines a threshold whereas,
	      if this is exceeded, multiple pairs  are	joined	and  then  the
	      guide tree is adjusted.

	      Besides  sequence	 and structure alignment, a consensus sequence
	      and structure is computed. The minimum pair frequency  probabil-
	      ity  for	a  basepair in the consensus sequence is controlled by
	      parameter	-cmin.

	      The console output could look like (just a part):

				  * *  ****
				  * *  ****
				 ** *  ****
				 ** *  ****		     *
				 ** *  ****  ********	  ****
				 ** *  ****  ********	  ****
				 ** *  ****  ********	  ****
		**************** ** * ****************	  ******
		**************** ** ****************************
		**************** ** ****************************
		ggggcuauagcucagcugggggagcuauagcucagcugggagcgggga
		.((((....))))....((.(.(((((..((((........))))...
		************************************************
		**************** ** ****************************
		**************** ** ** *************************
		**************** ** *  ***************	 *******
				 ** *  ****  ********	 *****
				 ** *  ****  ********	 *****
				 ** *  ****   *******	 *** *
				 ** *  ****		     *
				  * *  ****
				  * *  ****

	      The number of * above the	primary	sequence shows	the  frequency
	      of  the base.  Each * stands for 10% frequency. Accordingly, the
	      number of	* below	the secondary structure	show the frequency  of
	      the occurrence of	a paired or unpaired base.

	      The guide	tree is	written	to a file "cluster.dot"	in dot format.
	      If  a  filename  was  specified  by parameter -f the filename is
	      "filename_cluster.dot".	    Refer      to	http://www.re-
	      search.att.com/sw/tools/graphviz	for more details about the dot
	      format and tools.

       -p, -pmin=double
	      Structures  (in  fact, a consensus of compatible structures) are
	      predicted	from the partition function which is calculated	 using
	      the Vienna RNA library [3]. Structure lines in the input are ig-
	      nored.   -pmin is	the minimum frequency of a basepair which must
	      be exceeded to be	considered for the prediction of structures.

       -pm=int,-pd=int,-bm=int,-br=int,-bd=int
	      Scoring parameters. Refer	to Section DESCRIPTION.

       --RIBOSUM
	      Uses the base and	basepair substitution matrix RIBOSUM85-60  ma-
	      trix as proposed in [4].	Requires pairwise alignment model.

       -2d    RNAforester provides different types of visualizations for pair-
	      wise and multiple	alignment.

	      pairwise	alignment  Since bases paired in a structure S1	can be
	      aligned to bases unpaired	in a structure S2, the presentation of
	      a	common secondary structure leaves some choice. For  an	align-
	      ment of those structures,	an RNA secondary structure "$S2-at-S1"
	      is  drawn	 that  highlights  the differences as deviations of S2
	      from S1, or vice versa, "S1-at-S2". Both are alternative visual-
	      izations of the same alignment.  Bases  printed  in  black  show
	      structure	 elements  that	occur in both structures with the same
	      sequence.	Sequence variations are	displayed by  using  red  let-
	      ters.  Bases  or	base  pairs  that  can only be found in	S1 are
	      printed in blue, while bases that	only occur in S2  are  printed
	      in green.

	      The  drawings are	written	to files "x_n.ps" and "y_n.ps" where n
	      is the number of the alignment. n	enumerates the suboptimal  so-
	      lutions  if  option -so is used.	The region of local similarity
	      are highlighted in  the  original	 structures  in	 the  drawings
	      "x_str.ps" and "y_str.ps".

	      multiple alignment Each cluster of the result list of a multiple
	      alignment	 is visualized in two alternative drawings, written to
	      the files	"filename_cons_n.ps" and "filename_n_.ps" if option -f
	      is used. In both plots, the consensus structure  is  shown.  The
	      lighter  a basepair bond is drawn, the less frequent does	it ex-
	      ist in the structures. Bases or basepair bonds that have a  fre-
	      quency  of one hundred percent are drawn in red color. In	"file-
	      name_cons_n.ps", the most	 frequent  base	 at  each  residue  is
	      printed,	with  the  base	 frequency indicated by	grey-scale. In
	      "filename_n.ps", the frequencies of the bases a,c,g,u  are  pro-
	      portional	 to  the radius	of circles that	are arranged clockwise
	      on the corners of	a square, starting at the upper	 left  corner.
	      Additionally,  these  circles  are colored red, green, blue, ma-
	      genta for	the bases a,c,g,u, respectively. The  frequency	 of  a
	      gap  is  proportional to a black circle growing at the center of
	      the square.

	      Parameters		   --2d_hidebasenum,--2d_basenuminter-
	      val=n,--2d_grey,--2d_scale=double	 effect	the drawings of	align-
	      ments and	consensus structures as	implied	by their names.

       --score
	      Only  the	 optimal score of an alignment is printed. This	option
	      is useful	when RNA-forester is called by	another	 program  that
	      only needs a similarity or distance value.

       --fasta
	      Alignments are printed in	Fasta format

REFERENCES
       [1] Jiang T, Wang J T L and Zhang K, (1995) Alignment of	Trees -	An Al-
       ternative to Tree Edit, Theoretical Computer Science 143(1), 137-148

       [2] Hoechsmann M, Toeller T, Giegerich R	and Kurtz S, (2003) Local Sim-
       ilarity	of  RNA	Secondary Structures, Proc. of the IEEE	Bioinformatics
       Conference (CSB 2003), 159-168

       [3] Ivo L. Hofacker, Walter Fontana, Peter  F.  Stadler,	 L.  Sebastian
       Bonhoeffer, Manfred Tacker, and Peter Schuster, (1994) Fast Folding and
       Comparison of RNA Secondary Structures, Monatsh.Chem. 125: 167-188.

       [4]  Klein R.J. and Eddy	S.R., (2003) RSEARCH: finding homologs of sin-
       gle structured RNA sequences, BMC Bioinformatics. 2003  Sep  22;4(1):44

VERSION
       This man	page documents version 1.4 of RNAforester.

AUTHORS
       Matthias	Hoechsmann

BUGS
       I   hope	  you	wouldn't  find	them.	Comments  should  be  sent  to
       mhoechsm@techfak.uni-bielefeld.de

				 November 2017		    RNAForester(2.0.1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=RNAforester&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help