Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
RSS(1)			    General Commands Manual			RSS(1)

NAME
       prss - test a protein sequence similarity for significance

SYNOPSIS
       prss  [-Q  -f  #	 -g # -h -O file -s SMATRIX -w # ] sequence-file-1 se-
       quence-file-2 [ #-of-shuffles ]

       prss [-fghsw] - interactive mode

DESCRIPTION
       prss is used to evaluate	the significance of a protein  sequence	 simi-
       larity  score  by comparing two sequences and calculating optimal simi-
       larity scores, and then repeatedly shuffling the	second	sequence,  and
       calculating  optimal  similarity	 scores	using the Smith-Waterman algo-
       rithm. An extreme value distribution is then fit	 to  the  shuffled-se-
       quence scores.  The characteristic parameters of	the extreme value dis-
       tribution  are  then  used to estimate the probability that each	of the
       unshuffled sequence scores would	be obtained by chance in one sequence,
       or in a number of sequences equal to the	number of shuffles.  This pro-
       gram is derived from rdf2, which	was described by Pearson  and  Lipman,
       PNAS  (1988) 85:2444-2448, and Pearson (Meth. Enz.  183:63-98).	Use of
       the extreme value distribution for estimating the probabilities of sim-
       ilarity scores  was  described  by  Altshul  and	 Karlin,  PNAS	(1990)
       87:2264-2268.  The 'z-values' calculated	by rdf2	are not	as informative
       as  the P-values	and expectations calculated by prdf.  prss uses	calcu-
       lates optimal scores using the same rigorous  Smith-Waterman  algorithm
       (Smith  and  Waterman,  J.  Mol.	 Biol. (1983) 147:195-197) used	by the
       ssearch program.

       prss also allows	a more sophisticated shuffling method: residues	can be
       shuffled	within a local window, so that the  order  of  residues	 1-10,
       11-20, etc, is destroyed	but a residue in the first 10 is never swapped
       with a residue outside the first	ten, and so on for each	local window.

EXAMPLES
       (1)    prss  -w 10 musplfm.aa lcbo.aa

       Compare	the  amino  acid  sequence in the file musplfm.aa with that in
       lcbo.aa,	then shuffle lcbo.aa 100 times using a local  shuffle  with  a
       window  of  10.	Report the significance	of the unshuffled musplfm/lcbo
       comparison scores with respect to the shuffled scores.

       (2)    prss musplfm.aa lcbo.aa

       Compare the amino acid sequence in the file  musplfm.aa	with  the  se-
       quences in the file lcbo.aa.

       (3)    prss

       Run  prss  in  interactive  mode.  The program will prompt for the file
       name of the two query sequence files and	the number of shuffles	to  be
       used.   100  shuffles  are  calculated  by  default; 250	- 500 shuffles
       should provide more accurate probability	estimates.

OPTIONS
       prss can	be directed to change the scoring matrix, gap  penalties,  and
       shuffle	parameters  by entering	options	on the command line (preceeded
       by a `-'). All of the options should preceed the	file names  number  of
       shuffles.

       -f #   Penalty for the first residue in a gap (-12 by default).

       -g #   Penalty for additional residues in a gap (-2 by default).

       -h     Do not display histogram of similarity scores.

       -Q -q  "quiet" -	do not prompt for filename.

       -O filename
	      send copy	of results to "filename."

       -s str (SMATRIX)	 the  filename	of an alternative scoring matrix file.
	      For protein sequences, BLOSUM50 is used by default;  PAM250  can
	      be  used	with  the  command  line  option  -s  250(or  with  -s
	      pam250.mat).

SEE ALSO
       ssearch(1), prdf(1), fasta(1), lfasta(1), protcodes(5)

AUTHOR
       Bill Pearson
       wrp@virginia.EDU

       The curve fitting routines in rweibull.c	were provided by  Phil	Green,
       Washington U., St. Louis.

				     local				RSS(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=prss&sektion=1&manpath=FreeBSD+Ports+15.0>

home | help