Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
RDF2(1)			    General Commands Manual		       RDF2(1)

NAME
       prdf - test a protein sequence similarity for significance

SYNOPSIS
       prdf  [-f  #  -g	 # -h -k # -O filename -s SMATRIX -w window-size ] se-
       quence-file-1 sequence-file-2 [ ktup ] [	#-of-shuffles ]

       prdf [-fghks] - interactive mode

DESCRIPTION
       prdf is used to evaluate	the significance of a protein  sequence	 simi-
       larity score by comparing two sequences and calculating initial and op-
       timized similarity scores, and then repeatedly shuffling	the second se-
       quence,	and  calculating  the  initial	and optimized scores.  Extreme
       value distributions are then fit	to each	of the three distributions  of
       scores.	 The  characteristic parameters	of the extreme value distribu-
       tion are	then used to estimate the probability that each	of the unshuf-
       fled sequence scores would be obtained by chance	in one sequence, or in
       a number	of sequences equal to the number of shuffles.  This program is
       derived from rdf2, which	was described  by  Pearson  and	 Lipman,  PNAS
       (1988)  85:2444-2448,  and Pearson (Meth. Enz.  183:63-98).  Use	of the
       extreme value distribution for estimating the probabilities of similar-
       ity  scores  was	 described  by	Altshul	 and   Karlin,	 PNAS	(1990)
       87:2264-2268.  The 'z-values' calculated	by rdf2	are not	as informative
       as the P-values and expectations	calculated by prdf.

       prdf also allows	a more sophisticated shuffling method: residues	can be
       shuffled	 within	 a  local  window, so that the order of	residues 1-10,
       11-20, etc, is destroyed	but a residue in the first 10 is never swapped
       with a residue outside the first	ten, and so on for each	local window.

EXAMPLES
       (1)    prdf -w 10 musplfm.aa lcbo.aa 1 250

       Compare the amino acid sequence in the file  musplfm.aa	with  that  in
       lcbo.aa,	 then  shuffle	lcbo.aa	250 times using	a local	shuffle	with a
       window of 10 and	calculate initial and optimized	similarity scores  us-
       ing  Ktup  = 1.	Report the significance	of the unshuffled musplfm/lcbo
       comparison scores with respect to the shuffled scores.

       (2)    prdf musplfm.aa lcbo.aa 2

       Compare the amino acid sequence in the file  musplfm.aa	with  the  se-
       quences in the file lcbo.aa using ktup =	2.

       (3)    prdf

       Run  prdf  in  interactive  mode.  The program will prompt for the file
       name of the two query sequence files, the ktup, and the number of shuf-
       fles to be used.	 100 shuffles are calculated by	 default;  250	-  500
       shuffles	should provide more accurate probability estimates.

OPTIONS
       prss  can  be directed to change	the scoring matrix, gap	penalties, and
       shuffle parameters by entering options on the command  line  (preceeded
       by  a  `-'). All	of the options should preceed the file names number of
       shuffles.

       -f #   Penalty for the first residue in a gap (-12 by default).

       -g #   Penalty for additional residues in a gap (-2 by default).

       -h     Do not display histogram of similarity scores.

       -k #   (GAPCUT) Sets the	threshold for joining the initial regions  for
	      calculating the initn score.

       -Q -q  "quiet" -	do not prompt for filename.

       -O filename
	      send copy	of results to "filename."

       -s str (SMATRIX)	 the  filename	of an alternative scoring matrix file.
	      For protein sequences, BLOSUM50 is used by default;  PAM250  can
	      be  used	with  the  command  line  option  -s  250(or  with  -s
	      pam250.mat).

SEE ALSO
       fasta(1),lfasta(1),prss(1),protcodes(5)

AUTHOR
       Bill Pearson
       wrp@virginia.EDU

       The curve fitting routines in rweibull.c	were provided by  Phil	Green,
       Washington U., St. Louis.

				     local			       RDF2(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=prdf&sektion=1&manpath=FreeBSD+Ports+15.0>

home | help