FreeBSD Manual Pages

home | help
LALIGN/PLALIGN(1)	    General Commands Manual	     LALIGN/PLALIGN(1)

NAME
       lalign  - compare two protein or	DNA sequences for local	similarity and
       show the	local sequence alignments

       plalign,flalign - compare two sequences for local similarity  and  plot
       the local sequence alignments

SYNOPSIS
       lalign [-EKfgiImnNOQqrRswxZ] sequence-file-1 sequence-file-2
       plalign [-EKfgiImnNQqrRsvwxZ] sequence-file-1 sequence-file-2

DESCRIPTION
       lalign and plalign programs compare two sequences looking for local se-
       quence similarities.  lalign/plalign use	code developed by X. Huang and
       W.  Miller  (Adv. Appl. Math. (1991) 12:337-357)	for the	"sim" program.
       (Version	2.1 uses sim2 code.)  While  ssearch  reports  only  the  best
       alignment  between  the query sequence and the library sequence,	lalign
       and plalign will	report all the alignments with	pair-wisse  probabili-
       ties  <	0.05  (default,	 modified with -E #) between the two sequences
       lalign shows the	actual local alignments	between	the two	sequences  and
       their  scores,  while  plalign  produces	 a plot	of the alignments that
       looks similar to	a `dot-matrix'	homology  plot.	  On  Unixtm  systems,
       plalign	generates  postscript  output.	flalign	generates graphic com-
       mands for the GCG "figure" program.

       Probability estimates for the lalign/plalign/flalign programs are based
       on the parameters provided by Altschul and Gish (1996)  Meth.  Enzymol.
       266:460-480.   These  parameters	 are available for BLOSUM50, BLOSUM62,
       and PAM250 scoring matrices with	specific gap penalties,	and  also  for
       DNA  comparison	with  a	gap penalty of -16, -4.	 Probability estimates
       are not available for other scoring matrices and	gap penalties.

       The E(10,000) values reported with the  alignments  are	the  pairwise-
       alignment  probabilities	multiplied by 10,000. These estimates approxi-
       mate the	significance from a search of a	10,000 entry  database.	  They
       differ  from the	-E 0.05	initial	theshold by the	same factor of 10,000.
       This is an unfortunate inconsistency, but I believe that	it is  helpful
       to provide the perspective of a database	search.

       The  lalign/plalign/fasta  programs use a standard text format sequence
       file.  Lines beginning with '>' or ';' are considered comments and  ig-
       nored;  sequences  can be upper or lower	case, blanks,tabs and unrecog-
       nizable characters are ignored.	lalign/plalign expect sequences	to use
       the single letter amino acid codes, see protcodes(1) .

OPTIONS
       lalign and the other programs can be directed to	change the scoring ma-
       trix, search parameters,	output format, and default search  directories
       by  entering  options  on the command line (preceeded by	a `-').	All of
       the options should preceed the file name	and  ktup  arguments).	Alter-
       nately,	these options can be changed by	setting	environment variables.
       The options and environment variables are:

       -E #   Pairwise-probability limit (default -E 0.05).

       -K #   maximum number of	alignments to be shown (default	-K 50).

       -f #   Penalty for the first residue a gap (-14 by default).

       -g #   Penalty for each additional residue in a gap (-4 by default).

       -i     Compare the reverse complement (DNA only).

       -I     Show alignment between identical sequences.  Normally, the iden-
	      tity alignment is	not shown.

       -m #   (MARKX) =1,2,3. Alternate	display	of matches and	mismatches  in
	      alignments.  MARKX=1  uses ":","."," ", for identities, conseva-
	      tive replacements, and  non-conservative	replacements,  respec-
	      tively.  MARKX=2	uses  "	","x", and "X".	 MARKX=3 does not show
	      the second sequence, but uses the	second alignment line to  dis-
	      play  matches  with  a "."  for identity,	or with	the mismatched
	      residue for mismatches.  MARKX=3 is useful  for  aligning	 large
	      numbers of similar sequences.

       -n     pre-specify DNA sequence,	rather than infer from	sequence.

       -N #   limit first and second sequences to '#' residues.

       -s str (SMATRIX)	 the  filename	of an alternative scoring matrix file.
	      For protein sequences, BLOSUM50 is used by default;  PAM250  can
	      be  used with the	command	line option -s P250, BLOSUM62 with "-s
	      BL62".

       -v str (LINEVAL)	(plalign only) plalign can use up to 4 different  line
	      styles  to  denote  the  scores of local alignments.  The	scores
	      that correspond to these line styles can be specified  with  the
	      environment  variable  LINVAL, or	with the -v option.  In	either
	      case, a string with three	numbers	separated by spaces should  be
	      given.   This  string  must  be  surrounded  by double quotation
	      marks.  For example, LINEVAL="200	100 50"	tells plalign  to  use
	      solid  lines  for	local alignments with scores greater than 200,
	      long dashed lines	for scores between 100 and 200,	 short	dashed
	      lines for	scores between 50 and 100, and dotted lines for	scores
	      less than	50.
		   plalign -v "200 100 50"
	      Normally,	 the  values are 200, 100, and 50 for protein sequence
	      comparisons and 400, 200,	and 100	for DNA	sequence comparisons.

       -w #   (LINLEN) output line length for sequence alignments.   (normally
	      60, can be set up	to 200).

EXAMPLES
       (1)    lalign mchu.aa mchu.aa

       Compare the amino acid sequence in the file mchu.aa with	itself and re-
       port  the  ten  best  local alignments.	Sequence files should have the
       form:

	    >MCHU - Calmodulin - Human ...
	    ADQLTEEQIAEF ...

       (2)    plalign -K 100 -E	0.01 qrhuld.aa egmsmg.aa

       Display up to 100 local alignments of the LDL receptor (qrhuld.aa) with
       epidermal growth	factor precursor (egmsmg.aa) with pairwise  probabili-
       ties better than	0.01.  Plot the	results	on the screen.

       (3)    lalign

       Run  the	 lalign	 program in interactive	mode.  The program will	prompt
       for the name of two sequence files and  the  number  of	alignments  to
       show.

SEE ALSO
       ssearch(1), prss(1), fasta(1), protcodes(5), dnacodes(5)

AUTHOR
       Bill Pearson
       wrp@virginia.EDU

				     local		     LALIGN/PLALIGN(1)
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=lalign&sektion=1&manpath=FreeBSD+Ports+15.0.quarterly>
home | help
Header And Logo

Peripheral Links

Site Navigation

FreeBSD Manual Pages

Header And Logo

Peripheral Links

Search

Site Navigation

FreeBSD Manual Pages