Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
AD2VCF(1)		    General Commands Manual		     AD2VCF(1)

NAME
       AD2VCF -	Extract	allelic	depth from a SAM stream	and add	to VCF files

SYNOPSIS
       ad2vcf file.vcf minimum-MAPQ < file.sam
       samtools	view [flags] file.{bam|cram} | ad2vcf file.vcf minimum-MAPQ

PURPOSE
       ad2vcf  is  designed  to	augment	VCF files lacking allelic depth	(AD in
       FORMAT) with allelic depth extracted from a SAM	stream	for  the  same
       sample used to generate the VCF.

       It  was	specifically created for a TOPMed project to augment dbGaP BCF
       files lacking AD	info in	preparation for	use with haplohseq, which  re-
       quires either NGS data with AD info or micro-array data.

DESCRIPTION
       ad2vcf scans a single-sample VCF	file one call at a time	and simultane-
       ously  scans the	SAM stream for the same	sample,	extracting the alleles
       from the	read sequence for each call position in	the VCF.

       Output is compressed using xz and saved in file-ad.vcf.xz.

       Both files must be sorted by chromosome and  position.	Otherwise,  it
       would  not  be possible to perform this task in a single	pass.  Since a
       single read in the SAM stream could overlap  multiple  VCF  calls,  SAM
       alignments  are	cached until the next call in the VCF stream is	beyond
       the end of the read. Usually this results in no more than a  few	 thou-
       sand  SAM  alignments  being  held  in memory at	once, but in very rare
       cases, we've seen hundreds of thousands for a particular	sample.

       Processing a single-sample human	VCF and	CRAM file from SRA takes about
       15 to 30	minutes	on a typical processor,	with CRAM decoding being a ma-
       jor bottleneck due to  it's  complex  compression.   For	 this  reason,
       ad2vcf was intentionally	designed as a filter, so that it can utilize a
       second  core  while samtools view saturates the first core decoding the
       CRAM file.

       If is advisable to filter out questionable  alignments  before  feeding
       data  to	ad2vcf.	 For example, samtools can remove unmapped, secondary,
       qcfail, dup, and	supplementary  alignments  using  --excl-flags	0xF0C.
       This  will  greatly  reduce  the	 number	of buffered alignments in rare
       cases, preventing runaway memory	use  to	 buffer	 alignments  that  are
       probably	not useful.

SEE ALSO
       vcf-split, haplohseq

BUGS
       Please  report bugs to the author and send patches in unified diff for-
       mat.  (man diff for more	information)

AUTHOR
       J. Bacon

								     AD2VCF(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=ad2vcf&sektion=1&manpath=FreeBSD+Ports+15.0>

home | help