Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
samtools-fasta(1)	     Bioinformatics tools	     samtools-fasta(1)

NAME
       samtools-fasta,	samtools-fastq - converts a SAM/BAM/CRAM file to FASTA
       or FASTQ

SYNOPSIS
       samtools	fastq [options]	in.bam
       samtools	fasta [options]	in.bam

DESCRIPTION
       Converts	a BAM or CRAM into either FASTQ	or FASTA format	 depending  on
       the  command invoked. The files will be automatically compressed	if the
       file names have a .gz, .bgz, or .bgzf extension.

       Note this command is attempting to reverse the alignment	process, so if
       the aligner took	a single input FASTQ and produced multiple SAM records
       via supplementary and/or	secondary alignments, then converting back  to
       FASTQ  again  should  produce the original single FASTA / FASTQ record.
       By default it will not attempt to records for  supplementary  and  sec-
       ondary alignments, but see the -F option	for more details.

       If the input contains read-pairs	which are to be	interleaved or written
       to  separate  files  in	the same order,	then the input should be first
       collated	by name.  Use samtools collate or samtools sort	-n  to	ensure
       this.

       For  each  different QNAME, the input records are categorised according
       to the state of the READ1 and READ2 flag	bits.	The  three  categories
       used are:

       1 : Only	READ1 is set.

       2 : Only	READ2 is set.

       0 : Either both READ1 and READ2 are set;	or neither is set.

       The  exact  meaning of these categories depends on the sequencing tech-
       nology used.  It	is expected that ordinary single  and  paired-end  se-
       quencing	reads will be in categories 1 and 2 (in	the case of paired-end
       reads,  one  read of the	pair will be in	category 1, the	other in cate-
       gory 2).	 Category 0 is essentially a "catch-all" for reads that	do not
       fit into	a simple paired-end sequencing model.

       For each	category only one sequence will	be written for a given	QNAME.
       If  more	 than  one record is available for a given QNAME and category,
       the first in input file order that has quality values will be used.  If
       none of the candidate records has quality values, then the first	in in-
       put file	order will be used instead.

       Sequences will be written to standard output unless one of the -1,  -2,
       -o,  or	-0  options is used, in	which case sequences for that category
       will be written to the specified	file.  The same	filename may be	speci-
       fied with multiple options, in which case the sequences will be	multi-
       plexed in order of occurrence.

       If  a  singleton	file is	specified using	the -s option then only	paired
       sequences will be output	for categories 1 and 2;	 paired	 meaning  that
       for  a  given  QNAME there are sequences	for both category 1 and	2.  If
       there is	a sequence for only one	of categories 1	or 2 then it  will  be
       diverted	 into the specified singletons file.  This can be used to pre-
       pare fastq files	for programs that cannot handle	a  mixture  of	paired
       and singleton reads.

       The  -s	option	only affects category 1	and 2 records.	The output for
       category	0 will be the same irrespective	of the use of this option.

       The sequence generated will be for the entire sequence recorded in  the
       SAM  record  (and  quality if appropriate).  This means if it has soft-
       clipped CIGAR records then the soft-clipped data	will be	in the	output
       FASTA/FASTQ.   Hard-clipped data	is, by definition, absent from the SAM
       record and hence	will be	absent in any FASTA/FASTQ produced.

       The filter options order	of precedence is -d/-D,	-f, -F,	--rf and -G.

OPTIONS
       -n      By default, either '/1' or '/2' is added	to  the	 end  of  read
	       names  where  the corresponding READ1 or	READ2 FLAG bit is set.
	       Using -n	causes read names to be	left as	they are.

       -N      Always add either '/1' or '/2' to the end of  read  names  even
	       when put	into different files.

       -O      Use quality values from OQ tags in preference to	standard qual-
	       ity string if available.

       -s FILE Write singleton reads to	FILE.

       -t      Copy  RG,  BC and QT tags to the	FASTQ header line, if they ex-
	       ist.

       -T TAGLIST
	       Specify a comma-separated list of tags to  copy	to  the	 FASTQ
	       header line, if they exist.  TAGLIST can	be blank or * to indi-
	       cate  all  tags should be copied	to the output.	If using *, be
	       careful to quote	it to avoid unwanted shell expansion.

       -1 FILE Write reads with	the READ1 FLAG set (and	READ2 not set) to FILE
	       instead of outputting them.  If the -s  option  is  used,  only
	       paired reads will be written to this file.

       -2 FILE Write reads with	the READ2 FLAG set (and	READ1 not set) to FILE
	       instead	of  outputting	them.	If the -s option is used, only
	       paired reads will be written to this file.

       -o FILE Write reads with	either READ1 FLAG or READ2 flag	 set  to  FILE
	       instead of outputting them to stdout.  This is equivalent to -1
	       FILE -2 FILE.

       -0 FILE Write  reads where the READ1 and	READ2 FLAG bits	set are	either
	       both set	or both	unset to FILE instead of outputting them.

       -f INT  Only output alignments with all bits set	in INT present in  the
	       FLAG field.  INT	can be specified in hex	by beginning with `0x'
	       (i.e.  /^0x[0-9A-F]+/)  or in octal by beginning	with `0' (i.e.
	       /^0[0-7]+/) [0].

       -F INT, ,--excl-flags INT ,--exclude-flags INT
	       Do not output alignments	with any bits set in  INT  present  in
	       the  FLAG field.	 INT can be specified in hex by	beginning with
	       `0x' (i.e. /^0x[0-9A-F]+/) or in	octal by  beginning  with  `0'
	       (i.e. /^0[0-7]+/) [0x900].  This	defaults to 0x900 representing
	       filtering of secondary and supplementary	alignments.

       --rf INT	, --incl-flags INT, --include-flags INT
	       Only  output alignments with any	bits set in INT	present	in the
	       FLAG field.  INT	can be specified in hex	by beginning with `0x'
	       (i.e. /^0x[0-9A-F]+/), in octal by  beginning  with  `0'	 (i.e.
	       /^0[0-7]+/), as a decimal number	not beginning with '0' or as a
	       comma-separated list of flag names [0].

       -G INT  Only  EXCLUDE  reads with all of	the bits set in	INT present in
	       the FLAG	field.	INT can	be specified in	hex by beginning  with
	       `0x'  (i.e.  /^0x[0-9A-F]+/)  or	in octal by beginning with `0'
	       (i.e. /^0[0-7]+/) [0].

       -d TAG[:VAL]
	       Only output alignments containing  an  auxiliary	 tag  matching
	       both  TAG  and  VAL.   If  VAL is omitted then any value	is ac-
	       cepted.	The tag	types supported	are i, f, Z, A and H.  "B" ar-
	       rays are	not supported.	This is	comparable to the method  used
	       in  samtools  view  -d.	 The  option may be specified multiple
	       times and is equivalent to using	the -D option.

       -D TAG:FILE
	       Only output alignments containing an auxiliary tag matching TAG
	       and having a value listed in FILE.  The format of the  file  is
	       one line	per value.  This is equivalent to specifying -d	multi-
	       ple times.

       -i      add  Illumina  Casava  1.8 format entry to header (eg 1:N:0:AT-
	       CACG)

       -c [0..9]
	       set compression level when writing gz or	bgzf fastq files.

       --i1 FILE
	       write first index reads to FILE

       --i2 FILE
	       write second index reads	to FILE

       --barcode-tag TAG
	       aux tag to find index reads in [default:	BC]

       --quality-tag TAG
	       aux tag to find index quality in	[default: QT]

       -@, --threads INT
	       Number of input/output compression threads to use  in  addition
	       to main thread [0].

       --index-format STR
	       string  to  describe how	to parse the barcode and quality tags.
	       For example:

	       i14i8   the first 14 characters are index 1, the	next 8 charac-
		       ters are	index 2

	       n8i14   ignore the first	8 characters,  and  use	 the  next  14
		       characters for index 1

		       If  the tag contains a separator, then the numeric part
		       can be replaced with '*'	to mean	'read until the	 sepa-
		       rator or	end of tag', for example:

	       n*i*    ignore  the  left  part of the tag until	the separator,
		       then use	the second part

EXAMPLES
       Starting	from a coordinate sorted file, output paired reads to separate
       files, discarding singletons, supplementary and secondary  reads.   The
       resulting files can be used with, for example, the bwa aligner.

	   samtools collate -u -O in_pos.bam | \
	   samtools fastq -1 paired1.fq	-2 paired2.fq -0 /dev/null -s /dev/null	-n

       Starting	 with  a name collated file, output paired and singleton reads
       in a single file, discarding supplementary and secondary	reads.	To get
       all of the reads	in a single file, it is	necessary to redirect the out-
       put of samtools fastq.  The output file is suitable for	use  with  bwa
       mem  -p	which  understands  interleaved	 files containing a mixture of
       paired and singleton reads.

	   samtools fastq -0 /dev/null in_name.bam > all_reads.fq

       Output paired reads in a	single file, discarding	supplementary and sec-
       ondary reads.  Save any singletons in a separate	file.  Append  /1  and
       /2  to  read names.  This format	is suitable for	use by NextGenMap when
       using its -p and	-q options.  With this aligner,	paired reads  must  be
       mapped separately to the	singletons.

	   samtools fastq -0 /dev/null -s single.fq -N in_name.bam > paired.fq

BUGS
       o The way of specifying output files is far too complicated and easy to
	 get wrong.

AUTHOR
       Written	by  Heng Li, with modifications	by Martin Pollard and Jennifer
       Liddle, all from	the Sanger Institute.

SEE ALSO
       samtools(1), samtools-faidx(1), samtools-fqidx(1) samtools-import(1)

       Samtools	website: <http://www.htslib.org/>

samtools-1.21		       12 September 2024	     samtools-fasta(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=samtools-fasta&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help