Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
samtools-collate(1)	     Bioinformatics tools	   samtools-collate(1)

NAME
       samtools	collate	- shuffles and groups reads together by	their names

SYNOPSIS
       samtools	collate	[options] in.sam|in.bam|in.cram	[<prefix>]

DESCRIPTION
       Shuffles	 and  groups reads together by their names.  A faster alterna-
       tive to a full query name sort, collate ensures that reads of the  same
       name  are  grouped  together in contiguous groups, but doesn't make any
       guarantees about	the order of read names	between	groups.

       The output from this command should be suitable for any operation  that
       requires	all reads from the same	template to be grouped together.

       Temporary  files	 are written to	<prefix>, specified either as the last
       argument	or with	the -T option.	If prefix is unspecified then one will
       be derived from the output filename (-o option).	 If no output file was
       given then the TMPDIR environment variable will be used,	and finally if
       that is unset then "/tmp" is used.

       Conversely, if prefix is	specified but  no  output  filename  has  been
       given  then the output will be written to <prefix>.<fmt>	where <fmt> is
       appropriate to the file format is use (e.g. "bam" or "cram").

       Using -f	for fast mode will output only primary	alignments  that  have
       either  the  READ1 or READ2 flags set (but not both).  Any other	align-
       ment records will be filtered out.  The collation will only  work  cor-
       rectly  if  there  are no more than two reads for any given QNAME after
       filtering.

       Fast mode keeps a buffer	of alignments in memory	so that	it  can	 write
       out  most  pairs	 as  soon as they are found instead of storing them in
       temporary files.	 This allows collate to	avoid some work	and so	finish
       more  quickly  compared to the standard mode.  The number of alignments
       held can	be changed using -r, storing more alignments uses more	memory
       but increases the number	of pairs that can be written early.

       While collate normally randomises the ordering of read pairs, fast mode
       does  not.   Position-dependent biases that would normally be broken up
       can remain in the fast collate output.  It is therefore not a good idea
       to use fast mode	when preparing data for	programs that expect  randomly
       ordered	paired	reads.	 For example using fast	collate	instead	of the
       standard	mode may lead to significantly different results from aligners
       that estimate library insert sizes on batches of	reads.

OPTIONS
       -O      Output to stdout.  This option cannot be	used with -o.

       -o FILE Write output to FILE.  This option cannot be used with -O.   If
	       unspecified  and	 -O is not set,	the temporary file <prefix> is
	       used, appended by the the appropriate file-format suffix.

       -T PREFIX
	       Use PREFIX for temporary	files.	This is	the same as specifying
	       PREFIX as the last argument on the command line.	  This	option
	       is included for consistency with	samtools sort.

       -u      Write uncompressed BAM output

       -l INT  Compression level.  [1]

       -n INT  Number of temporary files to use.  [64]

       -f      Fast mode (primary alignments only).

       -r INT  Number of reads to store	in memory (for use with	-f).  [10000]

       --no-PG Do not add a @PG	line to	the header of the output file.

       -@, --threads INT
	       Number  of  input/output	compression threads to use in addition
	       to main thread [0].

AUTHOR
       Written by Heng Li from the Sanger Institute  and  extended  by	Andrew
       Whitwham.

SEE ALSO
       samtools(1), samtools-sort(1)

       Samtools	website: <http://www.htslib.org/>

samtools-1.21		       12 September 2024	   samtools-collate(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=samtools-collate&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help