Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
GENERAND(1)		    General Commands Manual		   GENERAND(1)

NAME
       generand	- Generate random genomics data	in various formats

SYNOPSIS
       generand	fasta sequences	sequence-length
       generand	fastq sequences	sequence-length
       generand	sam chromosomes	alignments-per-chromosome sequence-length
       generand	vcf chromosomes	calls-per-chromosome samples

PURPOSE
       generand	 is  a simple program to rapidly generate random genomics data
       streams in common formats such as FASTA,	FASTQ, SAM, and	VCF.

       This may	be useful for generating very short examples for academic pur-
       poses or	large streams for testing and benchmarking genomics programs.

DESCRIPTION
       generand	fast[aq] sequences sequence-length generates a FASTA or	 FASTQ
       stream of "sequences" sequences,	each of	length "sequence-length".  The
       sequence	 content  is  random  with a uniform distribution of bases, so
       that GC content should be very close to 50%.

       PHRED scores in FASTQ streams are generated in blocks of	 equal	scores
       and are mostly high-quality.  The last few scores are lower quality and
       independent  to	simulate  Illumina  sequencing,	where quality tends to
       drop near the end of each read.

       generand	sam chromosomes	alignments-per-chromosome sequence-length gen-
       erates a	SAM stream with	chromosomes * alignments-per-chromosome	 total
       alignments.   It	 outputs  increasing indexes for QNAME and CHROM, ran-
       domly increasing	POS, random QUAL  scores,  and	random	sequences  and
       PHRED scores as stated for FASTQ	above.

       generand	 vcf  chromosomes calls-per-chromosome samples generates a VCF
       stream with chromosomes * calls-per-chromosome calls.  It outputs chro-
       mosomes with increasing indexes,	 randomly  increasing  POS,  uniformly
       random  REF  and	 ALT,  uniformly random	QUAL scores, and random	sample
       columns including GT (genotype),	AD (allelic  depth)  and  DP  (depth).
       REF counts are always >=	ALT counts in the AD data and DP = REF count +
       ALT count.

SEE ALSO
       bcftools, fastqc, samtools, vcftools

AUTHOR
       J. Bacon

								   GENERAND(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=generand&sektion=1&manpath=FreeBSD+Ports+15.0>

home | help