Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
pgreplay(1)		     PostgreSQL	Utilities		   pgreplay(1)

NAME
       pgreplay	- PostgreSQL log file replayer for performance tests

SYNOPSIS
       pgreplay	[parse options]	[replay	options] [-d level] [infile]
       pgreplay	-f [parse options] [-o outfile]	[-d level] [infile]
       pgreplay	-r [replay options] [-d	level] [infile]

DESCRIPTION
       pgreplay	reads a	PostgreSQL log file (not a WAL file), extracts the SQL
       statements  and	executes  them	in  the	 same  order and relative time
       against a PostgreSQL database cluster.  A final report gives you	a use-
       ful statistical analysis	of your	workload and its execution.

       In the first form, the log file infile is replayed at the  time	it  is
       read.

       With the	-f option, pgreplay will not execute the statements, but write
       them  to	 a  `replay  file' outfile that	can be replayed	with the third
       form.

       With the	-r option, pgreplay will execute the statements	in the	replay
       file infile that	was created by the second form.

       If  the	execution of statements	gets behind schedule, warning messages
       are issued that indicate	that the server	cannot handle the  load	 in  a
       timely  fashion.	  The idea is to replay	a real-world database workload
       as exactly as possible.

       To create a log file that can be	parsed by pgreplay, you	 need  to  set
       the following parameters	in postgresql.conf:

	      log_min_messages=error (or more)
	      log_min_error_statement=log (or more)
	      log_connections=on
	      log_disconnections=on
	      log_line_prefix='%m|%u|%d|%c|' (if you don't use CSV logging)
	      log_statement='all'
	      lc_messages must be set to English (encoding does	not matter)
	      bytea_output=escape  (from  version  9.0 on, only	if you want to
	      replay the log on	8.4 or earlier)

       The database cluster against which you replay the SQL  statements  must
       be  a  clone  of	 the database cluster that generated the logs from the
       time immediately	before the logs	were generated.

       pgreplay	is useful for performance tests, particularly in the following
       situations:

       *   You want to compare the performance of your PostgreSQL  application
	   on different	hardware or different operating	systems.

       *   You	want  to  upgrade your database	and want to make sure that the
	   new database	version	does not suffer	from  performance  regressions
	   that	affect you.

       Moreover, pgreplay can give you some feeling as to how your application
       might  scale  by	allowing you to	try to replay the workload at a	higher
       speed.  Be warned, though, that 500 users working at  double  speed  is
       not really the same as 1000 users working at normal speed.

OPTIONS
   Parse options:
       -c     Specifies	that the log file is in	'csvlog' format	(highly	recom-
	      mended) and not in 'stderr' format.

       -b timestamp
	      Only  log	 entries  greater  or  equal to	that timestamp will be
	      parsed.  The format is YYYY-MM-DD	HH:MM:SS.FFF like in  the  log
	      file.  An	optional time zone part	will be	ignored.

       -e timestamp
	      Only log entries less or equal to	that timestamp will be parsed.
	      The  format is YYYY-MM-DD	HH:MM:SS.FFF like in the log file.  An
	      optional time zone part will be ignored.

       -q     Specifies	that a backslash in a simple string literal  will  es-
	      cape  the	following single quote.	 This depends on configuration
	      options like standard_conforming_strings and is the default  for
	      server version 9.0 and less.

       -D database
	      Only  log	 entries  related  to  the  specified database will be
	      parsed (this option can be specified  multiple  times  for  more
	      than one database).

       -U username
	      Only  log	 entries  related  to  the  specified username will be
	      parsed (this option can be specified  multiple  times  for  more
	      than one user).

   Replay options:
       -h hostname
	      Host  name  where	the target database cluster is running (or di-
	      rectory where the	UNIX socket can	be found). Defaults  to	 local
	      connections.
	      This works just like the -h option of psql.

       -p port
	      TCP port where the target	database cluster can be	reached.

       -W password
	      By default, pgreplay assumes that	the target database cluster is
	      configured  for trust authentication. With the -W	option you can
	      specify a	password that will be used for all users in the	 clus-
	      ter.

       -s factor
	      Speed  factor  for  replay,  by default 1. This can be any valid
	      positive floating	point number. A	factor less than 1 will	replay
	      the workload in `slow motion', while a  factor  greater  than  1
	      means `fast forward'.

       -E encoding
	      Specifies	 the  encoding	of the log file, which will be used as
	      client encoding during replay. If	it is  omitted,	 your  default
	      client encoding will be used.

       -j     If  all connections are idle, jump ahead to the next request in-
	      stead of sleeping. This will speed up replay.  Execution	delays
	      will still be reported correctly,	but replay statistics will not
	      contain the idle time.

       -X options
	      Extra  connection	options	for replay connections.	 These must be
	      libpq connection options specified in the	 format	 `option=value
	      [...]'.

   Output options:
       -o outfile
	      specifies	 the  replay file where	the statements will be written
	      for later	replay.

   Debug options:
       -d level
	      Specifies	the trace level	(between 1 and 3).  Increasing	levels
	      will  produce  more  detailed information	about what pgreplay is
	      doing.

       -v     Prints the program version and exits.

ENVIRONMENT
       PGHOST Specifies	the default value for the -h option.

       PGPORT Specifies	the default value for the -p option.

       PGCLIENTENCODING
	      Specifies	the default value for the -E option.

LIMITATIONS
       pgreplay	can only replay	what is	logged by PostgreSQL.  This  leads  to
       some limitations:

       *   COPY	statements will	not be replayed, because the copy data are not
	   logged.

       *   Fast-path  API  function  calls  are	not logged and will not	be re-
	   played.  Unfortunately, this	includes the Large Object API.

       *   Since the log file is always	in the server encoding (which you  can
	   specify  with  the  -E switch of pgreplay), all SET client_encoding
	   statements will be ignored.

       *   Since the preparation time of prepared  statements  is  not	logged
	   (unless  log_min_messages is	debug2 or more), these statements will
	   be prepared immediately before they are first executed  during  re-
	   play.

       *   Because  the	 log file contains only	text, query parameters and re-
	   turn	values will always be in text and never	in binary  format.  If
	   you	use  binary mode to, say, transfer large binary	data, pgreplay
	   can cause significantly more	network	traffic	than the original run.

       *   Sometimes, if a connection takes longer to complete,	the session ID
	   unexpectedly	changes	in the PostgreSQL log file. This causes	 pgre-
	   play	 to  treat  the	session	as two different ones, resulting in an
	   additional connection. This is arguably a bug in PostgreSQL.

AUTHOR
       Written by Laurenz Albe <laurenz.albe@wien.gv.at>.

Jun 2011							   pgreplay(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=pgreplay&sektion=1&manpath=FreeBSD+Ports+15.0>

home | help