Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
goaccess(1)			 User Manuals			   goaccess(1)

NAME
       goaccess	- fast web log analyzer	and interactive	viewer.

SYNOPSIS
       goaccess	[filename] [options...]	[-c][-M][-H][-q][-d][...]

DESCRIPTION
       goaccess	 GoAccess is an	open source real-time web log analyzer and in-
       teractive viewer	that runs in a terminal	in  *nix  systems  or  through
       your browser.

       It provides fast	and valuable HTTP statistics for system	administrators
       that require a visual server report on the fly.

       GoAccess	 parses	the specified web log file and outputs the data	to the
       X terminal. Features include:

       General Statistics:
	      This panel gives a summary of several metrics, such as the  num-
	      ber  of  valid  and  invalid requests, time taken	to analyze the
	      dataset, unique visitors,	requested files,  static  files	 (CSS,
	      ICO, JPG,	etc) HTTP referrers, 404s, size	of the parsed log file
	      and bandwidth consumption.

       Unique visitors
	      This panel shows metrics such as hits, unique visitors and cumu-
	      lative bandwidth per date. HTTP requests containing the same IP,
	      the  same	 date, and the same user agent are considered a	unique
	      visitor. By default, it includes web crawlers/spiders.

	      Optionally, date specificity can be set to the hour level	 using
	      --date-spec=hr  which will display dates such as 05/Jun/2016:16,
	      or to the	minute	level  producing  05/Jun/2016:16:59.  This  is
	      great  if	 you  want  to track your daily	traffic	at the hour or
	      minute level.

       Requested files
	      This panel displays the most  requested  (non-static)  files  on
	      your  web	 server.  It shows hits, unique	visitors, and percent-
	      age, along with the cumulative bandwidth,	protocol, and the  re-
	      quest method used.

       Requested static	files
	      Lists  the  most frequently static files such as:	JPG, CSS, SWF,
	      JS, GIF, and PNG file types, along with the same metrics as  the
	      last panel. Additional static files can be added to the configu-
	      ration file.

       404 or Not Found
	      Displays	the  same metrics as the previous request panels, how-
	      ever, its	data contains all pages	that were  not	found  on  the
	      server, or commonly known	as 404 status code.

       Hosts  This  panel  has	detailed  information on the hosts themselves.
	      This is great for	spotting aggressive crawlers  and  identifying
	      who's eating your	bandwidth.

	      Expanding	 the panel can display more information	such as	host's
	      reverse DNS lookup result, country of origin and city. If	the -a
	      argument is enabled, a list of user agents can be	 displayed  by
	      selecting	the desired IP address,	and then pressing ENTER.

       Operating Systems
	      This panel will report which operating system the	host used when
	      it hit the server. It attempts to	provide	the most specific ver-
	      sion of each operating system.

       Browsers
	      This  panel  will	report which browser the host used when	it hit
	      the server. It attempts to provide the most specific version  of
	      each browser.

       Visit Times
	      This  panel  will	display	an hourly report. This option displays
	      24 data points, one for each hour	of the day.

	      Optionally, hour specificity can be set to the tenth of an  hour
	      level  using  --hour-spec=min  which  will display hours as 16:4
	      This is great if you want	to  spot  peaks	 of  traffic  on  your
	      server.

       Virtual Hosts
	      This  panel  will	display	all the	different virtual hosts	parsed
	      from the access log. This	panel  is  displayed  if  %v  is  used
	      within the log-format string.

       Referrers URLs
	      If  the host in question accessed	the site via another resource,
	      or was linked/diverted to	you from another host,	the  URL  they
	      were  referred  from  will be provided in	this panel. See	`--ig-
	      nore-panel` in your configuration	file to	enable	it.   disabled
	      by default.

       Referring Sites
	      This  panel  will	 display  only the host	part but not the whole
	      URL. The URL where the request came from.

       Keyphrases
	      It reports keyphrases used on Google search, Google  cache,  and
	      Google  translate	that have lead to your web server. At present,
	      it only supports Google search queries via HTTP. See  `--ignore-
	      panel` in	your configuration file	to enable it.  disabled	by de-
	      fault.

       Geo Location
	      Determines  where	 an IP address is geographically located. Sta-
	      tistics are broken down by continent and country.	It needs to be
	      compiled with GeoLocation	support.

       HTTP Status Codes
	      The values of the	numeric	status code to HTTP requests.

       ASN    This panel displays ASN (Autonomous  System  Numbers)  data  for
	      GeoIP2 and legacy	databases. Great for detecting malicious traf-
	      fic and blocking accordingly.

       Remote User (HTTP authentication)
	      This  is the userid of the person	requesting the document	as de-
	      termined by HTTP authentication. If the document is not password
	      protected, this part will	be "-" just  like  the	previous  one.
	      This panel is not	enabled	unless %e is given within the log-for-
	      mat variable.

       Cache Status
	      If you are using caching on your server, you may be at the point
	      where  you  want	to  know  if  your request is being cached and
	      served from the cache. This panel	shows the cache	status of  the
	      object the server	served.	This panel is not enabled unless %C is
	      given within the log-format variable. The	status can be either
	       `MISS`, `BYPASS`, `EXPIRED`, `STALE`, `UPDATING`, `REVALIDATED`
	      or `HIT`

       MIME Types
	      This  panel specifies Media Types	(formerly known	as MIME	types)
	      and Media	Subtypes which will be assigned	and listed underneath.
	      This panel is not	enabled	unless %M is given within the log-for-
	      mat   variable.	See    https://www.iana.org/assignments/media-
	      types/media-types.xhtml for more details.

       Encryption Settings
	      This  panel  shows  the  SSL/TLS	protocol used along the	Cipher
	      Suites. This panel is not	enabled	unless %K is given within  the
	      log-format variable.

       NOTE:  Optionally and if	configured, all	panels can display the average
       time taken to serve the request.

STORAGE
       There are three storage options that can	be used	with GoAccess.	Choos-
       ing one will depend on your environment and needs.

       Default Hash Tables
	      In-memory	 storage  provides  better  performance	at the cost of
	      limiting the dataset size	to the amount  of  available  physical
	      memory.  GoAccess	 uses  in-memory hash tables. It has very good
	      memory usage and pretty good performance.	This storage has  sup-
	      port for on-disk persistence.

CONFIGURATION
       Multiple	 options can be	used to	configure GoAccess. For	a complete up-
       to-date list of configure options, run ./configure --help

       --enable-debug
	      Compile with debugging symbols and turn off  compiler  optimiza-
	      tions.

       --enable-utf8
	      Compile with wide	character support. Ncursesw is required.

       --enable-geoip=<legacy|mmdb>
	      Compile  with  GeoLocation support. MaxMind's GeoIP is required.
	      legacy will utilize the original	GeoIP  databases.   mmdb  will
	      utilize the enhanced GeoIP2 databases.

       --with-getline
	      Dynamically  expands line	buffer in order	to parse full line re-
	      quests instead of	using a	fixed size buffer of 4096.

       --with-openssl
	      Compile GoAccess with OpenSSL support for	its WebSocket server.

OPTIONS
       The following options can be supplied to	the command  or	 specified  in
       the  configuration  file.  If specified in the configuration file, long
       options need to be used without prepending --  and  without  using  the
       equal sign =.

   LOG/DATE/TIME FORMAT
       --time-format=<timeformat>
	      The  time-format variable	followed by a space, specifies the log
	      format time containing either a name of a	predefined format (see
	      options below) or	any combination	of regular characters and spe-
	      cial format specifiers.

	      They all begin with a percentage (%) sign. See  `man  strftime`.
	      %T or %H:%M:%S.

	      Note  that  if  a	timestamp is given in microseconds, %f must be
	      used as time-format.  If the timestamp is	given in  milliseconds
	      %* must be used as time-format.

       --date-format=<dateformat>
	      The  date-format variable	followed by a space, specifies the log
	      format time containing either a name of a	predefined format (see
	      options below) or	any combination	of regular characters and spe-
	      cial format specifiers.

	      They all begin with a percentage (%) sign. See  `man  strftime`.
	      %Y-%m-%d.

	      Note  that  if  a	timestamp is given in microseconds, %f must be
	      used as date-format.  If the timestamp is	given in  milliseconds
	      %* must be used as date-format.

       --datetime-format=<date_time_format>
	      The  date	and time format	combines the two variables into	a sin-
	      gle option. This gives the ability to get	the  timezone  from  a
	      request  and  convert  it	 to  another  timezone for output. See
	      --tz=<timezone>

	      They all begin with a percentage (%) sign. See  `man  strftime`.
	      e.g., %d/%b/%Y:%H:%M:%S %z.

	      Note that	if --datetime-format is	used, %x must be passed	in the
	      log-format variable to represent the date	and time field.

       --log-format=<logformat>
	      The log-format variable followed by a space or \t	for tab-delim-
	      ited, specifies the log format string.

	      Note  that  if  there  are  spaces within	the format, the	string
	      needs to be enclosed in single/double quotes. Inner quotes  need
	      to be escaped.

	      In  addition  to	specifying  the	raw log/date/time formats, for
	      simplicity, any of the following predefined log format names can
	      be supplied to the log/date/time-format variables. GoAccess  can
	      also handle one predefined name in one variable and another pre-
	      defined name in another variable.

		COMBINED     - Combined	Log Format,
		VCOMBINED    - Combined	Log Format with	Virtual	Host,
		COMMON	     - Common Log Format,
		VCOMMON	     - Common Log Format with Virtual Host,
		W3C	     - W3C Extended Log	File Format,
		SQUID	     - Native Squid Log	Format,
		CLOUDFRONT   - Amazon CloudFront Web Distribution,
		CLOUDSTORAGE - Google Cloud Storage,
		AWSELB	     - Amazon Elastic Load Balancing,
		AWSS3	     - Amazon Simple Storage Service (S3)
		AWSALB	     - Amazon Application Load Balancer
		CADDY	     - Caddy's JSON Structured format (local/info for-
	      mat)
		TRAEFIKCLF   - Traefik's CLF flavor

	      Note:  Generally,	 you  need  quotes  around values that include
	      white spaces, commas,  pipes,  quotes,  and/or  brackets.	 Inner
	      quotes must be escaped.

	      Note:  Piping  data  into	 GoAccess won't	prompt a log/date/time
	      configuration dialog, you	will need to previously	define	it  in
	      your configuration file or in the	command	line.

	      Note:  The default GoAccess format for CADDY is the 'local/info'
	      format. Nevertheless, if needed, you have	the option to  utilize
	      a	custom GoAccess	log format to match your particular configura-
	      tion.

   USER	INTERFACE OPTIONS
       -c --config-dialog
	      Prompt log/time/date configuration window	on program start. Only
	      when curses is initialized.

       -i --hl-header
	      Color highlight active terminal panel.

       -m --with-mouse
	      Enable mouse support on main terminal dashboard.

       ---color=<fg:bg[attrs, PANEL]>
	      Specify custom colors for	the terminal output.

	      Color Syntax
		DEFINITION space/tab colorFG#:colorBG# [attributes,PANEL]

	       FG# = foreground	color [-1...255] (-1 = default term color)
	       BG# = background	color [-1...255] (-1 = default term color)

	      Optionally,  it  is possible to apply color attributes (multiple
	      attributes are comma separated), such as:	bold, underline,  nor-
	      mal, reverse, blink

	      If  desired,  it	is  possible to	apply custom colors per	panel,
	      that is, a metric	in the REQUESTS	panel can be of	color A, while
	      the same metric in the BROWSERS panel can	be of color B.

	      Available	color definitions:
		COLOR_MTRC_HITS
		COLOR_MTRC_VISITORS
		COLOR_MTRC_DATA
		COLOR_MTRC_BW
		COLOR_MTRC_AVGTS
		COLOR_MTRC_CUMTS
		COLOR_MTRC_MAXTS
		COLOR_MTRC_PROT
		COLOR_MTRC_MTHD
		COLOR_MTRC_HITS_PERC
		COLOR_MTRC_HITS_PERC_MAX
		COLOR_MTRC_VISITORS_PERC
		COLOR_MTRC_VISITORS_PERC_MAX
		COLOR_PANEL_COLS
		COLOR_BARS
		COLOR_ERROR
		COLOR_SELECTED
		COLOR_PANEL_ACTIVE
		COLOR_PANEL_HEADER
		COLOR_PANEL_DESC
		COLOR_OVERALL_LBLS
		COLOR_OVERALL_VALS
		COLOR_OVERALL_PATH
		COLOR_ACTIVE_LABEL
		COLOR_BG
		COLOR_DEFAULT
		COLOR_PROGRESS

	      See configuration	file for a sample color	scheme.

       --color-scheme=<1|2|3>
	      Choose among color schemes.  1 for the default grey  scheme.   2
	      for  the	green scheme.  3 for the Monokai scheme	(shown only if
	      terminal supports	256 colors).

       --crawlers-only
	      Parse and	display	only crawlers (bots).

       --html-custom-css=<path/custom.css>
	      Specifies	a custom CSS file path to load in the HTML report.

       --html-custom-js=<path/custom.js>
	      Specifies	a custom JS file path to load in the HTML report.

       --html-report-title=<title>
	      Set HTML report page title and header.

       --html-refresh=<secs>
	      Refresh the HTML report every X seconds. The value has to	be be-
	      tween 1 and 60 seconds. The default is set to refresh  the  HTML
	      report every 1 second.

       --html-prefs=<JSON>
	      Set  HTML	report default preferences. Supply a valid JSON	object
	      containing the HTML preferences. It allows the ability  to  cus-
	      tomize each panel	plot. See example below.

	      Note: The	JSON object passed needs to be a one line JSON string.
	      For instance,

	      --html-prefs='{"theme":"bright","perPage":5,"layout":"horizontal","showTables":true,"visitors":{"plot":{"chartType":"bar"}}}'

       --json-pretty-print
	      Format JSON output using tabs and	newlines.

	      Note:  This  is not recommended when outputting a	real-time HTML
	      report since the WebSocket payload will much much	larger.

       --max-items=<number>
	      The maximum number of items to display per  panel.  The  maximum
	      can be a number between 1	and n.

	      Note:  Only  the	CSV  and  JSON	output	allow a	maximum	number
	      greater than the default value of	366 (or	50  in	the  real-time
	      HTML output) items per panel.

       --no-color
	      Turn off colored output. This is the default output on terminals
	      that do not support colors.

       --no-column-names
	      Don't  write column names	in the terminal	output.	By default, it
	      displays column names for	each available metric in every panel.

       --no-csv-summary
	      Disable summary metrics on the CSV output.

       --no-progress
	      Disable progress metrics [total requests/requests	per second].

       --no-tab-scroll
	      Disable scrolling	through	panels when TAB	is pressed or  when  a
	      panel is selected	using a	numeric	key.

       --no-html-last-updated
	      Do  not show the last updated field displayed in the HTML	gener-
	      ated report.

       --no-parsing-spinner
	      Do now show the progress metrics and parsing spinner.

       --tz=<timezone>
	      Outputs the report date/time data	in the	given  timezone.  Note
	      that it uses the canonical timezone name.	e.g., Europe/Berlin or
	      America/Chicago  or  Africa/Cairo	If an invalid timezone name is
	      given, the output	will be	in GMT.	See --datetime-format in order
	      to properly specify a timezone in	the date/time format.

   SERVER OPTIONS
       Note This is just a WebSocket server to provide the raw real-time data.
       It is not a WebServer itself. To	access your  reports  html  file,  you
       will  still  need  your	own HTTP server, place the generated report in
       it's document root dir and open the html	 file  in  your	 browser.  The
       browser	will  then  open another WebSocket-connection to the ws-server
       you may setup here, to keep the dashboard up-to-date.

       --addr Specify IP address to bind the server to.	Otherwise it binds  to
	      0.0.0.0.

	      Usually  there is	no need	to specify the address,	unless you in-
	      tentionally would	like to	bind the server	to a different address
	      within your server.

       --daemonize
	      Run GoAccess as daemon (only if --real-time-html enabled).

	      Note: It's important to make use of absolute paths across	 GoAc-
	      cess' configuration.

       --user-name=<username>
	      Run GoAccess as the specified user.

	      Note:  It's important to ensure the user or the users' group can
	      access the input and output files	as well	 as  any  other	 files
	      needed.	Other  groups the user belongs to will be ignored.  As
	      such it's	advised	to run GoAccess	behind a SSL proxy as it's un-
	      likely this user can access the SSL certificates.

       --origin=<url>
	      Ensure clients send the specified	origin header  upon  the  Web-
	      Socket handshake.

       --pid-file=<path/goaccess.pid>
	      Write  the  daemon PID to	a file when used along the --daemonize
	      option.

       --port=<port>
	      Specify the port to use. By default GoAccess'  WebSocket	server
	      listens on port 7890.

       --real-time-html
	      Enable real-time HTML output.

	      GoAccess uses its	own WebSocket server to	push the data from the
	      server  to  the  client. See http://gwsocket.io for more details
	      how the WebSocket	server works.

       --ws-url=<[scheme://]url[:port]>
	      URL to which the WebSocket server	responds. This is the URL sup-
	      plied to the WebSocket constructor on the	client side.

	      Optionally, it is	possible to specify the	WebSocket URI  scheme,
	      such  as	ws://  or wss:// for unencrypted and encrypted connec-
	      tions. e.g., wss://goaccess.io

	      If GoAccess is running behind a proxy, you could set the	client
	      side  to connect to a different port by specifying the host fol-
	      lowed by a colon and the port.  e.g., goaccess.io:9999

	      By default, it will attempt to connect to	the generated report's
	      hostname.	If GoAccess is running on a remote server, the host of
	      the remote server	should be specified here. Also,	make  sure  it
	      is a valid host and NOT an http address.

       --ping-interval=<secs>
	      Enable  WebSocket	 ping with specified interval in seconds. This
	      helps prevent idle connections getting disconnected.

       --fifo-in=<path/file>
	      Creates a	named  pipe  (FIFO)  that  reads  from	on  the	 given
	      path/file.

       --fifo-out=<path/file>
	      Creates a	named pipe (FIFO) that writes to the given path/file.

       --ssl-cert=<cert.crt>
	      Path to TLS/SSL certificate. In order to enable TLS/SSL support,
	      GoAccess requires	that --ssl-cert	and --ssl-key are used.

	      Only if configured using --with-openssl

       --ssl-key=<priv.key>
	      Path to TLS/SSL private key. In order to enable TLS/SSL support,
	      GoAccess requires	that --ssl-cert	and --ssl-key are used.

	      Only if configured using --with-openssl

   FILE	OPTIONS
       -      The log file to parse is read from stdin.

       -f --log-file=<logfile>
	      Specify  the  path  to  the input	log file. If set in the	config
	      file, it will take priority over -f from the command line.

       -S --log-size=<bytes>
	      Specify the log size in bytes. This is  useful  when  piping  in
	      logs for processing in which the log size	can be explicitly set.

       -l --debug-file=<debugfile>
	      Send all debug messages to the specified file.

       -p --config-file=<configfile>
	      Specify a	custom configuration file to use. If set, it will take
	      priority over the	global configuration file (if any).

       --external-assets
	      Output  HTML  assets  to external	JS/CSS files. Great if you are
	      setting up Content Security Policy (CSP).	This will  create  two
	      separate	files,	goaccess.js and	goaccess.css , in the same di-
	      rectory as your report.html file.

       --invalid-requests=<filename>
	      Log invalid requests to the specified file.

       --unknowns-log=<filename>
	      Log unknown browsers and OSs to the specified file.

       --no-global-config
	      Do not load the global configuration file. This directory	should
	      normally	  be	/usr/local/etc,	   unless    specified	  with
	      --sysconfdir=/dir.   See	--dcf  option  for finding the default
	      configuration file.

   PARSE OPTIONS
       -a --agent-list
	      Enable a list of user-agents by host. For	faster parsing,	do not
	      enable this flag.

       -d --with-output-resolver
	      Enable IP	resolver on HTML|JSON output.

       -e --exclude-ip=<IP|IP-range>
	      Exclude an IPv4 or IPv6 from being  counted.  Applicable	solely
	      during access log	data processing, it does not exclude persisted
	      data.   Ranges  can  be included as well using a dash in between
	      the IPs (start-end).

	      Examples:
		exclude-ip 127.0.0.1
		exclude-ip 192.168.0.1-192.168.0.100
		exclude-ip ::1
		exclude-ip 0:0:0:0:0:ffff:808:804-0:0:0:0:0:ffff:808:808

       -j --jobs=<1-6>
	      This specifies the number	of parallel processing threads	to  be
	      used  during the execution of the	program. It determines the de-
	      gree of concurrency when analyzing log data, allowing for	paral-
	      lel processing of	multiple tasks simultaneously. It defaults  to
	      1	 thread.  It's	common	to set the number of jobs based	on the
	      available	hardware resources, such as the	number of CPU cores.

       -H --http-protocol=<yes|no>
	      Set/unset	HTTP request protocol. This will create	a request  key
	      containing the request protocol +	the actual request.

       -M --http-method=<yes|no>
	      Set/unset	 HTTP  request	method.	This will create a request key
	      containing the request method + the actual request.

       -o --output=<path/file.[json|csv|html]>
	      Write output to stdout given one of the following	files and  the
	      corresponding extension for the output format:

		/path/file.csv - Comma-separated values	(CSV)
		/path/file.json	- JSON (JavaScript Object Notation)
		/path/file.html	- HTML

       -q --no-query-string
	      Ignore	    request's	     query	  string.	 i.e.,
	      www.google.com/page.htm?query => www.google.com/page.htm.

	      Note: Removing the query string can greatly decrease memory con-
	      sumption,	especially on timestamped requests.

       -r --no-term-resolver
	      Disable IP resolver on terminal output.

       --444-as-404
	      Treat non-standard status	code 444 as 404.

       --4xx-to-unique-count
	      Add 4xx client errors to the unique visitors count.

       --anonymize-ip
	      Anonymize	the client IP address.	The  IP	 anonymization	option
	      sets  the	 last  octet of	IPv4 user IP addresses and the last 80
	      bits of  IPv6  addresses	to  zeros.   e.g.,  192.168.20.100  =>
	      192.168.20.0     e.g.,	2a03:2880:2110:df07:face:b00c::1    =>
	      2a03:2880:2110:df07::

	      Note: This deactivates -a.

       --chunk-size=<256-32768>
	      This determines the number of lines that form a chunk. This  pa-
	      rameter  influences  the size of the data	processed concurrently
	      by each thread, allowing for parallelization of the file reading
	      and processing tasks. The	value of chunk-size affects the	 effi-
	      ciency  of  the parallel processing and can be adjusted based on
	      factors such as system resources and the characteristics of  the
	      input data.

	      Low Values: If chunk-size	is set too low,	it might result	in in-
	      efficient	 processing.  For  instance,  if each chunk contains a
	      very small number	of lines, the overhead of managing and coordi-
	      nating parallel processing might outweigh	the benefits.

	      Large Values: Conversely,	if chunk-size  is  set	too  high,  it
	      could  lead to resource exhaustion. Each chunk represents	a por-
	      tion of data that	a thread processes in parallel.	Setting	chunk-
	      size to an excessively large value might	cause  memory  issues,
	      particularly if there are	many parallel threads running simulta-
	      neously.

       --anonymize-level
	      Specifies	the anonymization levels: 1 => default,	2 => strong, 3
	      => pedantic.
	      +-------------+---------+---------+---------+
	      |	Bits-hidden | Level 1 |	Level 2	| Level	3 |
	      +-------------+---------+---------+---------+
	      |	IPv4	    | 8	      |	16	| 24	  |
	      +-------------+---------+---------+---------+
	      |	IPv6	    | 64      |	80	| 96	  |
	      +-------------+---------+---------+---------+

       --all-static-files
	      Include	static	files  that  contain  a	 query	string.	 e.g.,
	      /fonts/fontawesome-webfont.woff?v=4.0.3

       --browsers-file=<path>
	      By default GoAccess parses an "essential/basic" curated list  of
	      browsers & crawlers. If you need to add additional browsers, use
	      this   option.	Include	  an   additional  delimited  list  of
	      browsers/crawlers/feeds etc.  See	 config/browsers.list  for  an
	      example	 or   https://raw.githubusercontent.com/allinurl/goac-
	      cess/master/config/browsers.list

       --date-spec=<date|hr|min>
	      Set the date specificity to either date (default), hr to display
	      hours or min to display minutes appended to the date.

	      This is used in the visitors panel.  It's	 useful	 for  tracking
	      visitors	at  the	 hour level. For instance, an hour specificity
	      would yield to  display  traffic	as  18/Dec/2010:19  or	minute
	      specificity 18/Dec/2010:19:59.

       --double-decode
	      Decode  double-encoded  values.  This  includes, user-agent, re-
	      quest, and referrer.

       --enable-panel=<PANEL>
	      Enable parsing and displaying the	given panel.

	      Available	panels:
		VISITORS
		REQUESTS
		REQUESTS_STATIC
		NOT_FOUND
		HOSTS
		OS
		BROWSERS
		VISIT_TIMES
		VIRTUAL_HOSTS
		REFERRERS
		REFERRING_SITES
		KEYPHRASES
		STATUS_CODES
		REMOTE_USER
		CACHE_STATUS
		GEO_LOCATION
		MIME_TYPE
		TLS_TYPE

       --fname-as-vhost=<regex>
	      Use log filename(s) as virtual host(s). POSIX regex is passed to
	      extract the virtual host from the	 filename.  e.g.,  --fname-as-
	      vhost='[a-z]*.[a-z]*'  can be used to extract awesome.com.log =>
	      awesome.com.

       --hide-referrer=<NEEDLE>
	      Hide a referrer but still	count it. Wild cards  are  allowed  in
	      the needle. i.e.,	*.bing.com.

       --hour-spec=<hr|min>
	      Set the time specificity to either hour (default)	or min to dis-
	      play the tenth of	an hour	appended to the	hour.

	      This  is	used  in  the time distribution	panel. It's useful for
	      tracking peaks of	traffic	on your	server at specific times.

       --ignore-crawlers
	      Ignore crawlers from being counted.

       --unknowns-as-crawlers
	      Classify unknown OS and browsers as crawlers.

       --ignore-panel=<PANEL>
	      Ignore parsing and displaying the	given panel.

	      Available	panels:
		VISITORS
		REQUESTS
		REQUESTS_STATIC
		NOT_FOUND
		HOSTS
		OS
		BROWSERS
		VISIT_TIMES
		VIRTUAL_HOSTS
		REFERRERS
		REFERRING_SITES
		KEYPHRASES
		STATUS_CODES
		REMOTE_USER
		CACHE_STATUS
		GEO_LOCATION
		MIME_TYPE
		TLS_TYPE

       --ignore-referrer=<referrer>
	      Ignore referrers from being counted.  Wildcards  allowed.	 e.g.,
	      *.domain.com ww?.domain.*

       --ignore-statics=<req|panel>
	      Ignore static file requests.

	      req
		Only ignore request from valid requests

	      panels
		Ignore request from panels.

		Note  that  it will count them towards the total number	of re-
	      quests

       --ignore-status=<CODE>
	      Ignore parsing and displaying one	or  multiple  status  code(s).
	      For multiple status codes, use this option multiple times.

       --keep-last=<num_days>
	      Keep the last specified number of	days in	storage. This will re-
	      cycle  the  storage  tables.  e.g.,  keep	& show only the	last 7
	      days.

       --no-ip-validation
	      Disable client IP	validation. Useful if IP addresses  have  been
	      obfuscated  before being logged.	The log	still needs to contain
	      a	 placeholder  for  %h  usually	it's  a	 resolved   IP.	  e.g.
	      ord37s19-in-f14.1e100.net.

       --no-strict-status
	      Disable  HTTP  status code validation. Some servers would	record
	      this value only if a connection was established  to  the	target
	      and the target sent a response.  Otherwise, it could be recorded
	      as -.

       --num-tests=<number>
	      Number of	lines from the access log to test against the provided
	      log/date/time  format.  By default, the parser is	set to test 10
	      lines. If	set to 0, the parser won't test	 any  lines  and  will
	      parse  the  whole	 access	 log.  If  a  line  matches  the given
	      log/date/time format before it reaches <number>, the parser will
	      consider the log to be valid,  otherwise	GoAccess  will	return
	      EXIT_FAILURE and display the relevant error messages.

       --process-and-exit
	      Parse  log  and  exit  without outputting	data. Useful if	we are
	      looking to only add new data to  the  on-disk  database  without
	      outputting to a file or a	terminal.

       --real-os
	      Display real OS names. e.g, Windows XP, Snow Leopard.

       --sort-panel=<PANEL,FIELD,ORDER>
	      Sort panel on initial load. Sort options are separated by	comma.
	      Options are in the form: PANEL,METRIC,ORDER

	      Available	metrics:
		BY_HITS	    - Sort by hits
		BY_VISITORS - Sort by unique visitors
		BY_DATA	    - Sort by data
		BY_BW	    - Sort by bandwidth
		BY_AVGTS    - Sort by average time served
		BY_CUMTS    - Sort by cumulative time served
		BY_MAXTS    - Sort by maximum time served
		BY_PROT	    - Sort by http protocol
		BY_MTHD	    - Sort by http method

	      Available	orders:
		ASC
		DESC

       --static-file=<extension>
	      Add static file extension. e.g.: .mp3 Extensions are case	sensi-
	      tive.

   GEOLOCATION OPTIONS
       -g --std-geoip
	      Standard GeoIP database for less memory usage.

       --geoip-database=<geofile>
	      Specify path to GeoIP database file. i.e., GeoLiteCity.dat.

	      If  using	GeoIP2,	you will need to download the GeoLite2 City or
	      Country database from MaxMind.com	and use	 the  option  --geoip-
	      database to specify the database.	You can	also get updated data-
	      base  files  for	GeoIP  legacy,	you  can find these as GeoLite
	      Legacy Databases from MaxMind.com. IPv4 and IPv6 files are  sup-
	      ported  as  well.	 For  updated  DB URLs,	please see the default
	      GoAccess configuration file.

	      Note: --geoip-city-data is an alias of --geoip-database.

   OTHER OPTIONS
       -h --help
	      The help.

       -s --storage
	      Display current storage method. i.e., B+ Tree, Hash.

       -V --version
	      Display version information and exit.

       --dcf  Display the path of the default config file  when	 `-p`  is  not
	      used.

   PERSISTENCE STORAGE OPTIONS
       --persist
	      Persist  parsed  data  into disk.	If database files exist, files
	      will be overwritten. This	should be set to  the  first  dataset.
	      See examples below.

       --restore
	      Load previously stored data from disk. If	reading	persisted data
	      only,  the database files	need to	exist. See --persist and exam-
	      ples below.

       --db-path=<dir>
	      Path where the on-disk database files are	 stored.  The  default
	      value is the /tmp	directory.

CUSTOM LOG/DATE	FORMAT
       GoAccess	can parse virtually any	web log	format.

       Predefined  options include, Common Log Format (CLF), Combined Log For-
       mat (XLF/ELF), including	virtual	host, Amazon CloudFront	(Download Dis-
       tribution), Google Cloud	Storage	and W3C	format (IIS).

       GoAccess	allows any custom format string	as well.

       There are two ways to configure the log format.	The easiest is to  run
       GoAccess	with -c	to prompt a configuration window. Otherwise, it	can be
       configured under	~/.goaccessrc or the %sysconfdir%.

       time-format
	      The  time-format variable	followed by a space, specifies the log
	      format time containing any combination of	regular	characters and
	      special format specifiers.  They all begin with a	percentage (%)
	      sign. See	`man strftime`.	 %T or %H:%M:%S.

	      Note: If a timestamp is given in microseconds, %f	must  be  used
	      as time-format or	%* if the timestamp is given in	milliseconds.

       date-format
	      The  date-format variable	followed by a space, specifies the log
	      format date containing any combination of	regular	characters and
	      special format specifiers. They all begin	with a percentage  (%)
	      sign. See	`man strftime`.	e.g., %Y-%m-%d.

	      Note:  If	 a timestamp is	given in microseconds, %f must be used
	      as date-format or	%* if the timestamp is given in	milliseconds.

       log-format
	      The log-format variable followed by a space or  \t  ,  specifies
	      the log format string.

       %x     A	 date  and time	field matching the time-format and date-format
	      variables. This is used when given a timestamp  or  the  date  &
	      time  are	 concatenated  as a single string (e.g., 1501647332 or
	      20170801235000) instead of the date and time being in two	 sepa-
	      rated variables.

       %t     time field matching the time-format variable.

       %d     date field matching the date-format variable.

       %v     The  canonical  Server  Name  of	the server serving the request
	      (Virtual Host).

       %e     This is the userid of the	person requesting the document as  de-
	      termined by HTTP authentication.

       %C     The cache	status of the object the server	served.

       %h     host (the	client IP address, either IPv4 or IPv6)

       %r     The  request line	from the client. This requires specific	delim-
	      iters around the request (as single quotes,  double  quotes,  or
	      anything else) to	be parsable. If	not, we	have to	use a combina-
	      tion of special format specifiers	as %m %U %H.

       %q     The query	string.

       %m     The request method.

       %U     The URL path requested.

	      Note:  If	the query string is in %U, there is no need to use %q.
	      However, if the URL path,	does not include any query string, you
	      may use %q and the query string will be appended to the request.

       %H     The request protocol.

       %s     The status code that the server sends back to the	client.

       %b     The size of the object returned to the client.

       %R     The "Referrer" HTTP request header.

       %u     The user-agent HTTP request header.

       %K     The TLS encryption  settings  chosen  for	 the  connection.  (In
	      Apache LogFormat:	%{SSL_PROTOCOL}x)

       %k     The  TLS	encryption  settings  chosen  for  the connection. (In
	      Apache LogFormat:	%{SSL_CIPHER}x)

       %M     The MIME-type of the requested resource. (In  Apache  LogFormat:
	      %{Content-Type}o)

       %D     The  time	taken to serve the request, in microseconds as a deci-
	      mal number.

       %T     The time taken to	serve the request, in seconds  with  millisec-
	      onds resolution.

       %L     The  time	taken to serve the request, in milliseconds as a deci-
	      mal number.

       %n     The time taken to	serve the request, in nanoseconds.

       %^     Ignore this field.

       %~     Move forward through the log string until	a non-space (!isspace)
	      char is found.

       ~h     The host (the client IP address, either IPv4 or IPv6)  in	 a  X-
	      Forwarded-For (XFF) field.

	      It uses a	special	specifier which	consists of a tilde before the
	      host  specifier,	followed  by the character(s) that delimit the
	      XFF field, which are enclosed by curly braces. i.e., "~h{, }

	      For example, "~h{, }" is used in order  to  parse	 "11.25.11.53,
	      17.68.33.17"  field  which  is  delimited	by a comma and a space
	      (enclosed	by double quotes).

	      +----------------------------+-----------+
	      |	XFF field		   | specifier |
	      +----------------------------+-----------+
	      |	"192.1.21.932,.68.33192.1.1.2" | "~h{, }"  |
	      +----------------------------+-----------+
	      |	"192.1.2.12","192.68.33.17" | ~h{", }   |
	      +----------------------------+-----------+
	      |	192.1.2.12, 192.68.33.17   | ~h{, }    |
	      +----------------------------+-----------+
	      |	192.1.2.11492.68.33.11972.1.1.2 | ~h{ }     |
	      +----------------------------+-----------+

       Note:  In  order	to get the average, cumulative and maximum time	served
       in GoAccess, you	will need to start logging response times in your  web
       server. In Nginx	you can	add $request_time to your log format, or %D in
       Apache.

       Important:  If  multiple	 time  served  specifiers are used at the same
       time, the first option specified	in the format string will take	prior-
       ity over	the other specifiers.

       GoAccess	requires the following fields:

	      %h a valid IPv4/6

	      %d a valid date

	      %r the request

INTERACTIVE MENU
       F1 or h
	      Main help.

       F5     Redraw main window.

       q      Quit the program,	current	window or collapse active module

       o or ENTER
	      Expand selected module or	open window

       0-9 and Shift + 0
	      Set selected module to active

       j      Scroll down within expanded module

       k      Scroll up	within expanded	module

       c      Set or change scheme color.

       TAB    Forward iteration	of modules. Starts from	current	active module.

       SHIFT + TAB
	      Backward	iteration  of modules. Starts from current active mod-
	      ule.

       ^f     Scroll forward one screen	within an active module.

       ^b     Scroll backward one screen within	an active module.

       s      Sort options for active module

       /      Search across all	modules	(regex allowed)

       n      Find the position	of the next occurrence across all modules.

       g      Move to the first	item or	top of screen.

       G      Move to the last item or bottom of screen.

EXAMPLES
       Note: Piping data into GoAccess won't prompt a log/date/time configura-
       tion dialog, you	will need to previously	define it in  your  configura-
       tion file or in the command line.

   DIFFERENT OUTPUTS
       To output to a terminal and generate an interactive report:

	      #	goaccess access.log

       To generate an HTML report:

	      #	goaccess access.log -a -o report.html

       To generate a JSON report:

	      #	goaccess access.log -a -d -o report.json

       To generate a CSV file:

	      #	goaccess access.log --no-csv-summary -o	report.csv

       GoAccess	 also  allows  great  flexibility  for real-time filtering and
       parsing.	For instance, to quickly diagnose issues  by  monitoring  logs
       since goaccess was started:

	      #	tail -f	access.log | goaccess -

       And  even better, to filter while maintaining opened a pipe to preserve
       real-time analysis, we can make use of tail -f and a  matching  pattern
       tool such as grep, awk, sed, etc:

	      #	tail -f	access.log | grep -i --line-buffered 'firefox' | goac-
	      cess --log-format=COMBINED -

       or  to  parse from the beginning	of the file while maintaining the pipe
       opened and applying a filter

	      #	tail -f	-n +0 access.log | grep	-i --line-buffered 'firefox' |
	      goaccess --log-format=COMBINED -o	report.html --real-time-html -

       or to convert the log date timezone to a	different timezone, e.g.,  Eu-
       rope/Berlin

	      #	 goaccess  access.log  --log-format='%h	%^[%x] "%r" %s %b "%R"
	      "%u"'    --datetime-format='%d/%b/%Y:%H:%M:%S    %z'    --tz=Eu-
	      rope/Berlin --date-spec=min

   MULTIPLE LOG	FILES
       There  are  several ways	to parse multiple logs with GoAccess. The sim-
       plest is	to pass	multiple log files to the command line:

	      #	goaccess access.log access.log.1

       It's even possible to parse files from a	 pipe  while  reading  regular
       files:

	      #	cat access.log.2 | goaccess access.log access.log.1 -

       Note  that the single dash is appended to the command line to let GoAc-
       cess know that it should	read from the pipe.

       Now if we want to add more flexibility to GoAccess, we can do a	series
       of  pipes. For instance,	if we would like to process all	compressed log
       files access.log.*.gz in	addition to the	current	log file, we can do:

	      #	zcat access.log.*.gz | goaccess	access.log -

       Note: On	Mac OS X, use gunzip -c	instead	of zcat.

   REAL	TIME HTML OUTPUT
       GoAccess	has the	ability	to output real-time data in the	 HTML  report.
       You  can	even email the HTML file since it is composed of a single file
       with no external	file dependencies, how neat is that!

       The process of generating a real-time HTML report is  very  similar  to
       the  process  of	 creating  a  static  report. Only --real-time-html is
       needed to make it real-time.

	      #	goaccess access.log -o	/usr/share/nginx/html/site/report.html
	      --real-time-html

       By  default,  GoAccess  will use	the host name of the generated report.
       Optionally, you can specify the URL to which the	client's browser  will
       connect to. See https://goaccess.io/faq for a more detailed example.

	      #	 goaccess  access.log  -o  report.html	--real-time-html --ws-
	      url=goaccess.io

       By default, GoAccess listens on port 7890,  to  use  a  different  port
       other than 7890,	you can	specify	it as (make sure the port is opened):

	      #	   goaccess   access.log   -o	report.html   --real-time-html
	      --port=9870

       And to bind the WebSocket server	to  a  different  address  other  than
       0.0.0.0,	you can	specify	it as:

	      #	   goaccess   access.log   -o	report.html   --real-time-html
	      --addr=127.0.0.1

       Note: To	output real time data over a TLS/SSL connection, you  need  to
       use --ssl-cert=<cert.crt> and --ssl-key=<priv.key>.

   WORKING WITH	DATES
       Another useful pipe would be filtering dates out	of the web log

       The  following will get all HTTP	requests starting on 05/Dec/2010 until
       the end of the file.

	      #	sed -n '/05Dec2010/,$ p' access.log | goaccess -a -

       or using	relative dates such as yesterdays or tomorrows day:

	      #	sed -n '/'$(date '+%d%b%Y' -d '1 week ago')'/,$	p'  access.log
	      |	goaccess -a -

       If we want to parse only	a certain time-frame from DATE a to DATE b, we
       can do:

	      #	sed -n '/5Nov2010/,/5Dec2010/ p' access.log | goaccess -a -

       If we want to preserve only certain amount of data and recycle storage,
       we  can keep only a certain number of days. For instance	to keep	& show
       the last	5 days:

	      #	goaccess access.log --keep-last=5

   VIRTUAL HOSTS
       Assuming	your log contains the virtual host (server blocks) field.  For
       instance:

	      vhost.com:80 10.131.40.139 - - [02/Mar/2016:08:14:04 -0600] "GET
	      /shop/bag-p-20  HTTP/1.1"	 200  6715 "-" "Apache (internal dummy
	      connection)"

       And you would like to append the	virtual	host to	the request  in	 order
       to see which virtual host the top urls belong to

	      awk '$8=$1$8' access.log | goaccess -a -

       To exclude a list of virtual hosts you can do the following:

	      #	 grep  -v  "`cat  exclude_vhost_list_file`" vhost_access.log |
	      goaccess -

   FILES & STATUS CODES
       To parse	specific pages,	e.g., page views, html,	htm, php, etc.	within
       a request:

	      #	awk '$7~/.html|.htm|.php/' access.log |	goaccess -

       Note,  $7  is the request field for the common and combined log format,
       (without	Virtual	Host), if your log includes  Virtual  Host,  then  you
       probably	want to	use $8 instead.	It's best to check which field you are
       shooting	for, e.g.:

	      #	tail -10 access.log | awk '{print $8}'

       Or to parse a specific status code, e.g., 500 (Internal Server Error):

	      #	awk '$9~/500/' access.log | goaccess -

   SERVER
       Also, it	is worth pointing out that if we want to run GoAccess at lower
       priority, we can	run it as:

	      #	nice -n	19 goaccess -f access.log -a

       and  if	you don't want to install it on	your server, you can still run
       it from your local machine:

	      #	ssh -n root@server  'tail  -f  /var/log/apache2/access.log'  |
	      goaccess -

       Note:  SSH requires -n so GoAccess can read from	stdin. Also, make sure
       to use SSH keys for authentication as it	won't work if a	passphrase  is
       required.

   INCREMENTAL LOG PROCESSING
       GoAccess	 has the ability to process logs incrementally through its in-
       ternal storage and dump its data	to disk. It  works  in	the  following
       way:

       1  A  dataset  must  be	persisted  first with --persist, then the same
	  dataset can be loaded	with

       2  --restore.  If new data is passed (piped or through a	log file),  it
	  will append it to the	original dataset.

       NOTES

       GoAccess	 keeps	track  of  inodes of all the files processed (assuming
       files will stay on the same partition),	in  addition,  it  extracts  a
       snippet	of  data  from the log along with the last line	parsed of each
       file  and  the  timestamp  of  the  last	  line	 parsed.   e.g.,   in-
       ode:29627417|line:20012|ts:20171231235059

       First  it  compares  if the snippet matches the log being parsed, if it
       does, it	assumes	the log	hasn't changed dramatically, e.g., hasn't been
       truncated. If the inode does not	match the current file,	it parses  all
       lines. If the current file matches the inode, it	then reads the remain-
       ing  lines  and updates the count of lines parsed and the timestamp. As
       an extra	precaution, it won't parse log lines with  a  timestamp	  than
       the one stored.

       Piped data works	based off the timestamp	of the last line read. For in-
       stance, it will parse and discard all incoming entries until it finds a
       timestamp >= than the one stored.

       For instance:

	      // last month access log
	      #	goaccess access.log.1 --persist

       then, load it with

	      // append	this month access log, and preserve new	data
	      #	goaccess access.log --restore --persist

       To read persisted data only (without parsing new	data)

	      #	goaccess --restore

NOTES
       Each  active panel has a	total of 366 items or 50 in the	real-time HTML
       report.	The number of items is customizable using max-items Note  that
       HTML,  CSV  and JSON output allow a maximum number greater than the de-
       fault value of 366 items	per panel.

       A hit is	a request (line	in the access log), e.g.,  10  requests	 =  10
       hits.  HTTP requests with the same IP, date, and	user agent are consid-
       ered a unique visit.

       If you want to enable dual-stack	support, please	use --addr=::  instead
       of the default --addr=0.0.0.0.

       The  generated report will attempt to reconnect to the WebSocket	server
       after 1 second with exponential backoff.	It will	attempt	to connect  20
       times.

BUGS
       If  you	think  you  have found a bug, please send me an	email to goac-
       cess@prosoftcorp.com    or     use     the     issue	tracker	    in
       https://github.com/allinurl/goaccess/issues

AUTHOR
       Gerardo	Orellana <hello@goaccess.io> For more details about it,	or new
       releases, please	visit https://goaccess.io

GNU+Linux			   MAY 2024			   goaccess(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=goaccess&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help