Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
KHTTP_PARSE(3)		    Library Functions Manual		KHTTP_PARSE(3)

NAME
       khttp_parse, khttp_parsex -- parse a CGI	instance for kcgi

LIBRARY
       library "libkcgi"

SYNOPSIS
       #include	<sys/types.h>
       #include	<stdarg.h>
       #include	<stdint.h>
       #include	<kcgi.h>

       enum kcgi_err
       khttp_parse(struct kreq *req,  const struct kvalid *keys, size_t	keysz,
	   const char *const *pages, size_t pagesz, size_t defpage);

       enum kcgi_err
       khttp_parsex(struct kreq	*req,	      const struct kmimemap *suffixes,
	   const char *const *mimes, size_t mimesz, const struct kvalid	*keys,
	   size_t keysz,	const char *const *pages,	size_t pagesz,
	   size_t defmime,	       size_t defpage,		    void *arg,
	   void	(*argfree)(void	*arg),		       unsigned	int debugging,
	   const struct	kopts *opts);

       extern const char *const	kmimetypes[KMIME__MAX];
       extern const char *const	khttps[KHTTP__MAX];
       extern const char *const	kschemes[KSCHEME__MAX];
       extern const char *const	kmethods[KMETHOD__MAX];
       extern const struct kmimemap ksuffixmap[];
       extern const char *const	ksuffixes[KMIME__MAX];

DESCRIPTION
       The khttp_parse() and khttp_parsex() functions parse and	validate input
       and the HTTP environment	(compression, paths, MIME types, and  so  on).
       They are	the central functions in the kcgi(3) library, parsing and val-
       idating	key-value  form	 (query	string,	message	body, cookie) data and
       opaque message bodies.

       They must be matched by khttp_free(3) if	and only if the	 return	 value
       is KCGI_OK.  Otherwise, resources are internally	freed.

       The collective arguments	are as follows:

       arg     A  pointer  to private application data.	 It is not touched un-
	       less argfree is provided.

       argfree
	       Function	invoked	with arg by  the  child	 process  starting  to
	       parse untrusted network data.  This makes sure that no unneces-
	       sary data is leaked into	the child.

       debugging
	       This bit-field enables debugging	of the underlying parse	and/or
	       write  routines.	  It  may have KREQ_DEBUG_WRITE	for writes and
	       KREQ_DEBUG_READ_BODY for	the pre-parsed body.   Debugging  mes-
	       sages  to  kutil_info(3)	 consist of the	process	ID followed by
	       "-tx" or	"-rx" for writing or reading, a	colon and space,  then
	       the  logged  data.   A newline will flush the existing line, as
	       well reaching 80	characters.  If	flushed	at 80  characters  and
	       not  a  newline,	 an  ellipsis will follow the line.  The total
	       logged bytes will be emitted at the end of all reads or writes.

       defmime
	       If no MIME type is specified (that is, there's no suffix	to the
	       page request), use this index in	the mimes array.

       defpage
	       If no page was specified	(e.g., the default landing page), this
	       is provided as the requested page index.

       keys    An optional array of input and validation fields	or NULL.

       keysz   The number of elements in keys.

       mimesz  The number of elements in mimes.	 Also the MIME index  used  if
	       no  MIME	type was matched.  This	differs	from defmime, which is
	       used if there is	no MIME	suffix at all.

       mimes   An array	of MIME	types (e.g., "text/html"), mapped into a  MIME
	       index during MIME body parsing.	This relates both to pages and
	       input  fields  with  a  body type.  Any array should include at
	       least text/plain, as this is the	default	content	type for  MIME
	       documents.

       opts    Tunable	options	 regarding  socket buffer sizes	and so on.  If
	       set to NULL, meaningful defaults	are used.

       pages   An array	of recognised pathnames.  When pathnames  are  parsed,
	       they're matched to indices in this array.

       pagesz  The  number of pages in pages.  Also used if the	requested page
	       was not in pages.

       req     This structure is cleared and filled with input fields and HTTP
	       context parsed from the CGI environment.	 It is the main	struc-
	       ture carried around in a	kcgi(3)	application.

       suffixes
	       Define the MIME type (suffix) mapping.

       The first form, khttp_parse(), is for applications  using  the  system-
       recognised  MIME	types.	This should work well enough for most applica-
       tions.  It is equivalent	to invoking the	second	form,  khttp_parsex(),
       as follows:

	     khttp_parsex(req, ksuffixmap,
	       kmimetypes, KMIME__MAX, keys, keysz,
	       pages, pagesz, KMIME_TEXT_HTML,
	       defpage,	NULL, NULL, 0, NULL);

   Types
       A  struct kreq object is	filled in by khttp_parse() and khttp_parsex().
       It consists of the following fields:

       void *arg
	       Private application data.  This is set during khttp_parse().

       enum kauth auth
	       Type of "managed"  HTTP	authorisation  performed  by  the  web
	       server  according  to  the  AUTH_TYPE  header variable, if any.
	       This is KAUTH_DIGEST for	the AUTH_TYPE of "digest", KAUTH_BASIC
	       for "basic", KAUTH_BEARER for "bearer", KAUTH_UNKNOWN for other
	       values of AUTH_TYPE, or KAUTH_NONE if  AUTH_TYPE	 is  not  set.
	       See  the	 rawauth field for raw (i.e., not processed by the web
	       server) authorisation requests.

       struct kpair **cookiemap
	       An array	of keysz  singly  linked  lists	 of  elements  of  the
	       cookies	array.	 If cookie->key	is equal to one	of the entries
	       of keys and cookie->state is  KPAIR_VALID  or  KPAIR_UNCHECKED,
	       the  cookie  is	added  to  the list cookiemap[cookie->keypos].
	       Empty lists are NULL.  If a list	contains more than one cookie,
	       cookie->next points to the next cookie.	For the	last cookie in
	       a list, cookie->next is NULL.

       struct kpair **cookienmap
	       Similar to cookiemap, except that it contains the cookies where
	       cookie->state is	KPAIR_INVALID.

       struct kpair *cookies
	       Key-value  pairs	 read  from  request  cookies  found  in   the
	       HTTP_COOKIE  header  variable,  or  NULL	if cookiesz is 0.  See
	       fields for key-value pairs from the  request  query  string  or
	       message body.

       size_t cookiesz
	       The size	of the cookies array.

       struct kpair **fieldmap
	       Similar to cookiemap, except that the lists contain elements of
	       the fields array.

       struct kpair **fieldnmap
	       Similar	to  fieldmap, except that it contains the fields where
	       field->state is KPAIR_INVALID.

       struct kpair *fields
	       Key-value pairs read from the QUERY_STRING header variable  and
	       from  the  message  body, or NULL if fieldsz is 0.  See cookies
	       for key-value pairs from	request	cookies.

       size_t fieldsz
	       The number of elements in the fields array.

       char *fullpath
	       The full	requested path as contained in	the  PATH_INFO	header
	       variable.	    For		  example,	    requesting
	       "https://bsd.lv/app.cgi/dir/file.html?q=v", where "app.cgi"  is
	       the CGI program,	this value would be /dir/file.html.  It	is not
	       guaranteed to start with	a slash	and it may be an empty string.

       char *host
	       The  host name received in the HTTP_HOST	header variable.  When
	       using name-based	virtual	hosting, this is typically the virtual
	       host name specified by the client in the	HTTP request,  and  it
	       should  not be confused with the	canonical DNS name of the host
	       running	the  web  server.    For   example,   a	  request   to
	       "https://bsd.lv/app.cgi/file"  would  have  a host of "bsd.lv".
	       If HTTP_HOST is not defined, host is set	to "localhost".

       struct kdata *kdata
	       Internal	data.  Should not be touched.

       const struct kvalid *keys
	       Value passed to khttp_parse().

       size_t keysz
	       Value passed to khttp_parse().

       enum kmethod method
	       The KMETHOD_ACL,	KMETHOD_CONNECT, KMETHOD_COPY, KMETHOD_DELETE,
	       KMETHOD_GET,  KMETHOD_HEAD,  KMETHOD_LOCK,  KMETHOD_MKCALENDAR,
	       KMETHOD_MKCOL,	KMETHOD_MOVE,  KMETHOD_OPTIONS,	 KMETHOD_POST,
	       KMETHOD_PROPFIND,	KMETHOD_PROPPATCH,	  KMETHOD_PUT,
	       KMETHOD_REPORT,	KMETHOD_TRACE,	or  KMETHOD_UNLOCK  submission
	       method obtained from the	REQUEST_METHOD header variable.	 If an
	       unknown method was requested,  KMETHOD__MAX  is	used.	If  no
	       method was specified, the default is KMETHOD_GET.

	       Applications   will   usually   accept	only  KMETHOD_GET  and
	       KMETHOD_POST, so	be sure	to emit	a KHTTP_405 status  for	 unde-
	       sired methods.

       size_t mime
	       The MIME	type of	the requested file as determined by its	suffix
	       matched	to the mimemap map passed to khttp_parsex() or the de-
	       fault kmimemap if using khttp_parse().  This  defaults  to  the
	       mimesz value passed to khttp_parsex() or	the default KMIME__MAX
	       if  using khttp_parse() when no suffix is specified or when the
	       suffix is specified but not known.

       size_t page
	       The page	index found by looking up pagename in the pages	array.
	       If pagename is not found	in pages, pagesz is used; if  pagename
	       is empty, defpage is used.

       char *pagename
	       The  first component of fullpath	or an empty string if there is
	       none.  It is compared to	the elements of	the pages array	to de-
	       termine which page it  corresponds  to.	 For  example,	for  a
	       fullpath	of "/dir/file.html" this component corresponds to dir.
	       For "/file.html", it's file.

       char *path
	       The  middle  part of fullpath, after stripping pagename/	at the
	       beginning and .suffix at	the end, or an empty string  if	 there
	       is  none.   For	example, if the	fullpath is bar/baz.html, this
	       component is baz.

       char *pname
	       The script name received	in the	SCRIPT_NAME  header  variable.
	       For    example,	  for	 a    request	to   a	 CGI   program
	       /var/www/cgi-bin/app.cgi	 mapped	 by  the   web	 server	  from
	       "https://bsd.lv/app.cgi/file", this would be app.cgi.  This may
	       not reflect a file system entity	and it may be an empty string.

       uint16_t	port
	       The  server's  receiving	 TCP port according to the SERVER_PORT
	       header variable,	or 80 if that is not  defined  or  an  invalid
	       number.

       struct khttpauth	rawauth
	       The    raw    authorization    request	 according    to   the
	       HTTP_AUTHORIZATION header variable passed by  the  web  server.
	       This  is	 only set if the web server is not managing authorisa-
	       tion itself.

       char *remote
	       The string form of the client's IPv4 or IPv6 address taken from
	       the REMOTE_ADDR header variable,	or "127.0.0.1" if that is  not
	       defined.	 The address format of the string is not checked.

       struct khead *reqmap[KREQU__MAX]
	       Map  of	enum krequ enumeration values to pairs in reqs.	 If an
	       enumerated request was not specified, it	is NULL.

       struct khead *reqs
	       List of all HTTP	request	headers	or NULL	if reqsz  is  0.   The
	       request	headers	 are in	HTTP syntax with all lowercase and un-
	       derscores    as	  hyphens,    e.g.,    the    CGI     variable
	       HTTP_ACCEPT_LANGUAGE  is	 stored	 as http-accept-language.  See
	       envs for	unmodified environment variables.

       size_t reqsz
	       Number of request headers in reqs.

       struct khead *envs
	       List of all environment variables in the	CGI or FastCGI context
	       or NULL if envsz	is 0.  See reqs	for parsed and formatted  HTTP
	       headers,	  defined   as	environment  variables	starting  with
	       "HTTP_".

       size_t envsz
	       Number of environment variables in envs.

       enum kscheme scheme
	       The access scheme according to the HTTPS	header	variable,  ei-
	       ther KSCHEME_HTTPS if HTTPS is set and equal to the string "on"
	       or KSCHEME_HTTP otherwise.

       char *suffix
	       The  suffix  part of the	last component of fullpath or an empty
	       string if there is none.	  For  example,	 if  the  fullpath  is
	       /bar/baz.html,  this component is html.	See the	mime field for
	       the MIME	type parsed from the suffix.

       The application may optionally define keys  provided  to	 khttp_parse()
       and  khttp_parsex()  as	an  array of struct kvalid.  This structure is
       central to the validation of input data.	 It consists of	the  following
       fields:

       const char *name
	       The  field  name,  i.e.,	 how it	appears	in the HTML form input
	       name.  This cannot be NULL.  If the  field  name	 is  an	 empty
	       string and the HTTP message consists of an opaque body (and not
	       key-value  pairs), then that field will be used to validate the
	       HTTP message body.  This	is useful for  KMETHOD_PUT  style  re-
	       quests.

       int (*)(struct kpair *) valid
	       A validation function returning non-zero	if parsing and valida-
	       tion succeed or 0 otherwise.  If	it is NULL, then no validation
	       is  performed, the data is considered as	valid, and it is buck-
	       eted into cookiemap or fieldmap as such.

	       User-defined valid functions usually set	the  type  and	parsed
	       fields in the key-value pair.  When working with	binary data or
	       with a key that can take	different data types, it is acceptable
	       for a validation	function to set	the type to KPAIR__MAX and for
	       the application to ignore the parsed field and to work directly
	       with val	and valsz.

	       The  validation	function is allowed to allocate	new memory for
	       val: if the val pointer changes during validation,  the	memory
	       pointed	to  after  validation will be freed with free(3) after
	       the data	is passed out of the sandbox.

	       These functions are invoked from	within a system-specific sand-
	       box that	may not	allow some system calls, for  example  opening
	       files  or sockets.  In other words, validation functions	should
	       only do pure computation.

       The struct kpair	structure presents the user with  fields  parsed  from
       input   and   (possibly)	  matched  to  the  keys  variable  passed  to
       khttp_parse() and khttp_parsex().  It is	also passed to the  validation
       function	 to  be	 filled	in.  In	this case, the MIME-related fields are
       already filled in and may be examined to	determine the method of	 vali-
       dation.	This is	useful when validating opaque message bodies.

       char *ctype
	       The  value's  MIME content type (e.g., image/jpeg), or an empty
	       string if not defined.

       size_t ctypepos
	       If ctype	is not NULL, it	is looked up in	 the  mimes  parameter
	       passed  to khttp_parsex() or ksuffixmap if using	khttp_parse().
	       If found, it is set to the appropriate index.  Otherwise,  it's
	       mimesz.

       char *file
	       The  value's MIME source	filename or an empty string if not de-
	       fined.

       char *key
	       The NUL-terminated key (input) name.  If	the HTTP message  body
	       is  opaque  (e.g.,  KMETHOD_PUT),  then	an empty-string	key is
	       cooked up.  The key may contain an arbitrary sequence  of  non-
	       NUL  bytes, even	non-ASCII bytes, control characters, and shell
	       metacharacters.

       size_t keypos
	       If found	in the keys array passed to khttp_parse(),  the	 index
	       of the matching key.  Otherwise keysz.

       struct kpair *next
	       In  a  cookie or	field map, next	points to the next parsed key-
	       value pair with the same	key name.  This	occurs most  often  in
	       HTML checkbox forms, where many fields may have the same	name.

       union parsed parsed
	       The  parsed, validated value.  These may	be integer in i, for a
	       64-bit signed integer; a	string s, for a	NUL-termianted charac-
	       ter string; or a	double d,  for	a  double-precision  floating-
	       point  number.  This is intentionally basic because the result-
	       ing data	must be	reliably passed	from the parsing context  back
	       into the	web application.

       enum kpairstate state
	       The  validation state: KPAIR_VALID if the pair was successfully
	       validated by a validation function, KPAIR_INVALID if a  valida-
	       tion  function was invoked but failed, or KPAIR_UNCHECKED if no
	       validation function is defined for this key.

       enum kpairtype type
	       If parsed, the type of data in parsed, otherwise	KFIELD__MAX.

       char *val
	       The (input) value, which	may contain an arbitrary  sequence  of
	       bytes, even NUL bytes, non-ASCII	bytes, control characters, and
	       shell metacharacters.  The byte following the end of the	array,
	       val[valsz],  is	always	guaranteed  to be NUL.	The validation
	       function	may modify the contents.   For	example,  for  integer
	       numbers	and  e-mail  adresses,	trailing whitespace may	be re-
	       placed with NUL bytes.

       size_t valsz
	       The length of the val buffer in bytes.	It  is	not  a	string
	       length.

       char *xcode
	       The  value's  MIME content transfer encoding (e.g., base64), or
	       an empty	string if not defined.

       The struct khttpauth structure holds authorisation data	if  passed  by
       the server.  The	specific fields	are as follows.

       enum kauth type
	       If  no  data  was  passed  by  the  server,  the	 type value is
	       KAUTH_NONE.   Otherwise	it's  KAUTH_BASIC,  KAUTH_BEARER,   or
	       KAUTH_DIGEST.   KAUTH_UNKNOWN  signals  that  the authorisation
	       type was	not recognised.

       int authorised
	       For KAUTH_BASIC,	KAUTH_BEARER, or  KAUTH_DIGEST	authorisation,
	       this field indicates whether all	required values	were specified
	       for the application to perform authorisation.

       char *digest
	       An MD5 digest of	REQUEST_METHOD,	SCRIPT_NAME, PATH_INFO,	header
	       variables  and  the  request  body.  It is not a	NUL-terminated
	       string, but an array of exactly MD5_DIGEST_LENGTH bytes.	  Only
	       filled in when HTTP_AUTHORIZATION is "digest" and authorised is
	       non-zero.     Otherwise,	   it	 remains    NULL.    Used   in
	       khttpdigest_validatehash(3).

       d       An anonymous union containing parsed fields  per	 type:	struct
	       khttpbasic  basic  for  KAUTH_BASIC  or KAUTH_BEARER, or	struct
	       khttpdigest digest for KAUTH_DIGEST.

       If the field for	 an  HTTP  authorisation  request  is  KAUTH_BASIC  or
       KAUTH_BEARER,  it will consist of the following for its parsed entities
       in its struct khttpbasic	structure:

       response
	       The hashed and encoded response string for KAUTH_BASIC,	or  an
	       opaque string for KAUTH_BEARER.

       If the field for	an HTTP	authorisation request is KAUTH_DIGEST, it will
       consist of the following	in its struct khttpdigest structure:

       alg     The  encoding  algorithm,  parsed  from	the  possible  MD5  or
	       MD5-Sess	values.

       qop     The quality of protection algorithm, which may be  unspecified,
	       Auth or Auth-Init.

       user    The user	coordinating the request.

       uri     The  URI	for which the request is designated.  (This must match
	       the request URI).

       realm   The request realm.

       nonce   The server-generated nonce value.

       cnonce  The (optional) client-generated nonce value.

       response
	       The hashed and encoded response string, which entangled	fields
	       depending on algorithm and quality of protection.

       count   The (optional) cnonce counter.

       opaque  The (optional) opaque string requested by the server.

       The  struct  kopts  structure  consists of tunables for network perfor-
       mance.  You probably don't want to use these  unless  you  really  know
       what you're doing!

       sndbufsz
	       The size	of the output buffer.  The output buffer is a heap-al-
	       located	region	into  which  writes  (via  khttp_write(3)  and
	       khttp_head(3)) are buffered instead of being  flushed  directly
	       to  the	wire.  The buffer is flushed when it is	full, when the
	       HTTP headers are	flushed, and when  khttp_free(3)  is  invoked.
	       If  the	buffer size is zero, writes are	flushed	immediately to
	       the wire.  If the buffer	size is	less than zero,	it  is	filled
	       with a meaningful default.

       Lastly, the struct khead	structure holds	parsed HTTP headers.

       key     Holds  the  HTTP	 header	name.  This is not the CGI header name
	       (e.g., HTTP_COOKIE), but	the  reconstituted  HTTP  name	(e.g.,
	       Coookie).

       val     The opaque header value,	which may be an	empty string.

   Variables
       A  number  of variables are defined <kcgi.h> to simplify	invocations of
       the khttp_parse() family.  Applications are strongly suggested  to  use
       these  variables	(and associated	enumerations) in khttp_parse() instead
       of overriding them with hand-rolled sets	in khttp_parsex().

       kmimetypes
	       Indexed list of common MIME types, for example, "text/html" and
	       "application/json".  Corresponds	to enum	kmime enum khttp.

       khttps  Indexed list of HTTP status code	and identifier,	 for  example,
	       "200 OK".  Corresponds to enum khttp.

       kschemes
	       Indexed	list  of  URL  schemes,	for example, "https" or	"ftp".
	       Corresponds to enum kscheme.

       kmethods
	       Indexed list of HTTP methods, for example,  "GET"  and  "POST".
	       Corresponds to enum kmethod.

       ksuffixmap
	       Map  of	MIME types defined in enum kmime to possible suffixes.
	       This array is terminated	with a MIME  type  of  KMIME__MAX  and
	       name NULL.

       ksuffixes
	       Indexed list of canonical suffixes for MIME types corresponding
	       to  enum	kmime.	This may be a NULL pointer for types that have
	       no canonical suffix, for	example.  "application/octet-stream".

RETURN VALUES
       khttp_parse() and khttp_parsex()	return an error	code:

       KCGI_OK
	    Success (not an error).

       KCGI_ENOMEM
	    Memory failure.  This can occur in many places: spawning a	child,
	    allocating memory, creating	sockets, etc.

       KCGI_ENFILE
	    Could not allocate file descriptors.

       KCGI_EAGAIN
	    Could not spawn a child.

       KCGI_FORM
	    Malformed data between parent and child whilst parsing an HTTP re-
	    quest.  (Internal system error.)

       KCGI_SYSTEM
	    Opaque operating system error.

       On  failure, the	calling	application should terminate as	soon as	possi-
       ble.  Applications should not try to write an HTTP 505 error  or	 simi-
       lar,  but  allow	the web	server to handle the empty CGI response	on its
       own.

SEE ALSO
       kcgi(3),	khttp_free(3)

AUTHORS
       The khttp_parse() and khttp_parsex() functions were written by Kristaps
       Dzonsons	<kristaps@bsd.lv>.

FreeBSD	ports 15.quarterly	  $Mdocdate$			KHTTP_PARSE(3)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=khttp_parse&sektion=3&manpath=FreeBSD+Ports+15.0.quarterly>

home | help