Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
KHTTP_PARSE(3)		    Library Functions Manual		KHTTP_PARSE(3)

NAME
       khttp_parse, khttp_parsex -- parse a CGI	instance for kcgi

LIBRARY
       library "libkcgi"

SYNOPSIS
       #include	<sys/types.h>
       #include	<stdarg.h>
       #include	<stdint.h>
       #include	<kcgi.h>

       enum kcgi_err
       khttp_parse(struct kreq *req,  const struct kvalid *keys, size_t	keysz,
	   const char *const *pages, size_t pagesz, size_t defpage);

       enum kcgi_err
       khttp_parsex(struct kreq	*req,	      const struct kmimemap *suffixes,
	   const char *const *mimes, size_t mimesz, const struct kvalid	*keys,
	   size_t keysz,	const char *const *pages,	size_t pagesz,
	   size_t defmime,	       size_t defpage,		    void *arg,
	   void	(*argfree)(void	*arg),		       unsigned	int debugging,
	   const struct	kopts *opts);

       extern const char *const	kmimetypes[KMIME__MAX];
       extern const char *const	khttps[KHTTP__MAX];
       extern const char *const	kschemes[KSCHEME__MAX];
       extern const char *const	kmethods[KMETHOD__MAX];
       extern const struct kmimemap ksuffixmap[];
       extern const char *const	ksuffixes[KMIME__MAX];

DESCRIPTION
       The khttp_parse() and khttp_parsex() functions parse and	validate input
       and the HTTP environment	(compression, paths, MIME types, and  so  on).
       They are	the central functions in the kcgi(3) library, parsing and val-
       idating	key-value  form	 (query	string,	message	body, cookie) data and
       opaque message bodies.

       They must be matched by khttp_free(3) if	and only if the	 return	 value
       is KCGI_OK.  Otherwise, resources are internally	freed.

       The collective arguments	are as follows:

       arg     A  pointer  to private application data.	 It is not touched un-
	       less argfree is provided.

       argfree
	       Function	invoked	with arg by  the  child	 process  starting  to
	       parse untrusted network data.  This makes sure that no unneces-
	       sary data is leaked into	the child.

       debugging
	       This bit-field enables debugging	of the underlying parse	and/or
	       write  routines.	  It  may have KREQ_DEBUG_WRITE	for writes and
	       KREQ_DEBUG_READ_BODY for	the pre-parsed body.   Debugging  mes-
	       sages  to  kutil_info(3)	 consist of the	process	ID followed by
	       "-tx" or	"-rx" for writing or reading, a	colon and space,  then
	       the  logged  data.   A newline will flush the existing line, as
	       well reaching 80	characters.  If	flushed	at 80  characters  and
	       not  a  newline,	 an  ellipsis will follow the line.  The total
	       logged bytes will be emitted at the end of all reads or writes.

       defmime
	       If no MIME type is specified (that is, there's no suffix	to the
	       page request), use this index in	the mimes array.

       defpage
	       If no page was specified	(e.g., the default landing page), this
	       is provided as the requested page index.

       keys    An optional array of input and validation fields	or NULL.

       keysz   The number of elements in keys.

       mimesz  The number of elements in mimes.	 Also the MIME index  used  if
	       no  MIME	type was matched.  This	differs	from defmime, which is
	       used if there is	no MIME	suffix at all.

       mimes   An array	of MIME	types (e.g., "text/html"), mapped into a  MIME
	       index during MIME body parsing.	This relates both to pages and
	       input  fields  with  a  body type.  Any array should include at
	       least text/plain, as this is the	default	content	type for  MIME
	       documents.

       opts    Tunable	options	 regarding  socket buffer sizes	and so on.  If
	       set to NULL, meaningful defaults	are used.

       pages   An array	of recognised pathnames.  When pathnames  are  parsed,
	       they're matched to indices in this array.

       pagesz  The  number of pages in pages.  Also used if the	requested page
	       was not in pages.

       req     This structure is cleared and filled with input fields and HTTP
	       context parsed from the CGI environment.	 It is the main	struc-
	       ture carried around in a	kcgi(3)	application.

       suffixes
	       Define the MIME type (suffix) mapping.

       The first form, khttp_parse(), is for applications  using  the  system-
       recognised  MIME	types.	This should work well enough for most applica-
       tions.  It is equivalent	to invoking the	second	form,  khttp_parsex(),
       as follows:

	     khttp_parsex(req, ksuffixmap,
	       kmimetypes, KMIME__MAX, keys, keysz,
	       pages, pagesz, KMIME_TEXT_HTML,
	       defpage,	NULL, NULL, 0, NULL);

   Types
       A  struct kreq object is	filled in by khttp_parse() and khttp_parsex().
       It consists of the following fields:

       void *arg
	       Private application data.  This is set during khttp_parse().

       enum kauth auth
	       Type of "managed"  HTTP	authorisation  performed  by  the  web
	       server  according  to  the  AUTH_TYPE  header variable, if any.
	       This is KAUTH_DIGEST for	the AUTH_TYPE of "digest", KAUTH_BASIC
	       for "basic", KAUTH_BEARER for "bearer", KAUTH_UNKNOWN for other
	       values of AUTH_TYPE, or KAUTH_NONE if  AUTH_TYPE	 is  not  set.
	       See  the	 rawauth field for raw (i.e., not processed by the web
	       server) authorisation requests.

       struct kpair **cookiemap
	       An array	of keysz  singly  linked  lists	 of  elements  of  the
	       cookies	array.	 If cookie->key	is equal to one	of the entries
	       of keys and cookie->state is  KPAIR_VALID  or  KPAIR_UNCHECKED,
	       the  cookie  is	added  to  the list cookiemap[cookie->keypos].
	       Empty lists are NULL.  If a list	contains more than one cookie,
	       cookie->next points to the next cookie.	For the	last cookie in
	       a list, cookie->next is NULL.

       struct kpair **cookienmap
	       Similar to cookiemap, except that it contains the cookies where
	       cookie->state is	KPAIR_INVALID.

       struct kpair *cookies
	       Key-value  pairs	 read  from  request  cookies  found  in   the
	       HTTP_COOKIE  header  variable,  or  NULL	if cookiesz is 0.  See
	       fields for key-value pairs from the  request  query  string  or
	       message body.

       size_t cookiesz
	       The size	of the cookies array.

       struct kpair **fieldmap
	       Similar to cookiemap, except that the lists contain elements of
	       the fields array.

       struct kpair **fieldnmap
	       Similar	to  fieldmap, except that it contains the fields where
	       field->state is KPAIR_INVALID.

       struct kpair *fields
	       Key-value pairs read from the QUERY_STRING header variable  and
	       from  the  message  body, or NULL if fieldsz is 0.  See cookies
	       for key-value pairs from	request	cookies.

       size_t fieldsz
	       The number of elements in the fields array.

       char *fullpath
	       The full	requested path as contained in	the  PATH_INFO	header
	       variable.	    For		  example,	    requesting
	       "https://bsd.lv/app.cgi/dir/file.html?q=v", where "app.cgi"  is
	       the CGI program,	this value would be /dir/file.html.  It	is not
	       guaranteed to start with	a slash	and it may be an empty string.

       char *host
	       The  host name received in the HTTP_HOST	header variable.  When
	       using name-based	virtual	hosting, this is typically the virtual
	       host name specified by the client in the	HTTP request,  and  it
	       should  not be confused with the	canonical DNS name of the host
	       running	the  web  server.    For   example,   a	  request   to
	       "https://bsd.lv/app.cgi/file"  would  have  a host of "bsd.lv".
	       If HTTP_HOST is not defined, host is set	to "localhost".

       struct kdata *kdata
	       Internal	data.  Should not be touched.

       const struct kvalid *keys
	       Value passed to khttp_parse().

       size_t keysz
	       Value passed to khttp_parse().

       enum kmethod method
	       The KMETHOD_ACL,	KMETHOD_CONNECT, KMETHOD_COPY, KMETHOD_DELETE,
	       KMETHOD_GET,  KMETHOD_HEAD,  KMETHOD_LOCK,  KMETHOD_MKCALENDAR,
	       KMETHOD_MKCOL,	KMETHOD_MOVE,  KMETHOD_OPTIONS,	 KMETHOD_POST,
	       KMETHOD_PROPFIND,	KMETHOD_PROPPATCH,	  KMETHOD_PUT,
	       KMETHOD_REPORT,	KMETHOD_TRACE,	or  KMETHOD_UNLOCK  submission
	       method obtained from the	REQUEST_METHOD header variable.	 If an
	       unknown method was requested,  KMETHOD__MAX  is	used.	If  no
	       method was specified, the default is KMETHOD_GET.

	       Applications   will   usually   accept	only  KMETHOD_GET  and
	       KMETHOD_POST, so	be sure	to emit	a KHTTP_405 status  for	 unde-
	       sired methods.

       size_t mime
	       The MIME	type of	the requested file as determined by its	suffix
	       matched	to the mimemap map passed to khttp_parsex() or the de-
	       fault kmimemap if using khttp_parse().  This  defaults  to  the
	       mimesz value passed to khttp_parsex() or	the default KMIME__MAX
	       if  using khttp_parse() when no suffix is specified or when the
	       suffix is specified but not known.

       size_t page
	       The page	index found by looking up pagename in the pages	array.
	       If pagename is not found	in pages, pagesz is used; if  pagename
	       is empty, defpage is used.

       char *pagename
	       The  first component of fullpath	or an empty string if there is
	       none.  It is compared to	the elements of	the pages array	to de-
	       termine which page it  corresponds  to.	 For  example,	for  a
	       fullpath	of "/dir/file.html" this component corresponds to dir.
	       For "/file.html", it's file.

       char *path
	       The  middle  part of fullpath, after stripping pagename/	at the
	       beginning and .suffix at	the end, or an empty string  if	 there
	       is  none.   For	example, if the	fullpath is bar/baz.html, this
	       component is baz.

       char *pname
	       The script name received	in the	SCRIPT_NAME  header  variable.
	       For    example,	  for	 a    request	to   a	 CGI   program
	       /var/www/cgi-bin/app.cgi	 mapped	 by  the   web	 server	  from
	       "https://bsd.lv/app.cgi/file", this would be app.cgi.  This may
	       not reflect a file system entity	and it may be an empty string.

       uint16_t	port
	       The  server's  receiving	 TCP port according to the SERVER_PORT
	       header variable,	or 80 if that is not  defined  or  an  invalid
	       number.

       struct khttpauth	rawauth
	       The    raw    authorization    request	 according    to   the
	       HTTP_AUTHORIZATION header variable passed by  the  web  server.
	       This  is	 only set if the web server is not managing authorisa-
	       tion itself.

       char *remote
	       The string form of the client's IPv4 or IPv6 address taken from
	       the REMOTE_ADDR header variable,	or "127.0.0.1" if that is  not
	       defined.	 The address format of the string is not checked.

       struct khead *reqmap[KREQU__MAX]
	       Mapping	of  enum  krequ	enumeration values to reqs parsed from
	       the input stream.

       struct khead *reqs
	       List of all HTTP	request	headers, known via enum	krequ and  not
	       known, parsed from the input stream, or NULL if reqsz is	0.

       size_t reqsz
	       Number of request headers in reqs.

       enum kscheme scheme
	       The  access  scheme according to	the HTTPS header variable, ei-
	       ther KSCHEME_HTTPS if HTTPS is set and equal to the string "on"
	       or KSCHEME_HTTP otherwise.

       char *suffix
	       The suffix part of the last component of	fullpath or  an	 empty
	       string  if  there  is  none.   For  example, if the fullpath is
	       /bar/baz.html, this component is	html.  See the mime field  for
	       the MIME	type parsed from the suffix.

       The  application	 may  optionally define	keys provided to khttp_parse()
       and khttp_parsex() as an	array of struct	 kvalid.   This	 structure  is
       central	to the validation of input data.  It consists of the following
       fields:

       const char *name
	       The field name, i.e., how it appears in	the  HTML  form	 input
	       name.   This  cannot  be	 NULL.	 If the	field name is an empty
	       string and the HTTP message consists of an opaque body (and not
	       key-value pairs), then that field will be used to validate  the
	       HTTP  message  body.   This is useful for KMETHOD_PUT style re-
	       quests.

       int (*)(struct kpair *) valid
	       A validation function returning non-zero	if parsing and valida-
	       tion succeed or 0 otherwise.  If	it is NULL, then no validation
	       is performed, the data is considered as valid, and it is	 buck-
	       eted into cookiemap or fieldmap as such.

	       User-defined  valid  functions  usually set the type and	parsed
	       fields in the key-value pair.  When working with	binary data or
	       with a key that can take	different data types, it is acceptable
	       for a validation	function to set	the type to KPAIR__MAX and for
	       the application to ignore the parsed field and to work directly
	       with val	and valsz.

	       The validation function is allowed to allocate new  memory  for
	       val:  if	 the val pointer changes during	validation, the	memory
	       pointed to after	validation will	be freed  with	free(3)	 after
	       the data	is passed out of the sandbox.

	       These functions are invoked from	within a system-specific sand-
	       box  that  may not allow	some system calls, for example opening
	       files or	sockets.  In other words, validation functions	should
	       only do pure computation.

       The  struct  kpair  structure presents the user with fields parsed from
       input  and  (possibly)  matched	to  the	 keys	variable   passed   to
       khttp_parse()  and khttp_parsex().  It is also passed to	the validation
       function	to be filled in.  In this case,	the  MIME-related  fields  are
       already	filled in and may be examined to determine the method of vali-
       dation.	This is	useful when validating opaque message bodies.

       char *ctype
	       The value's MIME	content	type (e.g., image/jpeg), or  an	 empty
	       string if not defined.

       size_t ctypepos
	       If  ctype  is  not NULL,	it is looked up	in the mimes parameter
	       passed to khttp_parsex()	or ksuffixmap if using	khttp_parse().
	       If  found, it is	set to the appropriate index.  Otherwise, it's
	       mimesz.

       char *file
	       The value's MIME	source filename	or an empty string if not  de-
	       fined.

       char *key
	       The  NUL-terminated key (input) name.  If the HTTP message body
	       is opaque (e.g.,	KMETHOD_PUT),  then  an	 empty-string  key  is
	       cooked  up.   The key may contain an arbitrary sequence of non-
	       NUL bytes, even non-ASCII bytes,	control	characters, and	 shell
	       metacharacters.

       size_t keypos
	       If  found  in the keys array passed to khttp_parse(), the index
	       of the matching key.  Otherwise keysz.

       struct kpair *next
	       In a cookie or field map, next points to	the next  parsed  key-
	       value  pair  with the same key name.  This occurs most often in
	       HTML checkbox forms, where many fields may have the same	name.

       union parsed parsed
	       The parsed, validated value.  These may be integer in i,	for  a
	       64-bit signed integer; a	string s, for a	NUL-termianted charac-
	       ter  string;  or	 a  double d, for a double-precision floating-
	       point number.  This is intentionally basic because the  result-
	       ing  data must be reliably passed from the parsing context back
	       into the	web application.

       enum kpairstate state
	       The validation state: KPAIR_VALID if the	pair was  successfully
	       validated  by a validation function, KPAIR_INVALID if a valida-
	       tion function was invoked but failed, or	KPAIR_UNCHECKED	if  no
	       validation function is defined for this key.

       enum kpairtype type
	       If parsed, the type of data in parsed, otherwise	KFIELD__MAX.

       char *val
	       The  (input)  value, which may contain an arbitrary sequence of
	       bytes, even NUL bytes, non-ASCII	bytes, control characters, and
	       shell metacharacters.  The byte following the end of the	array,
	       val[valsz], is always guaranteed	to  be	NUL.   The  validation
	       function	 may  modify  the  contents.  For example, for integer
	       numbers and e-mail adresses, trailing  whitespace  may  be  re-
	       placed with NUL bytes.

       size_t valsz
	       The  length  of	the  val  buffer in bytes.  It is not a	string
	       length.

       char *xcode
	       The value's MIME	content	transfer encoding (e.g.,  base64),  or
	       an empty	string if not defined.

       The  struct  khttpauth  structure holds authorisation data if passed by
       the server.  The	specific fields	are as follows.

       enum kauth type
	       If no data  was	passed	by  the	 server,  the  type  value  is
	       KAUTH_NONE.    Otherwise	 it's  KAUTH_BASIC,  KAUTH_BEARER,  or
	       KAUTH_DIGEST.  KAUTH_UNKNOWN  signals  that  the	 authorisation
	       type was	not recognised.

       int authorised
	       For  KAUTH_BASIC,  KAUTH_BEARER,	or KAUTH_DIGEST	authorisation,
	       this field indicates whether all	required values	were specified
	       for the application to perform authorisation.

       char *digest
	       An MD5 digest of	REQUEST_METHOD,	SCRIPT_NAME, PATH_INFO,	header
	       variables and the request body.	It  is	not  a	NUL-terminated
	       string,	but an array of	exactly	MD5_DIGEST_LENGTH bytes.  Only
	       filled in when HTTP_AUTHORIZATION is "digest" and authorised is
	       non-zero.    Otherwise,	 it    remains	  NULL.	    Used    in
	       khttpdigest_validatehash(3).

       d       An  anonymous  union  containing	parsed fields per type:	struct
	       khttpbasic basic	for KAUTH_BASIC	 or  KAUTH_BEARER,  or	struct
	       khttpdigest digest for KAUTH_DIGEST.

       If  the	field  for  an	HTTP  authorisation  request is	KAUTH_BASIC or
       KAUTH_BEARER, it	will consist of	the following for its parsed  entities
       in its struct khttpbasic	structure:

       response
	       The  hashed  and	encoded	response string	for KAUTH_BASIC, or an
	       opaque string for KAUTH_BEARER.

       If the field for	an HTTP	authorisation request is KAUTH_DIGEST, it will
       consist of the following	in its struct khttpdigest structure:

       alg     The  encoding  algorithm,  parsed  from	the  possible  MD5  or
	       MD5-Sess	values.

       qop     The  quality of protection algorithm, which may be unspecified,
	       Auth or Auth-Init.

       user    The user	coordinating the request.

       uri     The URI for which the request is	designated.  (This must	 match
	       the request URI).

       realm   The request realm.

       nonce   The server-generated nonce value.

       cnonce  The (optional) client-generated nonce value.

       response
	       The  hashed and encoded response	string,	which entangled	fields
	       depending on algorithm and quality of protection.

       count   The (optional) cnonce counter.

       opaque  The (optional) opaque string requested by the server.

       The struct kopts	structure consists of  tunables	 for  network  perfor-
       mance.	You  probably  don't  want to use these	unless you really know
       what you're doing!

       sndbufsz
	       The size	of the output buffer.  The output buffer is a heap-al-
	       located	region	into  which  writes  (via  khttp_write(3)  and
	       khttp_head(3))  are  buffered instead of	being flushed directly
	       to the wire.  The buffer	is flushed when	it is full,  when  the
	       HTTP  headers  are  flushed, and	when khttp_free(3) is invoked.
	       If the buffer size is zero, writes are flushed  immediately  to
	       the  wire.   If the buffer size is less than zero, it is	filled
	       with a meaningful default.

       Lastly, the struct khead	structure holds	parsed HTTP headers.

       key     Holds the HTTP header name.  This is not	the  CGI  header  name
	       (e.g.,  HTTP_COOKIE),  but  the	reconstituted HTTP name	(e.g.,
	       Coookie).

       val     The opaque header value,	which may be an	empty string.

   Variables
       A number	of variables are defined <kcgi.h> to simplify  invocations  of
       the  khttp_parse()  family.  Applications are strongly suggested	to use
       these variables (and associated enumerations) in	khttp_parse()  instead
       of overriding them with hand-rolled sets	in khttp_parsex().

       kmimetypes
	       Indexed list of common MIME types, for example, "text/html" and
	       "application/json".  Corresponds	to enum	kmime enum khttp.

       khttps  Indexed	list  of HTTP status code and identifier, for example,
	       "200 OK".  Corresponds to enum khttp.

       kschemes
	       Indexed list of URL schemes, for	 example,  "https"  or	"ftp".
	       Corresponds to enum kscheme.

       kmethods
	       Indexed	list  of  HTTP methods,	for example, "GET" and "POST".
	       Corresponds to enum kmethod.

       ksuffixmap
	       Map of MIME types defined in enum kmime to  possible  suffixes.
	       This  array  is	terminated  with a MIME	type of	KMIME__MAX and
	       name NULL.

       ksuffixes
	       Indexed list of canonical suffixes for MIME types corresponding
	       to enum kmime.  This may	be a NULL pointer for types that  have
	       no canonical suffix, for	example.  "application/octet-stream".

RETURN VALUES
       khttp_parse() and khttp_parsex()	return an error	code:

       KCGI_OK
	    Success (not an error).

       KCGI_ENOMEM
	    Memory  failure.  This can occur in	many places: spawning a	child,
	    allocating memory, creating	sockets, etc.

       KCGI_ENFILE
	    Could not allocate file descriptors.

       KCGI_EAGAIN
	    Could not spawn a child.

       KCGI_FORM
	    Malformed data between parent and child whilst parsing an HTTP re-
	    quest.  (Internal system error.)

       KCGI_SYSTEM
	    Opaque operating system error.

       On failure, the calling application should terminate as soon as	possi-
       ble.   Applications  should not try to write an HTTP 505	error or simi-
       lar, but	allow the web server to	handle the empty CGI response  on  its
       own.

SEE ALSO
       kcgi(3),	khttp_free(3)

AUTHORS
       The khttp_parse() and khttp_parsex() functions were written by Kristaps
       Dzonsons	<kristaps@bsd.lv>.

FreeBSD	Ports 14.quarterly	  $Mdocdate$			KHTTP_PARSE(3)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=khttp_parse&sektion=3&manpath=FreeBSD+Ports+14.3.quarterly>

home | help