gruvi.http – HTTP Client and Server

The gruvi.http module implements a HTTP client and server.

The client and server are relatively complete implementations of the HTTP protocol. Some of the supported features are keepalive, pipelining, chunked transfers and trailers.

This implementation supports both HTTP/1.0 and HTTP/1.1. The default for the client is 1.1, and the server will respond with the same version as the client.

Connections are kept alive by default. This means that you need to make sure you close connections when they are no longer needed, by calling the appropriate close() method.

It is important to clarify how the API exposed by this module uses text and binary data. Data that is read from or written to the HTTP header, such as the version string, method, and headers, are text strings (str on Python 3, str or unicode on Python 2). However, if the string type is unicode aware (str on Python 3, unicode on Python 2), you must make sure that it only contains code points that are part of ISO-8859-1, which is the default encoding specified in RFC 2606. Data that is read from or written to HTTP bodies is always binary. This is done in alignment with the WSGI spec that requires this.

This module provides a number of APIs. Client-side there is one:

The following server-side APIs are available:

  • A gruvi.Server based API. Incoming HTTP messages are passed to a message handler that needs to take care of all aspects of HTTP other than parsing.
  • A WSGI API, as described in PEP 333.

The server-side API is selected through the adapter argument to HttpServer constructor. The default adapter is WsgiAdapter, which implements the WSGI protocol. To use the raw server interface, pass the identity function (lambda x: x).

REQUEST

Constant indicating a HTTP request.

RESPONSE

Constant indicating a HTTP response.

exception HttpError

Exception that is raised in case of HTTP protocol errors.

class ParsedUrl

A namedtuple() with the following fields: scheme, host, port, path, query, fragment and userinfo.

In addition to the tuple fields the following properties are defined:

addr

Address tuple that can be used with create_connection().

ssl

Whether the scheme requires SSL/TLS.

target

The “target” i.e. local part of the URL, consisting of the path and query.

parse_url(url, default_scheme='http', is_connect=False)

Parse an URL and return its components.

The default_scheme argument specifies the scheme in case URL is an otherwise valid absolute URL but with a missing scheme.

The is_connect argument must be set to True if the URL was requested with the HTTP CONNECT method. These URLs have a different form and need to be parsed differently.

The result is a ParsedUrl containing the URL components.

class HttpMessage

HTTP message.

Instances of this class are returned by HttpClient.getresponse() and passed as an argument to HttpServer message handlers.

message_type

The message type, either REQUEST or RESPONSE.

version

The HTTP version as a string, either '1.0' or '1.1'.

status_code

The HTTP status code as an integer. Only for response messages.

method

The HTTP method as a string. Only for request messages.

url

The URL as a string. Only for request messages.

parsed_url

The parsed URL as a ParsedUrl instance.

headers

The headers as a list of (name, value) tuples.

charset

The character set as parsed from the “Content-Type” header, if available.

body

The message body, as a Stream instance.

get_header(headers, name, default=None)

A shorthand for get_header(headers, ...).

class HttpRequest(protocol)

HTTP client request.

Usually you do not instantiate this class directly, but use the instance returned by HttpProtocol.request(). You can however start new request yourself by instantiating this class and passing it a protocol instance.

switchpoint start_request(method, url, headers=None, bodylen=None)

Start a new HTTP request.

The optional headers argument contains the headers to send. It must be a sequence of (name, value) tuples.

The optional bodylen parameter is a hint that specifies the length of the body that will follow. A length of -1 indicates no body, 0 means an empty body, and a positive number indicates the body size in bytes. This parameter helps determine whether to use the chunked transfer encoding. Normally when the body size is known chunked encoding is not used.

switchpoint write(buf)

Write buf to the request body.

switchpoint end_request()

End the request body.

class WsgiAdapter(application)

WSGI Adapter

This class adapts the WSGI callable application so that instances of this class can be used as a message handler in HttpProtocol.

class HttpProtocol(handler=None, server_side=False, server_name=None, version=None, timeout=None)

HTTP protocol implementation.

The handler argument specifies a message handler to handle incoming HTTP requests. It must be a callable with the signature handler(message, transport, protocol).

The server_side argument specifies whether this is a client or server side protocol.

For client-side protocols, the server_name argument specifies the remote server name, which is used as the Host: header. It is normally not required to set this as it is taken from the unresolved hostname passed to connect().

For server-side protocols, server_name will be used as the value for the SERVER_NAME WSGI environment variable.

default_version = '1.1'

Default HTTP version.

max_header_size = 65536

Max header size. The parser keeps the header in memory during parsing.

max_buffer_size = 65536

Max number of body bytes to buffer. Bodies larger than this will cause the transport to be paused until the buffer is below the threshold again.

max_pipeline_size = 10

Max number of pipelined requests to keep before pausing the transport.

writer

A Stream instance for writing directly to the underlying transport.

switchpoint request(method, url, headers=None, body=None)

Make a new HTTP request.

The method argument is the HTTP method as a string, for example 'GET' or 'POST'. The url argument specifies the URL.

The optional headers argument specifies extra HTTP headers to use in the request. It must be a sequence of (name, value) tuples.

The optional body argument may be used to include a body in the request. It must be a bytes instance, a file-like object opened in binary mode, or an iterable producing bytes instances. To send potentially large bodies, use the file or iterator interfaces. This has the benefit that only a single chunk is kept in memory at a time.

The response to the request can be obtained by calling the getresponse() method. You may make multiple requests before reading a response. For every request that you make however, you must call getresponse() exactly once. The remote HTTP implementation will send by the responses in the same order as the requests.

This method will use the “chunked” transfer encoding if here is a body and the body size is unknown ahead of time. This happens when the file or interator interface is used in the abence of a “Content-Length” header.

switchpoint getresponse()

Wait for and return a HTTP response.

The return value will be a HttpMessage. When this method returns only the response header has been read. The response body can be read using read() and similar methods on the message body.

Note that if you use persistent connections (the default), it is required that you read the entire body of each response. If you don’t then deadlocks may occur.

class HttpClient(version=None, timeout=None)

HTTP client.

The optional version argument specifies the HTTP version to use. The default is HttpProtocol.default_version.

The optional timeout argument specifies the timeout for various network and protocol operations.

protocol

Return the protocol, or None if not connected.

switchpoint getresponse()

A shorthand for self.protocol.getresponse().

switchpoint request(method, url, headers=None, body=None)

A shorthand for self.protocol.request().

class HttpServer(application, server_name=None, adapter=None, timeout=None)

HTTP server.

The application argument is the web application to expose on this server. The application is wrapped in adapter to create a message handler as required by HttpProtocol. By default the adapter in default_adapter is used.

The optional server_name argument specifies the server name. The server name is made available to WSGI applications as the $SERVER_NAME environment variable.

The optional timeout argument specifies the timeout for various network and protocol operations.

default_adapter

The default adapter to use.

alias of WsgiAdapter

Example

# HTTP client and server example.

from gruvi.http import HttpClient, HttpServer

def handler(env, start_response):
    headers = [('Content-Type', 'text/plain; charset=UTF-8')]
    status = '200 OK'
    body = 'pong'
    start_response(status, headers)
    yield body.encode('utf-8')

server = HttpServer(handler)
server.listen(('localhost', 0))
addr = server.addresses[0]

client = HttpClient()
client.connect(addr)
client.request('GET', '/ping')

resp = client.getresponse()
assert resp.status_code == 200

body = resp.body.read()
print('result = {}'.format(body.decode(resp.charset)))