Fibers and the Hub

Fibers are light weight execution contexts that are cooperatively scheduled. They form the basis for concurrency in Gruvi. Comparing fibers to threads:

  • Like threads, fibers represent an independent flow of control in a program.
  • Fibers are more light weight than threads, meaning you can run more of them.
  • Unlike threads, there can only ever be one active fiber (per thread, see below). This means that fibers give you concurrency but not parallelism.
  • Unlike threads, fibers are cooperatively scheduled. They are never preempted and have to yield control other fibers explicitly (called switching).

Gruvi uses the python-fibers package as its fiber library.

Fibers logically exist within threads. Each fiber has its own Python and C-level stack. When a Python program starts, there will be one thread called 'Main', and this thread will have one fiber called 'Root'. Both are created automatically; the main thread by Python and the root fiber by the python-fibers package.

Fibers are organized in a tree. Each fiber except the root fiber has a parent. When a fiber exits, either because its main function exits or because of an uncaught exception, control is passed to its parent. If the root fiber exits, the thread will exit.

The hub and fiber scheduling

The Gruvi framework maintains a special fiber called the Hub. The hub runs an event loop, and acts as a central fiber scheduler. The event loop in Gruvi is provided by libuv through the pyuv bindings.

From a high-level, the flow of a Gruvi program is as follows:

  1. Run the current fiber (initially the root fiber).
  2. When the fiber needs to wait for something, for example network data, the event to wait for is registered with the event loop, together with a switch back callback.
  3. The fiber switches to the hub, which run the event loop.
  4. When the event loop detects an event for which a fiber registered interest, it will will call the callback. This causes a switch back to the fiber that installed the event.

The hub instance is automatically created when in is first needed. It can be retrieved using the following function:

get_hub()

Return the instance of the hub.

The event loop is available as the .loop property:

Hub.loop

The event loop used by this hub instance. This is an instance of pyuv.Loop.

Creating fibers and switching

To create a new fiber, instantiate the Fiber class and pass it a main function. The Fiber class is a thin wrapper on top of fibers.Fiber, to which it adds a few behaviors:

  • The fiber is created as a child of the hub.
  • Only the hub is allowed to switch to this fiber. This prevent complex interactions where any fiber can switch to any other fiber. In other words, the hub is a real hub.
class Fiber(target, args=(), kwargs={}, name=None, hub=None)

An cooperatively scheduled execution context aka green thread aka co-routine.

The target argument is the main function of the fiber. It must be a Python callable. The args and kwargs specify its arguments and keyword arguments, respectively.

The name argument specifies the fiber name. This is purely a diagnositic tool is used e.g. in log messages.

The hub argument can be used to override the hub that will be used to schedule this fiber. This argument is used by the unit tests and should not by needed.

name

The fiber’s name.

alive

Whether the fiber is alive.

start()

Schedule the fiber to be started in the next iteration of the event loop.

cancel(message=None)

Schedule the fiber to be cancelled in the next iteration of the event loop.

Cancellation works by throwing a Cancelled exception into the fiber. If message is provided, it will be set as the value of the exception.

switchpoint join(timeout=None)

Wait until the fiber completes.

The only two fibers in a Gruvi program that use fibers.Fiber directly are the hub and the root fiber. All other fibers should be created as instances of Fiber.

When a fiber is created it doesn’t run yet. To switch to the fiber, call its start() method and call a function that will switch to the hub:

def say_hello():
    print('Hello, there!')

fiber = Fiber(say_hello)
fiber.start()

print('Starting fiber')
sleep(0)
print('Back in root fiber')

The output of this will be:

Starting fiber
Hello, there!
Back in root fiber

Working with the event loop

To register interest in a certain event, you need to create the appropriate pyuv.Handle instance and add it to the loop. The callback to the handle should cause a switch back to the current fiber. You also want to make sure you implement a timeout on the event, and that you clean up the handle in case it times out. Because this logic can be relatively tricky to get right, Gruvi provides the switch_back context manager for this:

class switch_back(timeout=None, hub=None, lock=None)

A context manager to facilitate switching back to the current fiber.

Instances of this class are callable, and are intended to be used as the callback argument for an asynchronous operation. When called, the switchback object causes Hub.switch() to return in the origin fiber (the fiber that created the switchback object). The return value in the origin fiber will be an (args, kwargs) tuple containing positional and keyword arguments passed to the callback.

When the context manager exits it will be deactivated. If it is called after that then no switch will happen. Also the cleanup callbacks are run when the context manager exits.

In the example below, a switchback object is used to wait for at most 10 seconds for a SIGHUP signal:

hub = get_hub()
with switch_back(timeout=10) as switcher:
    sigh = pyuv.Signal(hub.loop)
    sigh.start(switcher, signal.SIGHUP)
    switcher.add_cleanup(sigh.close)
    hub.switch()

The timeout argument can be used to force a timeout after this many seconds. It can be an int or a float. If a timeout happens, Hub.switch() will raise a Timeout exception in the origin fiber. The default is None, meaning there is no timeout.

The hub argument can be used to specify an alternate hub to use. This argument is used by the unit tests and should normally not be needed.

fiber

The origin fiber.

timeout

The Timeout exception if a timeout has occurred. Otherwise the timeout parameter provided to the constructor.

active

Whether the switchback object is currently active.

switch(value=None)

Switch back to the origin fiber. The fiber is switch in next time the event loop runs.

throw(typ, val=None, tb=None)

Throw an exception into the origin fiber. The exception is thrown the next time the event loop runs.

add_cleanup(callback, *args)

Add a cleanup action. The callback is run with the provided positional arguments when the context manager exists.

Lockless operation and switchpoints

Functions that may cause a switch to happen are called switch points. In Gruvi these functions are marked by a special decorator:

switchpoint(func)

Mark func as a switchpoint.

In Gruvi, all methods and functions that call Hub.switch() directly, and all public APIs that can cause an indirect switch, are marked as a switchpoint. It is recommended that you mark your own methods and functions in the same way. Example:

@switchpoint
def myfunc():
    # may call Hub.switch() here

Knowing where switches may happen is important if you need to modify global state in a non-atomic way. If you can be sure that during the modification no switch points are called, then you don’t need any locks. This lockless operation is one of the main benefits of green threads. Gruvi offers a context manager that can help with this:

class assert_no_switchpoints(hub=None)

Context manager that defines a block in which no switches may happen, and in which no switchpoints may be called.

Use it as follows:

with assert_no_switchpoints():
    do_something()
    do_something_else()

If the context manager detects a switch or a call into a switchpoint it raises an AssertionError.

The assert_no_switchpoints context manager should not be overused. Instead it is recommended to try and confine non-atomic changes to a global state to single functions.

Utility functions

current_fiber()

Return the current fiber.

Note: The root and hub fiber are “bare” fibers.Fiber instances. Calling this method there returns the bare instance, not a gruvi.Fiber instance.

spawn(func, *args, **kwargs)

Spawn a new fiber.

A new Fiber is created with main function func and positional arguments args. The keyword arguments are passed to the Fiber constructor, not to the main function. The fiber is then scheduled to start by calling its start() method.

The fiber instance is returned.

switchpoint sleep(secs)

Sleep for secs seconds. The secs argument can be an int or a float.

Fiber local data

class local

Fiber-local data.

To manage fiber-local data, instantiate this class and store attributes on it:

mydata = local()
mydata.x = 10

Attributes have a value or are unset independently for each fiber.

Hub reference

class Hub

The central fiber scheduler and event loop manager.

ignore_interrupt = False

By default the hub will raise a KeyboardInterrupt in the root fiber when a SIGINT (CTRL-C) is received. Set this to True to ignore SIGINT instead.

name

The name of the Hub, which is 'Hub'.

loop

The event loop used by this hub instance. This is an instance of pyuv.Loop.

data

A per-hub dictionary that can be used by applications to store data.

Keys starting with 'gruvi:' are reserved for internal use.

poll

A centrally managed poller that can be used install callbacks for file descriptor readiness events.

switchpoint close()

Close the hub and wait for it to be closed.

This may only be called in the root fiber. After this call returned, Gruvi cannot be used anymore in the current thread. The main use case for calling this method is to clean up resources in a multi-threaded program where you want to exit a thead but not yet the entire process.

switch()

Switch to the hub.

This method pauses the current fiber and runs the event loop. The caller should ensure that it has set up appropriate callbacks so that it will get scheduled again, preferably using switch_back. In this case then return value of this method will be an (args, kwargs) tuple containing the arguments passed to the switch back instance.

If this method is called from the root fiber then there are two additional cases. If the hub exited due to a call to close(), then this method returns None. And if the hub exited due to a exception, that exception is re-raised here.

run_callback(callback, *args)

Queue a callback.

The callback will be called with positional arguments args in the next iteration of the event loop. If you add multiple callbacks, they will be called in the order that you added them. The callback will run in the Hub’s fiber.

This method is thread-safe: it is allowed to queue a callback from a different thread than the one running the Hub.

Mixing threads and fibers

There are two common situations where you might want to mix threads and fibers:

  • When running CPU intensive code. In this case, you should run the code the CPU thread pool.
  • When running third party code that performs blocking IO. In this case, run the code in the IO thread pool.

In both cases, running the code in a thread pool allows the hub to continue servicing IO for fibers. All other cases of mixing threads and fibers are generally a bad idea.

The following code is Gruvi is thread safe:

All other code is not thread safe. If, for whatever reason, you must access this code from multiple threads, use locks to mediate access.