Fibers and the Hub¶
Fibers are light weight execution contexts that are cooperatively scheduled. They form the basis for concurrency in Gruvi. Comparing fibers to threads:
- Like threads, fibers represent an independent flow of control in a program.
- Fibers are more light weight than threads, meaning you can run more of them.
- Unlike threads, there can only ever be one active fiber (per thread, see below). This means that fibers give you concurrency but not parallelism.
- Unlike threads, fibers are cooperatively scheduled. They are never preempted and have to yield control other fibers explicitly (called switching).
Gruvi uses the python-fibers package as its fiber library.
Fibers logically exist within threads. Each fiber has its own Python and
C-level stack. When a Python program starts, there will be one thread called
'Main'
, and this thread will have one fiber called 'Root'
. Both are
created automatically; the main thread by Python and the root fiber by the
python-fibers package.
Fibers are organized in a tree. Each fiber except the root fiber has a parent. When a fiber exits, either because its main function exits or because of an uncaught exception, control is passed to its parent. If the root fiber exits, the thread will exit.
The hub and fiber scheduling¶
The Gruvi framework maintains a special fiber called the Hub. The hub runs an event loop, and acts as a central fiber scheduler. The event loop in Gruvi is provided by libuv through the pyuv bindings.
From a high-level, the flow of a Gruvi program is as follows:
- Run the current fiber (initially the root fiber).
- When the fiber needs to wait for something, for example network data, the event to wait for is registered with the event loop, together with a switch back callback.
- The fiber switches to the hub, which run the event loop.
- When the event loop detects an event for which a fiber registered interest, it will will call the callback. This causes a switch back to the fiber that installed the event.
The hub instance is automatically created when in is first needed. It can be retrieved using the following function:
-
get_hub
()¶ Return the instance of the hub.
The event loop is available as the .loop
property:
Creating fibers and switching¶
To create a new fiber, instantiate the Fiber
class and pass it a main
function. The Fiber
class is a thin wrapper on top of
fibers.Fiber
, to which it adds a few behaviors:
- The fiber is created as a child of the hub.
- Only the hub is allowed to switch to this fiber. This prevent complex interactions where any fiber can switch to any other fiber. In other words, the hub is a real hub.
-
class
Fiber
(target, args=(), kwargs={}, name=None, hub=None)¶ An cooperatively scheduled execution context aka green thread aka co-routine.
The target argument is the main function of the fiber. It must be a Python callable. The args and kwargs specify its arguments and keyword arguments, respectively.
The name argument specifies the fiber name. This is purely a diagnositic tool is used e.g. in log messages.
The hub argument can be used to override the hub that will be used to schedule this fiber. This argument is used by the unit tests and should not by needed.
-
name
¶ The fiber’s name.
-
alive
¶ Whether the fiber is alive.
-
start
()¶ Schedule the fiber to be started in the next iteration of the event loop.
-
cancel
(message=None)¶ Schedule the fiber to be cancelled in the next iteration of the event loop.
Cancellation works by throwing a
Cancelled
exception into the fiber. If message is provided, it will be set as the value of the exception.
-
switchpoint
join
(timeout=None)¶ Wait until the fiber completes.
-
The only two fibers in a Gruvi program that use fibers.Fiber
directly
are the hub and the root fiber. All other fibers should be created as instances
of Fiber
.
When a fiber is created it doesn’t run yet. To switch to the fiber, call its
start()
method and call a function that will switch to the hub:
def say_hello():
print('Hello, there!')
fiber = Fiber(say_hello)
fiber.start()
print('Starting fiber')
sleep(0)
print('Back in root fiber')
The output of this will be:
Starting fiber
Hello, there!
Back in root fiber
Working with the event loop¶
To register interest in a certain event, you need to create the appropriate
pyuv.Handle
instance and add it to the loop. The callback to the
handle should cause a switch back to the current fiber. You also want to make
sure you implement a timeout on the event, and that you clean up the handle in
case it times out. Because this logic can be relatively tricky to get right,
Gruvi provides the switch_back
context manager for this:
-
class
switch_back
(timeout=None, hub=None, lock=None)¶ A context manager to facilitate switching back to the current fiber.
Instances of this class are callable, and are intended to be used as the callback argument for an asynchronous operation. When called, the switchback object causes
Hub.switch()
to return in the origin fiber (the fiber that created the switchback object). The return value in the origin fiber will be an(args, kwargs)
tuple containing positional and keyword arguments passed to the callback.When the context manager exits it will be deactivated. If it is called after that then no switch will happen. Also the cleanup callbacks are run when the context manager exits.
In the example below, a switchback object is used to wait for at most 10 seconds for a SIGHUP signal:
hub = get_hub() with switch_back(timeout=10) as switcher: sigh = pyuv.Signal(hub.loop) sigh.start(switcher, signal.SIGHUP) switcher.add_cleanup(sigh.close) hub.switch()
The timeout argument can be used to force a timeout after this many seconds. It can be an int or a float. If a timeout happens,
Hub.switch()
will raise aTimeout
exception in the origin fiber. The default is None, meaning there is no timeout.The hub argument can be used to specify an alternate hub to use. This argument is used by the unit tests and should normally not be needed.
-
fiber
¶ The origin fiber.
-
timeout
¶ The
Timeout
exception if a timeout has occurred. Otherwise the timeout parameter provided to the constructor.
-
active
¶ Whether the switchback object is currently active.
-
switch
(value=None)¶ Switch back to the origin fiber. The fiber is switch in next time the event loop runs.
-
throw
(typ, val=None, tb=None)¶ Throw an exception into the origin fiber. The exception is thrown the next time the event loop runs.
-
add_cleanup
(callback, *args)¶ Add a cleanup action. The callback is run with the provided positional arguments when the context manager exists.
-
Lockless operation and switchpoints¶
Functions that may cause a switch to happen are called switch points. In Gruvi these functions are marked by a special decorator:
-
switchpoint
(func)¶ Mark func as a switchpoint.
In Gruvi, all methods and functions that call
Hub.switch()
directly, and all public APIs that can cause an indirect switch, are marked as a switchpoint. It is recommended that you mark your own methods and functions in the same way. Example:@switchpoint def myfunc(): # may call Hub.switch() here
Knowing where switches may happen is important if you need to modify global state in a non-atomic way. If you can be sure that during the modification no switch points are called, then you don’t need any locks. This lockless operation is one of the main benefits of green threads. Gruvi offers a context manager that can help with this:
-
class
assert_no_switchpoints
(hub=None)¶ Context manager that defines a block in which no switches may happen, and in which no switchpoints may be called.
Use it as follows:
with assert_no_switchpoints(): do_something() do_something_else()
If the context manager detects a switch or a call into a switchpoint it raises an
AssertionError
.
The assert_no_switchpoints
context manager should not be overused.
Instead it is recommended to try and confine non-atomic changes to a global
state to single functions.
Utility functions¶
-
current_fiber
()¶ Return the current fiber.
Note: The root and hub fiber are “bare”
fibers.Fiber
instances. Calling this method there returns the bare instance, not agruvi.Fiber
instance.
-
spawn
(func, *args, **kwargs)¶ Spawn a new fiber.
A new
Fiber
is created with main function func and positional arguments args. The keyword arguments are passed to theFiber
constructor, not to the main function. The fiber is then scheduled to start by calling itsstart()
method.The fiber instance is returned.
-
switchpoint
sleep
(secs)¶ Sleep for secs seconds. The secs argument can be an int or a float.
Fiber local data¶
-
class
local
¶ Fiber-local data.
To manage fiber-local data, instantiate this class and store attributes on it:
mydata = local() mydata.x = 10
Attributes have a value or are unset independently for each fiber.
Hub reference¶
-
class
Hub
¶ The central fiber scheduler and event loop manager.
-
loop
The event loop used by this hub instance. This is an instance of
pyuv.Loop
.
-
data
¶ A per-hub dictionary that can be used by applications to store data.
Keys starting with
'gruvi:'
are reserved for internal use.
-
poll
¶ A centrally managed poller that can be used install callbacks for file descriptor readiness events.
-
close
()¶ Close the hub.
This sets a flag that will cause the event loop to exit when it next runs. The hub fiber will then exit and control is transferred back to the root fiber.
-
switch
()¶ Switch to the hub.
This method pauses the current fiber and runs the event loop. The caller should ensure that it has set up appropriate callbacks so that it will get scheduled again, preferably using
switch_back
. In this case then return value of this method will be an(args, kwargs)
tuple containing the arguments passed to the switch back instance.If this method is called from the root fiber then there are two additional cases. If the hub exited due to a call to
close()
, then this method returns None. And if the hub exited due to a exception, that exception is re-raised here.
-
run_callback
(callback, *args)¶ Queue a callback.
The callback will be called with positional arguments args in the next iteration of the event loop. If you add multiple callbacks, they will be called in the order that you added them. The callback will run in the Hub’s fiber.
This method is thread-safe: it is allowed to queue a callback from a different thread than the one running the Hub.
-
Mixing threads and fibers¶
There are two common situations where you might want to mix threads and fibers:
- When running CPU intensive code. In this case, you should run the code the
CPU thread pool
. - When running third party code that performs blocking IO. In this case, run
the code in the
IO thread pool
.
In both cases, running the code in a thread pool allows the hub to continue servicing IO for fibers. All other cases of mixing threads and fibers are generally a bad idea.
The following code is Gruvi is thread safe:
- The
Hub.run_callback()
method. - All synchronization primitives in Synchronization primitives.
All other code is not thread safe. If, for whatever reason, you must access this code from multiple threads, use locks to mediate access.