Weak pointers in Seastar

17 November 2020

In a previous post I discussed Seastar’s shared pointer type for managing a shared resource using reference counting. In this post I am going to discuss another smart pointer type provided by Seastar called a weak pointer. A weak pointer is similar to a shared pointer, but the twist is that the pointer does not have any influence over the lifetime of the resource it references, and as a result, may be invalidated at any moment!

A classic use case of weak pointers is managing references that point into a shared cache. For example, consider a global pool of objects managed according to some expiration policy. This might be something like a page cache with a least recently used eviction policy. In this scenario many independent sub-systems may each hold references to cached objects. What should happen when memory pressure necessitates evicting objects from the cache, and in particular, how is access synchronized between the cache and the various sub-systems holding references to objects that may be evicted? Weak pointers can be used to build a solution to this problem.

It’s tempting to reach for a shared pointer–which would ensure that referenced objects stay alive–but some communication mechanism would be needed to inform sub-systems that they should also release their references in order to reduce memory pressure. Instead of solving this problem by building some type of out-of-band communication mechanism, weak pointers internally track references and expose an easy to understand semantic that supports the required synchronization. Continuing with the caching example, assume each sub-system holds a weak pointer. When an object is evicted from the cache its memory is immediately freed, and sub-systems are required to check the validity of each weak pointer into the cache before dereferencing. Invalid pointers can then be handled in various ways, such as performing a read-through cache operation.

You may be familiar with std::weak_ptr<T> that works in conjunction with std::shared_ptr<T>. As we’ll see shortly, the weak pointer implementation provided by Seastar provides the same basic functionality. But unlike the implementation in the standard library, seastar::weak_ptr<T> can be used without any other smart pointers such as seastar::shared_ptr<T>, making it attractive in situations where storing objects on the heap may introduce unnecessary overhead or would otherwise be inconvenient.

Usage

The first step to using Seastar weak pointers to manage references to a type Data is to have Data inherit from seastar::weakly_referencable<Data> which injects functionality into the class for tracking back references.

class Data : public seastar::weakly_referencable<Data> {
 public:
  void foo();
  // ...
};

We’ll take a look shortly at what is injected into the class, but from a practical standpoint, Data now exposes a single new method called weak_from_this() which returns a seastar::weak_ptr<Data>. Here is how you might build an instance of Data and acquire weak pointers to the instance.

auto d = std::make_unique<Data>(...);
auto w0 = d->weak_from_this();
auto w1 = d->weak_from_this();

The weak pointers w0 and w1 can be used just like the unique pointer d, so w0->foo() will behave as expected. However, safe usage of a weak pointer requires that the pointer’s validity be checked before the pointer is dereferenced using operator bool(). In the example below, when d is released, references maintained behind the scenes are used to invalidate w0 and w1 without explicit synchronization. Once a weak pointer is found to be invalid it can be discarded, but the way you handle an invalid pointer is left to your application.

assert(w0)
assert(w1);

d = nullptr; // invalidates w0 and w1

assert(!w0);
assert(!w1);

That covers the essentials of how to use Seastar weak pointers. One doesn’t often need to reach for a weak pointer, but when they are needed, it’s very nice to not have to invent a one-off solution.

Internals

As you might have guessed, this functionality is implemented by tracking all the weak pointers that have been created so that they can be invalidated. The illustration below shows the data structures in the seastar::weakly_referencable<T> base class used to track each weak pointer.

Image

When inheriting from seastar::weakly_referencable<T> a few important bits are injected; a boost intrusive list for tracking weak pointers (shown above as back refs), a factory method (weak_from_this()) for building new weak pointers, and a destructor that invalidates all active weak pointers. Each of these three elements can be seen below in the simplified snippet showing the base class implementation.

template<typename T>
class weakly_referencable {
    // holds references to all weak pointers
    boost::intrusive::list<weak_ptr<T>, ...> _ptr_list;

public:
    // build new weak pointer
    weak_ptr<T> weak_from_this() {
        weak_ptr<T> ptr(static_cast<T*>(this));
        _ptr_list.push_back(ptr);
        return ptr;
    }

    // invalidate all weak pointer references
    ~weakly_referencable() {
        _ptr_list.clear_and_dispose([] (weak_ptr<T>* wp) {
            wp->_ptr = nullptr;
        });
    }
};

Below is a simplified version of the weak_ptr<T> itself. I’ve removed all the boilerplate to highlight only the essentials. It has an intrusive list hook _hook which is used to attach the weak pointer to the referenced object, and the private pointer T* _ptr is used by operator->() to proxy access to the managed object. Notice above that the destructor sets this pointer to nullptr to invalidate each reference, and the weak pointer itself uses the nullptr value to check for validity in operator bool().

template<typename T>
class weak_ptr {
    intrusive_hook_type _hook;
    T* _ptr = nullptr;

public:
    explicit operator bool() const { return _ptr != nullptr; }
    T* operator->() const noexcept { return _ptr; }
};

Comparison with std::weak_ptr<T>

Finally, I think that it is interesting to compare Seastar’s weak pointer to the weak pointer found in the standard library, which only works in conjunction with std::shared_ptr<T>. The primary differences arise around considerations about concurrency control. For example, the weak pointers from both Seastar and the standard library are intended to be checked for validity before dereferencing. But what prevents the pointer from being invalidated immediately after being checked?

This fundamental issue is addressed differently in both cases, and is driven by a difference in the set of assumptions about threading. Seastar performs no explicit synchronization and contains no locking in its weak pointer implementation. Like Seastar’s shared pointer, the design assumes a single thread of execution, thus managing concurrency is trivial. This differs from the approach in the standard library which requires that the weak pointer be locked (protecting its internal data structures from concurrent access) before ultimately creating a new std::shared_ptr<T> (which is already thread-safe) through which access to the base object is made. This all adds a lot of overhead that can be avoided in Seastar.