Seastar's temporary buffer

23 August 2020

In the systems software space it’s fairly common to see projects develop their own abstractions for working with memory. Some examples include Ceph’s ceph::bufferlist and Folly’s folly::IOBuf. Such abstractions are custom built because the standard library provides no generic mechanism to deal with the various ways in which memory moves around in high-performance applications. Despite being application-specific abstractions, these memory management utilities are all built up from the fundamental abstraction of a contiguous region of memory.

Seastar provides a variety of memory management abstractions, such as the memory workhorse seastar::temporary_buffer<T>, a primitive object used to build up higher level memory management abstractions. In the rest of this post we’ll examine how this utility works and when it should be used. Seastar requires that it be specialized on a 1-byte char type, so for brevity, we’ll use temporary_buffer (or buffer for short) in this post as an alias for seastar::temporary_buffer<char>.

Here’s the gist of temporary_buffer: it represents a contiguous region of memory, and can own the memory it manages, or share ownership with another buffer. Many Seastar interfaces that deal with memory pass around temporary_buffer instances. It’s common to use a temporary_buffer in cases where one might reach for a unique_ptr<char[]> or shared_ptr<char[]>, and it easily handles more complex scenarios like generating many different views (think string_view) of a larger memory region that all share ownership.

Constructing

A new temporary_buffer is created using the temporary_buffer(size_t) constructor. The resulting buffer has capacity for the specified size, its memory is uninitialized (for performance), and the underlying memory allocation will be released as soon as the instance is destroyed.

namespace ss = seastar;

{
  auto buf = ss::temporary_buffer<char>(1024);
} // 1K of memory is freed

Once we have a buffer, we can write some data into it by first using get_write() to obtain a non-const pointer to the underlying memory buffer, and then modifying the memory directly.

auto buf = ss::temporary_buffer<char>(1024);
char *p = buf.get_write();
std::memset(p, 0, buf.size());

This common pattern of creating a buffer and immediately filling it with data is simplified through the use of the temporary_buffer(char *src, size_t size) constructor. Internally this constructor builds an uninitialized buffer, and then fills it from the specified source pointer.

temporary_buffer(const CharType* src, size_t size)
  : temporary_buffer(size)
{
  std::copy_n(src, size, _buffer); // _buffer is a pointer to the backing memory
}

Under the hood a temporary_buffer is a lightweight object, containing only three members: a pointer to a memory region, the size of the region, and a ss::deleter to manage the lifetime of the allocated memory.

template <typename CharType>
class temporary_buffer {
  CharType* _buffer;
  size_t _size;
  deleter _deleter;
  ...
};

Let’s look closer at how a temporary_buffer is created. Below is the constructor used to build an uninitialized instance. A naked memory allocation is made with the resulting memory pointer and size saved as class members. Finally, the _deleter member is initialized with a deleter that manages the newly allocated raw memory.

explicit temporary_buffer(size_t size)
    : _buffer(static_cast<CharType*>(malloc(size * sizeof(CharType))))
    , _size(size)
    , _deleter(make_free_deleter(_buffer)) {
    if (size && !_buffer) {
        throw std::bad_alloc();
    }
}

Note that the temporary_buffer contains a default destructor—it doesn’t explicitly free the memory allocated in the constructor. Instead, the temporary_buffer relies entirely on the Seastar deleter utility to manage the lifetime of the underlying memory. In the case of the constructor above, the deleter will free the allocated memory when its destructor is called.

Recall from the previous discussion about the seastar::deleter utility that it provides reference counting and other features for sharing resources. Next we’ll see how temporary_buffer uses the deleter utility to provide interfaces for efficiently interacting with memory.

Sharing

We can see how the deleter is used to implement more advanced scenarios by examining the process of sharing a temporary_buffer. In the snippet below buf2 is a temporary_buffer created by sharing buf1. Both of these buffers point to the same underlying memory so that changes to one are visible through the other. Importantly, no data is copied.

{
  auto buf1 = ss::temporary_buffer<char>(1024);
  buf1.get_write()[0] = 'a';

  {
    auto buf2 = buf1.share();
    buf2.get_write()[0] = 'b';
  } // buf2 is destroyed

  assert(buf1.get_write()[0] == 'b');
} // memory is released

A shared temporary_buffer maintains a reference count to the underlying memory along with the original buffer. For example, in the next snippet a buffer is obtained by sharing from a temporary. In this case buf holds a reference to the underlying memory to keep it alive, and may even be used to create additional shared buffers.

{
  auto buf = ss::temporary_buffer<char>(1024).share();
  buf.get_write()[0] = 'a';
} // memory is released

Like most things in Seastar, a temporary_buffer is not safe to share across cores. For a deeper explanation of this topic, see the previous posts on how the seastar::deleter and seastar::shared_ptr utilities implement reference counting.

Views

The share interface described above also accepts an offset and length so that the new buffer provides a view into the original buffer. Since sharing involves no data copying, this is useful in high-performance situations such as chopping up a large buffer that arrives over the network along application-specific boundaries, such as individual messages.

auto buf = ss::temporary_buffer<char>(1024);
auto view1 = buf.share(0, 20);
auto view2 = buf.share(20, 40);
...

One issue that can arise when sharing buffers is that a view may point to a small memory region while holding a reference to a large amount of memory. This can be lead to hard to debug memory pressure scenarios. So the general rule of thumb is that if you don’t know where the data is coming from then it’s best to avoid holding on to a temporary buffer for an extended period of time.

Alternatively, a copy of a temporary buffer can be made using the clone method, and then releasing the source buffer. The new buffer is a full copy with new ownership. This also works as expected for views, with the resulting buffer allocating enough memory only for the view region.

These are the basics of the temporary buffer. There are a few other helper interfaces like aligned for allocating a buffer with a particular memory alignment, as well as some interfaces like trim and trim_front which are useful for consuming chunks of memory from a larger buffer.