Alonso's blog

How to actually use arenas (and program in C pain free)

It was hard for me to believe the first time I heard Ryan Fleury say he didn't need to think about memory management in C. Part of me thought that probably it must be true, but I couldn't understand how. It took me a long time to understand how to use arenas in practice. So, if you are like me, maybe this post can be helpful.

My first approaches involved reading Untangling Lifetimes: The Arena Allocator and watching Enter The Arena: Simplifying Memory Management (2023) by Ryan Fleury. From these, I understood the problem. It resonated with me. But I couldn't understand how arenas are implemented, and how to structure a whole project around them. Also, I didn't fully understand the different kind of arenas you could use.

Later in my journey, I kept hearing the same idea. The important thing was to group your allocations.

And also, I found resources on how to implement arenas:

But I still didn't understand how to use them across a project without having to think about memory management. At this point, I understood how to implement an arena, but not how they could make memory management trivial.

Also, at the time I thought, well yeah, arenas are cool, but, maybe, there are better ways to group allocations? For example, instead of making all allocations in a single buffer, you could store all allocations made in a lifetime together. Then, when freeing a lifetime iterate over all allocations. I tried this and it worked pretty well, up to a point.

This worked well for lifetimes that corresponded with sections of the program. For example, parsing in a compiler. It also worked well for data structures that can be a lifetime of their own, like Abstract Syntax Trees. (Sorry, I'm working on a programming language)

But this approach didn't scale well to arbitrary functions and types. Also, creating new lifetimes or doing small allocations (cheaply) wasn't easy.

I was missing how to use scratch arenas alongside normal arenas, and how a function can use a combination of both types of arenas to satisfy all its requirements.

It was thanks to Arena allocator tips and tricks by Chris Wellons that arenas finally clicked for me. In his post, he has the following snippet:

strlist *getlines(str path, arena *perm, arena scratch) {
	...
}

Thanks to that snippet, arenas finally made sense to me. Need dynamic memory? Just pass an arena by value. Need to return something dynamically allocated? Just pass and arena by reference. The coolest part is you don't even have to worry about freeing, everything allocated is freed automatically at the end the scope, as the scratch arena is passed by value.

The other cool thing is that you only need two arenas per thread, as you can interleave them when passing them as parameters:

DynamicArrayInt foo(Arena* arena, ..., Arena scratch) {
    ...
}

DynamicArrayInt bar(Arena* arena, ..., Arena scratch) {
    DynamicArrayInt intermediateResult = foo(&scratch, ..., *arena);
    return intermediateResult;
}

If you use virtual memory and allocate 4GB of address space per arena, this makes memory management almost trivial.

Simplest possible arena

typedef struct {
    uint8_t* buffer;
    size_t offset;
    size_t capacity;
} Arena;

An array of bytes, how much we have used of the buffer and the total length of the buffer. The minimal operations required would be:

Arena arena_new();
void* arena_alloc(Arena* arena, size_t size, size_t alignment, size_t count);
void arena_free(Arena* arena);

Simplest possible dynamically growing arena

Using virtual memory makes memory management almost trivial. But in some cases, you can't use it. For example, in WASM. I haven't tried this yet, but by using indirection it should be possible to have the same API as before while allowing the arena’s buffer to grow dynamically:

typedef struct {
    uint8_t** chunks;
    size_t chunkCount;
} BackingStorage;

typedef struct {
    BackingStorage* storage;
    size_t offset;
} Arena;

Using a chunked/segmented array for storage and accessing it via indirection, the arena can grow without changing the API (the chunked/segmented part is only needed if you want to preserve pointer stability).

Other way to do scratch arenas

Ryan Fleury uses another approach for scratch arenas in his post. Instead of passing it as a parameter you retrieve it from some thread-local global state. He provides the following snippet for its API:

ArenaTemp GetScratch(Arena **conflicts, U64 conflict_count); // grabs a thread-local scratch arena
#define ReleaseScratch(t) ArenaTempEnd(t)

When asking for a scratch arena is required to pass arenas currently used for persistent allocations. Also, it is required to call ReleaseScratch at the end of the scope where you are using the scratch arena.

I like about this approach you don't have to change a lot of functions if you realize dynamic memory is needed deep in the call stack. Also, it gives you a lot of freedom in the arena implementation. For example, it makes easy to add a free-list if you think you need one.

I don't like having to remember to pass persistent arenas to GetScratch and to call ReleaseScratch at the end of the scope. The latter also, means one must be careful with early returns.

Conclusion

I've really enjoyed using arenas for memory management. Finally, I feel like I'm no longer fighting manual memory management in C. I also like the control arenas give you. You can choose when to ask the OS for memory. On the other hand, I don't like that arenas makes harder to use contiguous arrays. You have to choose between wasting space, which can add up quickly depending on the problem, using chunked/segmented arrays, or using linked lists. Having said that, for me, the benefits greatly outweigh the drawbacks.