Improved custom memory allocator

memory-allocator benchmark performance

After quite some work I have finally converted the #memoryallocator written for #cachegrand to be #lockless and #waitfree in the hot path and "just" lock-less in the slow path!

The memory allocator (which I called #slaballocator because of the similarities with the slab allocators) built for cachegrand targets long-running threads that tend to do not share the memory, cachegrand uses 1 thread per core (by default) and really aims to silo the memory allocated as much as possible, opening the door to #optimizations that would never work or be possible with the standard memory allocators.

On top of the improved performances, the new iteration of the slab allocator now starts to offer memory leaks tracking, on top of the double frees detection, currently it's simple enough to do not impact the performances (aka #Valgrind is super but stress testing under it it's impossible). Here a screenshot The next iteration will get a better memory tracking with stack trace recording to be able to easily determine where the leak occurred and not just how many leaks have occurred and of which size.

Because I love #data I run multiple benchmarks that try to allocate 16384 blocks of memory, for blocks that go from 16 bytes to 64kbytes, below just a zoom on the ones up to 1024 because afterwards the os malloc (#ubuntu2204) and #tcmalloc get too slow and make the charts unreadable. The full set of charts, to a link to the spreadsheet with all the data is available at the bottom of the the pr

Looking at the charts it's possible to see how the time needed by my slab allocator is now mostly constant, with some minimal variations on the 64 threads benches (probably caused by the OS using the cpu cores). The previous iteration (red) was fast but this new one (blue) gets as close to zero as possible (although it's not really 0) :)

Technically I can swap this memory allocator in redis and see what's happen, that would be an interesting benchmark to run.

#caching #redis #memorymanagement #performance #performancetesting

SET Operations/s
avatar Daniele Salvatore Albano

Keep Reading...

cachengrand has gone green! If you care about pollution running cachegrand on ARM cloud instances can make the difference.

avatar Daniele Salvatore Albano