Nuclex Signal/Slot Library: Benchmarks

When you’re writing some code that needs to notify code in othe r parts of the program, your weapon of choice is the "signal / slot concept". A signal is a connection point where any interested party can register a callback function to be invoked when the signal emits/fires.

There’s already an ocean of libraries out there providing this functionality to C++, but as you will see in this article, they’re all suffering from performance issues in one way or other. Plus, most don’t compile without warnings, have inconvenient sytax or lack unit tests.

So here’s the signal/slot "library" (it’s just three headers) I wrote to fix those issues for me, together with a summary of my design goals and a comprehensive benchmark on different compilers and CPUs.

Goals / Mandatory Use Cases

I started with a short laundry list of my use cases to guide the design:

  • Optimize granular usage (many small individual signals rather than a big multi-purpose one)

    • minimal memory footprint embedded in classes
    • fast construction / destruction
  • Support GCC, clang and MSVC (with maximum warning levels)
  • Performance should be near a vanilla virtual method call
  • Able to collect return values from subscribers / slots

    • without allocating memory
  • Binary (executable) size should stay small
  • Callbacks must be able to unregister themselves while being called back
  • Callbacks must be able to subscribe other callbacks while being called back
  • Reliable (unit tests for every valid use and for every error case)
  • Unsubscribe with function/method pointer + instance pointer pair, no "connection&quot objects or ids.

I then set out to find the leanest, fastest implementation that can cover these requirements ignoring everything else.

Benchmarks

Compiler options: fastest possible code that runs on generic x86-64 (amd64) CPU.

MSVC: /TP /GF /utf-8 /W4 /GS- /fp:fast /EHsc /std:c++17 /GR /O2 /Oy /Oi /Gy /GL /MD /Gw
GCC: -fvisibility=hidden -fvisibility-inlines-hidden -Wpedantic -Wall -Wextra -Wno-unknown-pragmas -shared-libgcc -fpic -funsafe-math-optimizations -std=c++17 -fpermissive -O3 -flto -fpie
clang: -fvisibility=hidden -fvisibility-inlines-hidden -Wpedantic -Wall -Wextra -Wno-unknown-pragmas -fpic -funsafe-math-optimizations -std=c++17 -fpermissive -O3 -flto -fpie

Scores are cpu cycles per action. Benchmark runs repeat an action for between 10,000,000 and 500,000,000 times, measure the total time, then cycles_per_action = (cpu_speed_ghz x 1,000,000,000) / (total_time / number_of_repeats). Overall tab shows the average of all data.

cpu cycles per action (lower = better)

Remarks: Nuclex: Unsubscribe() optmizes for removing the oldest or newest callback. My benchmark removes callbacks in reverse order of subscription. If the removal order is randomized, the result is a lot worse (but at 50 callbacks still beats any competitors)
Nano: I included it because I wanted to compare to one of the fastest libraries around. However, it doesn’t support callbacks unsubscribing themselves while being called back and therefore doesn’t actually meet my requirements.
libsigc++ is interesting in terms of construction time. I assume it doesn’t initialize a thing until the first subscription, so the price is paid later.
Boost.Signals2 is known to be slow, but the results are just ridiculous.

Fairness notice: many of the libraries tested are thread-safe and thus handicap their performance with mutexes. This is silly, imho, since adding cheap mutex-based thread safety takes me one minute to do with a wrapper class around an event. I’m working on a lock-free, thread-safe event, but this will take time.

Stack size

How much larger does a class become per event it embeds?

Note: The Nuclex implementation has a built-in buffer for 2 subscribed callbacks. It can be configured with a buffer for only 1 callback, which reduces its size to 32 bytes.

Links

Boost.Signals2: https://theboostcpplibraries.com/boost.signals2
Libsigc++: https://libsigcplusplus.github.io/libsigcplusplus
LSignal: https://github.com/cpp11nullptr/lsignal
Nano Signals 11: https://github.com/FrankHB/nano-signal-slot
Nano Signals 17: https://github.com/NoAvailableAlias/nano-signal-slot
Nuclex: https://devel.nuclex.org/framework/browse/listing.php?repname=Framework&path=/Nuclex.Support.Native/trunk/Include/Nuclex/Support/Events/
Sigs: https://github.com/netromdk/sigs

6 thoughts to “Nuclex Signal/Slot Library: Benchmarks”

  1. Developer of nano-signal-slot here and I love benchmarks. You’ve inspired me to revisit the rework branch of nano and get back to keeping the benchmark project up to date. Cheers!

  2. Nice!
    I really appreciate that you share your requirements. You came up with something fast, complete and easy to understand. While others boast with numbers, you deliver with usability and performance! ;)

    Do you intend to tighten the screws on whether callbacks will be called when they are added while an event is fired?
    In EventTests.cpp, you write:
    331 // Can be this or this + 1, even (sic) may or may not invoke subscribers that
    332 // are added during event firing in the same firing cycle.

    For my uses, never calling new subscribers while an event is being delivered simplified the application logic downstream.

    Also, what do you think of a way to automatically disconnect all events when an event receiver instance dies? (e.g. Nano::Observer)

  3. I looked into Event.h and wonder if the algorithm doesn’t skip notifications if:
    1. Event emission starts, it’s at delegate n/2 of n
    2. Event subscriptions 1,2 (< n/2) are removed
    3. emission continues

    Due to the way the swapping works, I'd think that 2 subscribers are moved from the back to places 1, 2, meaning that they won't receive the event this time. :o

  4. @Vine Yes, it’s exactly as you describe. If an event unsubscribes *other* receivers from inside the callback, that can result in notifications being skipped for unrelated receivers.

    I actually thought about this while designing the event, but considered it such an edge case that I rather just amended my requirements to “Callbacks must be able to unregister themselves while being called back” :)

  5. @V1ne:
    > For my uses, never calling new subscribers while an event is being
    > delivered simplified the application logic downstream.
    >
    So far I haven’t had any cases myself where I would want to register subscribers from inside an event’s callback.

    It would add a bit of complexity to ensure no delivery for new subscribers until the event is over. The firing loop has to re-check the subscriber count after each callback in case an unsubscription happens, so it takes more effort than simply stopping at the initial subscriber count.

    > Also, what do you think of a way to automatically disconnect all events
    > when an event receiver instance dies? (e.g. Nano::Observer)
    >
    I’ve thought about that. If I can find a way to make it external to ‘Event’ class, like a base class that provides a method wrapping Event::Subscribe(), i.e.

    SubscibeEvent(myButton.Pressed, this, &Me::onButtonPressed);

    Then why not? My goal for the ‘Event’ class is to remain as light and simple as possible because my use cases usually see 1 or 2 subscriptions and individual events per notification even if generated in the same class.

Leave a Reply

Your email address will not be published. Required fields are marked *

Please copy the string amrMXC to the field below:

This site uses Akismet to reduce spam. Learn how your comment data is processed.