Qt Signals & Slots VS QEvents – Qt cross-thread asynchronous operation performance test and selection suggestions

Related code reference: https://gitcode.net/coloreaglestdio/qtcpp_demo/-/tree/master/qt_event_signal

1. Origin of the problem

When implementing low-latency transformation of taskBus, avoiding the abuse of signals and slots played a better role. In the previous article, the author described how to improve the throughput of the software radio (SDR) platform by avoiding broadcast signals and frequent new and delete. Recently, considering that events (QEvent) may be more suitable for point-to-point calls in cross-thread asynchronous operations, the main data flow of taskBus was transformed using Events, and an improvement of about 5-10ms was achieved.

This improvement still does not meet my expectations, because in my impression, signals and slots are very slow. So, here comes the question, which one is faster between cross-thread Signal & Slots and Events, why is it faster, and how much faster?

2. Review the source of the impression that “signal slots are slow”

In the classic Qt documentation, there is a performance description of signal and slot performance selection:

Compared to callbacks, signals and slots are slightly slower because of the
increased flexibility they provide, although the difference for real applications
is insignificant. In general, emitting a signal that is connected to some slots,
 is approximately ten times slower than calling the receivers directly, with
 non-virtual function calls.

“Signals and slots are slightly slower compared to callbacks because they offer greater flexibility, although the difference in real-world applications is not significant. Typically,emitters connect to certain slots than Calling the receiver directly (using a non-virtual function call) is about ten times slower.”

This is where the impression comes from. However, with the above questions and careful reading, I found that there are more explanations:

 This is the overhead required to locate the connection
 object, to safely iterate over all connections (i.e. checking that subsequent
 receivers have not been destroyed during the emission), and to marshall any
 parameters in a generic fashion. While ten non-virtual function calls may sound
 like a lot, it's much less overhead than any new or delete operation, for example.
  As soon as you perform a string, vector or list operation that behind the scene
  requires new or delete, the signals and slots overhead is only responsible for a
  very small proportion of the complete function call costs. The same is true
  whenever you do a system call in a slot; or indirectly call more than ten
  functions. The simplicity and flexibility of the signals and slots mechanism is well worth the overhead, which your users won't even notice.

“This is the overhead required to locate the connection object, safely iterate over all connections (i.e. check that subsequent receivers were not destroyed during the launch), and sort out any arguments in a generic way. Although ten non-virtual function calls may sound like a lot ,But its overhead is much smaller than any new operation or delete operation. Once a string, vector, or list operation is performed, and the operation requires new creation or deletion in the background, the signal and slot overhead is only A small fraction of the overall function call costThe same is true whenever you make a system call in a slot; or call more than a dozen functions indirectly. The simplicity and flexibility of the signal and slot mechanism are very It’s worth the expense, and your users won’t even notice.”

Could it be that the key factor in the last optimization was not the change from signals and slots to direct function calls, but the improvement caused by replacing frequent new deletes with static memory? I immediately tested it and found that it was indeed the case.

  • The overhead of new and delete is much greater than the overhead of signal & slots

3. Programming for specialized testing

We use Qt specifically to test signals, slots, and events, two methods of transmitting messages across threads. The design test is a scenario where two objects ping pong messages to each other at the fastest speed, as shown in the following figure: