Thread Pool

Some of today’s plug-ins can only run with multithreaded processing, because technologies such as circuit simulation, machine learning and neural networks, or physical modelling (e.g. piano, orchestral synthesis) place high demands on the CPU. Although CPUs are also getting faster, the current trend in hardware leans more towards increased parallelisation: The number of CPU cores rises more quickly than clock speed. In audio processing, the trends are towards more detailed, complex models that demand ever more CPU performance.

This is where CLAP offers a real perspective.

CLAP addresses the question “will this work in 40 years?”. We estimate that in 5 to 10 years the average laptop computer will have between 16 and 32 physical CPU cores. Plug-Ins like Diva with 8-16 voices today will allow 64-128 voices, and there will be more simulations of pianos and orchestral instruments with a similarly high voice count. If such a plug-in occasionally requires 40% or more of a CPU core, traditional load balancing in the DAW cannot achieve optimal performance. These instruments therefore need multithreading i.e. distributing voice processing across the available cores. Effect plug-ins can make similar CPU demands that can be parallelised by channel or feature, e.g. in a 16 channel surround processor. Before CLAP, these plug-ins had to implement their own real-time thread management.

A single instance of such a plug-in runs fine on today’s machines. Add more instances, however, and threads will start competing with each other as well as with the host. If a typical session loads multiple high-demand instruments, the number of competing threads can vastly exceed the number of available CPU cores. The result is a series of CPU spikes and audio dropouts, even if the CPU as a whole is only marginally above 50% use. The spikes and dropouts are due to stalling, missed CPU caching and context switch overhead. Even worse, some operating systems limit the availability of real-time priority threads: For example, the maximum number of real-time MMCSS threads in Windows is 32.

Asymmetric CPU designs with separate Efficiency and Performance cores also cause CPU spikes unless using the Efficiency cores for latency-critical tasks is carefully avoided.

CLAP’s thread pool is a simple solution which already shows its benefits. Instead of plug-ins vying with each other for attention, the host manages a small number of real-time threads for all plug-ins via an easy-to-use interface. The number of context switches is therefore reduced to an absolute minimum, and as the host knows the deadline for each track, it can prioritise the plug-ins that need to finish early, e.g. on a low latency track.

The results are visible today. During tests we observed up to twice as many instances before audio dropouts occur, and a minimum of 20%-25% in other cases. Overall there are measurably fewer CPU spikes, and much more even load balancing.

Implementing multithreading in plug-ins also poses its own challenges e.g. while creating and invoking realtime threads. By moving responsibility to the host, the plug-in developer can avoid a lot of common pitfalls.