moodycamel::ConcurrentQueue: A Lock-Free Queue That Doesn't Mess Around
Intro
You know that feeling when you're building a multi-threaded app, and you just need a simple, fast queue to pass data between threads? You reach for boost::lockfree::queue, maybe tbb::concurrent_queue, and you hope it's good enough.
But sometimes "good enough" isn't. Especially when your queue becomes a bottleneck, and those benchmarks start to look like a slow boat. That's where moodycamel::ConcurrentQueue comes in. It's a lock-free, multi-producer/multi-consumer queue that's been tuned to be faster than Boost or TBB, and it's designed with real-world usage in mind.
What It Does
moodycamel::ConcurrentQueue is a header-only C++11 concurrent queue. It's a lock-free queue that lets multiple threads safely push and pop items without blocking each other. It's specifically designed for high-throughput scenarios where you need to move a lot of data between threads, and you want to minimize contention and memory allocation.
The repo is at github.com/cameron314/concurrentqueue, and it's been around for a while, with a solid track record in performance-sensitive projects.
Why It's Cool
Let's be real. Lock-free queues are hard to get right, and many open source ones work fine for small loads but fall apart under contention. moodycamel::ConcurrentQueue is different. Here's what makes it stand out.
It's actually faster. The benchmarks in the repo show it beating Boost and TBB in many common scenarios. The trick is a clever use of per-thread caches and a design that avoids global contention. Instead of having all producers fight over a single atomic pointer, the queue uses a more granular approach that reduces cache line bouncing.
It's lock-free. Not just "wait-free" in the academic sense, but it means you won't get deadlocks or thread stalls because of locks. This is important for real-time or low-latency systems.
It's lightweight. The entire library is a single header file. No dependencies, no build system, just include it and go. It's C++11 compatible, so it works with most modern compilers out of the box.
It handles blocking and non-blocking. You can use a blocking producer/consumer pattern, or you can poll for items. The API lets you do try_dequeue for non-blocking pops, or wait_dequeue for blocking ones (you need a lightweight semaphore for that). It's flexible.
Memory management is done for you. The queue manages its own memory internally, using a slab-like allocator. So you don't have to worry about managing a pool of nodes yourself, and it avoids the overhead of per-element allocations.
How to Try It
Getting started is stupid simple. Since it's a single header, you just drop concurrentqueue.h into your project and include it.
#include "concurrentqueue.h"
#include <thread>
#include <iostream>
int main() {
moodycamel::ConcurrentQueue<int> q;
// Producer
std::thread producer([&]() {
for (int i = 0; i < 100; ++i) {
q.enqueue(i);
}
});
// Consumer
std::thread consumer([&]() {
int item;
while (q.try_dequeue(item)) {
std::cout << "Got: " << item << std::endl;
}
});
producer.join();
consumer.join();
return 0;
}
Compile with -std=c++11 -lpthread and you're done. There's also a blockingconcurrentqueue.h for blocking operations, and the repo has a bunch of benchmarks and tests you can build with CMake if you want to see how it performs on your machine.
Final Thoughts
If you're building any kind of multi-threaded system in C++ and you need a queue that's both fast and easy to use, give moodycamel::ConcurrentQueue a serious look. It's one of those rare libraries that just works out of the box, and it's been battle-tested in production for years. It's not going to solve every problem — no queue will — but for the vast majority of producer-consumer patterns, it's a clear winner over the alternatives.
And hey, it's header only. No DLLs, no linking drama. That alone makes it worth a try.
Found this on @githubprojects
Repository: https://github.com/cameron314/concurrentqueue