Benchmarking resizable arrays versus linked lists in C++

The second half of Herb Sutter’s recent talk, “Modern C++: What You Need to Know,” explores the question of how many items do you need before it’s better to use a std::list instead of a std::vector for an insertion sort. The conventional wisdom is that eventually the cost of moving half or so of the vector to make room for a new element will overtake the overhead of the linked list traversal. However, testing showed the answer is never and underscored the importance of memory locality in modern computers (with lots of prefetched cache data).

I decided to try it out on my own machine, but before I got too far with the code I looked around to see if someone already had code I could use. Soon enough, I found C++ benchmark – std::vector VS std::list VS std::deque on Baptiste Wicht’s blog. He ran lots of tests (not just insertion and removal) on lots of data types (not just integers). Like Herb Sutter, Baptiste also found arrays to be faster than linked lists for small objects, but lists were better for large objects. The meaning of large varied depending on the algorithm. For sorting, large was 16 bytes, for insertion, large was more like 64 bytes. He didn’t explore arrays of pointers to large objects.

In trying out his year-old code on my Mac, I encountered a couple compile errors and one logic error. Since the code was on GitHub, I thought it would be a good chance to try it out. I set up a GitHub account, forked his code into my account, fixed the issues, and made a pull request for my changes to be merged back into Baptiste’s original code. A couple days later, he accepted the changes — the internet works!

Here are some results from my runs. I went up to n = 1,000,000 for two operations and two data sizes.

Looks like for my PowerMac, large is somewhere between 8 and 32. The real hero is deque, which is segmented vector that can easily grow at both ends.