Using a Single Threaded Functor in Multiple Threads with Futures in C++
Multithreaded programming requires a shift of paradigm when it comes to return values of functions. C++11 provides `std::async
<http://en.cppreference.com/w/cpp/thread/async>`__ to run functions asynchronously but this is not available in older versions.
My current project on word spotting on historical documents is fairly complete in functionality but I decided that searching word images on page images concurrently is necessary for speed up. I'm already using Boost for many of the functionality and instead of creating a dependency on not yet mature C++11 support in various compilers, I decided to use =boost::thread=s.
Suppose we have a functor like
class Search_t
{
public:
Search_t(Document d) { ... };
SearchResult operator()(SearchItem i) { ... };
};
and we want to use this functor in multiple threads. We can't simply do
std::vector<SearchResult> results;
Search_t search(document);
// search_items is vector<SearchItem> and si is an iterator on this.
for (si = search_items.begin(); si != search_items.end(); ++si)
{
boost::thread task(boost::bind(search, *si));
results.push_back(task); //ERROR!
}
because task
does not return a SearchResult
.
Instead we need to store results within the object and retrieve them after they are generated.
I didn't want to change the interface of Search_t
, because multithreading should be optional and other parts of the program may depend on this interface. Instead a wrapper class that runs these threads with a similar interface looked a better solution.
class SearchMT_t
{
boost::shared_ptr<std::vector<boost::unique_future<SearchResult> > > futures_;
public:
SearchMT_t(Document d) :
/* The most important assumption to this is Search_t does not alter
Document object's state in any way. Otherwise we need to ensure that a
document_ is reached by a single thread with mutexes. */
document_(d),
futures_(new std::vector<boost::unique_future<SearchResult> >)
{};
void operator()(SearchItem si)
{
/*If you are sure that there won't be any race conditions
between Search_t threads in search, you can move the following line
to the constructor and use a single object for all searches.*/
Search_t search(document_);
boost::packaged_task<SearchResult> search_task(std::bind(search, si));
futures_->push_back(search_task.get_future());
boost::thread task(boost::move(search_task));
};
std::vector<SearchResult> results()
{
std::vector<SearchResult> results;
/* Wait for all threads to complete their work. */
boost::wait_for_all(futures_->begin(), futures_->end());
for(int i = 0; i < futures_->size(); ++i)
{
results.push_back((*futures_)[i].get());
}
return results;
}
}
This way, it becomes much more straightforward to use multithreading in a loop:
std::vector<SearchResult> results;
SearchMT_t search(document);
// search_items is vector<SearchItem> and si is an iterator on this.
for (si = search_items.begin(); si != search_items.end(); ++si)
{
search(*si);
}
results = search.results();
This way, we kept the Search_t
class intact and used a much simpler approach in the loop.