Should i use ThreadPools or Task Parallel Library for IO-bound operations

In one of my projects that's kinda an aggregator, I parse feeds, podcasts and so from the web.

If I use sequential approach, given that a large number of resources, it takes quite a time to process all of them (because of network issues and similar stuff);

foreach(feed in feeds)
{
read_from_web(feed)
parse(feed)
}

So I want to implement concurrency and couldn't decide if I should basically use ThreadPools to process with worker threads or just rely on TPL to get it sorted.

ThreadPools for sure will handle the job for me with worker threads and I'll get what I expect (and in multi-core CPU environments, the other cores will be also utilized also).

concurrency

But I still want to consider TPL too as it's recommend method but I'm a bit concerned about it. First of all I know that TPL uses ThreadPools but adds additional layer of decision making. I'm mostly concerned of the condition that where a single-core environment is present. If I'm not wrong TPL starts with a number worker-threads equal to number of available CPU-cores at the very beginning. I do fear of TPL producing similar results to sequential approach for my IO-bound case.

So for IO-bound operations (in my case reading resources from web), is it best to use ThreadPools and control the things, or better just rely on TPL? Can TPL also be used in IO-bound scenarios?

Update: My main concern is that -- on a single-core CPU environment will TPL just behave like sequential approach or will it still offer concurrency? I'm already reading Parallel Programming with Microsoft .NET and so the book but couldn't find an exact answer for this.

Note: this is a re-phrasing of my previous question [ Is it possible to use thread-concurrency and parallelism together? ] which was quite phrased wrong.

25048 次浏览

I do fear of TPL producing similar results to sequential approach for my IO-bound case.

I think it will. What is the bottleneck? Is is parsing or downloading? Multithreading will not help you much with downloading from the web.

I would use Task Parallel Library for cropping, applying mask or effects for downloaded images, cuting some sample from podcast etc. It's more scalable.

But it will not be the order of magnitude speed up. Spend your resources to implementing some features, testing.

PS. "Wow my function execustes in 0.7 s instead of 0.9" ;)

You can assign your own task scheduler to a TPL task. The default work stealing one is quite clever though.

You are right that the TPL does remove some of the control you have when you create your own thread pool. But this is only correct if you do not want to dig deeper. The TPL does allow you to create long running Tasks that are not part of the TPL thread pool and could serve your purpose well. The published book which is a free read Parallel Programming with Microsoft .NET will give you much more insight how the TPL is meant to be used. You have always the option to give Paralle.For, Tasks explicit parameters how many threads should be allocated. Besides this you can replace the TPL scheduler with your own one if your want full control.

So i instead decided to write tests for this and see it on practical data.

Test Legend

  • Itr: Iteration
  • Seq: Sequential Approach.
  • PrlEx: Parallel Extensions - Parallel.ForEach
  • TPL: Task Parallel Library
  • TPool: ThreadPool

Test Results

Single-Core CPU [Win7-32] -- runs under VMWare --

Test Environment: 1 physical cpus, 1 cores, 1 logical cpus.
Will be parsing a total of 10 feeds.
________________________________________________________________________________


Itr.    Seq.    PrlEx   TPL     TPool
________________________________________________________________________________


#1      10.82s  04.05s  02.69s  02.60s
#2      07.48s  03.18s  03.17s  02.91s
#3      07.66s  03.21s  01.90s  01.68s
#4      07.43s  01.65s  01.70s  01.76s
#5      07.81s  02.20s  01.75s  01.71s
#6      07.67s  03.25s  01.97s  01.63s
#7      08.14s  01.77s  01.72s  02.66s
#8      08.04s  03.01s  02.03s  01.75s
#9      08.80s  01.71s  01.67s  01.75s
#10     10.19s  02.23s  01.62s  01.74s
________________________________________________________________________________


Avg.    08.40s  02.63s  02.02s  02.02s
________________________________________________________________________________

Single-Core CPU [WinXP] -- runs under VMWare --

Test Environment: 1 physical cpus, NotSupported cores, NotSupported logical cpus.
Will be parsing a total of 10 feeds.
________________________________________________________________________________


Itr.    Seq.    PrlEx   TPL     TPool
________________________________________________________________________________


#1      10.79s  04.05s  02.75s  02.13s
#2      07.53s  02.84s  02.08s  02.07s
#3      07.79s  03.74s  02.04s  02.07s
#4      08.28s  02.88s  02.73s  03.43s
#5      07.55s  02.59s  03.99s  03.19s
#6      07.50s  02.90s  02.83s  02.29s
#7      07.80s  04.32s  02.78s  02.67s
#8      07.65s  03.10s  02.07s  02.53s
#9      10.70s  02.61s  02.04s  02.10s
#10     08.98s  02.88s  02.09s  02.16s
________________________________________________________________________________


Avg.    08.46s  03.19s  02.54s  02.46s
________________________________________________________________________________

Dual-Core CPU [Win7-64]

Test Environment: 1 physical cpus, 2 cores, 2 logical cpus.
Will be parsing a total of 10 feeds.
________________________________________________________________________________


Itr.    Seq.    PrlEx   TPL     TPool
________________________________________________________________________________


#1      07.09s  02.28s  02.64s  01.79s
#2      06.04s  02.53s  01.96s  01.94s
#3      05.84s  02.18s  02.08s  02.34s
#4      06.00s  01.43s  01.69s  01.43s
#5      05.74s  01.61s  01.36s  01.49s
#6      05.92s  01.59s  01.73s  01.50s
#7      06.09s  01.44s  02.14s  02.37s
#8      06.37s  01.34s  01.46s  01.36s
#9      06.57s  01.30s  01.58s  01.67s
#10     06.06s  01.95s  02.88s  01.62s
________________________________________________________________________________


Avg.    06.17s  01.76s  01.95s  01.75s
________________________________________________________________________________

Quad-Core CPU [Win7-64] -- HyprerThreading Supported --

Test Environment: 1 physical cpus, 4 cores, 8 logical cpus.
Will be parsing a total of 10 feeds.
________________________________________________________________________________


Itr.    Seq.    PrlEx   TPL     TPool
________________________________________________________________________________


#1      10.56s  02.03s  01.71s  01.69s
#2      07.42s  01.63s  01.71s  01.69s
#3      11.66s  01.69s  01.73s  01.61s
#4      07.52s  01.77s  01.63s  01.65s
#5      07.69s  02.32s  01.67s  01.62s
#6      07.31s  01.64s  01.53s  02.17s
#7      07.44s  02.56s  02.35s  02.31s
#8      08.36s  01.93s  01.73s  01.66s
#9      07.92s  02.15s  01.72s  01.65s
#10     07.60s  02.14s  01.68s  01.68s
________________________________________________________________________________


Avg.    08.35s  01.99s  01.75s  01.77s
________________________________________________________________________________

Summarization

  • Whether you run on a single-core environment or a multi-core one, Parallel Extensions, TPL and ThreadPool behaves the same and gives approximate results.
  • Still TPL has advantages like easy exception handling, cancellation support and ability to easily return Task results. Though Parallel Extensions is also another viable alternative.

Running tests on your own

You can download the source here and run on-your-own. If you can post the results, i'll add them also.

Update: Fixed the source link.

If you're trying to maximize throughput for IO-bound tasks you absolutely must combine the traditional Asynchronous Processing Model (APM) APIs with your TPL based work. The APM APIs are the only way to unblock the CPU thread whilst the asynchronous IO callback is pending. The TPL provides the TaskFactory::FromAsync helper method to assist in combining APM and TPL code.

Check out this section of the .NET SDK on MSDN entitled TPL and Traditional .NET Asynchronous Programming for more information on how to combine these two programming models to achieve async nirvana.

If you parallelize your calls to the urls, I think it will improve your application, even if have only one core. Take a look on this code:

var client = new HttpClient();
var urls = new[]{"a", "url", "to", "find"};


// due to the EAP pattern, this will run in parallel.
var tasks = urls.Select(c=> client.GetAsync(c));


var result = Tasks.WhenAll(task).ContinueWith(a=> AnalyzeThisWords(a.Result));
result.Wait(); // don't know if this is needed or it's correct to call wait

The difference between multithreading and asynchrony in this case is how the callback/completion is done.

When using EAP the number of tasks is not related with the number of threads.

As you're relying on the GetAsync task, the http client uses a networkstream (socket, tcp client or whatever) and signalize it to raise an event when the BeginRead/EndRead is done. So, no threads are involved in this moment.

After the completion is called, maybe a new thread is created, but it's up to TaskScheduler (used in call GetAsync/ContinueWith call) to create a new thread, use an existing thread or inline the task to use the calling thread.

If the AnalyzeThisWords blocks for too much time, then you start to get bottlenecks as the "callback" on the ContinueWith is done from a thread pool worker.