.NET 中 worker 和 I/O 线程的简单描述

在.NET 中很难找到关于 worker 和 I/O 线程的详细而简单的描述

关于这个话题,我清楚的是什么(但可能不是技术上的精确) :

  • 辅助线程是 应该使用 CPU 进行工作的线程;
  • I/O 线程(也称为“完成端口线程”) 应该使用设备驱动程序进行工作,实质上是“什么也不做”,只监视非 CPU 操作的完成情况。

不清楚的是:

  • 尽管方法 ThreadPool。GetiliableThreads 返回两种类型的可用线程的数量,似乎没有公共 API 来安排 I/O 线程的工作。中手动创建工作线程。NET?
  • 似乎单个 I/O 线程可以监视多个 I/O 操作。是真的吗?如果是这样,为什么 ThreadPool 默认有这么多可用的 I/O 线程?
  • 在一些文本中,我读到了这个回调函数,它在 I/O 操作完成后由 I/O 线程执行时触发。是真的吗?考虑到这个回调是 CPU 操作,这不是工作线程的作业吗?
  • 更具体地说—— ASP.NET 异步页面用户 I/O 线程吗?在将 I/O 工作切换为单独的线程而不是增加工作线程的最大数量时,确切的性能优势是什么?是因为单个 I/O 线程监视多个操作吗?或者 Windows 在使用 I/O 线程时更有效地进行上下文切换?
28232 次浏览

Simply put a worker thread is meant to perform a short period of work and will delete itself when it has completed it. A callback may be used to notify the parent process that it has completed or to pass back data.

An I/O thread will perform the same operation or series of operations continuously until stopped by the parent process. It is so called because it typically device drivers run continuously monitor the device port. An I/O thread will typically create Events whenever it wishes to communicate to other threads.

All processes run as threads. Your application runs as a thread. Any thread may spawn worker threads or I/O threads (as you call them).

There is always a fine balance between performance and the number or type of threads used. Too many callbacks or Events handled by a process will severely degrade its performance due to the number of interruptions to its main process loop as it handles them.

例如,工作线程可以在用户交互后将数据添加到数据库中,或者执行长时间的数学计算,或者将数据写入文件。通过使用工作线程,您可以释放主应用程序,这对于GUI非常有用,因为它在执行任务时不会冻结。

Someone with more skills than me is going to jump in here to help out.

Worker threads have a lot of state, they are scheduled by the processor etc. and you control everything they do.

IO Completion Ports are provided by the operating system for very specific tasks involving little shared state, and thus are faster to use. A good example in .Net is the WCF framework. Every "call" to a WCF service is actually executed by an IO Completion Port because they are the fastest to launch and the OS looks after them for you.

I'll begin with a description of how asynchronous I/O is used by programs in NT.

You may be familiar with the Win32 API function ReadFile (as an example), which is a wrapper around the Native API function NtReadFile. This function allows you to do two things with asynchronous I/O:

  • You can create an event object and pass it to NtReadFile. This event will then be signaled when the read operation completes.
  • You can pass an asynchronous procedure call (APC) function to NtReadFile. Essentially what this means is that when the read operation completes, the function will be queued to the thread which initiated the operation and it will be executed when the thread performs an alertable wait.

There is however a third way of being notified when an I/O operation completes. You can create an I/O completion port object and associate file handles with it. Whenever an operation is completed on a file which is associated with the I/O completion port, the results of the operation (like I/O status) is queued to the I/O completion port. You can then set up a dedicated thread to remove results from the queue and perform the appropriate tasks like calling callback functions. This is essentially what an "I/O worker thread" is.

A normal "worker thread" is very similar; instead of removing I/O results from a queue, it removes work items from a queue. You can queue work items (QueueUserWorkItem) and have the worker threads execute them. This prevents you from having to spawn a thread every single time you want to perform a task asynchronously.

The term 'worker thread' in .net/CLR typically just refers to any thread other than the Main thread that does some 'work' on behalf of the application that spawned the thread. 'Work' could really mean anything, including waiting for some I/O to complete. The ThreadPool keeps a cache of worker threads because threads are expensive to create.

The term 'I/O thread' in .net/CLR refers to the threads the ThreadPool reserves in order to dispatch NativeOverlapped callbacks from "overlapped" win32 calls (also known as "completion port I/O"). The CLR maintains its own I/O completion port, and can bind any handle to it (via the ThreadPool.BindHandle API). Example here: http://blogs.msdn.com/junfeng/archive/2008/12/01/threadpool-bindhandle.aspx. Many .net APIs use this mechanism internally to receive NativeOverlapped callbacks, though the typical .net developer won't ever use it directly.

There is really no technical difference between 'worker thread' and 'I/O thread' -- they are both just normal threads. But the CLR ThreadPool keeps separate pools of each simply to avoid a situation where high demand on worker threads exhausts all the threads available to dispatch native I/O callbacks, potentially leading to deadlock. (Imagine an application using all 250 worker threads, where each one is waiting for some I/O to complete).

The developer does need to take some care when handling an I/O callback in order to ensure that the I/O thread is returned to the ThreadPool -- that is, I/O callback code should do the minimum work required to service the callback and then return control of the thread to the CLR threadpool. If more work is required, that work should be scheduled on a worker thread. Otherwise, the application risks 'hijacking' the CLR's pool of reserved I/O completion threads for use as normal worker threads, leading to the deadlock situation described above.

Some good references for further reading: win32 I/O completion ports: http://msdn.microsoft.com/en-us/library/aa365198(VS.85).aspx managed threadpool: http://msdn.microsoft.com/en-us/library/0ka9477y.aspx example of BindHandle: http://blogs.msdn.com/junfeng/archive/2008/12/01/threadpool-bindhandle.aspx