6/10/2009

Winsock I/O Model - Part I : Concept

The basic steps to do windows socket programming are simple and straightforward:

Server Side

  1. Initialize Winsock.
  2. Create a socket.
  3. Bind the socket.
  4. Listen on the socket for a client.
  5. Accept a connection from a client.
  6. Receive and Send data.
  7. Disconnect.

Client Side

  1. Initialize Winsock.
  2. Create a socket.
  3. Connect to the server.
  4. Send and Receive data.
  5. Disconnect.
More detailed information please refer Winsock Prgming Guide @ MSDN[2].

But the hard problem is how to write high performance & high scalable network application, which means:
- Serve as many concurrent connections as possible
- Response each request as soon as possible

The various solutions to this problem is called - Network I/O Model. I had writtend a general I/O model article[1], this post focus on the same problem in network area. Using the same taxonomy as [1], we can categorize the various Network I/O Models as:

1. Multitasking

2. I/O Multiplexing
 - BSD Select
 - Window Message Select
 - Kernel Event Select

3. Overlapped(
Async) I/O
 - Polling using GetOverlappedResult()
 - Waiting on Kernel Event
 - Callback using Completion Routine
 - Queuing using I/O Completion Port

Let's detail on each model one by one:

1. Multitasking

Usually, you use one thread to serve one client connection in this model. Since each thread consumes some resources (such as memory for stack), and context switching among large number of threads is expensive, this model is not thought to be scalable.

2. I/O Multiplexing

- 2.1 BSD Select

In this model, you only issue sync network I/O requests only when you know it won't block. You call the BSD style select() function to check what operations on what sockets won't block. (I.E. you use select() to get network event notifications)

The advantage of this model is that you can multiplex connections and I/O operations in single thread, which will reduce the os resource consumption greatly. The drawback is that, socket handle counts that one select() call supports is limited (defaults to 64) and there exist expensive data transfer cost and big kernel algorithm complexity.

- 2.2. Window Message Select

In this model, network events, such as socket connection arrives and data ready, are sent to some window procedure as standard window message, just as a keyboard strike or mouse movement event.

To use this model, you should call WSAAsyncSelect() to associate socket handle with some window handle. For what situation triggers what window message, please see WSAAsyncSelect Doc at MSDN.

The advantage is the same as BSD Select, the drawback is that it requires window handle and window procedure, which is not hard for some applications(such as service or console application).

- 2.3. Kernel Event Select

This model is similar to BSD Select. You call WSAEventSelect() to associate a socket with a WSAEvent. And call WSAWaitForMultipleEvents() to wait network event notification. when the kernel event is signaled, you should call WSAEnumNetworkEvents() to determine what network event(s) is occurred on what network handle.

Compared with Window Message Model, this model doesn't need window object and message queue. But since this model can only wait on at most 64 WSAEvents in one call, multiple threads is needed to handle more concurrent socket connections. So it is not thought as scalable model well.

3. Overlapped I/O

Overlapped i/o can be performed only on sockets created through the WSASocket() function with the WSA_FLAG_OVERLAPPED flag set or sockets created through the socket() function. When working in Overlapped model, each time you make a network method call, you should pass a WSAOVERLAPPED structure as parameter. The call will return immediately whether the operation is completed or not.

Overlapped network i/o only works with the following APIs:
WSASend
WSASentTo
WSARecv

WSARecvFrom
WSARecvMsg
AcceptEx
ConnectEx
DisconnectEx
TransmitFile

TransmitPackets
WSAIoctl

Completion Notification - Overlapped socket calls return immediately after issuing, there are 3 ways to get completion notification:
- 3.1 Callback: it gets called when the calling thread is in alertable state. The completion routine is specified as an optional parameter to some socket calls.
- 3.2 Event: WSAOVERLAPPED contians a WSAEVENT handle, which is signaled when the related I/O operation is completed. You must set valid WSAEvent handle in related overlapped structure that is passed to non-blocking socket calls.
- 3.3 Polling: After issuing overlapped socket calls, you can call WSAGetOverlappedResult to poll the completion status of previous non-blocking calls.

4. Overlapped I/O + IOCP

In IOCP model, you associate a socket handle with an I/O completion port and any further OverLapped socket operatioin will use this IOCP for completion notification. Compared with previous models, this model is the most complicated one. You must fully understand Overlapped I/O and I/O Completion Port mechanism to understand this model well.

Typically, when an overlapped I/O call is made, a pointer to an OVERLAPPED structure is passed as a parameter to that I/O call. GetQueuedCompletionStatus() will return the same pointer when the operation completes. With this structure alone, however, an application can't tell which operation just completed. In order to keep track of the operations that have completed, it's useful to define your own OVERLAPPED structure that contains any extra information about each operation queued to the completion port. This is so called "per-i/o data".

The CompletionKey parameter used when associate socket with iocp (using CreateIoCompletionPort() ) can be used to store "per-handle data". For each completion event related to a socket handle, the corresponding CompletionKey can be retrieved, as the 3rd parameter of GetQueuedCompletionStatus().

With IOCP, which in fact is a kernel Queue object, there is no limit on how many concurrent connections you can manage in one thread. And IOCP provides some system supports to optimize thread scheduling to reduce context switch cost. So this model is the prefered scalable server network model.

This article focus only on theory and idea, in next article of this topic, I will show the real code using these ideas.

NOTES:
1. WSAAsyncSelect/WSAEventSelect will change associated sockets from blocking mode to non-blocking mode, but BSD Select doesn't have this side effect
2. In I/O Multiplexing model (using BSD/Message/Event Select), a robust application must be prepared for the possibility that a network event arrives but the corresponding Win Sock call can't complete immediately.
3. "Socket overlapped I/O versus blocking/nonblocking mode" is a great article about Sync/Async VS Blocking/Non-Blocking
4. When using I/O multiplexing model, socket is usually set to NON_BLOCKING mode(manually when using BSD Select, automatically when using Message/Event Select ).

[Reference]
1. High Performance I/O Models
2. Winsock Programming Guide @ MSDN
3. Windows Sockets: A Quick And Dirty Primer
4. BSD Socket Introduction
5. Socket overlapped I/O versus blocking/nonblocking mode
6. Book : Network Programming for Microsoft Windows

No comments: