6/04/2009

Inside Scalable I/O Model - In Sync & Async Way

  The so called "I/O Model" describes how you write code to accomplish I/O tasks, such as connecting to a server, writing data to disk, receiving data from network etc. Programmatically accomplishing I/O tasks is far more complicated than just calling a single related OS API, for example WriteFile() or ReadFile().

  In this article, I try to summarize and explain all the options you have when programming I/O tasks.

I. What's Sync & Async I/O

  The concepts - "Asynchronous and Synchronous" come from the telecommunication domain. They mainly deal with timing - "Sync communication needs an external clock signal to coordinate the rhythm of sending/receiving sides, while in Async model, any timing required to recover data from the communication symbol is encoded within the symbols."[1]

  Sync/Async are introduced into computer science for a different meaning: "Async operations are those occurring independently of the issuing program flow, while in Sync model, requesting program flow must wait for the operations to acknowledge completion."[1]

  Particularly, "Asynchronous I/O is a form of I/O processing that permits other processing to continue before the transmission has finished"[2]; in Synchrounous I/O, the operation sender "must wait until the hardware has completed the physical I/O, so that it can be informed of the success or failure of the operation"[5].

  MSDN also has a great article about the Sync/Async concepts - Synchronous and Asynchronous I/O.

  NOTE: in the following sections, we use Async/Non-Blocking, Sync/Blocking alternatively and don't distinguish them. This convention is conformed to wikipedia and MSDN, but some popular textbooks(such as APUE ) differentiate these terms explicitly.

II. Why Invent Two I/O Models?

  The Sync I/O model is simple to understand and straightforward to use, but you should wait while the I/O is performing, thus the ultimate performance is limited.

  The basic motivation for Async I/O is to boost performance. In fact, there are two assumptions about this problem:
  1. I/O operations are time consuming, there IS a lot of time for CPU to execute other instructions while the I/O ops are going on.
  2. There are large amount of available instructions to execute when the results from I/O ops are not available.

   If these two assumptions are not correct, the performance of your code may not be improved even if you adopted the powerful Async I/O programming model.

III. How to do Scalable I/O Operations?

The challenges for scalable I/O model are:
1. What code to execute after I/O operations are issued (maybe in async/sync way)?
2. How to detect and What to do when I/O operations are completed?

Scalable I/O operations can be accomplished in either Sync or Async way:

- Scalable Synchronous I/O Models

  In Sync I/O Model, the API calls return only when the requested operation completes. But this doesn't mean that these calls must block. For example, suppose you want to read data from socket, if there are some data in the socket buffer before you issued the read request, the read() call will succeed without block.

  In order to avoid CPU idle while i/o operations are blocked, you may execute other tasks that is not related to the pending i/o, or avoiding i/o blocking by means of only issuing i/o operations at proper time. That is to say:

1. Multitasking - Code in other tasks get executed after I/O request is issued and code follows the I/O operations get executed when they are completed. (Task may take the form of thread or process)

2. I/O Multiplexing - Use one thread to handle multiple sync I/O device handles.
  The key idea of this model is that - you call sync i/o functions only when you know that call will NOT block the calling thread (Because the data/status is ready even if you haven't issued any corresponding i/o operations, for example: connect(), read() on socket).
  You use some sync calls (such as select, poll, epoll) to check multiple device handles' status to know what i/o operations on them will not block.
  Essentially, I/O multiplexing provides a way to check the status of multiple device handles in one sync call. It only returns when some events happened on some handles. Thus you can handle multiple Sync I/O operations in one thread.

- Scalable Asynchronous I/O Models

  In Async Model, I/O operation call returns immediately after being issued and code that follows will get executed. The various models differs only on how to get I/O completion notification:

1. Polling - Use various polling function (such as HasOverlappedIoCompleted() on Windows) to poll whether some pending Async I/O is completed

2. Callback - Kernel will call some user provided call back functions when some i/o operations are completed.

3. Waiting - The OS kernel will signal some kernel event objects when i/o operations are completed. Application code just waits on these events to get completion notification.

4. Queuing - Kernel will place some information packet into a Queue (such as I/O Completion Port on Windows) when I/O operations completed. Client code should check that Queue for completion information.

Another advantage of Async I/O is that it reduces memory copy cost. Other I/O models need copy data(for send/receive) in system buffer from/to application buffer. In Async I/O, system knows the receiving/sending buffer when the io request is issued, so I can use it directly, no need to do memory copy between system space and user space.

IV. I/O Models on Windows Platform

- Multi-threading is supported through thread library
- I/O multiplexing is supported by select(), WSAAsyncSelect(), WSAEventSelect()
- Async Polling/Async Waiting is supported by means of Overlapped I/O
- Async Callback model is supported by means of Alertable I/O
- Async Queuing is supported by means of I/O Completion Port

NOTE:

1. To understand Alertable I/O, you must get familiar with the Asynchronous Procedure Call mechanism first, which is a great infrastructure for inter-thread/inter-process communication facility: Win32 API QueueUserAPC() can be used to queue APC entry to other thread's APC queue even in other process.
- The most famous usage of this facility is Gracefully Terminating Thread - if a thread is in alertable waiting state and you want to terminate it, you can post a dummy APC to it. That thread will then resume from wait status, execute this dummy APC. You can do graceful termination actions after that. [more detail can be found in chapter 10, "Windows via C/C++"]

2. Alertable I/O and Overlapped I/O are old techniques, use I/O Completion Port as much as you can, it's easy to use and also very powerful.

V. I/O Models on Linux Platform

- Multi-threading is supported
- I/O multiplexing is supported by select(), poll(), epoll_xxx()
- Async Callback is supported by Linux AIO using system signal or notify function

NOTE:
- poll() improves select() in that there is no limitation on handle counts, and epoll_xxx() improves poll() in that data transfer between kernel and user space is reduced greatly and the complexity of the corresponding kernel logic is also reduced from O(N^3) to O(N).[17]

VI. I/O Models on .Net

- Async Callback is supported in the form of Event-based Asynchronous Pattern
- Async Polling/Async Waiting/Async Callback is supported in the form of IAsyncResult Async Pattern

.Net also provides a mechanism for developers to call Synchronous method in Asynchronous way using Asynchronous Programming Using Delegates.

More information can be found @ Asynchronous Programming Design Patterns

VII. I/O Models on Java

-
I/O multiplexing is supported in Readiness Selection pattern in Java NIO library. It works similarly as BSD Select mechanism
- Async Polling/Async Waiting/Async Callback will be supported in JDK7 AIO as Java Future style and Callback Style[26]

The difference between Java NIO and AIO can be described as "With non-blocking I/O, you're getting events through a selector when the channel is ready to do I/O. The asynchronous I/O gives you a notification when the I/O is completed".[27]

Currently, scalable I/O library on Java is very raw & preliminary, many IO libraries for Java exit in the community, such as Apahce MINA, Jboss Netty, Grizzly from GlassFish and Naga.

Update@12/23/2009:
- The I/O Multiplexing mechanism is essentially a Reactor design pattern, while Asynchronous I/O is in fact a Proactor design pattern. [31]

[Reference]

- General
1. Wikipedia explanation, Async vs Sync, Async
2. Asynchronous IO, http://en.wikipedia.org/wiki/Asynchronous_I/O
3. C10K Problem, http://www.kegel.com/c10k.html
4. MPI Async I/O, http://beige.ucs.indiana.edu/I590/node109.html
5. Sync/Async by Oracle GURU, http://www.ixora.com.au/notes/asynchronous_io.htm

- Windows
6. Mulithreaded Async IO V.S. IO Completion Port
7. IO Completion Port (IOCP Introduction, Inside IOCP)
8. Callback supports by Windows Kernel
9. Tips for Async I/O and IOCP
10. Async vs Sync I/O on Windows

- Linux
11. Async IO on Linux from Lighttpd
12. Ideas for Async-IO model from Squid
13. Kernel AIO support for Linux
14. Linux AIO Programming Introduction
15. Linux AIO Design Notes
16. epoll performance analysis
17. select, poll, epoll comparison
18. I/O Event Handling Under Linux

- .Net
19. .NET Asynchronous Programming Design Patterns
20. .Net Asynchronous File I/O
21. Call Sync Method in Async Way on .Net

- Java
22. Java I/O Tutorial
23. Java Async I/O Library from IBM
24. Java NIO (Tutorial, JDK Doc, JSR51, Intr1, Intr2)
25. Java NIO.2/AIO (JSR203 OpenJdk NIO Page, Design Notes)
26. Java One 2009(NIO in JDK7, AIO in JDK7)
27. Interview on Java NIO.2
28. Java Concurrency Tutorial

- Talk
29. http://www.slideshare.net/Arbow/asynchronous-io-programming
[This Presentation describes many reasoning and insights behind Async IO]
30. http://www.slideshare.net/directi/async-io-and-multithreading-explained
[Interesting and Intuitve explanation on Threading and Async IO]

31. Great Article on Comparing High Performance I/O Design Patterns
32. Asynchronous I/O and Asynchronous Disk I/O Explorer
32. I/O Multiplexing & Scalable Socket Servers

No comments: