How to Read Source Code

Part I - General Steps and Principles

1. Define a Clear Goal
- what's your reading purpose? to know how, to own components, to modify and extend?
- results driven, what are the desired final outcomes?
- just focus on what you want to get

2. Know it as Client User
- read user manual
- get an overall big picture
- know what it can do and what can't
- what is it suitable for and what not
- try the software or write some application over the library

3. Thinking Before Reading
- what if you design the whole system?
- what's the core challenges that are unclear to you?
- write down your questions and concerns
- read with questions

4. Know the Architecture/Components
- know the overall architecture first
- divide the whole system into small components
- identify what to focus, what to ignore
- use build file to identify component dependencies
- try building it

5. Read Specific Part in Detail
- make a SMART(Specific, Measurable, Achievable, Results-based, Time-specific) plan
- focus on core parts and ignore trivial ones
- identify entry point: main/wmain function
- identify the main loop (server application)
- identify thread creation/termination
- identify core data structure
- identify operations on core data structure
- use typical scenario to figure out how the system really works
- noting/documenting/charting down while reading

6. Producing Results
- big picture from user's perspective
- big picture from dev's perspective
- arch/logic for individual component
- summarize core data structure
- practice: build/deploy/use/debug/modify/hack the system
- comments on the implementation(what's good, what's bad, what learned)

7. Misc Tips
- read doc(user manual, design doc) before code
- get core data structure doc first if possible
- read the code in both static and dynamic(debug) way
- debug/step into code using specific execution scenario
- read code iteratively, don't deep into detail in the beginning
- use interface/contract to separate concerns
- overall -> detail, but just detail on small areas
- leverage code comprehension tools to get static information
- print out core code and read them on real papers
- try to write unit test/use case for the software
- consider refactoring the code(kind of active reading) if unit tests are given
- if it's really hard to read, consider rewriting it!

Part II - Tools

One of the most frequent activities when reading code is navigating among various source codes files. So tools that help navigating are very important to improve the reading efficiency. Here are some popular tools for this purpose:

1. Source Code Index Generator
cscope http://cscope.sourceforge.net/
ctags http://ctags.sourceforge.net/

2. GUI Frontend for Index Generator
kscope http://kscope.sourceforge.net/
cbrowser http://cbrowser.sourceforge.net

3. Code Index Generating and Navigating
Source Navigator http://sourcenav.sourceforge.net/
Source Insight http://www.sourceinsight.com/
LXR http://lxr.linux.no/

4. For C Language ONLY
CXRef http://www.gedanken.demon.co.uk/cxref/
cflow http://www.gnu.org/software/cflow/

1. Code Comprehension Tool List - http://www.cs.ubc.ca/~murphy/cs319/index.html
2. Code Doc Generating Tool List - http://www.stack.nl/~dimitri/doxygen/links.html
3. A Survey on Code Comprehension Tools - http://www.grok2.com/code_comprehension.html
4. Tips for Code Reading - http://c2.com/cgi/wiki?TipsForReadingCode
5. Reading V.S. Rewriting - http://www.joelonsoftware.com/articles/fog0000000069.html

No comments: