Memory Management in Native Code

Memory management is a core task in native world, careless usage of dynamic memory may cause the following problems:

- 1. Heap Fragment, this will introduce performance penalty since it breaks data locality
- 2. Memory Leak, it's a prgm correctness problem and a horrible defect for long-run software

Here I summarized some tips related to the two issues: Mem Optimization & Mem Correctness

Part I - Mem Optimization

General Principles
1. Ensure mem layout cohesion (aka. improve data locality)
2. Avoid frequent alloc/free (aka. batch mem ops, prefer few & bulk over large & small mem ops)

How to implement them?
- Redesign your data structure to make them live in large blocks
- Make unrelated data structure in different region
- Use memory pool to manage mem

Part II - Mem Correctness

One of the challenges when doing native code development is avoiding memory leak. It's so easy (also difficult to avoid) to forget releasing each memory block/object that has been allocated explicitly.

Types of Mem Leaks
1. Constant Leak, allocated mem are totally forgotten to release
2. Casual Leak, allocated mem are not release under some conditions
3. One-Time Leak , allocated mem is not released but that line of code only get executed once (for example, mem allocated in ctor of singleton objec)
4. Implicit Leak, mem blocks are hold too long (released too late in application life cycle, this kind of mem leak happens even in GC enabled language such as Java/.Net, for example, unused objects are still reachable through Root Set in GC)

To deal with Mem Leak problem, you have two choices:
- Avoid it
- Detect & Fix it

Sec. I - How to Avoid Mem Leak?

1. Adopt Resource Acquisition Is Initialization (RAII) Mechanism in C++

std::auto_ptr is a good choice if RAII is semantic right for your problem. If you want to ensure your object/mem get released whenever the control goes out of some scope (for example, multiple exit path, potential exception etc.), RAII can be used to solve your problem.

But it can't be passed as return value, can't be put in STL containers.

2. stack based allocation

_alloca() will allocated mem from stack rather than heap. The mem returned will be released when function returns.

But there is potential stack overflow exceptions, since stack is far more smaller than heap.

3. Reference Counting (aka, share/smart pointer)

Use some data structure to track how many owners are referencing the mem block or objects. When reference counting is zero, the mem/object is released.

std::tr1::shared_ptr and boost::shared_ptr are all based on Reference Counting and RAII concepts. They resolved the problem of not being able to be put in container, can't be passed as parameter and return value etc.

But, if your objects have cyclic reference, this mechanism doesn't work. The fundamental problem behind is that - the semantic of "useless", should be defined as "Can't Be Reached", not "No One References".

Another draw back of smart pointer style reference counting is that, it can't handle pointers that should be put into a union structure. Because union can't consists of any member fields that has user defined ctor/non-trivial default ctor/dtor/copy ctor etc.

4. Garbage Collection (yes, gc for C/C++)

Most modern GC uses Mark & Sweep algorithm to implement GC. The idea behind is that, GC has pointer list for all heap-allocated objects and a Root Set object pointer list. When garbage collection is triggered, it traverse from root set to find all reachable objects and mark them. Those unmarked objects are garbage that can be deleted. GC for C/C++ is a huge topic, [7] is a very good reference doc.

General purpose GC for C/C++ is difficult, but for your specific application requirement, it may not that challenge. According to my own GC implementation experience, the most difficult part is define your object ownership policy.

GC is great, but it still can't handle some "semantic garbage objects". That is to say, if you hold references to some objects that are in fact you will never use again, GC won't collect mem and other resources owned by these objects.

Essentially, Memory Management is all about Consistency of Ownership.
- Each mem block should have an owner
- Each mem block should have only 1 owner
- Only the owner of the mem block is responsible for its life cycle

So, the most important design principle about C/C++ memory management is - consider carefully about the ownership of an object/mem block: when and who should be responsible for releasing it.

Sec. 2 - How to Detect Mem Leak

1. Use Debug Version C RunTime library

1.1 Use _CrtDumpMemoryLeaks()

Step 1. include the following directives into each cpp source file
#include "crtdbg.h"
#include "stdlib.h"

Step 2. call _CrtDumpMemoryLeaks() at the line where you want to check memory leaks.

This method has a drawback that mem objects that are released after _CrtDumpMemoryLeaks() invocation will be treated as leaked mem. (This happens when mem is released in global object's dtor) It's a false negative.

1. 2 Use _CrtSetDbgFlag()

Add the following code at the entry point of your application

int nFlag = _CrtSetDbgFlag( _CRTDBG_REPORT_FLAG );
_CrtSetDbgFlag( nFlag );

This method don't have the drawback of 1.1, but you have no control when the mem leak action performs.

1. 3 Use CrtMemState

_CrtMemState cms1, cms2, cms3;

/* code to check */

if(_CrtMemDifference(&cms3, &cms1, &cms2))

This code will dump heap statistics info about the changes happened in the "code to check".

_CrtSetReportMode() can be used to control where to output these diagnose information.

_crtBreakAlloc / {,,msvcrtd.dll}_crtBreakAlloc / _CrtSetBreakAlloc() can be used to control the debug break condition.
More info on CRT mem debug routines, please see reference [1] and [2]

2. Monitor Process Working Set

The Win32 API GetProcessMemoryInfo() can query process working set size. You can use this api to check whether the working set size is changed after calling some suspicious functions.

It can't tell you where mem leak happens, but it a good way to write unit test to track mem leak problems.

3. Use Professional Tools

IBM Purify
Windows Leaks Detector
Visual Leak Detector
User Mode Dump Heap

Part III - Other Tips

1. use "#define SAFE_DELETE(ptr) if (ptr != NULL) { delete ptr; ptr = NULL; }" to avoid redeleting the same object.
2. remember to delete objects pointed by elements in container that contains pointers.
3. pair delete/new delete[]/new[] malloc/free correctly.
4. use "new (std::nothrow)" to eliminate exceptions raising in low mem situation.

NOTE: (Lessons learned from topic investigating)

When solving hard problems, Be Sure To:
1. Use well-known idioms and well-understood mechanisms
2. Keep things as simple as possible

Memory Management:
1. It's another subsystem/component of your whole system
2. Design this component with care
3. Avoid using new/delete directly

Techniques discussed here apply to not only memory blocks, but also any type of "resource" that needs explicit requesting/releasing.


Mem Leak
1. Mem Leak Detection http://www.ddj.com/cpp/204800654
2. Microsoft CRT Debug Routines http://msdn.microsoft.com/en-us/library/1666sb98(VS.71).aspx
3. Microsoft CRT Debug Tech http://msdn.microsoft.com/en-us/library/zh712wwf.aspx
4. Mem Debugger List http://en.wikipedia.org/wiki/Memory_debugger
5. Purify from IBM - Use Purify for C code
6. Mem Leak in Java/.Net http://www.agiledeveloper.com/articles/MemoryLeak092002.pdf

Garbage Collection
7. C/C++ GC from HP http://www.hpl.hp.com/personal/Hans_Boehm/gc/
8. GC for C/C++ http://blog.codingnow.com/2008/06/gc_for_c.html
9. How .Net GC works, GC in OO Language, Auto GC

Understanding Mem Management
10. C++ Mem Management http://www.slideshare.net/reachanil/c-memory-management
11. Inside Mem Management http://www.ibm.com/developerworks/linux/library/l-memory/
12. C++ Memory Management: From Fear to Triumph (Part 1, Part 2, Part 3)
13. http://www.cantrip.org/wave12.html
14. Mem Optimization http://www.codingnow.com/2008/memory_management.ppt
15. Mem Mgmt 4 Sys Coder http://www.enderunix.org/simsek/articles/memory.pdf

16. C++ smart pointers http://www.onlamp.com/lpt/a/6559
17. Mem Leak Definition
18. Is Mem Leak Ever OK?

No comments: