1/20/2010

Database Technologies for Decision Support System

Database technologies can be applied into two types of scenarios:
- Transaction Processing(OLTP)
- Analytic Processing, using statistical method(OLAP) or machine/computational learning method(Data Mining)

OLTP, which is based on E.F. Codd's relation model, is the traditional (maybe most popular) application type of DBMS and most people are very familiar with it. This post tries to summarize related technologies in analytic processing, which is widely adopted in decision support systems.

Part I - What Data to Analyze?

In Decision Support(or Business Intelligence) system, data to be analyzed usually comes from operational system, i.e., OLTP relational database. These relational databases are often located at different departments/sites, may be using different DBMS vendor's products, using different data schema and merely contain data within a relatively short time span.

To make a good business decisions, it's strongly desired to hold historical data, view them in a uniformed way and not bothering the daily operational environment. Thus comes out the Data Warehouse, which is a repository of an organization's electronically stored data and is designed to facilitate reporting and analysis.

Operational data (in OLTP system) is extracted, transformed(also cleaned) and loaded(by ETL subsystem) into the data warehouse for further analyzing. OLAP and DM systems read these data, analyze them, produce useful reports and present them to end(business) users. (See diagram below)


Decision Support System Life Cycle, based on MS SQL Tech Doc

Part II - How to Analyze?

There are two ways to analyze data in data warehouse:

1. Data Mining

Data Mining is the extraction of hidden predictive information from large databases. Someone also defines it as knowledge discovery process(using machine learning algorithms) in database.

Technical highlights:
- Typical Data Mining enabled information systems process data in terms of Record(or Case).
- Such system also provides some Language Extension to facilitate composing data mining related queries. (for example, MS SQL Server provides DMX - Data Mining Extensions)

The challenging part of data mining is various mining algorithms. Here is a list of data mining algorithms available in MS SQL Server Analysis Service.

See - A Basic Data Mining Tutorial using MS SQL Server

2. OnLine Analytical Processing(OLAP)

OLAP is the processing of large scale multidimensional data using statistical based methods. A typical OLAP system provides:
- Multidimensional Model
- Analytical Query Language (for example, MS SQL Server provides MDX - Multidimensional Expression)
- Analyzing Server(or Engine) that executes analytical queries

See - A Basic Tutorial for OLAP in MS SQL Server

2.1 Multidimensional Model

Multidimensional model view data as as cubes that generalize spreadsheets to any number of dimensions. It categorizes data either as numerical values(a.k.a. measures) associated with some facts or textual values(a.k.a. dimensions) that characterize the facts.

Facts represent the subject - the interesting event in a enterprise that need to be analyzed.
Dimensions represent context information for facts, perspectives to view facts.
Measures represent those numeric properties of facts that decision makers want to analyze.

For example, in a shoe shop, shoe purchasing events are the facts, the selling price is a measure attribute and the color, the size, the manufacture and the brand are all dimension attributes.

More tutorial explanation of this model can be found in [7][8].

2.2 OLAP Server Architecture

There are three ways to implement multidimensional model:

- ROLAP (Relational OLAP)
Fact data is stored in relation model based storage system and some special induces technologies may be adopted. In this architecture, measures are derived from the records in the fact table and dimensions are derived from the dimension tables.

This architecture can be further divided into two types:
 Star Schema
 Snowflake Schema

Both of them contains fact table and dimension table, but in star schema, there is only one table for one dimension, while in snowflake schema, there are usually multiple tables for one particular dimension.

- MOLAP (Multidimensional OLAP)
Fact data is stored in an optimized multi-dimensional array storage, i.e., the server supports the multidimensional model directly.
It's usually regarded to be faster but less scalable than ROLAP.

- HOLAP (Hybrid OLAP)
It is a combination of ROLAP and MOLAP. HOLAP allows storing part of the data in a MOLAP store and another part of the data in a ROLAP store.

Part III - Other technologies

Other related technologies include data visualization, metadata management, analytical query parallelizting etc.

Analytical database technology is very promising and also complicated, more survey paper can be found at[5][7][8].

[Reference]

General

[01] Course on DM & OLAP
[02] OLAP & Data Mining Links
[03] OLAP & Data Warehouse Bibliography
[04] A Brief Tutorial on Data Mining and OLAP

[05] An Overview of Data Warehousing and OLAP technology
[06] Providing OLAP to User-Analysts: An IT Mandate

[07] An Overview of Data Warehouse, OLAP and Data Mining Technology
[08] Multidimensional Database Technology

Data Warehousing

[21] Data Warehouse Architecture
[22] Oracle Data Warehousing Guide

Data Mining

[31] Data Mining FAQ
[32] Data Mining Introduction
[33] Data Mining Tech Summary

[34] Oracle Data Mining Concept
[35] Microsoft SQL Server Analysis Service - Data Mining

OLAP

[41] OLAP Introduction
[42] OLAP Overview
[43] OLAP Council
[44] OLAP Wiki

[45] Oracle OLAP User Guide
[46] Microsoft SQL Server Analysis Service - OLAP

1/12/2010

Hardware Multithreading Primer

A multithreading processor is able to pursue two or more threads of control in parallel within the processor pipeline. The contexts of two or more threads are often stored in separate on-chip register sets.

Formally speaking, CMT(Chip Multi-Threading), is a processor technology that allows multiple hardware threads of execution (also known as strands) on the same chip, through multiple cores per chip, multiple threads per core, or a combination of both.

Let's see various techniques that enable hardware multithreading:

1. Multiple Cores per Chip

CMP
(Chip Multi-Processing, a.k.a. Multicore), is a processor technology that combines multiple processors (a.k.a. cores) on the same chip. (see Figure 2 (b))

The idea is very similar to SMP, but implemented within a single chip. [10] is the most famous paper about this technology.

2. Multiple Threads per Core

2.1 Vertical Multithreading - Instructions can be issued only from a single thread in any given CPU cycle.

- Interleaved Multithreading(a.k.a. Fine Grained Multithreading), the instruction(s) of other threads is fetched and fed into the execution pipeline(s) at each processor cycle. So context switches at every CPU cycle.(see Figure 1 (b))

- Blocked Multithreading(a.k.a. Coarse Grained Multithreading), the instruction(s) of other threads is executed successively until an event in current execution thread occurs that may cause latency. This delay event induces a context switch. (see Figure 1 (c))

2.2 Horizontal Multithreading - Instructions can be issued from multiple threads in any given cycle.

This is so called Simultaneous multithreading (SMT): Instructions are simultaneously issued from multiple threads to the execution units of a superscalar processor. Thus, the wide superscalar instruction issue is combined with the multiple-context approach. (see Figure 2 (a))

Figure 1 - Single Thread Multiple Issue (from [3])

Figure 2 - Multiple Thread Multiple Issue (from [3])

In summary[3]:

- Unused instruction slots, which arise from latencies during the pipelined execution of single-threaded programs by a contemporary microprocessor, are filled by instructions of other threads within a multithreaded processor. The execution units are multiplexed among those thread contexts that are loaded in the register sets.

- Underutilization of a superscalar processor due to missing instruction-level parallelism can be overcome by simultaneous multithreading, where a processor can issue multiple instructions from multiple threads in each cycle. Simultaneous multithreaded processors combine the multithreading technique with a wide-issue superscalar processor to utilize a larger part of the issue bandwidth by issuing instructions from different threads simultaneously.

Notes:

Superpipeline - extreme pipeline processor technology, where the instruction pipeline is divided into extreme amount (usually, 8+) of pipe-lined stages.

Superscalar - (a.k.a. multiple issue), is a processor technology, where multiple instructions can be issued to the instruction execution unit.

[Reference]

General
[1] CMT vs CMP vs SMT
[2]
Chip Multithreading: Opportunities and Challenges
[3] A Survey of Processors with Explicit Multithreading

SMT
[5] Simultaneous Multithreading - Maximizing On-Chip Parallelism
[6] Converting TLP to ILP via Simultaneous Multithreading
[7] Simultaneous Multithreading - A Platform for Next-Generation Processors

CMP
[10] The Case for a Single-Chip Multiprocessor

Case Studies
[11] Niagara: A 32-Way Multithreaded SPARC Processor
[12] Niagara2: A Highly Threaded Server-on-a-Chip

SMT Research Group
[15] http://www.cs.washington.edu/research/smt/

Multicore Computing Course
[20] http://www.cs.rice.edu/~johnmc/comp522/

1/09/2010

Parallel Programming - Using PVM

PVM is an inactive direction in HPC community, but there are many lessons can be learned from its programming model, its architecture design/implementation and how/why it failed to be the dominate system.

Part I - What's PVM?


PVM (Parallel Virtual Machine) is a software framework for heterogeneous parallel computing in networked environments, which is based on message passing model. Its main focus is uniform parallel computing framework on interconnected heterogeneous computers of varied architecture(from Unix to Windows, from PC, Workstation to MPP).

(diagram captured from [1])

The PVM system is composed of two parts:
-The first part is a Daemon , called pvmd3 and sometimes abbreviated pvmd , that resides on all the computers making up the virtual machine.
- The second part of the system is a Library of PVM interface routines, which contains a functionally complete primitives needed for cooperation between tasks of an application.

Part II - PVM Programming Model

A PVM application consists of a collection of cooperating tasks, each of which is responsible for some workload of a big problem. Tasks can be created/terminated across the network and it also can communicate and synchronize with other tasks.

Sometimes an application is parallelized along its functions; that is, each task performs a different function, for example, input, problem setup, solution, output, and display. This process is often called functional parallelism .

A more common method of parallelizing an application is called data parallelism . In this method all the tasks are the same, but each one only knows and solves a small part of the data.

(diagram captured from [1])

PVM supports either or a mixture of these methods.

Parallel applications can be viewed in another perspective, based on the organization of the computing tasks. Roughly speaking, there are three types:

- Crowd/Star Computing, a collection of closely related tasks, perform computations on different portions of the workload, usually involving the periodic exchange of intermediate results. A star-like parent-child task relationship exists among the tasks.

- Tree Computing, tasks are spawned, usually dynamically as the computation progresses, in a tree-like manner, thereby establishing a tree-like, parent-child relationship.

Tree Computation Using Parallel Merge Sort (Detail Info)

- Hybrid Model, combine the upper to models.

Crowd Computation typically involves three phases:
- Initialization
- Computation
- Result aggregation

This model can be further divided into two types:

- Master/Worker, master is responsible for process spawning, initialization, collection and display of results, and perhaps timing of functions. The worker programs perform the actual computation involved. Their workloads are typically assigned by the master (statically or dynamically).

- Work Crew, multiple instances of a single program execute, with one tasks(typically the one initiated manually) taking over the non-computational responsibilities in addition to contributing to the computation itself.

Part III - PVM Features/Interface

- Process/Task Management
- Resource Management/Configuration
- Message Passing

Detail Interface usage doc can be found at: PVM User Interface
Here are some PVM example application source code

Part IV - How PVM works

Geist's Book has a great chapter on the internals of PVM Design & Implementation

The core design lies in the host table and message routing:

1. Host table describes all hosts in a virtual machine. It is issued by the master pvmd and kept synchronized across the virtual machine.

Pvm Host Table (from[1])

2. Some host manipulation operations involves several hosts (for example, host addition), so 2-phase commit/3 phase commit protocol is applied (master is the coordinator).

3. Message routing is accomplished through Pvmds. Message contains target task ID, which in turn encapsulates some Pvmd ID. Pvm Daemon will use the host table to identify the target Pvm Daemon information and put the message to the corresponding send queue.

Part V - Developing PVM Applications

PVM app development cycle:
1. A user writes one or more sequential programs in C, C++, or Fortran 77 that contain embedded calls to the PVM library.
2. These programs are compiled for each architecture in the host pool, and the resulting object files are placed at a location accessible from machines in the host pool.
3. To execute an application, a user typically starts one copy of one task (usually the "master" task) by hand from a machine within the host pool. This process subsequently starts other PVM tasks, eventually resulting in a collection of active tasks that then compute locally and exchange messages with each other to solve the problem.

Notes for PVM on Windows:
1. Code/Bin can be found at http://www.netlib.org/pvm3/win32/
2. Don't use the InstallShiedl version, it contains many bugs, use manual install version
3. If "[pvmd pid4604] mksocs() somepath\pvmd.xxx failed. You are required to run on NTFS" error message shows up, check the file, delete it and restart pvm
4. C/C++ pvm application needs include "pvm3.h", link libpvm3.lib and ws2_32.lib ( also link libgpvm3.lib if group communication functions are called)
5. Build-in Pvm Library uses old C runtime library, you should ignore "libc.lib" in VC++ linker/Input setting
6. Build-in Pvm Library only works with static C/C++ runtime library, please change VC++ project setting: Property->C/C++->Code Generation->Runtime Library, and choose "Multi-Threaded(/MT)"
7. To run your application, executable files should be put into $(PVM_ROOT)\Bin\$(PVM_ARCH) directory

I wrote some Pvm applications, trying to build and run it will be a good start to Pvm journey.

Part VI - MPI vs PVM

MPI is a pure standard or specification, but PVM can be regarded as both standard and implementation. MPI standard is the work of MPI Forum, which is formed by over 40 organizations, companies and individual persons, while PVM's standard/implementation are maintained by one project group.

Let's focus on the technical difference of the two system.

Programming Model

Both are based on message passing model and can support SPMD(Single Program Multiple Data) /MPMD(Multiple Program Multiple Data) pattern.

But MPI is treated as static model, where process communication happens in static manner. For example, tasks are regarded created statically with no failures.

While PVM is thought to be dynamic model, where system resource can be configured, tasks can be create/destroyed at runtime. And node failure is also within consideration.

What's more, PVM provides the conceptual view of virtual machine, which consists of various nodes. Each node is a physical machine, such as PC workstation, SMP, Cluster or even an entire MPP system. While MPI focus more on communications and don't have such concepts.

Message Passing

Both system is based on message passing model, so passing message is the core feature of the two systems. But MPI provides more options and richer semantic, its various implementations are considered to be faster than that of PVM.

Process Control

The initial MPI standard doesn't contain specifications on how to create/destroy process/tasks, later improvements(MPI 2) add related APIs.
PVM considered these dynamic features from the beginning, so spawn/kill tasks related interface is included in the initial release of the system.

Failure Handling

PVM considered failure scenarios at the beginning, it provides some (failure) event notification mechanism to let application developers to write fault tolerant programs, although PVM itself is not fault tolerant(for example the master PVM daemon is a single point of failure).

MPI don't specify how to deal with failure at the beginning and added PVM like event notification feature in later version but still very limited(Its main purpose is to locate the root cause of failure, not helper mechanism to write fault tolerant application).
MPI3 is considering check-point based failure handling features.

Summary

PVM is designed for heterogeneous and dynamic environment, it provides a virtually uniformed conceptual view.

While MPI is mainly designed for high performance and source code level portability.

Reference [21][24][27] are very good material on this topic.

Part VII - Why PVM Failed

Standing at today's position, we can tell easily that MPI beats PVM as the message passing standard mechanism. But why? Since PVM's feature seems more powerful and its concept and architecture is beautiful.

Here are some of my thoughts:

1. Do One Thing and Do it Well

MPI initially only focus on communication with very limited dynamic mechanisms. But at that time, performance/portability is critical. As node scale is relatively small and the system is special purpose super computers, dynamic and failure handling is not that important.

PVM involves many great features but the practical performance is not so good, since performance optimization is not its main focus.

2. Ecosystem is Important

MPI starts as an industrial/academia cooperating efforts. Various HPC hardware vendors has the motivation to adopt this standard (to win more costumers). HPC application developers likes to use it because there is no more porting pains any more. Both sides are very happy.

PVM starts as pure research project, only one team define the spec and write the implementation. Although it can quickly response to end user requirement, the lack of industrial vendor support is very dangerous for its long term survival.

PVM's main focus is heterogeneous system integration. But HPC hardware system is very expensive, how many users will have the requirement to integrate many such system? Industrial vendors are very reluctant to develop a system that can will talk to other competitor's similar product.

3. Vision Driven V.S. Problem Driven

PVM is fancy and elegant but MPI solves the biggest problems of HPC application developers. PVM is a very good research project but born to be research purpose only.

[Reference]

Tutorials
01. The PVM Book
02. An Introduction to PVM Programming
03. Advanced PVM tutorial
04. PVM Beginner Guide
05. Parallel Processing using PVM
06. PVM programming hands-on tutorial
07. PVM User Manual

General
11. PVM Wiki
12. PVM official Home
13. PVM3 Home
14. PVM++
15. PVM/MPI Conference
16. Java over PVM (Paper, Home)
17. PVM Papers

MPI vs PVM
21. PVM and MPI: a Comparison of Features

22. PVM MPI are completely Different
23. Why are PVM/MPI so different?
24. Goals Guiding Design: MPI/PVM

25. PVM vs MPI Article
26. Performance Comparison PVM, MPI and DSM (ppt)
27. PVM/MPI Comparison using Physical Applications
28. PVM/MPI comparison for Java HPC

1/06/2010

Parallel Programming - Using MPI

MPI is a message passing programming model standards[2], it defines various Terms/Concepts, Data Structures and Function Signatures that are used to passing messages among computer processes.

1. Terms and Concepts

A MPI application consists of multiple Processes(Tasks), each has a unique identifier called Rank. A Process belongs to some Groups and some Communicators. Processes don't share memory or state, the only way to communicate is sending/receiving message.

Programming Patterns - Usually, multiple Processes are needed to accomplish a task, there are some popular patterns on how processes are cooperated:
- Peer to Peer (a.k.a. Work Crew) Model, each process behaves equally
- Master/Slave (a.k.a. Master/Worker) Model, one process acts as coordinator, others act as common labor force

2. MPI APIs

MPI APIs can be categorized as:
- Environment Management
- Data Type
- Point-to-Point Communication
- Collective Communication
- One Sided Communication
- Process (Topology, Creation and Management)
- Parallel I/O

Point-to-Point communication occurs between two Processes, while Collective communication involves all Processes in a communicator and One-Sided communication only needs on Process's participation.

3. Collective Communication Patterns

Typical collective communication semantics are somewhat hard to describe, Some diagrams (mainly from Rusty Lusk) are listed below to illustrate the semantic of each primitive:


4. Typical MPI Program Structure

#include "mpi.h"

int main(int argc, char** argv)
{
 int nProc;
 int nThisRank;
 MPI_Init(&argc, &argv);
 MPI_Comm_size(MPI_COMM_WORLD, &nProc);
 MPI_Comm_rank(MPI_COMM_WORLD, &nThisRank);

 if (nThisRank == 0)
 {
  //Core App Logic
 }
 else
 {
  //Core App Logic
 }

 MPI_Finalize();
 return 0;
}

I had written some non-trivial MPI applicatioins:
- Demo various collective communication semantic
- Parallel Numeric PI Calculation
- Map-Reduce-Merge over MPI

You can try the code to get hands on experiences.

5. Combine Message Passing with Threading

Typical MPI applications run one process per core on a machine node. Since all cores on a node share the same physical memory, we can use a combined model, where one process per node and multiple threads per process(OpenMP, PThreads etc) on each node. I.E.:
- Threading within one node
- Message Passing cross nodes

Notes on MPI programming:

1. MPI's network facilities are message oriented, not connection/stream oriented (as in Tcp Socket programming), so before receive a message, you'd better know how large it will be, other wise, message may be truncated.

2. MPI can dealing communication with both fixed length and variable length data types. For variable length data types, use MPI_Pack/MPI_Unpack and those MPI_xxxv(for instance, MPI_Scatterv, MPI_Gatherv, MPI_Allgatherv, MPI_Alltoallv) functions.
(mpi.net's source contains code that leverages these *v functions, it's good example to learn MPI programming)

[Reference]

0. MPI Official Site
1. MPI V2.2 Official Standard
2. Message Passing Interface on wikipedia

3. MPI on Multicore (PPT)
4. MPI and OpenMP Hybrid Model
5. Combining MPI and OpenMP

Tutorials

11 Parallel Computing Introduction
12 MPI Tutorials @ LLNL (Exercises)
13 Tutorial on MPI by William Gropp (Exercises)
14 C++ MPI Exercises by John Burkardt
15 MPI Hands-On Tutorial by OSC
16 Tutorial on OpenMP and MPI

17 Book online: MPI The Complete Reference
18 Book: Parallel Programming With MPI

19 Clear Message Passing Concept Explanation

MS-MPI/MPI.NET

20 Windows HPC Server 2008 - Using MS-MPI whitepaper
21 Using Microsoft MPI
22 MPI.NET Tutorial

Implementations

- Native
MPICH2 - a MPI implementation
Open MPI (formerly Lam/MPI)
MPI Implementation List (List 1, List 2)

- .Net
Pure MPI.NET (implemented totally using .Net)
MPI.NET (an .Net wrapper around native MPI library)

- Java
Java-MPI binding list
mpiJava
MPJ Express

- Python
PyMPI
MPI4Py

- Matlab
MatlabMPI