7/29/2009

How DNS Works

I - What's DNS & Why DNS?

In the beginning, people use numerical identifiers(IP) to represent network devices. But human is good at remembering meaningful names, not numbers, so here comes the host name. In early days, there is a global file that stores the name/ip mapping, which is known as hosts file.

As there are more and more devices in Internet, a single hosts file can't solve the mapping problems. So people invented DNS.

The Domain Name System(DNS) is a hierarchical naming system for computers, services, or any resource participating in the Internet. It associates various information with domain names assigned to each of the participants. Most importantly, it translates domain names meaningful to humans into the numerical (binary) identifiers associated with networking equipment for the purpose of locating and addressing.

DNS essentially functions as a distributed database using a client/server relationship between clients that need name resolution (mapping host names to IP addresses) and the servers that maintain the DNS data.

II - Related Concepts

1. Host & Host Name

Each device on the Internet is called a Host. Whether the host is a computer, printer, router, and so forth, as long as it has a unique IP address, it’s a host. Just as the IP address identifies the host uniquely, so does the Host Name.

2. Zone, Domain & Delegation

A Zone is a portion of the DNS database that contains the resource records with the owner names belonging to a contiguous portion of the DNS namespace.

A Zone starts as a storage database for a single DNS domain name. If other domains are added below the domain used to create the zone, these domains can either be part of the same zone or belong to another zone. Once a subdomain is added, it can then either be:
  • Managed and included as part of the original zone records, or
  • Delegated away to another zone created to support the subdomain
A DNS database can be partitioned into multiple Zones. A DNS server is considered authoritative for a domain name if it loads the Zone file containing that name.

Delegation is a process of assigning responsibility for a portion of a DNS namespace to a DNS server owned by a separate entity.

3. DNS Database Replication

There could be multiple zones representing the same portion of the namespace. Among these zones there are three types:

  • Primary
  • Secondary
  • Stub

Primary is a zone to which all updates for the records that belong to that zone are made. A secondary zone is a read-only copy of the primary zone. A stub zone is a read-only copy of the primary zone that contains only the resource records that identify the DNS servers that are authoritative for a DNS domain name.

Any changes made to the primary zone file are replicated to the secondary zone file. DNS servers hosting a primary, secondary or stub zone are said to be authoritative for the DNS names in the zone. A DNS server hosting a primary zone is said to be the primary DNS server for that zone.

4. Resource Record

A DNS database consists of resource records (RRs). Each RR identifies a particular resource within the database. There are various types of RRs in DNS. The common RR types are: Start of Authority(SOA), Name Server(NS), Mail Exhanger(MX), Host(A), Alias(CNAME). Please read[3] for detailed description on each RR type.

III - How DNS Works

DNS is essentially a distributed client/server system, where communication is mainly done by send/receive DNS query.

DNS queries can be sent from a DNS client (resolver) to a DNS server, or between two DNS servers. A DNS query is merely a request for DNS resource records of a specified type with a specified DNS name. For example, a DNS query can request all resource records of type A (host) with DNS name "abc.com".

There are two types of DNS queries that may be sent to a DNS server:

  • Recursive
  • Iterative

A recursive query forces a DNS server to respond to a request with either a failure or a successful response. With a recursive query, the DNS server must contact any other DNS servers it needs to resolve the request. When it receives a successful response from the other DNS server(s), it then sends a response to the DNS client.

An iterative query is one in which the DNS server is expected to respond with the best local information it has, based on what the DNS server knows from local zone files or from caching, without contacting other DNS servers. If a DNS server does not have any local information that can answer the query, it simply sends a negative response.

When iteration is used, a DNS server answers a client based on its own specific knowledge about the namespace with regard to the names data being queried. For example, if a DNS server on your intranet receives a query from a local client for “www.microsoft.com”, it might return an answer from its names cache. If the queried name is not currently stored in the names cache of the server, the server might respond by providing a referral - that is, a list of NS and A resource records for other DNS servers that are closer to the name queried by the client.

As shown in the graphic above, a number of queries were used to determine the IP address for www.whitehouse.gov. The query sequence is described below:

  1. Recursive query for www.whitehouse.gov (A resource record)
  2. Iterative query for www.whitehouse.gov (A resource record)
  3. Referral to the .gov name server (NS resource records, for .gov); for simplicity, iterative A queries by the DNS server (on the left) to resolve the IP addresses of the Host names of the name server’s returned by other DNS servers have been omitted.
  4. Iterative query for www.whitehouse.gov (A resource record)
  5. Referral to the whitehouse.gov name server (NS resource record, for whitehouse.gov)
  6. Iterative query for www.whitehouse.gov (A resource record)
  7. Answer to the interative query from whitehouse.gov server (www.whitehouse.gov’s IP address)
  8. Answer to the original recursive query from local DNS server to Resolver (www.whitehouse.gov’s IP address)

IV - RFCs about DNS
  • RFC 1034 -- Domain Names — Concepts and Facilities
  • RFC 1035 -- Domain Names — Implementation and Specification
  • RFC 1123 -- Requirements for Internet Hosts — Application and Support
  • RFC 1886 -- DNS Extensions to Support IP Version 6
  • RFC 1995 -- Incremental Zone Transfer in DNS
  • RFC 1996 -- A Mechanism for Prompt Notification of Zone Changes (DNS NOTIFY)
  • RFC 2136 -- Dynamic Updates in the Domain Name System (DNS UPDATE)
  • RFC 2181 -- Clarifications to the DNS Specification
  • RFC 2308 -- Negative Caching of DNS Queries (DNS NCACHE)
  • RFC 2535 -- Domain Name System Security Extensions (DNSSEC)
  • RFC 2671 -- Extension Mechanisms for DNS (EDNS0)
  • RFC 2782 -- A DNS RR for specifying the location of services (DNS SRV)
[Reference]
1. Wiki On Domain Name System
2. HowStuffWorks on Domain Name System
3. MS TechNet on How DNS Works
4. Understanding Domain Name System (Part I, Part II)

Papers on Designing/Implementing Internet
1. Rethinking the Design of the Internet
2. End to End Argument In System Design
3. End to End Principle

7/21/2009

Version Control Using SVN + Apache

  Version Control is one of the 3 cornerstones (the other two are: Unit Testing and Project Automation) in modern software development. SVN is an open source version control system that is widely used in open source community and many other companies.

  To secure your SVN environment, you should configure the Authentication and Authorization options. For networked SVN, there are two ways to talk to SVN server - Svnserve protocol or Apache protocol:
- Svnserve protocol can leverage SSH to authenticate SVN user, so you can use Unix user account to access SVN repository. But SSH is only popular in *nix world, not well supported in Windows world.
- Apache has a Windows Authentication module that can be used to talk with SVN repository data files securely. Windows Authentication is an AD based system and very popular in windows based enterprise environment.

  Here I will show how to configure Windows Authentication mechansim in Apache based SVN environment. You are supposed to be familiar with SVN concepts, architecture and common command usage.

1. Install Software Component

- SVN + Apache
1) Download and install the upper package.
2) Suppose you install it to $SvnServRoot, then Apache Httpd is located at $SvnServRoot\Httpd.
3) Both SvnServe and Httpd will run as Windows Service after installation.
4) Use cmd:"svnadmin create repo" to create a svn repository called "repo" under SVN's root directory. (the root directory is specified when starting SvnServe using "-r" option)
5) Use cmd:"svn import HelloWorld.txt svn://server_name/repo/HelloWorld.txt" to add a sample txt file into svn.

- SSPI Apahce Module
1) Download the sspi zip file and unzip
2) Copy bin\mod_auth_sspi.so to $SvnServRoot\Httpd\Modules

2. Configure SVN DAV

1). Load svn related modules.
In $SvnServRoot\Httpd\Conf\httpd.conf, ensure the following two lines are added:
# Subversion modules
LoadModule dav_svn_module modules/mod_dav_svn.so
LoadModule authz_svn_module modules/mod_authz_svn.so
2). Set URI -> SVN Repository mapping.
Suppose you want people to access your svn by the uri - http://server_name/svn.

If you just have one repository that is located at $svnroot\your_repo, add the following to $SvnServRoot\Httpd\Conf\httpd.conf:
<Location /svn>
DAV svn
SVNPath $\svnroot\your_repo
<Location>
If you have multiple repositories that are all located under $\svnroot\repo_root, add the following to $SvnServRoot\Httpd\Conf\httpd.conf:
<Location /svn>
 DAV svn
SVNListParentPath on
SVNParentPath $svnroot\repo_root
<Location>
3) test
Now restart your Apache windows service, try browsing http://server_name/svn. If all is ok, you will see the HelloWorld.txt file is listed in the browser.

You can also try cmd:"svn mkdir http://server_name/svn/sandbox - m 'message text'" to see whether Apache Httpd based SVN Web DAV works.

3. Configure SSPI

1). In $SvnServRoot\Httpd\Conf\httpd.conf, ensure the following line is added:
# Windows Authentication module
LoadModule sspi_auth_module   modules/mod_auth_sspi.so
Make sure this directive is ahead of those that loads svn web dav moduels.

2) In the Location section of httpd.conf, specify SSPI parameters as follows:
<Location /svn>
# SSPI auth module parameter
AuthName "Subversion Authentication"
AuthType SSPI
SSPIAuth On
SSPIAuthoritative On
SSPIDomain DOMAIN      # set the domain to authorize against
SSPIOmitDomain On      # keep domain name in userid string
SSPIOfferBasic On      # let non-IE clients authenticate
SSPIBasicPreferred Off # should basic authentication have higher priority
SSPIUsernameCase lower # should convert username into lower case

# require the SVN Users group
Require group "DOMAIN\Subversion Users"
Require user "YOUR_DOMAIN\your_name"
</Location>
NOTE:
- If no Require directive is specified, any user can access the svn repository. (the same effect as no authentication at all)
- "Require valid-user" directive grants access to any valid user that log into his machine using AD controlled account.
- You can use AuthzSVNAccessFile directive to specify authorization rule file in Location section.

3). Test
- Restart Apache HttpD windows service
- Try using your AD controlled windows account and local machine account to access http://server_name/svn/

4. Configure Apahche SSL

1) In $SvnServRoot\Httpd\Conf\httpd.conf, uncomment the following two lines:
#LoadModule ssl_module modules/mod_ssl.so
#Include conf/extra/httpd-ssl.conf

2). Create SSL Certificates
Run following commands under dir $SvnServRoot\Httpd\Conf\:
..\bin\openssl.exe req -config openssl.cnf -new -out my-server.csr
..\bin\openssl.exe rsa -in privkey.pem -out server.key
..\bin\openssl.exe req -new -key server.key -config openssl.cnf -out server.csr
..\bin\openssl.exe x509 -in server.csr -out server.crt -req -signkey server.key -days 10000

Then make suer the following files are created:
privkey.pem
server.crt
server.csr
server.der.crt
server.key

3). Modify $SvnServRoot\Httpd\Conf\Extra\httpd-ssl.conf
There are some hardcoded file path in this configuration file, replace them with the location where SSL certificate files are stored (It's "$SvnServRoot\Httpd\Conf\" in the upper case).

4). Test
Restart Apache Httpd windows service
Try browsing https://server_name/svn
Try cmd: "svn mkdir https://server_name/svn/sandbox/trunk"

NOTE:
- IE may not be able to connect to https://server_name/svn because it uses "AES128-SHA " algorithm, which is very weak. Firefox 3.5 works well in my test.
- Since the ssl certificate is self-created (not authenticated by Authority), you must accept it explicitly when first access the site using https protocol.
- Http and https have the same interface, but http use plain text to send your user name and password to server. Https is more secure especially in WAN/Internet environment.
- If you want to allow anonymous to read but only authenticated users to write (just the same as most open source projects hosting sites), you can add
<LimitExcept GET PROPFIND OPTIONS REPORT>
  Require valid-user
</LimitExcept>
to Location section in the httpd.conf file
- You can add "SSLRequireSSL" directive to Location section in httpd.conf file to deny non-https access to svn file repository.

[Reference]
1. Windows Authentication with Subversion on Windows
2. Apache SSL Setp by Step Guide
3. SVN + Apache Configuration
4. http://httpd.apache.org/docs/2.0/ssl/ssl_intro.html
5. SVN for Windows
6. Apache Based SVN Server
7. SVN Quick Guide
8. Configuring Windows Authentication with Apache 2.2.x and Subversion

7/10/2009

Agile Project Practices for Internet Product

《程序员》杂志2009年05期中最有价值的文章,当属腾讯R&D总监王速瑜所著“互联网敏捷开发实践之路”一文。下面是一些总结和体会:

Part I - 互联网产品的特点

1. 高度不确定性 - 用户众多,分布广泛,差异巨大,很难一开始就确定产品功能
2. 采用探索性、适应性、迭代性的设计开发理念,不断交付功能、不断获得反馈、不断调整改进
3. 发布成本低,获得用户反馈比较快
4. 需要依靠应用去推动用户行为,产品设计需要发掘人们潜在的需求,创新非常重要

Part II - 互联网产品敏捷开发实践

一 提升士气
1. 核心成员要表现出对项目的真诚热爱
2. 从用户和市场那里寻找正面反馈,让团队成员获得成就感
3. 容忍错误,鼓励成员敢于尝试,从错误中学习成长
4. 提供平等、开放的沟通环境,减少沟通障碍
5. 模糊职责,培养主人翁感觉:任何成员都可以发表意见、作出贡献,以群策群力的方式发挥集体智慧
6. 获得领导和其它员工的关注与支持,使得团队得到激励(Ship Party, Monkey, Team Activity)

二 提高团队透明度(Everyone should know - Where Are We?)
1. User Story Wall - Know The User Scenario
2. Burn Down Chart - Aware the whole progress of the project
3. Retrospect - Identify what and how to improve

三 不断迭代
核心思想:快速反馈调整 - 尽快动手、简单设计、快速可交付、收集用户反馈、不断改进
1. Define Iteration/Milestone
2. 短期计划应该详细,长远计划则可以粗略

四 Small Release
核心思想:逐步从小范围到大范围地发布
1. 质量控制:范围从小达到很好地控制了缺陷影响的范围,把用户试用当作测试
2. 营销手段:激发人们的好奇心,有利于产品的推广
3. 现场客户:快速客户反馈收集渠道

五 User Study
1. 直接观察法,去客户所在的真实环境观察记录
2. 案例研究法,适合于改造现有产品
3. 调查问卷发,列举关注的问题,进行问卷调查
4. 人物角色法,设定典型客户特点,分析其关注点,思维模式
5. 焦点小组法,针对某些特定问题进行小组讨论 (制定目标很重要)

Part III - 小结

团队管理理念:以人为本,自我管理,自我激励,持续改进。
项目实施原则:小步快跑,简单设计,拥抱变化,持续集成。