12/24/2007

Security Mechanism on Microsoft Windows

  When we talked about Security here, we just refer to Access Control, since there are so many aspects about system security(for example, encryption, decryption, authentication and authorization, access control is just part of ahthorization).

   In general, Access Control means what actions someone can do on something. On Windows operating system, it introduces some similar concepts to this statement: something - Objects, someone - Subjects(also referenced as Principal), actions - Operation(Access Mask). There are many more related concepts:ACL, ACE, SACL, DACL, Empty DACL, NULL DACL, Access Token, SD and SID.

  First, let's explain those concepts:

  Object - also called Securable Object, A securable object is an object that can have a security descriptor. All named Windows objects are securable. Some unnamed objects, such as process and thread objects, can have security descriptors too.

  Security Descriptor - (SD) contains the security information associated with a securable object. A security descriptor can include the following security information:
  • Security identifiers (SIDs) for the owner and primary group of an object.
  • A DACL that specifies the access rights allowed or denied to particular users or groups.
  • A SACL that specifies the types of access attempts that generate audit records for the object.
  • A set of control bits that qualify the meaning of a security descriptor or its individual members.
Principal/Subject - refers to the something that will take actions on securable objects.

  Security Identifier - (SID) is a unique value of variable length used to identify a Principal. Each Windows user account has a unique SID issued by an authority, such as a Windows domain controller, and stored in a security database.

  Access Control List - (ACL) is a list of access control entries (ACE). Each ACE in an ACL identifies a Principal and specifies the access rights allowed, denied, or audited for that trustee. The security descriptor for a securable object can contain two types of ACLs: a DACL and a SACL.

  Discretionary Access Control List - (DACL) identifies the trustees that are allowed or denied access to a securable object.

  System Access Control List - (SACL) enables administrators to log attempts to access a secured object.

  Access Control Entry - (ACE) Each ACE controls or monitors access to an object by a specified Principal. All types of ACEs contain the following access control information:
  • A security identifier (SID) that identifies the Principal to which the ACE applies.

  • An access mask that specifies the access rights controlled by the ACE.

  • A flag that indicates the type of ACE.

  • A set of bit flags that determine whether child containers or objects can inherit the ACE from the primary object to which the ACL is attached.
Access Token - An access token is an object that describes the security context of a process or thread. The information in a token includes the identity and privileges of the user account associated with the process or thread. Access tokens contain the following information:
  • The security identifier (SID) for the user's account

  • SIDs for the groups of which the user is a member

  • A logon SID that identifies the current logon session

  • A list of the privileges held by either the user or the user's groups

  • An owner SID

  • The SID for the primary group

  • The default DACL that the system uses when the user creates a securable object without specifying a security descriptor

  • The source of the access token

  • Whether the token is a primary or impersonation token

  • An optional list of restricting SIDs

  • Current impersonation levels

  • Other statistics
One thing need mention is that, ACEs in a DACL is ORDERED. That is to say, the order of the ACEs in a DACL is very important to the final access checking result.


  When you(actually, a process or thread) perform some kind of operations on a securable objects(for example, read/write a text file), what happened about the access control aspect? How does Windows performs the access check process? Here is a list of tasks Windows will to in order to maintain the access control semantic:
  1. Open up your thread token;
  2. Retrieve the SIDs of you and your groups from access token got from step 1;
  3. Obtain the SD of the securable objects;
  4. Get the ACL from the SD;
  5. For each ACE in the DACL, lookup the SID associated with that ACE;
  6. Lookup this SID in the list of SIDs got from step 2.;
  7. If found, check if it is deny or allow ACE, if deny, access check failed;
  8. Compare the ACCESS_MASK in the ACE with your desired ACCESS_MASK, clear bits that matches;
  9. If all bits in your desired ACCESS_MASK are cleared, access check succeeded, you are allowed to do the operations;
  10. If some bits are still set when you have traversed all ACEs, access check failed.

  the Win32 security API accesscheck() will do most of the upper tasks for you.

  All the upper concepts have corresponding entities in the operating system, which are all in binary form. To facility the management of Windows system security, Microsoft invented a string/text format of these entities. This is what we call Security Descriptor Definition Language, the security descriptor definition language (SDDL) defines the string format that the ConvertSecurityDescriptorToStringSecurityDescriptor and ConvertStringSecurityDescriptorToSecurityDescriptor functions use to describe a security descriptor as a text string. The language also defines string elements for describing information in the components of a security descriptor:
  SD String http://msdn.microsoft.com/en-us/library/aa379570(VS.85).aspx
  ACE String http://msdn.microsoft.com/en-us/library/aa374928(VS.85).aspx
  SID String http://msdn.microsoft.com/en-us/library/aa379602(VS.85).aspx
  The string format SD is a null-terminated string with tokens to indicate each of the four main components of a security descriptor: owner (O:), primary group (G:), DACL (D:), and SACL (S:).
O:owner_sidG:group_sidD:dacl_flags(string_ace1)(string_ace2)... (string_acen)S:sacl_flags(string_ace1)(string_ace2)... (string_acen)
  The string format ACE is in the form of: (ace_type;ace_flags;rights;object_guid;inherit_object_guid;account_sid)
  You can use two useful command line tools from Windows to display and set the ACL of securable objects. They are icacls and cacls. For example, to see the SD of a file - d:\myfile.ext, you can just run "cacls d:\myfile.ext /s". For more information about these to commands, type command /? for help.
  An Excellent Tutorial about Windows Security Mechanism:
  Part 1 - Background and Core Concepts. The Access Control Structures
  Part 2 - Basic Access Control programming
  Part 3 - Access Control programming with .NET v2.0
  Part 4 - The Windows 2000-style Access Control editor

  You may use the upper introduction knowledge when you design/implementation some security editing features in deploy applications to customer desktop or check the access rights of some client users on some securable objects in some enterprise online services. Otherwise, you are not likely to use these dinosaur APIs.
  Things can be learned from the design point of view:
  1. Data Structures (such as Security_Descriptor, ACL, ACE, SID) are not exposed to developers, although it's defined as C/C++ struct in some windows predefined header files. Developers can only access the members of those structures by means of Authorization APIs. It's kind of Encapsulation implemented by C.
  2. Binary form of data is hard for human, while text format is human readable and it will greatly improve the manageability and reduce the possibility to produce errors. Since security involves a lot of human activities(such as security administration), text format representation is definitely helpful.
  3. Compared with traditional unix access control mechanism, Windows access control facility is more powerful, since you can control everyone's access rights. But it's also definitely ugly and complicated. Could we implemented powerful functionalities(like the Windows one) in a simple and elegant way(as in the Unix world)?

11/25/2007

Check File Existence on Windows Platform

There are several ways to accomplish this:

1. Use _access() in CRT.
#include <io.h>
if
( _access(filename, 0) == 0 )
{
  
// file exists
}
2. Use GetFileAttributes() in Win32 or stat() in POSIX(Win32 has _stat()).
if (INVALID_FILE_ATTRIBUTES != GetFileAttributes(fileName))
{
   // file exists
}

3. Just create it and check the returned error code, using CreateDirectory() or CreateFile() in Win32.
HANDLE hFile = NULL;
hFile = CreateFile(fileName, 0, 0, NULL, CREATE_NEW, FILE_FLAG_DELETE_ON_CLOSE, NULL);
if ( hFile == INVALID_HANDLE_VALUE)
{
  if (GetLastError() == ERROR_ALREADY_EXISTS)
  {
    // file exists
  }
}
else
{
  CloseHandle(hFile);
}

4. Use FindFirstFile() in Win32.
WIN32_FIND_DATA wfd;
HANDLE hFind = FindFirstFile(fileName, &wfd);
if (hFind != INVALID_HANDLE_VALUE)
{
  CloseHandle(hFind);
  // file exists
}

5. Use other C++ file stream/file system library
  5.1 try open() on fstream in STL, if not exists, it will fail.
  5.2 try Boost library: boost::filesystem::exists(fileName)

But be careful about the atomic semantic and race condition. If you want to check the file's existence and then do some operation with that file(such as open for read/write), solution NO.3 is best for you.

11/20/2007

Create Mutiple Level Directories

1. SHCreateDirectoryEx()
  This function is included in Windows Shell and Control component: header is shlobj.h, library is shell32.lib and DLL is shell32.dll.
  It will create all intermediate directories in the path parameter. Both ANSI and UNICODE version path characters are supported.

2. MakeSureDirectoryPathExists()
  This function is included in Windows DbgHelp component: header is Dbghelp.h, Library is Dbghelp.lib, DLL is Dbghelp.dll.
  It will create all directories from the root, but just provide ANSI version, no UNICODE version.

3. Use CreateDirectory() to build your own one(recursively or iteratively).
  The following code has the same semantic as CreateDirectoryW().(Same return value and same GetLastError() value)
 1 BOOL CreateFullPathDirectoryW(const wchar_t * dirPath,
 2     LPSECURITY_ATTRIBUTES secAttr)
 3 {
 4     std::wstring curDir(dirPath);
 5     while(curDir.at(curDir.length() - 1) == L'\\')
 6     {
 7         curDir.erase(curDir.length() - 1, curDir.length());
 8     }
 9
10     if (!CreateDirectoryW(curDir.c_str(), secAttr))
11     {
12         DWORD dwErr = GetLastError();
13         if (ERROR_PATH_NOT_FOUND == dwErr)
14         {
15             std::wstring parentDir = curDir.substr(0, curDir.rfind(L'\\'));
16             if (parentDir.length() == 0)
17             // one level dir
18             {
19                 SetLastError(dwErr);
20                 return FALSE;
21             }
22
23             if (CreateFullPathDirectoryW(parentDir.c_str(), secAttr))
24             // create parent OK
25             {
26                 return CreateDirectoryW(dirPath, secAttr);
27             }
28             else
29             // we have correct Win32 LastError here
30             {
31                 return FALSE;
32             }
33         }
34         else
35         // already exists or other error
36         {
37             SetLastError(dwErr);
38             return FALSE;
39         }
40     }
41
42     return TRUE;
43 }
44

9/14/2007

Web Security - XSS and SQL Injection

============================================================
XSS/CSS(Cross Site Scripting)

XSS lies in the fact that web application can receive user input data and send them to client browsers to render it. Since web browser can execute javascript, theoretically, web application user can write code that could be executed on other users' client machines - this is the root cause of all XSS attacks.

How to get user input script code run in other users' browsers?
1. js url protocol - use "javascript://your_script_code_here" as link destination
2. script tag - use plain script blocks:
3. element event - use "onload=your_script_here" like code in html docs

What's the potential harms of XSS?
1. Cookie theft, XSS code can be "document.cookie"
2. Session cheat, use user's cookie data to access original site's service as a legal and logged in user
3. Phishing, lead user browser to access unintensional web urls

How to avoid XSS attack?
1. Restrict data format of user input value, do client/server side checking/validating
2. Do HTML encoding/escaping/filtering before writing strings to http response as web page content to client browser
3. Bind session with user IP
4. Disable script execution, safe but will limit the functionalities of web application

[Reference]
XSS FAQ and a chs version

XSS cheat sheet

Perl&XSS

XSS online tool

Case Studies (1, 2)

============================================================

Sql Injection

Sql Injection is a web attacking technique, in which attackers write sql codes into user input value, and wich will be used by application(most likely, web based) logic to execute sql statement in databases. The result is that attacker can write code to be executed by your DBMS. This may change your application logic(bypass authentication), harm your data(drop tables) and expose critical information(dump server file to client).

The basic practices to solve this problem are:
1. Check/Validate user input data, trim dangerous content
2. Prefer sql parameter and store procedure rather than dynamically composing sql statement literally
3. Grant minimal permission to web application account in DBMS
4. Avoid disclosing database error information

SQL Server Book Online - SQL Injection

ASP.Net Best Practices - Protect from SQL Injection

Practices to Avoid SQL Injection

Web App Security - best practices from Microsoft

Case Studies (1, 2, 3)

SQL Injection Cheat Sheet

============================================================

General Web Security Reference

http://www.webhackingexposed.com/tools.html

http://www.cgisecurity.com

Sql Security

============================================================

9/01/2007

Internet Explorer and Its HTTP Connection Limits

  Many web developers had discussed the IE client connection problem. Microsoft IE team blog had posted an related article titled: "Internet Explorer and Connection Limits" to explain their considerations behind this design. OpenAjax also has an article titled: HTTP connection limitation on AJAX to discuss this limitation's impact on Ajax applications.

  According to IE team's blog, the main reasoning behind is:
  "It turns out that this is a case where IE strictly follows the standards-- in this case, RFC2616, which covers HTTP1.1. As noted in the RFC:

  Clients that use persistent connections SHOULD limit the number of simultaneous connections that they maintain to a given server. A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy."


  In order to improve the web application performance, you can do something at both the server and client side.

  For client side, you can change this limitation by edit the system registry:
  under the regkey "HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings", add/change the following two DWORD values:

  MaxConnectionsPerServer REG_DWORD (Default 2)
  Sets the number of simultaneous requests to a single HTTP 1.1 Server

  MaxConnectionsPer1_0Server REG_DWORD (Default 4)
  Sets the number of simultaneous requests to a single HTTP 1.0 Server

  In IE5 or later, it is also possible to change the connection limit programmatically by calling the InternetSetOption function on NULL handle with the following flags (note that it will change connection limit for the whole process):
INTERNET_OPTION_MAX_CONNS_PER_SERVER INTERNET_OPTION_MAX_CONNS_PER_1_0_SERVER.

  For server side, since IE treats hostname as server, not IP address, so a.yourname.com and b.yourname.com are different servers in IE's perspective. So you can use sub-domain to let user relief the 2 connection limits even if they are browsing your web site using IE.

  Here are some useful server side programming tips regarding the browser connection limits:
  1. Circumventing browser connection limits for fun and profit
  2. Improving web performance by distributing images among hostnames

  But for IE8, here are some good news for server side performance tuning developers:
  1. IE8: The Performance Implications
  2. Testing IE8’s Connection Parallelism
  3. Connectivity Enhancements in Internet Explorer 8
  4. Internet Explorer 8 and Maximum Concurrent Connections

8/02/2007

Convert C++ Code to HTML

  As a blogger focusing on computer technologies, from time to time, I need to paste some programming code(most often, c/c++) to web. But very few blog service provider supports important features for code presentation, such as programming language syntax highlight, smart indention etc. So we have to convert the code to html by ourselves.

  Here, I collected some tips to help you do the convention:

  1. Use online converter.
  http://www.bedaux.net/cpp2html/ is a great online cpp->html converter. It's also highly customizable - you can choose the color scheme of the final html code, just by changing some variable values. But it uses CSS, so you must embed those CSS statements into your own blog template to make the final html code works in your blog pages.

  2. Use offline editor.
  Some powerful editors have build-in features to convert code into html format. For example, the vim editor. To do the convention, first, copy your code to VIM or use VIM to open your code file, make sure the syntax highlight is enabled(vim command: syntax enable), then run this command: "source $VIMRUNTIME/syntax/2html.vim". The html version of your original code will be opened in a new vim window. It's pure html code, no CSS used. So it's very handy to copy and paste them to your blog articles.

  If you don't like the geek style editor, you can use an UI friendly editor: notepad++. It also has an extension to convert in-editing code into html format. Suppose you are editing some source code file, from menu "plugin->export", choose "export as html".

  For other languages such as Java, C#, there are many similar tools such as java2html and csharp2html. I also found a good add-on called CopySourceAsHtml(CSAH) for Visual Studio 2008, it can convert code into html format easily.

  UPDATE @ 11/28/2008
1. You can use "TOhtml"(Be careful about the case) as shortcut for "source $VIMRUNTIME/syntax/2html.vim" in VIM;
2. 2html only works in VIM GUI version(gvim), the console version will get wrong final color shceme.
3. For this tool, you'd better configurate it with "let html_no_pre = 1" and "let html_number_lines = 1"(more info:help tohtml);
4. There is another great tool called HighLight, which can convert syntax highlighted code to html version.

7/25/2007

Server Side Paging Using T-Sql

It's a common task to sort and page the query result in the server side when doing database development. Here are some useful tips for doing this on Microsoft SQL Server using T-Sql.
To illustrate with real T-Sql code, suppose we have the following table definition and want to support sorting/paging on birthday column in DESC order, and we also have some parameter definitions.

CREATE TABLE user_info (

    user_guid BIGINT IDENTITY(1, 1),

    first_name NVARCHAR(64),

    last_name NVARCHAR(64),

    birthday DATETIME

)

DECLARE @StartIndex BIGINT

DECLARE @PageSize INT

1. Use TOP N function:
This method is supported on many Sql Server versions including Sql Server 2000, but peformance is poor in big table.
    SELECT *
    FROM (SELECT TOP(@StartIndex + @PageSize + 1) *
      FROM user_info
      ORDER BY birthday DESC
    ) AS low_bd
    WHERE user_guid NOT IN (
      SELECT TOP(@StartIndex) *
      FROM user_info
      ORDER BY birthday DESC
    )
2. Use ROW_NUMBER() and Common Table Expression:
This method is only support in Sql Server 2005, but has very low performance cost.

WITH u_cte AS (

  SELECT user_guid, ROW_NUMBER() OVER (ORDER BY birthday DESC) AS row_num

  FROM user_info

)

SELECT *

FROM u_cte, user_info

WHERE u_cte.user_guid = user_info.user_guid

AND row_num BETWEEN @StartIndex AND (@StartIndex + @PageSize + 1)

ORDER BY row_num ASC

7/11/2007

Hash Index in Microsoft SQL Server

In fact, there is no real Hash Index (index that implemented using hash mechanism ONLY)in MS Sql Server, all indices are b-tree based.

But when you use some relatively long string field in search condition or join condition, there is a way to improve the query performance. This technique uses both Hash and Index concepts, so it is named as Hash Index in SQL Server Book Online.

This technique involved two steps:
1. Add a numeric column to the existing table, which is is hash value of the string field

CREATE TABLE website(
ip nvarchar(16),
site_uri nvarchar(1024) NOT NULL,
uri_hash AS (checksum([site_uri])),
);

2. Build a non-clustered, non-unique index on this numeric field

CREATE INDEX idx_uri_hash ON
website(uri_hash) INCLUDE (site_uri);

Later when you do some equal comparison on this field, you can compose your query like:
SELECT ip
FROM website
WHERE uri_hash = CHECKSUM(N'http://www.live.com')
AND site_uri = N'http://www.live.com'

Sql Server Storage Engine will use the hash index first, and only do string comparison on the records returned from previous step.

3/24/2007

Jeffrey Snover, you did it!

Here is an interview with the architect of Windows PowerShell (code name: Monad). Interview With Monad Architect

In the last question, Jeffery said:
" As far as the learning curve goes, I expect it will be similar to what happened when I learned VMS’s DCL. I didn’t like it for the first 20 minutes because it was different than what I already knew. I then invested another 20 minutes in learning how to use it and then I was hooked. I was extremely productive because it was so consistent. I could guess about what to type and it would be right. Monad will be the same."

For this answer, I would say:" Yes, Jeffery, You Did It!"

On the first eyesight of the syntax of powershell, I think it's so ugly, I even can't believe it comes from Microsoft! Later when I am doing some project management scripting, I find cmd batch script is too simple to meet my requirement, so I have to turn to Powershell. I calmed down and spent some time to deep into the RTW version of PowerShell.

After knowing what powershell really is, I would be shame for my shallow when I think PS is garbage. PS is really a excellent shell and script language. It takes OO, .NET Framework, Uniformed Storage/Data Access, utility framework, open and extensible architecture, it can greatly improve your productivity. From now on, you can controll everything in Microsoft windows through command line interface! You can do manythings in a more simple and elegant way than *nix world does!

Microsoft had made a commitment that the future GUI configuration software will build on top of it. Currently Exchange Server 2007 management system and some other management system already did in that way. I think Microsoft had already noticed the importance of manageability besides the usability and cool UI. This is really important for Windows Servers and for system administrators.

Look forward to the brilliant future of PowerShell and Microsoft!