Yongji Wang's Blog: HTTP The Definitive Guide (Web Servers)

Web Servers

General-Purpose Software Web Servers
Web Server Appliances
Embedded Web Servers

What Real Web Servers Do
1. Set up connection—accept a client connection, or close if the client is unwanted.
2. Receive request—read an HTTP request message from the network.
3. Process request—interpret the request message and take action.
4. Access resource—access the resource specified in the message.
5. Construct response—create the HTTP response message with the right headers.
6. Send response—send the response back to the client.
7. Log transaction—place notes about the completed transaction in a log file.

Step 1: Accepting Client Connections

Handling New Connections
Client Hostname Identification - Most web servers can be configured to convert client IP addresses into client hostnames, using “reverse DNS.”
Determining the Client User Through ident - Some web servers also support the IETF ident protocol.The ident protocol lets servers find out what username initiated an HTTP connection. This information is particularly useful for web server logging—the second field of the popular Common Log Format contains the ident username of each HTTP request.

If a client supports the ident protocol, the client listens on TCP port 113 for ident

requests.

ident can work inside organizations, but it does not work well across the public Internet
for many reasons, including:

Many client PCs don’t run the identd Identification Protocol daemon software.
The ident protocol significantly delays HTTP transactions.
Many firewalls won’t permit incoming ident traffic.
The ident protocol is insecure and easy to fabricate.
The ident protocol doesn’t support virtual IP addresses well.
There are privacy concerns about exposing client usernames.

Step 2: Receiving Request Messages

Internal Representations of Messages

Connection Input/Output Processing Architectures

Single-threaded web servers
Multiprocess and multithreaded web servers
Multiplexed I/O servers
Multiplexed multithreaded web servers

Step 3: Processing Requests
Step 4: Mapping and Accessing Resources
Docroots

Virtually hosted docroots

User home directory docroots
Another common use of docroots gives people private web sites on a web server. A
typical convention maps URIs whose paths begin with a slash and tilde (/~) followed
by a username to a private document root for that user.

Dynamic Content Resource Mapping

Server-Side Includes (SSI)

Access Controls

Step 5: Building Responses
Response Entities
If there was a body, the response message usually contains:
• A Content-Type header, describing the MIME type of the response body
• A Content-Length header, describing the size of the response body
• The actual message body content

MIME Typing

mime.types
Magic typing
Explicit typing
Type negotiation

Redirection
Redirects are useful for:

Permanently moved resources
Temporarily moved resources
URL augmentation
Load balancing
Server affinity
Canonicalizing directory names

Step 6: Sending Responses

Step 7: Logging

Yongji Wang's Blog

Tuesday, May 6, 2014

HTTP The Definitive Guide (Web Servers)

No comments:

Post a Comment