Tuesday, May 6, 2014

HTTP The Definitive Guide (Web Servers)

Web Servers


  • General-Purpose Software Web Servers
  • Web Server Appliances
  • Embedded Web Servers

What Real Web Servers Do
1. Set up connection—accept a client connection, or close if the client is unwanted.
2. Receive request—read an HTTP request message from the network.
3. Process request—interpret the request message and take action.
4. Access resource—access the resource specified in the message.
5. Construct response—create the HTTP response message with the right headers.
6. Send response—send the response back to the client.
7. Log transaction—place notes about the completed transaction in a log file.


Step 1: Accepting Client Connections

  • Handling New Connections
  • Client Hostname Identification - Most web servers can be configured to convert client IP addresses into client hostnames, using “reverse DNS.”
  • Determining the Client User Through ident - Some web servers also support the IETF ident protocol.The ident protocol lets servers find out what username initiated an HTTP connection. This information is particularly useful for web server logging—the second field of the popular Common Log Format contains the ident username of each HTTP request.
If a client supports the ident protocol, the client listens on TCP port 113 for ident
requests.


ident can work inside organizations, but it does not work well across the public Internet
for many reasons, including:

  • Many client PCs don’t run the identd Identification Protocol daemon software.
  • The ident protocol significantly delays HTTP transactions.
  • Many firewalls won’t permit incoming ident traffic.
  • The ident protocol is insecure and easy to fabricate.
  • The ident protocol doesn’t support virtual IP addresses well.
  • There are privacy concerns about exposing client usernames.
Step 2: Receiving Request Messages

Internal Representations of Messages

Connection Input/Output Processing Architectures



  • Single-threaded web servers
  • Multiprocess and multithreaded web servers
  • Multiplexed I/O servers
  • Multiplexed multithreaded web servers

Step 3: Processing Requests
Step 4: Mapping and Accessing Resources
Docroots

Virtually hosted docroots

User home directory docroots
     Another common use of docroots gives people private web sites on a web server. A
typical convention maps URIs whose paths begin with a slash and tilde (/~) followed
by a username to a private document root for that user.

Dynamic Content Resource Mapping

Server-Side Includes (SSI)

Access Controls


Step 5: Building Responses
Response Entities
    If there was a body, the response message usually contains:
• A Content-Type header, describing the MIME type of the response body
• A Content-Length header, describing the size of the response body
• The actual message body content

MIME Typing

  • mime.types
  • Magic typing
  • Explicit typing
  • Type negotiation

Redirection
Redirects are useful for:

  • Permanently moved resources
  • Temporarily moved resources
  • URL augmentation
  • Load balancing
  • Server affinity
  • Canonicalizing directory names
Step 6: Sending Responses


Step 7: Logging
































































No comments:

Post a Comment