Sunday, June 8, 2014

HTTP The Definitive Guide (Redirection and Load Balancing)

Redirection and Load Balancing
In this chapter, we’ll take a look at the following redirection techniques, how they
work, and what their load-balancing capabilities are (if any):

  • HTTP redirection
  • DNS redirection
  • Anycast routing
  • Policy routing
  • IP MAC forwarding
  • IP address forwarding
  • The Web Cache Coordination Protocol (WCCP)
  • The Intercache Communication Protocol (ICP)
  • The Hyper Text Caching Protocol (HTCP)
  • The Network Element Control Protocol (NECP)
  • The Cache Array Routing Protocol (CARP)
  • The Web Proxy Autodiscovery Protocol (WPAD)
Where to Redirect

Servers, proxies, caches, and gateways all appear to clients as servers, in the sense that a client sends them an HTTP request, and they process it. Many redirection techniques work for servers, proxies, caches, and gateways because of their common, server-like traits.
Web servers handle requests on a per-IP basis.
Proxies tend to handle requests on a per-protocol basis.


Overview of Redirection Protocols
The direction that an HTTP message takes on its way through the Internet is affected by the HTTP applications and routing devices it passes from, through, and toward. For example:

  • The browser application that creates the client’s message could be configured to send it to a proxy server.
  • DNS resolvers choose the IP address that is used for addressing the message. This IP address can be different for different clients in different geographical locations.
  • As the message passes through networks, it is divided into addressed packets; switches and routers examine the TCP/IP addressing on the packets and make decisions about routing the packets on that basis.
  • Web servers can bounce requests back to different web servers with HTTP redirects.
Table 20-1 summarizes the redirection methods used to redirect messages to servers, each of which is discussed later in this chapter.

Table 20-2 summarizes the redirection methods used to redirect messages to proxy servers.

General Redirection Methods

HTTP Redirection
DNS Redirection
DNS allows several IP addresses to be associated to a singledomain, and DNS resolvers can be configured or programmed to return varying IP addresses.

DNS round robin
DNS round robin uses a feature of DNS hostname resolution to balance load across a farm of web servers. It is a pure load-balancing strategy, and it does not take into account any factors about the location of the client relative to the server or the current stress on the server.

Multiple addresses and round-robin address rotation
DNS round robin for load balancing

The impact of DNS caching

Other DNS-based redirection algorithms

  • Load-balancing algorithms - Some DNS servers keep track of the load on the web servers and place the leastloaded web servers at the front of the list.
  • Proximity-routing algorithms - DNS servers can attempt to direct users to nearby web servers, when the farm of web servers is geographically dispersed.
  • Fault-masking algorithms - DNS servers can monitor the health of the network and route requests away from service interruptions or other faults.

Anycast Addressing
In anycast addressing, several geographically dispersed web servers have the exact same IP address and rely on the “shortest-path” routing capabilities of backbone routers to send client requests to the server nearest to the client.
IP MAC Forwarding

Because MAC address forwarding is point-to-point only, the server or proxy has to be located one hop away from the switch.

IP Address Forwarding
In IP address forwarding, a switch or other layer 4–aware device examines TCP/IP addressing on incoming packets and routes packets accordingly by changing the destination IP address, instead of the destination MAC address.
This type of forwarding also is called Network Address Translation (NAT).

Two ways to control the return path of the response are:
  • Change the source IP address of the packet to the IP address of the switch. - This is called full NAT, where the IP forwarding device translates both destination and source IP addresses.
  • If the source IP address remains the client’s IP address, make sure (from a hardware perspective) that no routes exist directly from server to client (bypassing the switch). -This sometimes is called half NAT.

Network Element Control Protocol
The Network Element Control Protocol (NECP) allows network elements (NEs)— devices such as routers and switches that forward IP packets—to talk with server elements (SEs)—devices such as web servers and proxy caches that serve application layer requests.

Messages
Proxy Redirection Methods
Explicit Browser Configuration
Proxy Auto-configuration

Web Proxy Autodiscovery Protocol
PAC file autodiscovery
An HTTP client that implements the WPAD protocol:

  • Uses WPAD to find the PAC file CURL
  • Fetches the PAC file (a.k.a. configuration file, or CFILE) corresponding to the CURL
  • Executes the PAC file to determine the proxy server
  • Sends HTTP requests to the proxy server returned by the PAC file


WPAD algorithm
The current WPAD specification defines the following techniques, in order:

  • DHCP (Dynamic Host Discovery Protocol)
  • SLP (Service Location Protocol)
  • DNS well-known hostnames
  • DNS SRV records
  • DNS service URLs in TXT records
Of these five mechanisms, only the DHCP and DNS well-known hostname techniques are required for WPAD clients.

Consider a client with hostname johns-desktop.development.foo.com. This is the
sequence of discovery attempts a complete WPAD client would perform:

  • DHCP
  • SLP
  • DNS A lookup on “QNAME=wpad.development.foo.com”
  • DNS SRV lookup on “QNAME=wpad.development.foo.com”
  • DNS TXT lookup on “QNAME=wpad.development.foo.com”
  • DNS A lookup on “QNAME=wpad.foo.com”
  • DNS SRV lookup on “QNAME=wpad.foo.com”
  • DNS TXT lookup on “QNAME=wpad.foo.com”
CURL discovery using DHCP
DNS A record lookup
Retrieving the PAC file
Once a candidate CURL is created, the WPAD client usually makes a GET request to the CURL. When making requests, WPAD clients are required to send Accept headers with appropriate CFILE format information that they are capable of handling.
For example:
Accept: application/x-ns-proxy-autoconfig

When to execute WPAD
The web proxy autodiscovery process is required to occur at least as frequently as one of the following:

  • Upon startup of the web client—WPAD is performed only for the start of the first instance. Subsequent instances inherit the settings.
  • Whenever there is an indication from the networking stack that the IP address of the client host has changed.
WPAD spoofing
Timeouts

Administrator considerations
Administrators should configure at least one of the DHCP or DNS A record lookup methods in their environments, as those are the only two that all compatible clients are required to implement.

Cache Redirection Methods
WCCP Redirection
Cisco Systems developed the Web Cache Coordination Protocol (WCCP) to enable routers to redirect web traffic to proxy caches.

How WCCP redirection works

Start with a network containing WCCP-enabled routers and caches that can communicate with one another.

  • A set of routers and their target caches form a WCCP service group. The configuration of the service group specifies what traffic is sent where, how traffic is sent, and how load should be balanced among the caches in the service group.
  • If the service group is configured to redirect HTTP traffic, routers in the service group send HTTP requests to caches in the service group.
  • When an HTTP request arrives at a router in the service group, the router chooses one of the caches in the service group to serve the request (based on either a hash on the request’s IP address or a mask/value set pairing scheme).
  • The router sends the request packets to the cache, either by encapsulating the packets with the cache’s IP address or by IP MAC forwarding.
  • If the cache cannot serve the request, the packets are returned to the router for normal forwarding.
  • The members of the service group exchange heartbeat messages with one another, continually verifying one another’s availability.
WCCP2 messages
Message components
Each WCCP2message consists of a header and components. The WCCP header information contains the message type (Here I Am, I See You, Assignment, or Removal Query), WCCP version, and message length (not including the length of the header).


Service groups
A service group consists of a set of WCCP-enabled routers and caches that exchange WCCP messages.

GRE packet encapsulation
Routers that support WCCP redirect HTTP packets to a particular server by encapsulating them with the server’s IP address. The packet encapsulation also contains an IP header proto field that indicates Generic Router Encapsulation (GRE).
WCCP load balancing

Internet Cache Protocol
The Internet Cache Protocol (ICP) allows caches to look for content hits in sibling caches.
ICP can be thought of as a cache clustering protocol.



Cache Array Routing Protocol
The Cache Array Routing Protocol (CARP) is a standard proposed by Microsoft Corporation and Netscape Communication Corporation to administer a collection of proxy servers such that an array of proxy servers appears to clients as one logical cache.
In contrast, the collection of servers connected using CARP operates as a single, large server with each component server containing only a fraction of the total cached documents.
Hyper Text Caching Protocol
The difference between an ICP and an HTCP transaction is in the level of detail in the requests and responses.

HTCP Authentication
Setting Caching Policies





























No comments:

Post a Comment