Friday, May 23, 2014

HTTP The Definitive Guide (Client Identification and Cookies)

Client Identification and Cookies
The Personal Touch

  • Personal greetings
  • Targeted recommendations
  • Administrative information on file
  • Session tracking
HTTP Headers
Client IP Address
User Login

Fat URLs
URLs modified to include user state information are called fat URLs.

Fat URLs can be used to identify users as they browse a site. But this technology does have several serious problems. Some of these problems include:

Ugly URLs
    The fat URLs displayed in the browser are confusing for new users.
Can’t share URLs
    The fat URLs contain state information about a particular user and session. If you mail that URL to someone else, you may inadvertently be sharing your accumulated personal information.
Breaks caching
Generating user-specific versions of each URL means that there are no longer
commonly accessed URLs to cache.
Extra server load
    The server needs to rewrite HTML pages to fatten the URLs.
Escape hatches
    It is too easy for a user to accidentally “escape” from the fat URL session by jumping to another site or by requesting a particular URL. Fat URLs work only if the user strictly follows the premodified links. If the user escapes, he may lose his progress (perhaps a filled shopping cart) and will have to start again.
Not persistent across sessions
    All information is lost when the user logs out, unless he bookmarks the particular fat URL.

Cookies
Types of Cookies
    You can classify cookies broadly into two types: session cookies and persistent cookies.
The only difference between session cookies and persistent cookies is when they expire. As we will see later, a cookie is a session cookie if its Discard parameter is set, or if there is no Expires or Max-Age parameter indicating an extended expiration time.

How Cookies Work

Cookie Jar: Client-Side State
Because the browser is responsible for storing the cookie information, this system is called client-side state. The official name for the cookie specification is the HTTP State Management Mechanism.

Netscape Navigator cookies

domain
    The domain of the cookie
allh
    Whether all hosts in a domain get the cookie, or only the specific host named
path
    The path prefix in the domain associated with the cookie
secure
    Whether we should send this cookie only if we have an SSL connection
expiration
    The cookie expiration date in seconds since Jan 1, 1970 00:00:00 GMT
name
    The name of the cookie variable
value
    The value of the cookie variable

Microsoft Internet Explorer cookies

Different Cookies for Different Sites
Cookie Domain attribute
    Set-cookie: user="mary17"; domain="airtravelbargains.com"
Cookie Path attribute
    Set-cookie: pref=compact; domain="airtravelbargains.com"; path=/autos/
 
Cookie Ingredients
    There are two different versions of cookie specifications in use: Version 0 cookies (sometimes called “Netscape cookies”), and Version 1 (“RFC 2965”) cookies.
Version 0 (Netscape) Cookies
    Set-Cookie: name=value [; expires=date] [; path=path] [; domain=domain] [; secure]
    Cookie: name1=value1 [; name2=value2] ...

When a client sends requests, it includes all the unexpired cookies that match the domain, path, and secure filters to the site. All the cookies are combined into a Cookie header:
Cookie: session-id=002-1145265-8016838; session-id-time=1007884800

Version 1 (RFC 2965) Cookies

Version 1 Set-Cookie2 header


Version 1 Cookie header
Each matching cookie must include any Domain, Port, or Path attributes from the corresponding Set-Cookie2 headers.
For example, assume the client has received these five Set-Cookie2responses in the past from the www.joes-hardware.com web site:
    Set-Cookie2: ID="29046"; Domain=".joes-hardware.com"
    Set-Cookie2: color=blue
    Set-Cookie2: support-pref="L2"; Domain="customer-care.joes-hardware.com"
    Set-Cookie2: Coupon="hammer027"; Version="1"; Path="/tools"
    Set-Cookie2: Coupon="handvac103"; Version="1"; Path="/tools/cordless"
If the client makes another request for path /tools/cordless/specials.html, it will pass along a long Cookie2 header like this:
    Cookie: $Version="1";
                 ID="29046"; $Domain=".joes-hardware.com";
                 color="blue";
                 Coupon="hammer027"; $Path="/tools";
                 Coupon="handvac103"; $Path="/tools/cordless"

Version 1 Cookie2 header and version negotiation
    The Cookie2 request header is used to negotiate interoperability between clients and servers that understand different versions of the cookie specification. The Cookie2 header advises the server that the user agent understands new-style cookies and provides the version of the cookie standard supported (it would have made more sense to call it Cookie-Version):
        Cookie2: $Version="1"
    If the server understands new-style cookies, it recognizes the Cookie2header and should send Set-Cookie2(rather than Set-Cookie) response headers. If a client gets both a Set-Cookie and a Set-Cookie2header for the same cookie, it ignores the old Set-Cookie header.
If a client supports both Version 0 and Version 1 cookies but gets a Version 0 Set-Cookie header from the server, it should send cookies with the Version 0 Cookie header. However, the client also should send Cookie2: $Version=“1” to give the server indication that it can upgrade.

Cookies and Session Tracking

  • Figure 11-5a—Browser requests Amazon.com root page for the first time.
  • Figure 11-5b—Server redirects the client to a URL for the e-commerce software.
  • Figure 11-5c—Client makes a request to the redirected URL.
  • Figure 11-5d—Server slaps two session cookies on the response and redirects the user to another URL, so the client will request again with these cookies attached. This new URL is a fat URL, meaning that some state is embedded into the URL. If the client has cookies disabled, some basic identification can still be done as long as the user follows the Amazon.com-generated fat URL links and doesn’t leave the site.
  • Figure 11-5e—Client requests the new URL, but now passes the two attached cookies.
  • Figure 11-5f—Server redirects to the home.html page and attaches two more cookies.
  • Figure 11-5g—Client fetches the home.html page and passes all four cookies.
  • Figure 11-5h—Server serves back the content.
Cookies and Caching
The rules for cookies and caching are not well established. Here are some guiding principles for dealing with caches:
  • Mark documents uncacheable if they are
  • Be cautious about caching Set-Cookie headers
  • Be cautious about requests with Cookie headers
Cookies, Security, and Privacy
















































No comments:

Post a Comment