I l@ve RuBoard Previous Section Next Section

11.9 Accessing Netscape Cookie Information

Credit: Mark Nenadov

11.9.1 Problem

You need to access cookie information, which Netscape stores in a cookie.txt file, in an easily usable way, optionally transforming it into XML or SQL.

11.9.2 Solution

Classes are always good candidates for grouping data with the code that handles it:

class Cookie:
    "Models a single cookie"
    def _ _init_ _(self, cookieInfo):
        self.allInfo = tuple(cookieInfo)

    def getUrl(self):   return self.allInfo[0]
    def getInfo(self, n): return self.allInfo[n]

    def generateSQL(self):
        sql = "INSERT INTO Cookie(Url,Data1,Data2,Data3,Data4,Data5) "
        sql += "VALUES('%s','%s','%s','%s','%s','%s');" % self.allInfo
        return sql

    def generateXML(self):
        xml = "<cookie url='%s' data1='%s' data2='%s' data3='%s'" \
              " data4='%s' data5='%s' />" % self.allInfo
        return xml

class CookieInfo:
    "models all cookies from a cookie.txt file"
    cookieSeparator = "     "

    def _ _init_ _(self, cookiePathName):
        cookieFile = open(cookiePathName, "r")
        self.rawCookieContent = cookieFile.readlines(  )
        cookieFile.close(  )

        self.cookies = []
        for line in self.rawCookieContent:
            if line[:1] == '#': pass
            elif line[:1] == '\n': pass
            else: self.cookies.append(
                Cookie(line.split(self.cookieSeparator)))

    def count(self):
        return len(self.cookies)
    _ _len_ _ = count

    # Find a cookie by URL and return a Cookie object, or None if not found
    def findCookieByURL(self, url):
        for cookie in self.cookies:
            if cookie.getUrl(  ) == url: return cookie
        return None

    # Return list of Cookie objects containing the given string
    def findCookiesByString(self, str):
        results = []
        for c in self.cookies:
            if " ".join(c.allInfo).find(str) != -1:
                results.append(c)
        return results

    # Return SQL for all the cookies
    def returnAllCookiesInSQL(self):
        return '\n'.join([c.generateSQL(  ) for c in self.cookies]) + '\n'

    # Return XML for all the cookies
    def returnAllCookiesInXML(self):
        return "<?xml version='1.0' ?>\n\n<cookies>\n" + \
            '\n'.join([c.generateXML(  ) for x in self.cookies]) + \
            "\n\n</cookies>"

11.9.3 Discussion

The CookieInfo and Cookie classes provide developers with a read-only interface to the cookies.txt file in which Netscape stores cookies received from web servers. The CookieInfo class represents the whole set of cookies from the file, and the Cookie class represents one of the cookies. CookieInfo provides methods to search for cookies and to operate on all cookies. Cookie provides methods to output XML and SQL equivalent to the cookie.

Here is some test/sample code for this recipe, which you can modify to fit your specific cookies file:

if _ _name_ _=='_ _main_ _':
    c = CookieInfo("cookies.txt")
    print "You have:", len(c), "cookies"

    # prints third data element from www.chapters.ca's cookie
    cookie = c.findCookieByURL("www.chapters.ca")

    if cookie is not None:
        print "3rd piece of data from the cookie from www.chapters.ca:", \
            cookie.getData(3)
    else:
        print "No cookie from www.chapters.ca is in your cookies file"

    # prints the URLs of all cookies with "mail" in them
    print "url's of all cookies with 'mail' somewhere in their content:"
    for cookie in c.findCookiesByString("mail"):
        print cookie.getUrl(  )

    # prints the SQL and XML for the www.chapters.ca cookie
    cookie = c.findCookieByURL("www.chapters.ca")
    if cookie is not None:
        print "SQL for the www.chapters.ca cookie:", cookie.generateSQL(  )
        print "XML for the www.chapters.ca cookie:", cookie.generateXML(  )

These classes let you forget about parsing cookies that your browser has received from various web servers so you can start using them as objects. The Cookie class's generateSQL and generateXML methods may have to be modified, depending on your preferences and data schema.

A large potential benefit of this recipe's approach is that you can write classes with a similar interface to model cookies, and sets of cookies, in other browsers, and use their instances polymorphically (interchangeably), so that your system-administration scripts that need to handle cookies (e.g., to exchange them between browsers or machines, or remove some of them) do not need to depend directly on the details of how a given browser stores cookies.

11.9.4 See Also

Recipe 11.10; the Unofficial Cookie FAQ (http://www.cookiecentral.com/faq/) is chock-full of information on cookies.

    I l@ve RuBoard Previous Section Next Section