User Tools

Site Tools


programming:python:cpc2ical

What is it?

The other day my wife was lamenting the fact that the calendar for the City Pages, http://www.citypages.com/, (here in Minneapolis) thoroughly sucks, from any kind of usability standpoint. Indeed, it does. It's impossible to just glance at what is coming up in the next month or so. Instead, you have to slog through each day's listings and scroll down a page containing the full listings just to see what's happening on a given day. This isn't so bad if you are interested in what is happening on a particular day, but it's horrible for just seeing what shows are coming to town.

I decided to take it upon myself to bang out a quick script that would parse the daily rss feeds and convert them to iCal VEVENTS in a VCALENDAR. Once you have that, you can import it into any calendar app that understands iCal format (I tested with Google calendar). Then, you'll get a simple daily listing of all the events taking place that day at a glance and you can just click on the summary for the description and link to the main page.

Do note that this is a simple script that I banged out in a short amount of time to accomplish a simple task. It works well for that task, but I'm sure this could be taken to another level in terms of automation and such. I leave that as an exercise for others.

The Script

You can download the script by clicking here.

Here is the full source:

#!/usr/bin/env python
 
# $Id: cpc2ical.py 312 2010-09-30 16:52:54Z jay $
 
# Copyright Jason Deiman 2010
 
#
# This script's purpose is to convert the crappy City Pages "calendar" to
# an ical calendar file for a given month.  This was tested against a Google
# calendar.
#
# This script requires the following non-standard libraries:
#
#     python-feedparser: http://www.feedparser.org/ (apt-get install 
#                        python-feedparser)
#
 
from datetime import datetime , timedelta , tzinfo
from optparse import OptionParser
from hashlib import sha1
from uuid import uuid4
import time , sys , re
try:
    import feedparser 
except:
    print >> sys.stderr , 'Could not import feedparser.  You can download ' \
        'it from http://www.feedparser.org/ or if you are on a debian based ' \
        'machine, just "sudo apt-get install python-feedparser"'
    sys.exit(1)
CPBASEURL = 'http://www.citypages.com/syndication/events/date:'
RE_YM = re.compile('^\d{4}-\d{2}$')
 
class UTC(tzinfo):
    """
    UTC timezone for the datetime stuff
    """
    def utcoffset(self , dt):
        return timedelta(0)
 
    def tzname(self , dt):
        return 'UTC'
 
    def dst(self , dt):
        return timedelta(0)
 
class Vcal(object):
    """
    This is just a structure to hold all the boiler-plate vcal stuff.  I'm
    not using the icalendar stuff for this since it's stuff that's not going
    to change much
    """
    def __init__(self):
        self.items = [
            ('BEGIN' , 'VCALENDAR') ,
            ('PRODID' , '-//Splitstreams//City Pages Calendar//EN') ,
            ('VERSION' , '2.0') ,
            ('CALSCALE' , 'GREGORIAN') ,
            ('METHOD' , 'PUBLISH') ,
            ('X-WR-CALNAME' , 'City Pages') ,
            ('X-WR-TIMEZONE' , 'America/Chicago') ,
            ('BEGIN' , 'VTIMEZONE') ,
            ('TZID' , 'America/Chicago') ,
            ('X-LIC-LOCATION' , 'America/Chicago') ,
            ('BEGIN' , 'DAYLIGHT') ,
            ('TZOFFSETFROM' , '-0600') ,
            ('TZOFFSETTO' , '-0500') ,
            ('TZNAME' , 'CDT') ,
            ('DTSTART' , '19700308T020000') ,
            ('RRULE' , 'FREQ=YEARLY;BYMONTH=3;BYDAY=2SU') , 
            ('END' , 'DAYLIGHT') ,
            ('BEGIN' , 'STANDARD') ,
            ('TZOFFSETFROM' , '-0500') ,
            ('TZOFFSETTO' , '-0600') ,
            ('TZNAME' , 'CST') ,
            ('DTSTART' , '19701101T020000') ,
            ('RRULE' , 'FREQ=YEARLY;BYMONTH=11;BYDAY=1SU') ,
            ('END' , 'STANDARD') ,
            ('END' , 'VTIMEZONE') ,
        ]
 
    def setCalName(self , name):
        for i , item in enumerate(self.items):
            if item[0] == 'X-WR-CALNAME':
                l = list(item)
                l[1] = name
                self.items[i] = tuple(l)
    def getCalName(self):
        for k , v in self.items:
            if k == 'X-WR-CALNAME':
                return v
    CalName = property(getCalName , setCalName)
 
    def getStart(self):
        ret = ''
        for k , v in self.items:
            ret += '%s:%s\r\n' % (k , v)
        return ret
    Start = property(getStart)
 
    def getEnd(self):
        return 'END:VCALENDAR\r\n'
    End = property(getEnd)
 
 
class RssToIcal(object):
    dateTpl = '%Y%m%d'
    cpDateTpl = '%Y-%m-%d'
    datetimeTpl = '%Y%m%dT%H%M%SZ'
 
    def getUid(self , domain='splitstreams.com'):
        return '%s@%s' % (sha1(uuid4().bytes).hexdigest() , domain)       
 
    def rssEntry2Vevent(self , entry , dt):
        end = dt + timedelta(days=1)
        now = datetime.utcnow()
        desc = entry.description.replace('\n' , '\\n').replace('\r' , '')
        ret = 'BEGIN:VEVENT\r\n'
        ret += 'DTSTART;VALUE=DATE:%s\r\n' % dt.strftime(self.dateTpl)
        ret += 'DTEND;VALUE=DATE:%s\r\n' % end.strftime(self.dateTpl)
        ret += 'DTSTAMP:%s\r\n' % now.strftime(self.datetimeTpl)
        ret += 'UID:%s\r\n' % self.getUid()
        ret += 'CREATED:%s\r\n' % now.strftime(self.datetimeTpl)
        ret += 'DESCRIPTION:%s\\n\\n%s\r\n' % (entry.link , desc)
        ret += 'LOCATION:\r\n'
        ret += 'SEQUENCE:0\r\n'
        ret += 'STATUS:CONFIRMED\r\n'
        ret += 'SUMMARY:%s\r\n' % entry.title
        ret += 'TRANSP:TRANSPARENT\r\n'
        ret += 'END:VEVENT\r\n'
        return ret
 
    def convert(self , year , month):
        """
        This takes a numeric year and month and finds all the city pages
        events for that month and yields a list of Vevent strings
        """
        year , month = (int(year) , int(month))
        delta = timedelta(days=1)
        cpdt = datetime(year , month , 1 , tzinfo=UTC())
        while cpdt.month == month:
            url = '%s%s' % (CPBASEURL , cpdt.strftime(self.cpDateTpl))
            dfd = feedparser.parse(url)
            for e in dfd.entries:
                yield self.rssEntry2Vevent(e , cpdt)
            cpdt += delta
 
def getOpts():
    usage = 'Usage: %prog [options] YYYY-MM [YYYY-MM [YYYY-MM ...]]'
    p = OptionParser(usage=usage)
    p.add_option('-o' , '--output-file' , dest='outFile' , metavar='FILE' ,
        default='-' ,
        help='Send the calendar output to FILE instead of stdout '
        '[default: STDOUT]')
    p.add_option('-n' , '--cal-name' , dest='calName' , metavar='CALNAME' ,
        default='' ,
        help='Set the calendar name to a name of your choosing. '
        '[default: City Pages]')
    opts , args = p.parse_args()
    return (opts , args)
 
def getYearMonth(dateArg):
    if not RE_YM.match(dateArg):
        print >> sys.stderr , 'Invalid month, must be in YYYY-MM ' \
            'format: %s' % a
        sys.exit(1)
    year , month = [int(i) for i in dateArg.split('-')]
    curYear = time.localtime().tm_year
    curMonth = time.localtime().tm_mon
    if year < curYear or year > curYear + 1:
        print >> sys.stderr , 'Invalid year.  The year must be this, or ' \
            'next, year only: %d' % year
        sys.exit(2)
    if month < 1 or month > 12:
        print >> sys.stderr , 'Invalid month.  It must be 1 to 12: %d' % month
        sys.exit(3)
    if month < curMonth and year == curYear:
        print >> sys.stderr , 'You can\'t get a calendar in the past'
        sys.exit(4)
    return (year , month)
 
def main():
    opts , args = getOpts()
    outfh = None
    if opts.outFile == '-':
        outfh = sys.stdout
    else:
        outfh = open(opts.outFile , 'w')
    vcal = Vcal()
    if opts.calName:
        vcal.CalName = opts.calName
    r2i = RssToIcal()
    events = []
    ym = []
    for a in args:
        ym.append(getYearMonth(a))
    outfh.write(vcal.Start)
    for year , month in ym:
        for vev in r2i.convert(year , month):
            outfh.write(vev)
    outfh.write(vcal.End)
    if outfh != sys.stdout:
        outfh.close()
 
if __name__ == '__main__':
    main()

Usage

The script is quite simple to use. If you use the –help or -h option, you get the limited options that you can specify on the command line.

$ ./cpc2ical.py -h
Usage: cpc2ical.py [options] YYYY-MM [YYYY-MM [YYYY-MM ...]]

Options:
  -h, --help            show this help message and exit
  -o FILE, --output-file=FILE
                        Send the calendar output to FILE instead of stdout
                        [default: STDOUT]
  -n CALNAME, --cal-name=CALNAME
                        Set the calendar name to a name of your choosing.
                        [default: City Pages]

Essentially, you just specify a year and month, or a number of them, to generate. In this example, I'm generating an iCal for the months of October and November in 2010 and I'm outputting everything into a file called city_pages.ical:

$ ./cpc2ical.py -o city_pages.ical 2010-10 2010-11

That's about it. NOTE: This will take a while to run so don't CTRL-C out of it prematurely! Once you have your file, just import it into a calendar app.

Importing into Google Calendar

This will be a very short description of how to import this info into a Google Calendar (as of September of 2010).

  1. Point your browser at http://google.com/calendar and log in
  2. In the top, right corner, click on SettingsCalendar settings
  3. Now click on the Calendars tab in the top, left under the Calendar Settings heading (the General tab is selected by default)
  4. Click the Import calendar link in the middle of the page
  5. A javascript window should pop up allowing you to select your calendar file, city_pages.ical if you are using the example in the previous section
  6. Select the calendar you wish to import this into. Personally, I created a separate calendar just for the city pages stuff.
  7. Click the Import button and wait for a bit and you should get a message telling you the number of events that were imported.
  8. Go back to your calendar(s) and you should see listings for all events.

Future TODO

Here are some things that I may do in the future, but would also be a good exercise for others.

  • Parse the description and create better VEVENTS for things which are ongoing (spanning many days)
  • Parse the description for times and actually create VEVENTS that are not just simply “full day” events, but instead exist within the time period where the event actually takes place.
programming/python/cpc2ical.txt · Last modified: 2010/09/30 17:12 by jay