User Tools

Site Tools


programming:python:cpc2ical

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

programming:python:cpc2ical [2010/09/30 17:12] (current)
jay created
Line 1: Line 1:
 +====== What is it? ======
 +The other day my wife was lamenting the fact that the calendar for the City Pages, http://​www.citypages.com/,​ (here in Minneapolis) thoroughly sucks, from any kind of usability standpoint. ​ Indeed, it does.  It's impossible to just glance at what is coming up in the next month or so.  Instead, you have to slog through each day's listings and scroll down a page containing the full listings just to see what's happening on a given day.  This isn't so bad if you are interested in what is happening on a particular day, but it's horrible for just seeing what shows are coming to town.
 +
 +I decided to take it upon myself to bang out a quick script that would parse the daily rss feeds and convert them to iCal VEVENTS in a VCALENDAR. ​ Once you have that, you can import it into any calendar app that understands iCal format (I tested with Google calendar). ​ Then, you'll get a simple daily listing of all the events taking place that day at a glance and you can just click on the summary for the description and link to the main page.
 +
 +Do note that this is a simple script that I banged out in a short amount of time to accomplish a simple task.  It works well for that task, but I'm sure this could be taken to another level in terms of automation and such.  I leave that as an exercise for others.
 +
 +====== The Script ======
 +You can download the script by clicking {{:​programming:​python:​cpc2ical.py.gz|here}}.
 +
 +Here is the full source:
 +<code python>
 +#​!/​usr/​bin/​env python
 +
 +# $Id: cpc2ical.py 312 2010-09-30 16:52:54Z jay $
 +
 +# Copyright Jason Deiman 2010
 +
 +#
 +# This script'​s purpose is to convert the crappy City Pages "​calendar"​ to
 +# an ical calendar file for a given month. ​ This was tested against a Google
 +# calendar.
 +#
 +# This script requires the following non-standard libraries:
 +#
 +#     ​python-feedparser:​ http://​www.feedparser.org/​ (apt-get install ​
 +#                        python-feedparser)
 +#
 +
 +from datetime import datetime , timedelta , tzinfo
 +from optparse import OptionParser
 +from hashlib import sha1
 +from uuid import uuid4
 +import time , sys , re
 +try:
 +    import feedparser ​
 +except:
 +    print >> sys.stderr , 'Could not import feedparser. ​ You can download ' \
 +        'it from http://​www.feedparser.org/​ or if you are on a debian based ' \
 +        '​machine,​ just "sudo apt-get install python-feedparser"'​
 +    sys.exit(1)
 +CPBASEURL = '​http://​www.citypages.com/​syndication/​events/​date:'​
 +RE_YM = re.compile('​^\d{4}-\d{2}$'​)
 +
 +class UTC(tzinfo):​
 +    """​
 +    UTC timezone for the datetime stuff
 +    """​
 +    def utcoffset(self , dt):
 +        return timedelta(0)
 +
 +    def tzname(self , dt):
 +        return '​UTC'​
 +
 +    def dst(self , dt):
 +        return timedelta(0)
 +
 +class Vcal(object):​
 +    """​
 +    This is just a structure to hold all the boiler-plate vcal stuff. ​ I'm
 +    not using the icalendar stuff for this since it's stuff that's not going
 +    to change much
 +    """​
 +    def __init__(self):​
 +        self.items = [
 +            ('​BEGIN'​ , '​VCALENDAR'​) ,
 +            ('​PRODID'​ , '​-//​Splitstreams//​City Pages Calendar//​EN'​) ,
 +            ('​VERSION'​ , '​2.0'​) ,
 +            ('​CALSCALE'​ , '​GREGORIAN'​) ,
 +            ('​METHOD'​ , '​PUBLISH'​) ,
 +            ('​X-WR-CALNAME'​ , 'City Pages'​) ,
 +            ('​X-WR-TIMEZONE'​ , '​America/​Chicago'​) ,
 +            ('​BEGIN'​ , '​VTIMEZONE'​) ,
 +            ('​TZID'​ , '​America/​Chicago'​) ,
 +            ('​X-LIC-LOCATION'​ , '​America/​Chicago'​) ,
 +            ('​BEGIN'​ , '​DAYLIGHT'​) ,
 +            ('​TZOFFSETFROM'​ , '​-0600'​) ,
 +            ('​TZOFFSETTO'​ , '​-0500'​) ,
 +            ('​TZNAME'​ , '​CDT'​) ,
 +            ('​DTSTART'​ , '​19700308T020000'​) ,
 +            ('​RRULE'​ , '​FREQ=YEARLY;​BYMONTH=3;​BYDAY=2SU'​) , 
 +            ('​END'​ , '​DAYLIGHT'​) ,
 +            ('​BEGIN'​ , '​STANDARD'​) ,
 +            ('​TZOFFSETFROM'​ , '​-0500'​) ,
 +            ('​TZOFFSETTO'​ , '​-0600'​) ,
 +            ('​TZNAME'​ , '​CST'​) ,
 +            ('​DTSTART'​ , '​19701101T020000'​) ,
 +            ('​RRULE'​ , '​FREQ=YEARLY;​BYMONTH=11;​BYDAY=1SU'​) ,
 +            ('​END'​ , '​STANDARD'​) ,
 +            ('​END'​ , '​VTIMEZONE'​) ,
 +        ]
 +
 +    def setCalName(self , name):
 +        for i , item in enumerate(self.items):​
 +            if item[0] == '​X-WR-CALNAME':​
 +                l = list(item)
 +                l[1] = name
 +                self.items[i] = tuple(l)
 +    def getCalName(self):​
 +        for k , v in self.items:
 +            if k == '​X-WR-CALNAME':​
 +                return v
 +    CalName = property(getCalName , setCalName)
 +        ​
 +    def getStart(self):​
 +        ret = ''​
 +        for k , v in self.items:
 +            ret += '​%s:​%s\r\n'​ % (k , v)
 +        return ret
 +    Start = property(getStart)
 +
 +    def getEnd(self):​
 +        return '​END:​VCALENDAR\r\n'​
 +    End = property(getEnd)
 +
 +
 +class RssToIcal(object):​
 +    dateTpl = '​%Y%m%d'​
 +    cpDateTpl = '​%Y-%m-%d'​
 +    datetimeTpl = '​%Y%m%dT%H%M%SZ'​
 +
 +    def getUid(self , domain='​splitstreams.com'​):​
 +        return '​%s@%s'​ % (sha1(uuid4().bytes).hexdigest() , domain) ​      
 +
 +    def rssEntry2Vevent(self , entry , dt):
 +        end = dt + timedelta(days=1)
 +        now = datetime.utcnow()
 +        desc = entry.description.replace('​\n'​ , '​\\n'​).replace('​\r'​ , ''​)
 +        ret = '​BEGIN:​VEVENT\r\n'​
 +        ret += '​DTSTART;​VALUE=DATE:​%s\r\n'​ % dt.strftime(self.dateTpl)
 +        ret += '​DTEND;​VALUE=DATE:​%s\r\n'​ % end.strftime(self.dateTpl)
 +        ret += '​DTSTAMP:​%s\r\n'​ % now.strftime(self.datetimeTpl)
 +        ret += '​UID:​%s\r\n'​ % self.getUid()
 +        ret += '​CREATED:​%s\r\n'​ % now.strftime(self.datetimeTpl)
 +        ret += '​DESCRIPTION:​%s\\n\\n%s\r\n'​ % (entry.link , desc)
 +        ret += '​LOCATION:​\r\n'​
 +        ret += '​SEQUENCE:​0\r\n'​
 +        ret += '​STATUS:​CONFIRMED\r\n'​
 +        ret += '​SUMMARY:​%s\r\n'​ % entry.title
 +        ret += '​TRANSP:​TRANSPARENT\r\n'​
 +        ret += '​END:​VEVENT\r\n'​
 +        return ret
 +
 +    def convert(self , year , month):
 +        """​
 +        This takes a numeric year and month and finds all the city pages
 +        events for that month and yields a list of Vevent strings
 +        """​
 +        year , month = (int(year) , int(month))
 +        delta = timedelta(days=1)
 +        cpdt = datetime(year , month , 1 , tzinfo=UTC())
 +        while cpdt.month == month:
 +            url = '​%s%s'​ % (CPBASEURL , cpdt.strftime(self.cpDateTpl))
 +            dfd = feedparser.parse(url)
 +            for e in dfd.entries:​
 +                yield self.rssEntry2Vevent(e , cpdt)
 +            cpdt += delta
 +
 +def getOpts():
 +    usage = '​Usage:​ %prog [options] YYYY-MM [YYYY-MM [YYYY-MM ...]]'
 +    p = OptionParser(usage=usage)
 +    p.add_option('​-o'​ , '​--output-file'​ , dest='​outFile'​ , metavar='​FILE'​ ,
 +        default='​-'​ ,
 +        help='​Send the calendar output to FILE instead of stdout '
 +        '​[default:​ STDOUT]'​)
 +    p.add_option('​-n'​ , '​--cal-name'​ , dest='​calName'​ , metavar='​CALNAME'​ ,
 +        default=''​ ,
 +        help='​Set the calendar name to a name of your choosing. '
 +        '​[default:​ City Pages]'​)
 +    opts , args = p.parse_args()
 +    return (opts , args)
 +
 +def getYearMonth(dateArg):​
 +    if not RE_YM.match(dateArg):​
 +        print >> sys.stderr , '​Invalid month, must be in YYYY-MM ' \
 +            '​format:​ %s' % a
 +        sys.exit(1)
 +    year , month = [int(i) for i in dateArg.split('​-'​)]
 +    curYear = time.localtime().tm_year
 +    curMonth = time.localtime().tm_mon
 +    if year < curYear or year > curYear + 1:
 +        print >> sys.stderr , '​Invalid year.  The year must be this, or ' \
 +            'next, year only: %d' % year
 +        sys.exit(2)
 +    if month < 1 or month > 12:
 +        print >> sys.stderr , '​Invalid month. ​ It must be 1 to 12: %d' % month
 +        sys.exit(3)
 +    if month < curMonth and year == curYear:
 +        print >> sys.stderr , 'You can\'t get a calendar in the past'
 +        sys.exit(4)
 +    return (year , month)
 +
 +def main():
 +    opts , args = getOpts()
 +    outfh = None
 +    if opts.outFile == '​-':​
 +        outfh = sys.stdout
 +    else:
 +        outfh = open(opts.outFile , '​w'​)
 +    vcal = Vcal()
 +    if opts.calName:​
 +        vcal.CalName = opts.calName
 +    r2i = RssToIcal()
 +    events = []
 +    ym = []
 +    for a in args:
 +        ym.append(getYearMonth(a))
 +    outfh.write(vcal.Start)
 +    for year , month in ym:
 +        for vev in r2i.convert(year , month):
 +            outfh.write(vev)
 +    outfh.write(vcal.End)
 +    if outfh != sys.stdout:
 +        outfh.close()
 +
 +if __name__ == '​__main__':​
 +    main()
 +</​code>​
 +====== Usage ======
 +The script is quite simple to use.  If you use the ''​--help''​ or ''​-h''​ option, you get the limited options that you can specify on the command line.
 +<​code>​
 +$ ./​cpc2ical.py -h
 +Usage: cpc2ical.py [options] YYYY-MM [YYYY-MM [YYYY-MM ...]]
 +
 +Options:
 +  -h, --help ​           show this help message and exit
 +  -o FILE, --output-file=FILE
 +                        Send the calendar output to FILE instead of stdout
 +                        [default: STDOUT]
 +  -n CALNAME, --cal-name=CALNAME
 +                        Set the calendar name to a name of your choosing.
 +                        [default: City Pages]
 +</​code>​
 +Essentially,​ you just specify a year and month, or a number of them, to generate. ​ In this example, I'm generating an iCal for the months of October and November in 2010 and I'm outputting everything into a file called ''​city_pages.ical'':​
 +<​code>​
 +$ ./​cpc2ical.py -o city_pages.ical 2010-10 2010-11
 +</​code>​
 +That's about it.  **NOTE: This will take a while to run so don't CTRL-C out of it prematurely!** ​ Once you have your file, just import it into a calendar app.
 +===== Importing into Google Calendar =====
 +This will be a very short description of how to import this info into a Google Calendar (as of September of 2010).
 +
 +  - Point your browser at http://​google.com/​calendar and log in
 +  - In the top, right corner, click on ''​Settings''​ -> ''​Calendar settings''​
 +  - Now click on the ''​Calendars''​ tab in the top, left under the ''​Calendar Settings''​ heading (the ''​General''​ tab is selected by default)
 +  - Click the ''​Import calendar''​ link in the middle of the page
 +  - A javascript window should pop up allowing you to select your calendar file, ''​city_pages.ical''​ if you are using the example in the previous section
 +  - Select the calendar you wish to import this into.  Personally, I created a separate calendar just for the city pages stuff.
 +  - Click the ''​Import''​ button and wait for a bit and you should get a message telling you the number of events that were imported.
 +  - Go back to your calendar(s) and you should see listings for all events.
 +====== Future TODO ======
 +Here are some things that I may do in the future, but would also be a good exercise for others.
 +
 +  * Parse the description and create better VEVENTS for things which are ongoing (spanning many days)
 +  * Parse the description for times and actually create VEVENTS that are not just simply "full day" events, but instead exist within the time period where the event actually takes place.
 +
  
programming/python/cpc2ical.txt · Last modified: 2010/09/30 17:12 by jay