Billipede.net

"You just won't believe how vastly, hugely, mind-bogglingly big it is."

filed under:

2017-04-14 MiLB Schedule in Org-Mode

I live in Austin, and like to go to baseball games. This means that, unless I want to drive to Dallas or Houston (and I very much don't), I have to make do with minor league baseball, specifically the Round Rock Express at Dell Diamond. In fact, this suits me just fine, since it's a beautiful, intimate little ballpark, tickets are relatively cheap, it's a short drive, and parking is easy. It's close enough that I can decide after work on any given day whether or not I'd like to go to a game that night, so I thought it might be nice to have Express home games show up in my Emacs org-mode agenda. I started by finding the Express schedule in iCal format. The MiLB uses a site called stanza.co to handle their calendaring (there are other formats as well) and it can be found here. Choosing either "Apple" or "Other" gives you an iCal file, since I guess iCal has become the de-facto calendar interchange format. Go figure.
Anyway, the reason I wanted an iCal is because somebody has helpfully already written an awk script that will take an iCal file and turn it into an org-mode one. It's called ical2org.awk and you can get it here.
Note that the default Ubuntu 16.04 awk is not gawk, as literally everyone would expect and prefer. It's some other one that nobody's ever heard of called mawk. Since the author of ical2org.awk is a practical-minded person, it relies on some gawk-isms, and you'll obviously want to uninstall mawk and install gawk instead. You could install them side by side, but honestly you probably want gawk anyway, so take this opportunity to uncripple your system. With that out of the way, you can go ahead and run the conversion:
~ $ awk -f ical2org.awk < milb-roundrockexpress.ics > milb-roundrockexpress.org awk: ical2org.awk:272: (FILENAME- FNR43) warning: gensub: third argument `' treated as 1 awk: ical2org.awk:284: (FILENAME- FNR43) warning: gensub: third argument `' treated as 1 ...snip 279 lines... awk: ical2org.awk:284: (FILENAME- FNR2563) warning: gensub: third argument `' treated as 1
Well, that didn't go as well as planned. After some time spelunking in the awk man page, I figured out that this program actually relies on some behavior that works but generates a warning, which because of my output redirect, results in warnings in my output org file. I could just redirect stderr away from my output file, but it turns out actually to be just as easy to fix the two lines that are the problem:
~ $ diff ical2org.awk ical2org_fixed.awk 272c272 < print "* " gensub("^[ ]+", "", "", gensub("\\\\,", ",", "g", gensub("\\\\n", " ", "g", summary))) "\n<" date ">" --- > print "* " gensub("^[ ]+", "", "1", gensub("\\\\,", ",", "g", gensub("\\\\n", " ", "g", summary))) "\n<" date ">" 284c284 < print gensub("^[ ]+", "", "", gensub("\\\\,", ",", "g", gensub("\\\\n", "\n", "g", entry))); --- > print gensub("^[ ]+", "", "1", gensub("\\\\,", ",", "g", gensub("\\\\n", "\n", "g", entry))); ~ $
With that, the script runs perfectly:
~ $ gawk -f ical2org_fixed.awk < milb-roundrockexpress.ics > milb-roundrockexpress.org ~ $

Turning It Up To 11

That's all well and good, but it's only good for Austinites like myself. Let's do the same for all MiLB teams. I dug into the stanza.co page with Dev Tools fully expecting to spend hours digging through minified javascript calls before I gave up, but a little fiddling reveals that the Express file was stored at here:
~/milb_schedules $ cat team_names_unclean.txt Team Class League MLB Affiliation State Tickets Aberdeen IronBirds Class A Short New York-Penn BAL MD ...snip... Vermont Lake Monsters ClasTeam Class League MLB Affiliation State Tickets
...and cut it down like so:
~/milb_schedules $ awk -F"\t" '{print $1}' team_names_unclean.txt | tr [:upper:] [:lower:] | sed -e '1d' -e 's/[\.\ \/]//' > team_names_clean.txt
Then, I tried gathering iCal files for all of them:
~/milb_schedules $ time for team in $( cat team_names_clean.txt ); do wget https://www.stanza.co/api/schedules/milb-${team}/milb-${team}.ics; done
There are 152 teams in this list, so it took a few minutes, but I was never rate limited or anything:
real 5m39.017s user 0m1.688s sys 0m0.552
Finally, I ran /ical2org.awk/ on all of them:
~/milb_schedules $ for team in $( cat team_names_clean.txt ); do gawk -f ./ical2org_fixed.awk < milb-${team}.ics > milb-${team}.org; done
None of these are really of any use to me except the Express file, but hopefully they are to someone else.