Over the past week after work, I've been working on a little side project to hone my ruby scripting skills and I'm ready to release it today.
I've written a script in ruby to parse through all the parliamentary reports from http://www.parliament.gov.sg/Publications/votes_11thParl.htm (at of time of processing 27th April 2010 report is the latest) and extract out the name, ward and date of absentee from each individual report and dump it into a SQLite database.
Some interesting facts:
1) Since the 11th Parliament started till the sitting on the 27th April 2010, there has been 928 absentee records.
2) The top 5 absentees in Parliament are:
Mr Lee Kuan Yew - 46
Dr Balaji Sadasivan - 38
Prof. Thio Li-ann - 38
Dr Loo Choon Yong - 28
Mr George Yong-boon Yeo - 25
3) Mr Low Thia Khiang (Hougang - Workers Party) have not been absent since the 11th Parliament commenced. Mr Chiam See Tong (Potong Pasir - Singapore People's Party) 13 times. Ms Sylvia Lim (NCMP - Workers Party) 6 times.
4) The top 5 wards which has the highest number of absentees in Parliament are:
Nominated Member - 146
Tanjong Pagar - 90
Pasir Ris-punggol - 75
Ang Mo Kio - 69
Aljunied - 62
West Coast - 56
5) The top 5 Parliament sittings with the most number of absentees are:
12th April 2007 - 28 (33.333% of Parliament)
24th March 2009 - 24
19th August 2007 - 23
20th August 2007 - 23
22nd Jan 2008 - 22
I'm releasing the database under the Creative Commons 3.0 license, so you are free to use the data for any purpose (analysis/data visualization/etc.) without any restriction :) Although I'd love it if you would share with me where are you using the data :)
Grab it in either formats:
Update: Here's my ruby script for those who are curious. (If you run this on your own, you might notice that RAdm Lui's records are slightly misnamed. I had to manually edit those records to clean up the name for the released database. If you can improve on the script, I'll appreciate it too)
Disclaimer: Due to the nature of a computerized and automatic processing, there may be some inaccurancies in the data collected, although as far as I know it, the initial parsing doesn't seem to output any errors/discrepancy. If you've found something wrong, please contact me so that I can fix it.