Friday, March 27, 2009

Implementation details for GovCheck

Now that GovCheck.net has gone live and is doing relatively well, I think it's time to discuss how the site was built. The driving force behind all the technologies being used in the production of this site is the Python programming language. None of the code that I've written has been written in a language or using libraries written in a language other than Python. With that said, here's a list of the various libraries being used:
  • BeautifulSoup - Used in the scraping code. It is truly one of the most intuitive and easy to use libraries I have used (with maybe the exception of the web framework being used).
  • SQLAlchemy - The ORM used to commit data being collected by the scraping code into the DB.
  • Elixir - This is a declarative layer on top of SQLAlchemy which allows me to have a similar feel to the DB layer in my scraping code as in my web app code.
  • Django - This is the Web Framework being used. I won't be the first one to say that it is *the best* web framework I have used - hands down. In addition to the Django Core as well as the contrib apps, I am using multiple third party apps:
    • django-registration - The user registration process on the site is handled by this app.
    • django-tagging - This app is currently used to allow users to assign a custom tag to petitions. I imagine that this app will find many more uses as the site grows.
    • django-pagination - This app handles the pagination on any pages with pagination.
    • I have also developed a couple of my own third party applications (which serve various functions - such as the petitions functionality), some of which I plan to open source soon.
  • pygooglechart - This is a library that allows users to generate maps using the Google Charts API. It is used to generate the maps throughout the site.
  • jQuery - The JavaScript library being used on GovCheck.net. As their website says - "The write less, do more JavaScript library". I'll also add - the best JS library out there.
As for the deployment, the Operating System being used is Ubuntu 8.04 Server edition on a 256MB slice. The webserver being used is nginx which is passing on requests to FastCGI instances running in the background. The database server is PostgreSQL. As of now, none of the pages are being cached, but I plan to add that functionality soon using memcached. Finally, all the media files are being served using Amazon Web Services (via a CloudFront distribution). Given this setup, I've got room for approximately 30 req/s on some of the heavier pages. I imagine that this no. will go up significantly when I add memcached to the mix.

So there you have it - the software being used at GovCheck and the servers being used to serve the site.

No comments:

Post a Comment