Showing posts with label Python. Show all posts
Showing posts with label Python. Show all posts

Thursday, June 23, 2011

Installing Pylint for Python 2.5 on Mac

In a previous post I discussed how to set up Google App Engine (Python) development environment on Mac OS X 10.6. I'm liberally mining some of my own notes on setting up Google App Engine on Ubuntu 10.10.

In my previous Ubuntu environment I had a neat Ant script for building, testing and uploading my Google App Engine applications such as My Web Brain. One of tests the script performed was Python code style checks using Pylint, which I have also mentioned before in the context of Ubuntu. I've set this up again on my Mac, so I thought I thought I would post my notes here.

In this post I talk about:

  • Installing Setup Tools for Python 2.5,
  • Installing PyLint using SetupTools, and
  • Configuring Aptana/Pydev to use PyLint
Lets go!

Sunday, June 19, 2011

Getting Started with Google App Engine, Python 2.5 and PyDev on OS X 10.6

Last week my shiny new Apple iMac 27 inch arrived and I thought I would share my steps in getting my Google App Engine (Python) environment set up. I have previously written about setting up Ubuntu 10.10 for the same purpose and that post remains one of the most popular on this blog.

To set up our Google App Engine environment we will need the correct version of Python, the Google App Engine SDK and a Python/GAE friendly IDE.

Monday, June 6, 2011

A Better Prisoner's Dilemma Simulation?

In my previous post I had started work on a python program named generations to try to replicate the simulations of Robert Axelrod in comparing differing Prisoner's Dilemma strategies. At the time I was simulating single organisms ('Critters') with different strategies accumulating food over a number of iterations with the critter with highest food at the end of the iterations judged the winner.

The result of this simple simulation was that two strategies - Grudger and Tit-for-tat - were continually vying for the top spot, with Grudger taking the mantle more often than not. In Axelrod's tournament Tit-for-Tat was the unambiguous winner, so I knew my generations script needed some work.

(Read my previous post for an introduction to Prisoner's Dilemma, why it interests me and what the various strategies were.)

I've been developing the simulation since. Technically it is a bit of a mess, but when I next refactor I will be focusing on new parts of the model aside from PD and so I wanted to deliver this conclusion to Prisoner's Dilemma now.

In this post I want to talk about how I changed the simulation and what effect these changes had. For the impatient - yes I have managed to tweak the simulation such that Tit-for-tat is the unambigously best strategy. Many critters had to die to make it so...


Monday, May 23, 2011

My Generations Project and Prisoner's Dilemma

Apologies for the lack of recent posts. I have been 'learning technical stuff' mostly at work and mostly not in a form that can be easily shared. I have recently found the time to start a new project I call Generations - a python program I am developing to simulate some aspects of natural economy and natural selection.

Recently I re-read Richard Dawkins' The Selfish Gene and Robert Wright's The Moral Animal and was fascinated by the idea that computer simulations could tell us useful information about effective behavioral strategies, how those strategies prosper and in turn what effect these strategies may have on their own effectiveness and that of other strategies.

Currently this initial version of my program only simulates Prisoner's Dilemma. Wikipedia describes the Prisoner's Dilemma scenario, a classic game from the field of Game Theory, like this:
Two suspects are arrested by the police. The police have insufficient evidence for a conviction, and, having separated the prisoners, visit each of them to offer the same deal. If one testifies for the prosecution against the other (defects) and the other remains silent (cooperates), the defector goes free and the silent accomplice receives the full one-year sentence. If both remain silent, both prisoners are sentenced to only one month in jail for a minor charge. If each betrays the other, each receives a three-month sentence. Each prisoner must choose to betray the other or to remain silent. Each one is assured that the other would not know about the betrayal before the end of the investigation. How should the prisoners act?
Prisoner's Dilemma is an example of a non-zero-sum game, meaning that players are not attempting to beat other players, just achieve the best outcome for themselves. That is to say: winning doesn't automatically imply the other player or players have lost. When played repeatedly and scored (ie. 'Iterative Prisoner's Dilemma') the interesting question becomes which strategy, if any, is the most successful against the widest range of counter-strategies. Always cooperate? Always defect? Some mixture of the two?

Monday, December 6, 2010

Link: Importing Python Modules

I ran into this blog post on importing Python Modules a couple of weeks ago and it help solidify my understanding of how import and from..import (and __import__()) operate in Python and what they actually do.

The PEP 0008 Python Style Guide does not state a preference for import style but this blog post is fairly blunt in recommending import (and its lower amount of name pollution) over other approaches when possible.

A good, clear read. Thanks Fredrik!

Tuesday, August 24, 2010

Adding Library Paths to Google App Engine Python Projects

Apologies - It has been a while since my last post in this blog. Wedding preparations and a focus on design work for a new version of My Web Brain (along with my day job as a SQL developer on a SAP rollout) has kept me busy.

Did you know you could alter the system path dynamically on Google App Engine Python? This tip is something that I am sure old hands at Python consider basic, but I hadn't seen the technique done before.

As part of some research of the next version of My Web Brain, I evaluated some Google App Engine frameworks, one of which was Tipfy. Tipfy uses this technique front and centre in its main.py file to alter the system path:


import os
import sys

if 'lib' not in sys.path:
    # Add /lib as primary libraries directory, with fallback to /distlib
    # and optionally to distlib loaded using zipimport.
    sys.path[0:0] = ['lib', 'distlib', 'distlib.zip']


Taking control of the system path frees your package structure from the normal constraints of the project folder structure. And as the Tipfy code demonstrates, it also provides a good way of separating framework, vendor and customised code bases with the desired fallback order.

Update 24 Nov 2010: Just a note on what is obvious in hindsight. You need to update sys.path prior to importing from modules on the affected path(s). The same is true for modules with dependencies on the affected path.

Thursday, August 13, 2009

Book Review: Programming Collective Intelligence : Building Smart Web 2.0 Applications

Programming Collective Intelligence: Building Smart Web 2.0 Applications
By Toby Segaran
O'Reilly 2007
ISBN 978-0-596-52932-1

I bought and read Programming Collective Intelligence about a year ago and it remains one of my favourite books.


According to the author, Collective Intelligence is the "combining of behavior, preferences, or ideas of a group of people to create novel insights." This is a book about approaches and algorithms that allow you to do more with datasets of user-generated or contributed content.

The Book takes a task-based approach. After an introductory chapter the book is divided into chapters around a performing a specific sort of analysis. You'll find chapters on making recommendations, discovering groups, searching and ranking linked content (like Google's Pagerank), optimisation, document filtering, modelling with decision trees, price models, classification and genetic programming.

You will not find heavy theoretical discourse in this book. As required the book introduced new underlying algorithms but only in the context of the current task. Those with a background in the material might benefit from knowing that the following algorithms are covered: Bayesian classifiers, decision tree classifiers, neural networks, support vector machines, k-nearest neighbours, clustering, multi-dimensional scaling and non-negative matrix factorisation.

The book is black and white and is 334 pages long with a mid-sized typeface. The writing style is relaxed and conversational, but you will need to concentrate at times to follow the concepts being explained. Each chapter has many code samples and quite few data tables and diagrams. The net effect is that where text, code or diagrams are hard to fully grasp, the other code, text or diagrams can clarify the meaning and intent.

The code samples are written in Python in a tutorial, incremental style, so you can follow along with the text. You will need download and install some well-known third party libraries to recreate most of the examples, and many of the samples either require an internet connection to supply the data or use data sourced from the internet.

A very short Python primer is included in the preface. Knowing Python is not a pre-requisite to reading and understanding the code samples, however - I did not know Python at all when I read it and still managed to follow along. As someone who is now learning Python I can appreciate that the code samples are all very concise and polished, even if the powerful use of list comprehensions sometimes confused me at the time I read it.

I enjoyed this book immensely and it remains one of my favourites. I had only very limited exposure to the concepts prior to reading the book, but I found myself increasingly excited about the possibilities as I read. Even as I page through the book for the purposes of refreshing my memory it is impossible not to earmark certain sections for my current projects.

Who should read the book? I think non-programmers would struggle getting through the book, and readers wanting a more theoretical and deep understanding of the algorithms may be disappointed. Programmers with a background in any language will probably extract a lot of value from the book, however, especially if these concepts are new to the them. For that audience I strongly recommend this book. Unlike many other technical books I think this book will remain relevant and useful for quite a while.

Programming Collective Intelligence : Building Smart Web 2.0 Applications is available from Amazon, Oreilly.com and possibly your local technical bookstore.

(Full disclosure: As an Amazon associate, I will get paid if you purchase this book though the links to Amazon on this page. Use this Amazon link if you do not wish this to happen. Was the review helpful?)

Sunday, August 9, 2009

GAE Maintaining Reference Properties on Deletion

When using Python with Google App Engine and the Datastore, a common problem is maintaining Reference Properties on the Many-side of a relationship when the one-side changes (especially when it gets deleted). For example, consider the following simple model for an application that allows you to group Contacts into Categories:


class Category(db.Model):
name = StringProperty()
#some other props here

class Contact(db.Model):
name = StringProperty()
category = ReferenceProperty(Category, collection_name='contacts')
email_address = EmailProperty()


In this simple model, a Contact object has a ReferenceProperty to a single category. What happens when need to delete the category? In this instance you might want to delete the related contacts but more likely simply want to clear the association with the category. Deleting the Category is easy, just instantiate it and call delete():


category_to_delete = db.get(some_category_key)
category_to_delete.delete()

Easy right? Not quite. The reference to the deleted Category still exists in each Contact that was in that category. If you try to dereference, you will raise an exception. For example:


category_name_of_contact = some_contact.category.name

Causes an error. The Google documentation suggests testing the ReferenceProperty first:


if contact.category:
contact_category_name = some_contact.category.name
else:
contact_category_name = 'No Category Assigned'

But for a number of reasons this might not be ideal. I am not sure about the associated overhead with this approach but I am sure that manipulating the entire result for the purposes of display every time it is shown (on whatever output page) does not sit well with me. Nor does simply leaving bad references that must be constantly checked. For this reason I asked in the Google App Engine - Python Google Group about what the recommended approaches for this problem are.

Some recommended the approach I originally took: Clean up the references in your controller code at the same time you delete the potentially referenced object:


for contact in category_to_delete.contacts:
contact.category = None
contact.save()
category_to_delete.delete()

This cleans up the datastore as a delete is performed. If a Category has many Contacts, this might be a very expensive operation (possibly it can be improved, I am a Google App Engine padawan, after all). On the other hand, deleting a category is probably much rarer an occurance than say, outputing a Contact, so possibly the economics of page request performance make it worthwhile.

But this approach still has problem that you need to perform it every time you delete a Category. Depending on your model and business rules, this might happen only in one location or in several. A cleaner way to encapsulate the deletion business rule, suggested by Vince Stross on the Google Group, is the override the delete() method on your model object. Here is how the updated model looks:


class Category(db.Model):
name = StringProperty()
#some other props here


def delete(self):
for contact in self.contacts
contact.category = None
contact.save()
db.Model.delete(self)

class Contact(db.Model):
name = StringProperty()
category = ReferenceProperty(Category, collection_name='contacts')
email_address = EmailProperty()

Using the overridden method is easy:

category_to_delete.delete()


Behind the scenes, your method cleans up the references to the Category from Contact items, and then passes control across to the original delete() method defined by Google.

I like this approach; it is much cleaner and barring any performance improvements that can be made on the iteration of the loop, it is the approach I will take.

I have asked the Google Group membership to consider Vince's approach with my sample code - I wait to see what they. Any thoughts here?

Tuesday, July 7, 2009

Setting up a Python Google App Engine development environment

I am setting a development environment for working on Google App Engine projects. I have a freshly minted Windows 7 RC virtual machine cloned from a base image and made individual. Having a separate development environment on a virtual machine for working with Google App Engine (or anything similiar) makes a lot of sense. Especially since the Google App Engine SDK is tied to a specific python version.

What do you need to do to set up the development environment (at least in MS Windows)?

  1. Install Python 2.5.4 from the Python Website. I am not sure whether Google intends to support later versions but 2.5 is definitely not the latest, so be careful to install the correct version. I accept the default program folder of C:\python25, if only because I can vaguely recall issues with paths with spaces. I also install for all users.
  2. Make sure the python interpreter is in the system path. You should be able to open a command prompt window and type 'python' from anywhere on the file system and have the interpreter found. Add the main python folder (C:\python25 in my case) and the binaries folder (C:\python25\bin) to your path.
  3. Your file associations (.py --> python) should already be set by the installer.
  4. Install the App Engine SDK for Python from the App Engine Downloads page. There is nothing special to select during the install.
  5. The installer for the App Engine SDK python should have put the scripts on the system path. If you open a (new) command prompt window you should be able to verify this by typing dev_appserver.py with no arguments.
UPDATE 15th July 2009: You should also consider installing PIL as discussed in this post.

You are now ready to rock and roll.... as long you do not mind working in notepad or notepad++.
I do not have strong opinions about an IDE at this early stage, and I am interested in hearing input. Google have a Google App Engine Eclipse Plugin but this is for Java only.

Eclipse already has an excellent plugin for Python called Pydev. However Pydev was only recently acquired by Aptana. You can find some excellent notes for installing and configuring only Pydev for Google App Engine on Google's own help page, thanks to the contributions of Joscha Feth.

However I am going to explore going down a different path. Since Google App Engine (or Django or other Python web projects) still requires static assets like Javascript and CSS in addition to HTML templates, and since Aptana is the leader in these features for Eclipse, I am going to try my luck with installing Aptana studio. I will let you know how I go.

What is everyone else using?

I would also like to discover a lightweight alternative to Eclipse editing for my Netbook (Eclipse is a beast and I am not sure my space- and processor-constrained Netbook would cope). If I find I will post it here.