Last week my shiny new Apple iMac 27 inch arrived and I thought I would share my steps in getting my Google App Engine (Python) environment set up. I have previously written about setting up Ubuntu 10.10 for the same purpose and that post remains one of the most popular on this blog.
To set up our Google App Engine environment we will need the correct version of Python, the Google App Engine SDK and a Python/GAE friendly IDE.
Showing posts with label Google App Engine. Show all posts
Showing posts with label Google App Engine. Show all posts
Sunday, June 19, 2011
Monday, January 3, 2011
Static Resources and Cache-Busting on Google App Engine Python
Hello and happy new year!
In a couple of previous posts I discussed using Google Closure JavaScript Compiler in your build process to minimise the size of your JavaScript resources. Arguably a bigger win for the end user experience is to ensure the caching of all of your website static resources (JavaScript, stylesheets and images) is properly optimised. In this post I want to share how I think I have achieved this Google App Engine (Python).
Ideally, we would like the end user's browser to use its cache for every static resource, on every request, until that resource changes because we have deployed a new application version. Adding the (relatively new)
Behind the scenes, these directives add additional headers to the HTTP response, letting the browser know that they can safely serve these files from cache on subsequent requests.
But how do we tell the end user browser when our static resources have changed?
In a couple of previous posts I discussed using Google Closure JavaScript Compiler in your build process to minimise the size of your JavaScript resources. Arguably a bigger win for the end user experience is to ensure the caching of all of your website static resources (JavaScript, stylesheets and images) is properly optimised. In this post I want to share how I think I have achieved this Google App Engine (Python).
Ideally, we would like the end user's browser to use its cache for every static resource, on every request, until that resource changes because we have deployed a new application version. Adding the (relatively new)
default_expiration and expiration directives to your app.yaml file will achieve the first part of this requirement:- url: /images
static_dir: images
expiration: "99d"
Behind the scenes, these directives add additional headers to the HTTP response, letting the browser know that they can safely serve these files from cache on subsequent requests.
But how do we tell the end user browser when our static resources have changed?
Monday, November 8, 2010
Quick Note: Eclipse and Subclipse on Ubuntu (for Appengine)
I have previously written about configuring PyDev on top of an Eclipse/Aptana installation for Google App Engine development. Now on the Ubuntu platform instead of the windows, the process is easier these days since Pydev ships with the new Aptana Studio 3 Beta.
I thought I would write a quick note to future Ben (i.e. me) about configuring Eclipse and Subclipse on Ubuntu. Subclipse is a Subversion plugin for Eclipse (the alternative is Subversive, which I haven't tried).
Briefly my Eclipse install (for reference):
I thought I would write a quick note to future Ben (i.e. me) about configuring Eclipse and Subclipse on Ubuntu. Subclipse is a Subversion plugin for Eclipse (the alternative is Subversive, which I haven't tried).
Briefly my Eclipse install (for reference):
sudo apt-get install eclipse(if using eclipse base install)- Install Aptana Studio 3 beta, either standalone or as Eclipse plugin
- Install Google plugin for Eclipse from within Eclipse (page with download sites)
- Configure Python 2.5 interpreter within Eclipse (assuming you have already installed Python 2.5)
- Create new Google App Engine project (under PyDev grouping)
- Configure Python Library Paths for Eclipse
sudo apt-get install subversionsudo apt-get install libsvn-java- install subclipse within Eclipse (download sites)
Labels:
Aptana Studio,
Eclipse,
Google App Engine,
PyDev,
Subclipse,
Subversion
| Reactions: |
Sunday, November 7, 2010
Using the provided OpenID functionality on Google App Engine
I've just released a small, simple application to he3-appengine-lib demonstrate using the provided OpenID support Google added to App Engine in version X. I wrote the application from scratch to learn how a real application would behave, as I wasn't quite following the documentation.
When you set the authentication option for your application to OpenID to behaves very similiar to using the standard Google authentication. When you use Google authentication you redirect your users to the URL created by create_login_url() to login and create_logout_url() to log out. Before and after you can use the rest of the google.appengine.api.users API to find out the current logged in user, their nickname and email address.
Using OpenID works the same way with a only a few caveats:
When you set the authentication option for your application to OpenID to behaves very similiar to using the standard Google authentication. When you use Google authentication you redirect your users to the URL created by create_login_url() to login and create_logout_url() to log out. Before and after you can use the rest of the google.appengine.api.users API to find out the current logged in user, their nickname and email address.
Using OpenID works the same way with a only a few caveats:
- When calling create_login_url(), you need to pass the required OpenID service provider or actual OpenID URL. This is because the the service provider provides the login screen (if required - the user might already be signed in).
- Some service providers might not supply an email address, which may be problematic depending on your requirements. Even if you don't need to email the user, the email address is a dependable identifier, whereas nickname is not. For example, MyOpenID.com does not supply an email address by default, although this can configured at the time of the login. In my demonstration application, I detect when an email address has not be provided,. and reject the login.
- If you use directives in the app.yaml file to mandate that login or admins are required for particular pages, you must implement a handler for /_ah/login and /_ah/login_required. Standard approach would direct the user to your dedicated login page (but I took the easy way out in my application and redirected to the home page)
Overall the transition from Google authentication to OpenID is not challenging from a technical standpoint. As Google themselves discuss, some of the challenges arrive from simple usability choices in how you communicate to your users what their options are.
You can find my demonstration application on he3-appengine-lib, and you can find a working copy at he3-openidtest.appspot.com, although I can't guarantee I will keep it around forever.
I am yet to use the OpenID support in a production application, and I am sure I will learn more when I do. I'll record anything else of use that I find out here.
Labels:
Google App Engine,
he3-appengine-lib,
OpenID
| Reactions: |
Thursday, November 4, 2010
Mapreduce to be built in to Google App Engine Environment
If you read the release notes for the latest Google App Engine python SDK (1.3.8) you might have noticed the available declarations in the app.yaml application configuration has been increased. There are some interesting additions.
The
The
The 'datastore_admin' builtin seems to be built on the appengine-mapreduce project. As I have previously posted, MapReduce is a methodology of dividing a large task into smaller pieces of work that can be performed in parallel, before being reduced to a single a outcome.
I am excited about the new builtin - since I discovered appengine-mapreduce I have considered it essential for maintaining a large set of data in the datastore.
In case you were excited to try out the new Mapreduce functionality, you need to wait a little longer - until the next release of the SDK at least. Oddly enough the documentation lists the 'datastore_admin' as a valid builtin option, and the libraries are now included in the SDK, but some import errors prevent it from being used (namely simplejson and graphy.backends).
As this discussion thread on the Google Group shows, it isn't quite ready for primetime yet. Hopefully the next release will tidy things up.
The
include directive allows the app.yaml file to be broken apart for better reuse. Within the one project this might not make sense, but you could imagine entire utility modules suddenly being a lot more transportable between projects and easily added to an existing application.The
builtin directive, built on top the include functionality, provides easy inclusion of built-in handlers, including appstats, the administration console, the remote API and a 'datastore_admin'.The 'datastore_admin' builtin seems to be built on the appengine-mapreduce project. As I have previously posted, MapReduce is a methodology of dividing a large task into smaller pieces of work that can be performed in parallel, before being reduced to a single a outcome.
I am excited about the new builtin - since I discovered appengine-mapreduce I have considered it essential for maintaining a large set of data in the datastore.
In case you were excited to try out the new Mapreduce functionality, you need to wait a little longer - until the next release of the SDK at least. Oddly enough the documentation lists the 'datastore_admin' as a valid builtin option, and the libraries are now included in the SDK, but some import errors prevent it from being used (namely simplejson and graphy.backends).
As this discussion thread on the Google Group shows, it isn't quite ready for primetime yet. Hopefully the next release will tidy things up.
Sunday, October 31, 2010
Installing Python Google Appengine SDK on Ubuntu 10.10
I am now back from my long hiatus from this blog (my excuse was a busy work schedule, wedding preparations and then the honeymoon) and I decided to have a look at Ubuntu 10.10. Ubuntu is a popular desktop Linux distribution which offers greater distribution-wide integration, consistency and polish than many other distributions.
I have a long but not terribly recent history of trying Linux but never fully adopting it. As an exercise I wondered how feasible it would be to port my development environment for Python Google App Engine (GAE) to my new Ubuntu virtual machine. Most things on a Linux platform are possible as long as you know what you are doing, but I don't, so getting GAE running required some investigation.
Ubuntu - at least as installed by the my VMware Easy Install process - presented some challenges. Ubuntu no longer installs the correct version of Python, for example, nor does it install source headers for libraries some Python extensions required by the SDK need to compile. After discussing my problem on the Google App Engine forums I found two very helpful blog posts that present the two different approaches to setting up Google App Engine Python SDK under Ubuntu 10.01.
Ubuntu - at least as installed by the my VMware Easy Install process - presented some challenges. Ubuntu no longer installs the correct version of Python, for example, nor does it install source headers for libraries some Python extensions required by the SDK need to compile. After discussing my problem on the Google App Engine forums I found two very helpful blog posts that present the two different approaches to setting up Google App Engine Python SDK under Ubuntu 10.01.
Tuesday, August 24, 2010
Adding Library Paths to Google App Engine Python Projects
Apologies - It has been a while since my last post in this blog. Wedding preparations and a focus on design work for a new version of My Web Brain (along with my day job as a SQL developer on a SAP rollout) has kept me busy.
Did you know you could alter the system path dynamically on Google App Engine Python? This tip is something that I am sure old hands at Python consider basic, but I hadn't seen the technique done before.
As part of some research of the next version of My Web Brain, I evaluated some Google App Engine frameworks, one of which was Tipfy. Tipfy uses this technique front and centre in its main.py file to alter the system path:
Taking control of the system path frees your package structure from the normal constraints of the project folder structure. And as the Tipfy code demonstrates, it also provides a good way of separating framework, vendor and customised code bases with the desired fallback order.
Update 24 Nov 2010: Just a note on what is obvious in hindsight. You need to update sys.path prior to importing from modules on the affected path(s). The same is true for modules with dependencies on the affected path.
Did you know you could alter the system path dynamically on Google App Engine Python? This tip is something that I am sure old hands at Python consider basic, but I hadn't seen the technique done before.
As part of some research of the next version of My Web Brain, I evaluated some Google App Engine frameworks, one of which was Tipfy. Tipfy uses this technique front and centre in its main.py file to alter the system path:
| import os |
| import sys |
| if 'lib' not in sys.path: |
| # Add /lib as primary libraries directory, with fallback to /distlib |
| # and optionally to distlib loaded using zipimport. |
| sys.path[0:0] = ['lib', 'distlib', 'distlib.zip'] |
Taking control of the system path frees your package structure from the normal constraints of the project folder structure. And as the Tipfy code demonstrates, it also provides a good way of separating framework, vendor and customised code bases with the desired fallback order.
Update 24 Nov 2010: Just a note on what is obvious in hindsight. You need to update sys.path prior to importing from modules on the affected path(s). The same is true for modules with dependencies on the affected path.
Saturday, July 3, 2010
Timezones using Gae-Pytz
This is a short post about my good experiences moving from standard pytz to the gae-pytz library for Google App Engine. Pytz is the standard python library for defining timezone information. Gae-pytz is a version of pytz optimised for Google App Engine.
If you have read one of my previous posts about implementing timezone support in Google App Engine, you'll know I have used the full pytz library in the past. In addition to improved performance, gae-pytz offers a couple of key advantages in comparison to vanilla pytz:
If you have read one of my previous posts about implementing timezone support in Google App Engine, you'll know I have used the full pytz library in the past. In addition to improved performance, gae-pytz offers a couple of key advantages in comparison to vanilla pytz:
- All timezone objects are packaged and deployed in a zip file. There are hundreds of timezone files in the normal pytz distribution, which makes the library particularly burdensome in the 1000-file maximum limit imposed by Google App Engine.
- Gae-pytz has built-in memcache caching of individual timezone objects.
My experience with implementing gae-pytz in my application was great. For the most part it is a drop in replacement with no changes to code required. I could actually remove all of the caching code I had put in place since the provided caching already provided the functionality. If you have previously built an application with pytz for app engine, check gae-pytz out.
Monday, June 28, 2010
Maintaining the App Engine Datastore with MapReduce
One of the items on the todo list for my application My Web Brain is to add features which will require changes to the data model used by the application. Making changes to an application's data model in the App Engine datastore is sometimes more painful than it should be, since the developer remains responsible for ensuring that existing entities are made consistent with the new model.
Having recently read about appengine-mapreduce I decided I would write a simple Mapper that looks for and fixes missing property values and invalid ReferenceProperty keys. You can find the ModelHygienceMapper (as I call it) at he3-appengine-lib in the new mappers.py file (some simple documentation coming soon).
Having recently read about appengine-mapreduce I decided I would write a simple Mapper that looks for and fixes missing property values and invalid ReferenceProperty keys. You can find the ModelHygienceMapper (as I call it) at he3-appengine-lib in the new mappers.py file (some simple documentation coming soon).
Labels:
Google App Engine,
he3-appengine-lib,
mapreduce
| Reactions: |
Wednesday, June 16, 2010
PrefetchingQuery: Prefetch Reference Properties and Parents in App Engine
Back in January Nick Johnson from Google wrote about the benefits of prefetching reference properties for a collection of App Engine datastore entities. Prefetching can dramatically reduce the amount of RPC calls* to the datastore by ensuring each entity referenced by a property is only retreived once.
It has taken me a while but I finally got a chance to go back to Nick's blog entry and build some infrastructure around his code to make prefetching reference properties easier. I call the effort PrefetchingQuery and you can find it over at he3-appengine-lib in performing.py
Like PagedQuery, PrefetchingQuery is a facade for a normal db.Query or db.GqlQuery object. You can configure the PrefetchingQuery with the properties to prefetch in its constructor, with a class attribute on your model entity or let it default to prefetching all reference properties. For example
configures the PrefetchingQuery to prefetch the myReferenceProp1 property. Actual prefetching occurs when you call fetch() on the query.
It has taken me a while but I finally got a chance to go back to Nick's blog entry and build some infrastructure around his code to make prefetching reference properties easier. I call the effort PrefetchingQuery and you can find it over at he3-appengine-lib in performing.py
Like PagedQuery, PrefetchingQuery is a facade for a normal db.Query or db.GqlQuery object. You can configure the PrefetchingQuery with the properties to prefetch in its constructor, with a class attribute on your model entity or let it default to prefetching all reference properties. For example
#create a prefetching query
myPrefetchingQuery = PrefetchingQuery( MyEntity.all(), (MyEntity.myReferenceProp1, ))configures the PrefetchingQuery to prefetch the myReferenceProp1 property. Actual prefetching occurs when you call fetch() on the query.
Sunday, June 6, 2010
PageLinks - a simple helper class for PagedQuery
On May 26th I added PageLinks to the already published PagedQuery class available on he3-appengine-lib. PagedQuery is a python class to add a paging abstraction to the vanilla db.Query and db.GqlQuery used on Google App Engine (announced here). PageLinks is a simple helper class that generates a sequence of page number / URL pairs for use in page navigation, including previous and next links.
Saturday, May 15, 2010
Unit Tests for PagedQuery
I've just checked-in some Gaeunit unit tests for PagedQuery. I have also included the gaeunit.py script and a simple app.yaml file.
To run the unit tests, checkout the trunk (if you have not already) and launch the development server pointing at the source folder. Then navigate to the local /test folder in your browser (probably http://localhost:8080/test), which will run all of the tests defined by the library (19 so far).
To run the unit tests, checkout the trunk (if you have not already) and launch the development server pointing at the source folder. Then navigate to the local /test folder in your browser (probably http://localhost:8080/test), which will run all of the tests defined by the library (19 so far).
Tuesday, April 27, 2010
PagedQuery - Easy paging using cursors on AppEngine
In the 1.3.1 release of Google App Engine, the App Engine team added support for Query cursors. I am happy to release a small and simple utility, PagedQuery, for providing a paging abstraction that uses the cursors and caches them to the memcache.
Labels:
GAE Datastore,
Google App Engine,
he3-appengine-lib
| Reactions: |
Saturday, January 30, 2010
App Engine: Am I running on the development server?
Sometimes you need to determine within your application whether it is currently running in a development or production environment. I just spotted a tip from Nickolas Daskalou in the Google App Engine Python forums that allows you to do just that:
This isn't particularly exciting, but I know I will come across this situation soon so I thought I would note the solution for myself in the future. Thanks Nick!
The Google App Engine development environment is generally pretty safe for Google specific resources: no emails are actually sent, only the local datastore can be updated and task queues do not process without manual intervention. When working with external resources, such as read/write web services, however, you may want to limit what your application does within a development.
import os
...
debug = os.environ['SERVER_SOFTWARE'].startswith('Dev')
This isn't particularly exciting, but I know I will come across this situation soon so I thought I would note the solution for myself in the future. Thanks Nick!
The Google App Engine development environment is generally pretty safe for Google specific resources: no emails are actually sent, only the local datastore can be updated and task queues do not process without manual intervention. When working with external resources, such as read/write web services, however, you may want to limit what your application does within a development.
Friday, January 29, 2010
App Engine: Google fails users.is_current_user_admin() test
The way Google App Engine executes cron jobs indicates that not all admin authentication is created equally. If you secure your cron or task queue URL in the app.yaml file, as Google suggests, your Google-automated tasks will be properly secured in such a way only an administrator can access the URL:
However, if you would like to secure one of these Google-executed URLs yourself, you seem to be out of luck.
Yes, you could check the easily spoofed request user-agent, but on first glance the most useful method would be the users.is_current_user_admin() API method. However, this fails for Google cron and taskqueue page requests.
This is inconvenient for me; I do use Google user accounts for my applications (such as My Web Brain) but I like handling security within the confines of an event handler, where I can control the exception raised, HTTP status code, logging and actual response sent to the user.
Hopefully this inconsistency will be resolved in a future App Engine release, but my feeling is that Google jury-rigged the exception for their own services into their own interpretation of app.yaml, and that the inbuilt users API would not know Google's own requests from anyone else's.
- url: /admin/my-cron-url
script: main.py
login: admin
However, if you would like to secure one of these Google-executed URLs yourself, you seem to be out of luck.
Yes, you could check the easily spoofed request user-agent, but on first glance the most useful method would be the users.is_current_user_admin() API method. However, this fails for Google cron and taskqueue page requests.
This is inconvenient for me; I do use Google user accounts for my applications (such as My Web Brain) but I like handling security within the confines of an event handler, where I can control the exception raised, HTTP status code, logging and actual response sent to the user.
Hopefully this inconsistency will be resolved in a future App Engine release, but my feeling is that Google jury-rigged the exception for their own services into their own interpretation of app.yaml, and that the inbuilt users API would not know Google's own requests from anyone else's.
Monday, January 4, 2010
Supporting Timezones in Google App Engine Pt. 3 - Your Timezone Aware App
I am covering how I added Timezone support to my web application My Web Brain in a series of posts. Hopefully someone will find the discussion useful or even better contribute ways to achieve the same effect.
Note: I've scaled this post back, mostly since I was being tardy in completing it and also with the realisation that the extra detail I might add would be too specific to my recent experience.
Converting datetimes on input
Sometimes your application will accept as input a user provided date or datetime. In a timezone aware application the working assumption must be that this date is in the user's local timezone. It therefore needs to be converted to UTC for storage.
A handy feature of the datastore datetime property is that as long as your datetime is marked with a specific timezone, the datastore will automatically convert the value to UTC when it is persisted. Once you have parsed your datetime from the input, simply assign the user's timezone before persisting. For example:
Of course if you are accepting input from the system time, you do not need to assign a timezone, it is already in UTC, even thought it is a naive datetime.
Converting dates (only) on input
To provide some contrast, My Web Brain also supports Someday Maybe items with 'tickle dates'. A tickle date is provided by the user to indicate when they should be reminded. In this situation the application will want to find all Someday Maybe items that are due to be 'tickled', and this should include any such item on the same local date the user entered. Therefore, we do not add any amount of time when combining to create a tickle date. Again, since this date was converted to UTC prior to persisted this query is easy for the application to do without knowing the user's local timezone.
Other situations will require other time offsets. Perhaps if a user's work day ends at 5pm, the due date should reflect this. The common thread is that right time offset will depend on the application. Once you have a determined this, the code is easy:
Converting datetimes on output
At the other end of your application you will want to show the user dates and datetimes in their local time. From the datastore you will retrieve datetimes in UTC - these values need to be converted to the user's timezone.
There is nothing overwhelmingly difficult about the above line of code, but you could be forgiven for wondering where in your application it should go. You need to decide if the derived local time of the data should belong to the model, view or controller part of your application:
For My Web Brain I have at the moment chosen to invest the responsibility of converting times to the user's local time to the model, but I wonder if view-based solution would provide more long-term benefits to performance and model design.
Who knew the single line of code converting a time between timezones could require such thinking? Maybe it is only me who can make it so.
Other Questions
There are some other points to cover about using making your application timezone aware. How do you generate a list of timezones for the user to pick from? I have a solution for this, but it isn't elegant. Another good question is how the best way would be to guess the correct timezone for a user. Those who have seen my frequent calls to pytz.timezone() may wonder what the performance penalty is (from what I have read, it is worth looking into).
But- This entry is getting long and I might leave these topics for another occasion. Remember that if you have something to add or something to ask, you are welcome to go ahead and do so. I still have significant learning to do in this area and I will keep you posted when I do.
- Part 0: The Starting Point - What you get out the box from Google App Engine and the Base Python Environment.
- Part 1: Getting Timezone information with pytz
- Part 2: Making the Google App Datastore more consistent by writing a UtcDatetimeProperty class
- Part 3: Handling timezones throughout your application (this post)
Note: I've scaled this post back, mostly since I was being tardy in completing it and also with the realisation that the extra detail I might add would be too specific to my recent experience.
Converting datetimes on input
Sometimes your application will accept as input a user provided date or datetime. In a timezone aware application the working assumption must be that this date is in the user's local timezone. It therefore needs to be converted to UTC for storage.
A handy feature of the datastore datetime property is that as long as your datetime is marked with a specific timezone, the datastore will automatically convert the value to UTC when it is persisted. Once you have parsed your datetime from the input, simply assign the user's timezone before persisting. For example:
user_datetime = parse_datetime(self.request.get('user_date'))
my_model_entity.user_datetime = user_datetime.replace(tzinfo=pytz.timezone(user_timezone_name))
my_model_entity.put()
Of course if you are accepting input from the system time, you do not need to assign a timezone, it is already in UTC, even thought it is a naive datetime.
my_model_entity.log_datetime = datetime.datetime.now()
my_model_entity.put()
Converting dates (only) on input
This is pretty straightforward. There is a situation though where your application would normally only care about dates, but because of your support for timezones you are persisting datetime objects. The user would select a date, say the 10th of January of this year, but when persisting this as a datetime you have a crucial decision to make - what time on the 10th of January should you store? In all situations you will want to set this time prior to converting to UTC. The decision about which time to use is non-trivial since your application will likely want to use the information in the effective manner. The answer will vary from application to application, but a couple of examples might illustrate the point.
In my application My Web Brain, the user can enter a Due Date for a next action that they define. In this application we will be asking one critical question on a regular basis concerning the due date - is the next action overdue? Something is overdue when the due date has past, not during that date, so My Web Brain adds 23 hours and 59 minutes to the due date. This gets converted to UTC when it is persisted, so any Next Action which is overdue can be queried easily by using a condition comparing the due date with the system time.
To provide some contrast, My Web Brain also supports Someday Maybe items with 'tickle dates'. A tickle date is provided by the user to indicate when they should be reminded. In this situation the application will want to find all Someday Maybe items that are due to be 'tickled', and this should include any such item on the same local date the user entered. Therefore, we do not add any amount of time when combining to create a tickle date. Again, since this date was converted to UTC prior to persisted this query is easy for the application to do without knowing the user's local timezone.
Other situations will require other time offsets. Perhaps if a user's work day ends at 5pm, the due date should reflect this. The common thread is that right time offset will depend on the application. Once you have a determined this, the code is easy:
due_date = parse_date (self.request.get('user_date'))
due_datetime = due_date.combine(datetime.time(hours=17))
next_action_entity.duedate = due_datetime.replace(tzinfo=pytz.timezone(user_timezone_name))
next_action_entity.put()
Converting datetimes on output
At the other end of your application you will want to show the user dates and datetimes in their local time. From the datastore you will retrieve datetimes in UTC - these values need to be converted to the user's timezone.
local_duedate = next_action_entity.duedate.astimezone(pytz.timezone(user_timezone_name))There is nothing overwhelmingly difficult about the above line of code, but you could be forgiven for wondering where in your application it should go. You need to decide if the derived local time of the data should belong to the model, view or controller part of your application:
- If make the conversion to local time a part of the view, you need to share the user's timezone information with the view. The view is often implemented in a templating language like Django templates which out of the box does not provide the capability to do this as a view operation. The best way to use the view in this way would be to define a custom tag for the templating engine, which is not overly difficult.
- You might make the conversion to local time a responsibility of the controller, which after all can access different parts of the model to both find the user's timezone preference and the data with the times to be converted. On the downside, converting the date times and providing them separately (or mixed in to the data) is messy and, in my opinion, increases the coupling between controller and view more than necessary.
- Making the conversion to local time a responsibility of the model is perhaps the easiest approach in the short term. Simple add a method to provide the local time to your entity's class. This has the benefit of making your entities less anemic, but introduces the requirement for one piece of your model to know the user's timezone, which might belong in somewhere else in the model entirely. It can also lead to timezone conversion code duplication across your model.
For My Web Brain I have at the moment chosen to invest the responsibility of converting times to the user's local time to the model, but I wonder if view-based solution would provide more long-term benefits to performance and model design.
Who knew the single line of code converting a time between timezones could require such thinking? Maybe it is only me who can make it so.
Other Questions
There are some other points to cover about using making your application timezone aware. How do you generate a list of timezones for the user to pick from? I have a solution for this, but it isn't elegant. Another good question is how the best way would be to guess the correct timezone for a user. Those who have seen my frequent calls to pytz.timezone() may wonder what the performance penalty is (from what I have read, it is worth looking into).
But- This entry is getting long and I might leave these topics for another occasion. Remember that if you have something to add or something to ask, you are welcome to go ahead and do so. I still have significant learning to do in this area and I will keep you posted when I do.
Sunday, December 6, 2009
Supporting Timezones in Google App Engine Pt. 2 - Writing UtcDateTimeProperty
I am covering how I added Timezone support to my web application My Web Brain in a series of posts. Hopefully someone will find the discussion useful or even better contribute ways to achieve the same effect.
As we covered in Part 0: The Starting Point, the datestore does not store the timezone of DateTimeProperty properties defined in a model. If a datetime object has a timezone defined that is not UTC, it will be converted prior to storage to UTC time, but all DateTimeProperty values will be naive - without timezone information - when they are retrieved. That means:
The primary effect we are looking for is to ensure that no naive datetime values are retrieved from the datastore properties of this type. Handling naive datetimes that are timezone sensitive repeatedly throughout your application is both redundant and dangerous, so we are trying to avoid this and make the assumption that all datestore datetimes are UTC explicit rather than implicit.
A smaller, less consequential effect is to ensure all datetime values passed to the original model property have a UTC timezone set if none would otherwise be present. This could be consdered wasteful since naive datetimes set on DateTimeProperties are treated as UTC anyway. The purpose is to make that assumption explicit and to future proof the code against future changes in the way the provided datastore properties may work in the future.
The code for my UtcDateTimeProperty is below:
The code and comments above should relatively readable. As stated the code ensures all datetimes saved to the datestore have an explicit timezone set and when retrieved also have an explicit timezone.
Using the custom property is simple, since it inherits the methods and important behaviour from the DateTimeProperty class:
You could go further and create a true TzDateTimeProperty class that stores and applies a provided timezone. In the case of my application this was not required, so you will not see the code of that custom property class here.
As usual, any questions, corrections or comments are very welcome. In the next part of this series I will look at the practical implementation of timezone support throughout your application.
- Part 0: The Starting Point - What you get out the box from Google App Engine and the Base Python Environment.
- Part 1: Getting Timezone information with pytz
- Part 2: Making the Google App Datastore more consistent by writing a UtcDatetimeProperty class
- Part 3: Handling timezones throughout your application
As we covered in Part 0: The Starting Point, the datestore does not store the timezone of DateTimeProperty properties defined in a model. If a datetime object has a timezone defined that is not UTC, it will be converted prior to storage to UTC time, but all DateTimeProperty values will be naive - without timezone information - when they are retrieved. That means:
- If you set a naive datetime as the value, it will be retrieved as a naive datetime
- If you set a datetime with a timezone of UTC, it will be retrieved as a naive datetime
- If you set a datetime with a timezone different from UTC, it will be converted to UTC prior to storage and retrieved naive (in UTC).
The primary effect we are looking for is to ensure that no naive datetime values are retrieved from the datastore properties of this type. Handling naive datetimes that are timezone sensitive repeatedly throughout your application is both redundant and dangerous, so we are trying to avoid this and make the assumption that all datestore datetimes are UTC explicit rather than implicit.
A smaller, less consequential effect is to ensure all datetime values passed to the original model property have a UTC timezone set if none would otherwise be present. This could be consdered wasteful since naive datetimes set on DateTimeProperties are treated as UTC anyway. The purpose is to make that assumption explicit and to future proof the code against future changes in the way the provided datastore properties may work in the future.
The code for my UtcDateTimeProperty is below:
#imports
import pytz
from google.appengine.ext import db
class UtcDateTimeProperty(db.DateTimeProperty):
'''Marks DateTimeProperty values returned from the datastore as UTC. Ensures
all values destined for the datastore are converted to UTC if marked with an
alternate Timezone.
Inspired by
http://www.letsyouandhimfight.com/2008/04/12/time-zones-in-google-app-engine/
http://code.google.com/appengine/articles/extending_models.html
'''
def get_value_for_datastore(self, model_instance):
def make_value_from_datastore(self, value):
all values destined for the datastore are converted to UTC if marked with an
alternate Timezone.
Inspired by
http://www.letsyouandhimfight.com/2008/04/12/time-zones-in-google-app-engine/
http://code.google.com/appengine/articles/extending_models.html
'''
def get_value_for_datastore(self, model_instance):
'''Returns the value for writing to the datastore. If value is None,
return None, else ensure date is converted to UTC. Note Google App
Engine already does this. Called by datastore
'''
date = super(UtcDateTimeProperty,
self).get_value_for_datastore(model_instance)
if date:
if date.tzinfo:
return date.astimezone(pytz.utc)
else:
return date.replace(tzinfo=pytz.utc)
else:
return None
return None, else ensure date is converted to UTC. Note Google App
Engine already does this. Called by datastore
'''
date = super(UtcDateTimeProperty,
self).get_value_for_datastore(model_instance)
if date:
if date.tzinfo:
return date.astimezone(pytz.utc)
else:
return date.replace(tzinfo=pytz.utc)
else:
return None
def make_value_from_datastore(self, value):
'''Returns the value retrieved from the datastore. Ensures all dates
are properly marked as UTC if not None'''
if value is None:
return None
else:
return value.replace(tzinfo=pytz.utc)
are properly marked as UTC if not None'''
if value is None:
return None
else:
return value.replace(tzinfo=pytz.utc)
The code and comments above should relatively readable. As stated the code ensures all datetimes saved to the datestore have an explicit timezone set and when retrieved also have an explicit timezone.
Using the custom property is simple, since it inherits the methods and important behaviour from the DateTimeProperty class:
from model_ext import UtcDateTimeProperty
from google.appengine.ext import db
class MyModelObject(db.model):
my_tz_date = UtcDateTimeProperty(required=false)
You could go further and create a true TzDateTimeProperty class that stores and applies a provided timezone. In the case of my application this was not required, so you will not see the code of that custom property class here.
As usual, any questions, corrections or comments are very welcome. In the next part of this series I will look at the practical implementation of timezone support throughout your application.
Wednesday, December 2, 2009
Supporting Timezones in Google App Engine Pt. 1 - Pytz
I am covering how I added Timezone support to my web application My Web Brain in a series of posts. Hopefully someone will find the discussion useful or even better contribute ways to achieve the same effect.
Where do the names come from? Pytz has two lists,
To demonstrate, to generate all possible tzinfo objects for the common_timezones list (for caching, obviously - you would not want to do this for every request) and store it in a handy dictionary, you could write:
For me, new to Python on anything but App Engine, getting pytz was one of the more trial and error parts of implementing timezone support. The downloadable source code I found on places such as Source Forge was out of date (and as you can imagine, timezone information changes considerably not only year to year but month to month). The solution for me was to use the easy-install command (from Setuptools) to download the Python Egg file to my local installation then again to unpack the source to a local directory. From their I could cherry pick the library folder itself and add it to my Google App Engine application. From the command line, installing the lastest version of pytz is easy:
The latest version as I write is 2009r.
Unfortunately, although I managed to unpack the egg file, I can not remember how I did it (and I just spent some time trying to recreate the feat). Hopefully someone will present a clue (or a completely better way of getting the library into my App Engine source tree).
On that slightly demoralising note I will conclude this post on pytz. The series on adding timezone support will continue, however. There I will discuss writing a wrapper class for the DateTimeProperty to ensure consistency for timezone aware applications.
See you then - any comments more than welcome.
- Part 0: The Starting Point - What you get out the box from Google App Engine and the Base Python Environment.
- Part 1: Getting Timezone information with pytz (this post)
- Part 2: Making the Google App Datastore more consistent by writing a UtcDatetimeProperty class
- Part 3: Handling timezones throughout your application
This post is Part 1, introducing Pytz. In my previous post I described how some of the core python classes -
datetime.datetime and datetime.time - support timezones out of the box. For example, you can create datetime object with a timezone and then convert it into any other timezone:
my_datetime_in_est = datetime.datetime(day=1,month=12,year=2009, tzinfo=est_tz)
my_datetime_in_pst = my_datetime_in_est.astimezone(pst_tz)
But where do these timezone information (tzinfo) objects come from? Well, you can write your own, defining the algorithm for incorporating a UTC offset and accounting for daylight savings time depending on the time of year (if applicable). But no sensible person wants to adopt the challenge of maintaining their own timezone information for one timezone, let alone for the entire world. It would be nice if you could simply tap into a maintained and open library that had all of the major timezones from around the world.
That is exactly what the pytz library provides. Pytz provides you with a capability to generate any timezone (ie. tzinfo object) it knows about by name:
That is exactly what the pytz library provides. Pytz provides you with a capability to generate any timezone (ie. tzinfo object) it knows about by name:
import pytz
est_tz = pytz.timezone('US/Eastern')
#datetime in eastern standard time
est_datetime = datetime.datetime(day=1, month=12, year=2009, tzinfo=est_tz)
all_timezones and common_timezones, which you can use to source a set of valid timezones. all_timezones contains a great number of timezones, including many defunct and ununsed zones. common_timezones is a smaller set of just the timezones still existing today.To demonstrate, to generate all possible tzinfo objects for the common_timezones list (for caching, obviously - you would not want to do this for every request) and store it in a handy dictionary, you could write:
tzinfo_dict = dict(
[(tzname, pytz.timezone(tzname)) for tzname in pytz.common_timezones]
)
For me, new to Python on anything but App Engine, getting pytz was one of the more trial and error parts of implementing timezone support. The downloadable source code I found on places such as Source Forge was out of date (and as you can imagine, timezone information changes considerably not only year to year but month to month). The solution for me was to use the easy-install command (from Setuptools) to download the Python Egg file to my local installation then again to unpack the source to a local directory. From their I could cherry pick the library folder itself and add it to my Google App Engine application. From the command line, installing the lastest version of pytz is easy:
easy_install --upgrade pytz
The latest version as I write is 2009r.
Unfortunately, although I managed to unpack the egg file, I can not remember how I did it (and I just spent some time trying to recreate the feat). Hopefully someone will present a clue (or a completely better way of getting the library into my App Engine source tree).
On that slightly demoralising note I will conclude this post on pytz. The series on adding timezone support will continue, however. There I will discuss writing a wrapper class for the DateTimeProperty to ensure consistency for timezone aware applications.
See you then - any comments more than welcome.
Sunday, November 29, 2009
Support Timezones in Google App Engine Pt. 0 - The Starting Point
As I mentioned in a previous post I recently implemented timezone in my GTD web application. I am writing a series on how I approached this on the off chance it would be useful someone or I might encourage someone to come forward with a better solution.
I thought I would start off examining the starting point, including:
Vanilla python includes a number of types for dates and times, including timezone support. The datetime module includes two timezone aware classes - Datetime and Time. If you do not supply a tzinfo (Timezone Info - information about how a timezone is named and how it relates to UTC) object during creation, your Datetime or Time is considered 'naive', meaning it is not timezone aware.
Datetime and Time objects can be used in arithmetic and comparisons fairly naturally, but it is worth noting this only appears to work where all timezone aware objects used in the operation or comparison are consistently naive or have a timezone set. This makes sense. Some datetime objects are not timezone sensitive, like TimeDelta.
You can change a naive date into a timezone aware date using the datetime
The missing link is the
It is nice to have these timezone aware objects to work with to spare reinventing the wheel. To be useful though in most use cases you need to persist the data, and that is where the Google App Engine datastore becomes involved. How does the datastore handle the persistence of timezone aware datetime objects?
Datetimes are persisted as part of the application model using the
In the meantime I have one rather obvious piece of advice: If you are using a date (only) in your model and you know it will be timezone sensitive, use a
In my application, I have a property duedate. On behalf of my users I am not interested in the time of day a due date might correspond with, but in order to properly serve timezone sensitive due dates (where a due date starts and ends in the correct timezone), a simple
Because of this if you are preparing to include timezone support for a date only properties, you should convert them to
Note we use the
That concludes the topics I wanted to cover in this post. I plan the next in this series to talk about the pytz module and how to get your hands on prewritten and maintained tzinfo objects.
I thought I would start off examining the starting point, including:
- What the relevant functions from the datetime module that ship with python and are available on Google App Engine (GAE).
- Using
DateTimeProperty, notDateProperty, for timezone sensitive dates in the GAE datastore
- How the GAE datastore behaves with respect to timezones
Vanilla python includes a number of types for dates and times, including timezone support. The datetime module includes two timezone aware classes - Datetime and Time. If you do not supply a tzinfo (Timezone Info - information about how a timezone is named and how it relates to UTC) object during creation, your Datetime or Time is considered 'naive', meaning it is not timezone aware.
import datetime
naive_date = datetime.DateTime(day=30,month=11, year=2009)
tz_date = datetime.DateTime(day=30,month=11, year=2009, tzinfo=est_tz)Datetime and Time objects can be used in arithmetic and comparisons fairly naturally, but it is worth noting this only appears to work where all timezone aware objects used in the operation or comparison are consistently naive or have a timezone set. This makes sense. Some datetime objects are not timezone sensitive, like TimeDelta.
import datetime
naive_date = datetime.datetime(day=30,month=11, year=2009)
tz_date_est = datetime.datetime(day=30,month=11, year=2009, tzinfo=est_tz)
tz_date_pst = datetime.datetime(day=30,month=11, year=2009, tzinfo=pst_tz)
#this does not work
is_earlier = naive_date < tz_date_est
#this doesis_earlier = tz_date_est < tz_date_pst
#these both worklater = naive_date + datetime.timedelta(days=1)earlier = tz_date_est - datetime.timedelta(hours=5)You can change a naive date into a timezone aware date using the datetime
replace() method. If you use the replace method on a datetime that has a timezone set, no conversion occurs. To properly convert times between timezones, use the datetime astimezone() method instead.
naive_date = datetime.datetime(hour=1, day=30, month=11,year=2009)
tz_date_est = naive_date.replace(tzinfo = est_tz)
#replaces the timezone information in the datetime without conversion
#tz_date_est is now 1:00 am, 30th of November
tz_date_pst = tz_date_est.astimezone(pst_tz)
#converts from datetime's timezone to pst
#tz_date is now 10:00pm, 29th of November
The missing link is the
tzinfo (timezone information) objects.Python does not ship with them and the expectation appears to be that you or others will write your own. Creating your own tzinfo is easy, but only assuming you know the exact rules for the timezone (for example, when daylight savings is in effect), have only a limited number to create, and are prepared to maintain it moving forward for any future changes. For a up to date and maintained list of timezones you can use pytz, which I will cover in more depth in a future post.It is nice to have these timezone aware objects to work with to spare reinventing the wheel. To be useful though in most use cases you need to persist the data, and that is where the Google App Engine datastore becomes involved. How does the datastore handle the persistence of timezone aware datetime objects?
Datetimes are persisted as part of the application model using the
DateTimeProperty. All of Google App Engine runs on UTC time (ie. datetime.datetime.now() returns the current UTC time). It is natural then (and in most cases best practice) that the datastore is optimised to work in UTC. The interface is not completely consistent though:- If you persist a naive
datetime, a naivedatetimewill be returned (as you might expect) - If you persist a
datetimein UTC, a naivedatetimewill be returned (not quite as you would expect) - If you persist a
datetimein some other timezone, it will be converted to UTC and still returned as a naivedatetime.
datetime will always be read back from the datastore, and it is not obvious to the reader what the original timezone, whether it was originally specified or even if the datastore has performed a conversion at the time of persistence. This means you need to be careful about how timezones are treated at both the writing and reading stages of your application. The simplest approach would be to:- Use the datastore for UTC date times only. In practice this means ensuring all
DateTimePropertyproperties should have a timezone set where they can differ from UTC. The datastore will kindly ensure all necessary conversion is done, but will not complain about naive datetimes (so be careful). - When retreiving date times, they will be naive, but assuming you ensure all datetimes entering the system are marked with the correct timezone, you can assume they will be UTC and should probably ensure the timezone information is set correctly before using then.
UtcDateTimeProperty custom model class, which does little except ensure timezones are made explicit in both the datastore reads and writes. It should be quite doable to craft a TzDateTimeProperty which also stores the original timezone and converts datetimes back to the correct timezone when they are read.In the meantime I have one rather obvious piece of advice: If you are using a date (only) in your model and you know it will be timezone sensitive, use a
DateTimeProperty instead of the simple DateProperty. The underlying datetime.date object is always naive, and even knowing the relevant timezone is non-deterministic.In my application, I have a property duedate. On behalf of my users I am not interested in the time of day a due date might correspond with, but in order to properly serve timezone sensitive due dates (where a due date starts and ends in the correct timezone), a simple
DateProperty - my original naive choice (thats a pun, by the way) - was not sufficient. Even knowing the timezone, a single date may correspond with more than one date in different timezones. Because of this if you are preparing to include timezone support for a date only properties, you should convert them to
DateTimeProperty objects. As with all model changes, you need to be careful to convert existing data to the correct type. The following code checks to see if an entity possesses data for a property duedate. If it does, the data-type is checked and converted:
if entity.duedate:
if not isinstance(entity.duedate, datetime.datetime):
entity.duedate = datetime.datetime.combine(entity.duedate,
datetime.time(hour=0, minute=0, second=0))
Note we use the
combine() method of datetime.datetime in order to combine a date with a time. The default time you use will vary depending on your application. The above example could be improved perhaps by using replace() to put the result into the user's timezone, so that it logically represents a real date from their perspective.That concludes the topics I wanted to cover in this post. I plan the next in this series to talk about the pytz module and how to get your hands on prewritten and maintained tzinfo objects.
Supporting Timezones in Google App Engine
One of the features in the latest iteration of My Web Brain is timezone support. Google App Engine does not provide ideal support for timezones. DateTimeProperty objects with timezone information are converted by the datastore into UTC. When datetime objects are retrieved, no timezone information is present.
Many people using python with Google App Engine are old python hands. Some have experience with implementing timezone support. I am neither, so after some research I and experimentation I thought I would write about how I managed to build timezone support into my application. Apologies for any serious errors - I hope someone corrects me.
Note: All of my examples are taken from a machine running Windows 7.
I intend to cover the subject in a series of posts:
Many people using python with Google App Engine are old python hands. Some have experience with implementing timezone support. I am neither, so after some research I and experimentation I thought I would write about how I managed to build timezone support into my application. Apologies for any serious errors - I hope someone corrects me.
Note: All of my examples are taken from a machine running Windows 7.
I intend to cover the subject in a series of posts:
- Part 0: The Starting Point - What you get out the box from Google App Engine and the Base Python Environment.
- Part 1: Getting Timezone information with pytz
- Part 2: Making the Google App Datastore more consistent by writing a UtcDatetimeProperty class
- Part 3: Handling timezones throughout your application
Subscribe to:
Posts (Atom)