Cloud Computing Economies of Scale

Just watched a very interesting session on Cloud Computing by James Hamilton from MIX10.

http://channel9.msdn.com/events/MIX/MIX10/EX01

Really interesting about why you should use cloud computing instead of buying servers. Personally I already run my personal stuff in EC2, S3 and GAE and I will not go back to running an SMTP-server in the closet.

Also my ISP had a 10 day outage this summer and emails don’t like that 😉

Google App Engine ReferenceProperty and HTML5 local storage

The best thing with my job is that I work with the same things that I can spend hours doing in my free time. Too bad you don’t have 40 hours a week free time.

It’s been a while but I have finally made som progress.

I had some troubles with BigTable (the database that you use in Google App Engine). I put pretty large arrays with weather data in db.BlobProperty but when I read this back from the database GAE ran out of memory, even if I didn’t touch the blob. After reading up on this I found out that I had to use db.ReferenceProperty.

As always the manual is not that clear so here is some example code:

class ForecastData(db.Model):
    values = db.BlobProperty()

class Forecast(db.Model):
  firstGridPoint = db.GeoPtProperty()
  lastGridPoint = db.GeoPtProperty()
  increment = db.FloatProperty()
  parameter = db.StringProperty()
  forecast_data = db.ReferenceProperty(ForecastData)
  reference_time = db.DateTimeProperty()
  forecast_time = db.DateTimeProperty()
  insert_time = db.DateTimeProperty(auto_now_add=True)

I put my blob in a separate model and referenced it with a db.ReferenceProperty(ModelName). Below is an example for putting data in the Data Store.

    # Create the data object
    forecast_data = ForecastData()
    forecast_data.values = values

    # Put in in the database
    forecast_data = forecast_data.put()

    # Create the forecast object
    forecast = Forecast()
    # Reference the data (forecast_data is a key)
    forecast.forecast_data = forecast_data

And getting the data is done like this:

query = db.GqlQuery("SELECT * from Forecast where forecast_time=:1", forecast_time)
forecast = query.fetch(1)
if forecast:
    forecast_data = Forecast.forecast_data.get_value_for_datastore(forecast[0])
    forecast = ForecastData.get(forecast_data).values

I get the forecast object from the database with a GQL query. The referenced property can be fetched with the get_value_for_datastore method.

After this the application is much faster.

To minimize the data transfered I’m using HTML5 local storage (a very good guide to html5 can be found here).

To put something in the local storage:

window.localStorage.setItem('key', value);

and to get it back (even if the browser have been closed):

window.localStorage.getItem('key');

This is a very simple key/value store. Other useful commands are clear() which clears all saved values.

I’m hoping to launch the site for others to try out very soon but I want to get some more features in place.

Until then here is an up-to-date screenshot:

Weather and clouds

Ever since I wrote my master thesis I have had an interest in weather. My thesis was about metadata updates in a real time database at the Swedish Meteorological and Hydrological institute. The thesis was mainly about weather stations that deliver observations used when running weather models.

After my thesis I ended up as a consultant but for the last 2,5 years I’m back at SMHI doing various work, mostly backend. I have worked with real time deliveries of weather along with processing, visualisation etc. I worked with Diana (diana.met.no), a weather workstation software developed by Met.no.

Weather is a good thing since it is to interest to almost everyone. Telling someone that you’re a system developer is not a good conversation starter but saying that you work with weather will at least lead to a couple complains about how bad the forecasts are. From a system engineering point of view weather is good since there are huge quantities of data to process and it is a good source for testing new technologies.

Last fall I did a couple of tests using Python + CUDA + GRIBAPI for reading grib data (gridded binary is the format used for weather) and using CUDA to run marching squares and render images. As always there are to little spare time to finish all the small projects but at least I got a feel for CUDA and I hope to be able to use it in the future.

On to the clouds. 2 years ago I went to EclipseCon and learned a lot about EC2, both the founder of Smugmug and Stackoverflow praised it so after coming home I started using it at work. Today we have a couple of machines running in EC2 together with S3 for storage and RDS for databases. My primary server at home has handed over a lot of the work to an free tier EC2 instance.

I started using Python last fall and I like it a lot, being a shell scripting ninja I like regular expressions and {ba,k,c}sh but reusing scripts is not that easy but Python is perfect for this.

Getting to the point, my last project is to learn Google App Engine and I will do this by building a weather viewer using free weather data from NOAA. Python and probably some shell scripting will download the data and upload it to GAE and then I will use Django to render a site with Google Maps and my weather on it.

I want to build a free service that display the raw weather and I will build some kind of API so that people can get forecast values for latitude and longitude points.

Why? Mostly to learn Google App Engine but also since I know that there are people in need of such an API.