We’ve always been told to love our geeks – because they are the ones allowing us to publish whatever random thoughts we may have, any editorial content or strong analysis of the European Parliament. Poor editorial guys (like me) don’t go far with a pen and a paper, those days. So, here comes Mathieu’s first contribution to this blog. He’s our Facebook developer and he will now explain you how he developed our world acclaimed Facebook Chat application. Once you’ve read this post, you’ll understand why I always, always, always specify I am an editorial guy and NOT a technical one.
I was suggested to write a blog post about my experience here at the European Parliament, inside the WebComm unit and especially about the development of the Facebook Chat Application we just finished.
I don’t feel terribly confident about writing non-scientific content, especially with all those talented editors around me, but I like challenges and I thought it might be an interesting experiment.
So where to start? First of all let’s break some myths here. The European institutions and especially the European Parliament are not only composed of old-lazy-boring-suit-wearing guys. I mean I only discovered a small part of the European Parliament, which is the WebComm unit, part of the DG Comm. and all I could see where young, motivated, dynamic people. This is quite the opposite of my preconceived ideas and I really like to be surprised. Of course we have to face a little bit of administrative slowness for some things but in general I’m able to do my job in a smart, agile and constructive way. So that’s done.
About the Facebook chat application. I can say it has been quite a challenge for mostly 3 reasons :
- Real-time web application
- Agility needed
1) Real-time web application
Maybe it’s time to do a short (and incomplete) history of web development
Traditional web applications are not designed to be “real-time”. By “real-time” I mean that all the users connected to the application must be signaled all the changes as soon as they are made (in real-time) and not only when they refresh the page. Maybe it’s time to do a short (and incomplete) history of web development :
1) Static pages: at first they were static html pages. A guy would write HTML* code in some text files, save those in some directory. Other people could then connect to www.someserver.com, their browser would then start from the index.html file located at the root of the web directory, parse it and display it. The user would typically click a link and would then load another html file eg. contacts.html and see the contacts page of the site.
This had many disadvantages :
1) HTML is not easy to write for anybody, so you need specialists for any page produced
2) You had to write html for each and every thing, if you wanted a gallery with 200 items, you would have 200 html files or 1 gigantic one.
3) Content was really static, meaning static text, borders and pictures**
Trying to ameliorate, server (scripting) technologies were developed. In fact you don’t have to write every line of html of what the user sees. You can write code that generates HTML files. Let say you have a e-shop, you don’t have to create an HTML page for every article, you create an article template, an administration site where men can enter information about the products, you then associate urls to different products like www.eshop.com/products/1 www.eshop.com/products/2 would return an HTML file generated “on the fly” by the server using the template and filing in data from the administration site.
A solution could be to have each client connect the server every second or so to ask if there is a new message. The problem with this and that’s where we get to the second difficulty, is that “it doesn’t scale”! First let’s explain the term : “It doesn’t scale” is a shortcut for something like : it works for 5 users but it doesn’t scale up to 10000 users.
Why this method of having each client ping the server every second doesn’t scale ? Because if you have 10000 clients connected, that makes you 10000 connections by second to the server just for the “realtime” features, meaning that it doesn’t event include the users loading the page, interacting, liking and sending message and you get those 10000 connections per second even if nothing happen during this time.
After a lot of experimentation and researching I chose the following setup: nginx + uwsgi + python/django + APE =)
The solution needs to be able to send messages to the clients as soon as (and only when) a new message arrives. That’s were “server push” arrived, it’s actually an umbrella term to describe the possibility of sending data from the server to the client when new data arrives and not when the user refresh. Real push is called “http streaming” and is not supported by older browsers aka IE. So tweakers came up with a solution called “long-polling” the browser opens a connection to the server, the server wait to answer, the connections is kept open, when fresh data arrives, the server answer like he should have directly when the browser receives data, it handle it and then reopen a “long-polling” connection to the server. You can visualize all this in the Figure 1.
After a lot of experimentation and researching I chose the following setup:
nginx + uwsgi + python/django + APE =)
What are all these geeky terms ?
Nginx is a next gen webserver that’s capable of handling a ridiculous huge amount of connection on ridiculously small hardware. It is in my opinion a huge concurrent to the fat apache out there.
Uwsgi is a “fast, self-healing and developer/sysadmin-friendly application container server coded in pure C” and it executes python code that runs the django framework.
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. What this all means is that you can be a lot more productive with python than with other low-level compiled languages (because high level means you don’t care about the low level stupid stuff, and interpreted means that you don’t have to recompile the program after every change, it just runs directly)
Django is “The web framework for perfectionists with deadlines” or a high-level Python Web framework that encourages rapid development and clean, pragmatic design.
APE is Ajax Push Engine, one of the more mature libraries out there for doing server push, it can be configured to user different transport and is also written in C (read : it’s really fast)
So all the business logic and serverside was developed in django.
So now we have enough “nerderies” I think. Feel free to contact me if you have some technical questions.
That’s were I get to my last point : Agile development. We didn’t want to come up with the “wonderfullest chat on paper” that would turn unusable in practice. So we progressively developed features and tested it among us to feel and touch the thing, changing it multiple times. And that’s what agile is all about : small, realistic iterations that are validated, or in other words : if you do a lot of small steps you can only do some small steps in the wrong direction, but if you try to do a huge step, you can end-up doing a huge step in the wrong direction and lose a lot of time and money.
Agile thus equals programmers not programming wrong, but also not getting crazy during long periods alone in their desks. And small iterations, means frequent tests and meetings, and that’s where we get to the interesting part of all:
Humans working together to share with other humans their dream: democracy and freedom!
As a bonus, here is a video showing the coding process of our facebook’s chat application.
Your fellow Facebook developer.
* imagine HTML as language to layout text, images and frames on any screen
** picture were introducted in 1994