Build highly scalable applications with PyF

A few days ago, we released PyF 2.0.

- Yeah, that's nice and all, but what's PyF anyway?

Right, I should have probably started with that. PyF is a pure Python framework for writing highly scalable data processing applications. PyF is Free software, distributed under the terms of the MIT license.

- Scalable, you say? How scalable exactly?

Well, PyF is based on flow programming. That means that instead of processing « a certain quantity of data », we process a « flow » of data, so that at any point, we only ever have one object in memory, no matter how much data we will process in total. That's right, mining your huge customer database and generating reports with PyF will not take your servers down to their knees.

For the Python devs in here, we use Python generators everywhere. You can see it this way: each unit of the whole processing chain takes a generator as input and yields values as soon as they were processed. We could even handle a never ending flow of input data and keep processing them, yielding each one after the other!

- Wow, that seems cool! How do I use it?

It depends on what you want to do. PyF is composed of several layers. At the low level, you have only the basic subset of core functions that will help you write flow-based applications.

At the highest level though, you will find a full-blown web application that allows you to graphically design your processing chain (we call it a tube) by dragging and dropping processing units (we call them components) and chaining them, output to input. We have several default generic components that can be used to do all sorts of processing and reporting already, and it is pretty easy to write your own if necessary (we will gladly help in any case).We even have a built-in scheduler so you can specify when to automatically launch your processes!

We wrote a simple tutorial to get PyF, and a series of tutorials to actually make your first steps.

- OK, you got me hooked up. Where can I find more informations?

I'm glad you asked. :-)

The project home page is at http://pyfproject.org

We already have some documentation which should be more than enough to get you started, although we are working on making it more comprehensive.

If you have any question, come hang out on our mailing-list or our IRC channel:

PyF is still a very young project, but we have been dog-fooding it at work from day 1. We are very friendly and welcoming, so don't be afraid and join us, report bugs, submit patches, or simply use it.

On a more personal note, PyF has been the first big FOSS project I have worked on during $dayjob. I'm about to change job and life pretty soon, but don't worry, it's carried by some dedicated and much more talented people (two of them being core contributors to TurboGears). I will try to stay close to it since I grew quite attached to the project, and given how useful it can be I will no doubt be needing it for a future job anyway. :-)