B2.0.0 background ================= THIS WE NEED TO MAKE APPENDIX A. We can canabalise as we like for other sections. But the Message from Stockholm is, 'don't give them a background like this. The evaluators get bored' The relevance of Python ----------------------- Python is a portable, interpreted, object-oriented Very-High Level Language (VHLL). Its development started in 1990 at CWI, Centrum voor Wiskunde en Informatica, the National Research Institute for Mathematics and Computer Science in the Netherlands. Its principal author is Guido van Rossum, a Dutch citizen currently living in the United States. It is an Free/Open Source language. The most recent version of the language is Python 2.3, released under the Python Software Foundation License, which is approved by both the Open Source Initiative, and the Free Software Foundation. FIXME: Include this as a reference. http://www.opensource.org/licenses/PythonSoftFoundation.php Python has tens, perhaps hundreds of thousands of active developers worldwide, which makes it one of the top ten most popular programming languages in the world. [FIXME Footnote -- see section XXX and Appendix XXX for support for this assertion]. Some other languages, specifically C, C++, Java, Perl, and Visual Basic, have even larger user bases but they are either proprietary or rather low-level languages. On the other hand, the languages which most excite the Computer Science Research community -- Self, Lisp, Haskell, Limbo, ML, and so on -- are nowhere on this list, yet they are the targets of most European academic research and innovation. Thus European economic competitiveness suffers. The innovative research lives in academia, trapped in languages that are rarely used for commercial development. Of those more popular languages, two, Java and Visual Basic are proprietary. Sun Microsystems owns Java, and Microsoft owns Visual Basic. Any company which writes its software in Java or Visual Basic is at the mercy of these large American companies. And this is a real, not theoretical, threat. In 2002, Microsoft announced that it would no longer be supporting Visual Basic 6.0 after the year 2005. All Visual Basic Developers have been told to convert their code to run under Microsoft's new .NET framework. In 2001 Microsoft immediately stopped supporting its Visual J++ language, meant to be a direct competitor with Java, after settling a lawsuit with Sun Microsystems. Microsoft is making these decisions because they make business sense for Microsoft, regardless of the effects on businesses who develop software using Microsoft proprietary software. European SMEs are moving to Free/Open Source platforms ------------------------------------------------------ In the face of these threats to the very survival of their businesses, European SMEs are moving to Free/Open Source languages such as Python. In the year 2002, a group of SME's who rely on the Python programming language came together to form the Python Business Forum (www.python-in-business.org), at EuroPython, the European Python Community Conference (www.europython.org). FIXME! include the bylaws of the PBF, its Board of Directors and the like as an Appendix. Advancing the Python platform ----------------------------- While each SME member of the Python Business Forum has sufficient faith in the Python programming language to use it for the development of its own projects, it was agreed that there are defects in the current implementation of the language. The two most often cited was that the Python was too large for embedded applications and applications designed for handhelds, and that the interpreter itself ran too slowly. The developers of embedded systems intend to run on tiny machines would like a language with a 'smaller footprint'. They would like to strip out everything which they do not need from the language and run with the bare-bones minimum. This is hard to do in any language, and Python was not implemented with this goal in mind. Another goal that was not paramount in Python's design and implementation was execution speed. Python is a dynamically-typed, late-binding, interpreted language. While this proved to provide extremely productive development environments, execution speed sometimes is not fast enough. Today, optimisation of high-level languages must be done at run-time, and is notoriously more difficult to optimise than statically typed, early-binding compiled languages such as C or C++. Now a number of people and factors played together to start what is now one of the most promising high-level-language projects. Some high-profile research -------------------------- Independently some researchers who worked with Python were pondering writing an implementation of Python in Python itself. This group included Armin Rigo, author of Psyco http://psyco.sourceforge.net/introduction.html, a specialising Just-In-Time compiler for Python. He is intimately familiar with both Python internals and advanced research in compilers and runtime systems, and saw a Python implementation in Python itself as a chance to put the two fields together. It is useful to quote from his webpage which states the goals of his Psyco in full: My goal in programming Psyco is to contribute to reduce the following wide gap between academic computer science and industrial programming tools. While the former develops a number of programming languages with very cool semantics and features, the latter stick with low-level languages principally for performance reasons, on the ground that the higher the level of a language, the slower it is. Although clearly justified in practise, this belief is theoretically false, and even completely inverted --- for large, evolving systems like a whole operating system and its applications, high-level programming can deliver much higher performances. The new class of languages called 'dynamic scripting languages', of which Python is an example, is semantically close to long-studied languages like Lisp. The constraints behind their designs are however different: some high-level languages can be relatively well statically compiled, we can do some type inference, and so on, whereas with Python it is much harder --- the design goal was different. We now have powerful enough machines to stick with interpretation for a number of applications. This, of course, contributes to the common belief that high-level languages are terribly slow. Psyco is both an academic and an industrial project. It is an academic experiment testing some new techniques in the field of on-line specialization. It develops an industrially useful performance benefit for Python. And first of all it is a modest step towards: High-level languages are faster than low-level ones! FIXME Footnote to: http://psyco.sourceforge.net/introduction.html Although Armin Rigo proved with his 'Psyco' project that higher level languages can actually be optimized to be as fast or faster than C, he was very limited by the fact that Python is itself implemented in C like almost all other languages in industrial use today. There was no real-life project who tried to go all 'optimize high-level down to machine-code' way. Christian Tismer with his 'Stackless' project had already come to a similar conclusion from a more industrial viewpoint in that it is difficult to advance language technology while relying on a large C-code base. FIXME Footnote to: http://www.stackless.com/ Some mailing list discussions ----------------------------- On the German Python mailing list an independent discussion evolved where developers and academics were pondering about the possibility of developing a 'minimal' Python implemented in Python itself. Most noticeably Christian Tismer author of an industrial-use extension ('Stackless') of the Python language noted in a postscript to one of his mails that having a minimized version of Python could provide a new base to advance the language. Nevertheless, his extensions to the language proved to be useful for companies who needed a way to have millions of active objects and he had a branch of CPython to make this possible. Some organizational experience ------------------------------ Meanwhile Holger Krekel had joined the Python community in 2002. For some years he had designed the architecture for platforms and consulted for CEO's of some large banking centers in Europe. While participating in two 'coding sprints' of the Zope3 web-platform (the to-be successor of the successful Zope web-platform) he realized that agile 'sprints' in combination with the rapid development language Python provide an extremely productive way of communicating about and coding complex projects where traditional, slow-moving methods often fail. At a Sprint a group of people assemble together to write code and practice Agile software methodological techniques, such as Pair-Programming. Not only is this a lot of fun, but it is a way to transmit knowledge and enthusiasm throughout the community. ASK_STOCKHOLM DO we need to discuss Sprints more? Holger Krekel, seeing the opportunity to launch the PyPy project with Armin Rigo and Christian Tismer offered to organize the first one-week meeting, the 'Sprint towards a minimal Python'. Soon many interested developers joined and intense academical and practical planning ensued. Just a few weeks later the Sprint took place at 'Trillke-Gut', a castle-like building in Germany, bringing together some key developers, among them Michael Hudson, the release manager of version 2.2.1 of Python. Here is the mail that started this now rapidly evolving project http://starship.python.net/pipermail/python-de/2003q1/002925.html Some promising open-source development tools -------------------------------------------- From the beginning, the PyPy developers were committed using and integrating the most promising open-source technologies. Jens-Uwe Mager, the retired CTO from Helios (http://www.helios.de) attended the Sprint and helped set up a state-of-the-art open-source development environment. With his 12-year experience of setting up and leading a SME-company which is one of the worldwide leaders in print-preprocessing technology he helped organise the development and net-connectivity for the various web services needed by the PyPy developers. Research, pragmatism and industry experience combined ----------------------------------------------------- This combination of research, applicability and industrial experience was precisely what the Python Business Forum members had been looking for. Academics, developers and practitioners conversant in the latest in language and platform architecture who were interested in producing the Python interpreter which they needed. Laura Creighton and Jacob Hallén from the PBF attended the sprint and began participating in the project. The group decided soon to test their proof of concept by developing a working prototype and to test their ability to work together. They would meet for Sprints in their own private time, and work on the prototype. After the first Sprint at Trillke-Gut in Hildesheim, Jacob Hallén and Laura Creighton organized the next Sprint at AB Strakt in Gothenburg, Sweden. At this Sprint, Samuele Pedroni, the lead developer of Jython (a tight industrial-use integration of Java and Python), joined the project not only because such a project was relevant to his own computer science research interests, but also to see if PyPy would help with his own java-integration project. Prototyping went quickly, with the work of all these experienced people, and by the end of the week you could already run simple Python programs within PyPy. PyPy had gone from being 'a nice idea' to 'something we knew we could do'. The third sprint was organized by interested developers in Belgium at the University in Louvain-La-Neuve, Belgium and held June 20-24. We invited Guido van Rossum, the inventor of Python to attend. He not only attended the Belgium sprint but announced a few days later at the EuroPython conference that PyPy had a high priority on his list of 'dreams he hoped would come true' and he enjoyed sprinting with us a lot. By the end of the third one-week sprint at the University in Louvain-La-Neuve, the PyPy project had already produced a fully working prototype. There existed a working interpreter and standard implementation of 90% of the python types, as well as advanced language features such as nested scopes, generators and metaclasses. Armin Rigo and Guido van Rossum had started work on a type-inference engine which is the foundation for the final missing step: generating a native machine-level version of Python from its high-level PyPy-implementation. Perhaps most importantly, we had the enthusiastic support of the Python community at the EuroPython conference. No work had been done on actually optimising the code so it ran around 30,000 times slower than the existing CPython-implementation but this was expected from the start. Nevertheless, for a proof of concept, approximately four weeks work total for a group of about a dozen people, it was clearly a success. It was time to look for funding. < While some people suggested to apply to DARPA for funding we realized that most of us were rooted in Europe and it would make more sense to look for European possibilities. I don't think that this is wise. Why remind them that we could get somebody else to fund it, maybe? Besides, everybody on the list of people they are getting are in Europe, not just most ... > On June 17th, the 2nd Call of the Information Society Technologies [IST] Priority went out. Included in it was IST-2002-2.3.2.3 - Open development platforms for software and services. We believe that what we intend to do is a perfect fit for the goals of this call.