[z3-checkins] r30826 - in z3/deliverance/branches/packaged/deliverance/tests/test-data: . etc
ianb at codespeak.net
ianb at codespeak.net
Mon Jul 31 22:17:55 CEST 2006
Author: ianb
Date: Mon Jul 31 22:17:50 2006
New Revision: 30826
Added:
z3/deliverance/branches/packaged/deliverance/tests/test-data/etc/appmap.xml
- copied unchanged from r30825, z3/deliverance/branches/packaged/deliverance/etc/appmap.xml
z3/deliverance/branches/packaged/deliverance/tests/test-data/etc/themecontent.xml
- copied unchanged from r30825, z3/deliverance/branches/packaged/deliverance/etc/themecontent.xml
z3/deliverance/branches/packaged/deliverance/tests/test-data/etc/themerules.xml
- copied unchanged from r30825, z3/deliverance/branches/packaged/deliverance/etc/themerules.xml
z3/deliverance/branches/packaged/deliverance/tests/test-data/intro.html (contents, props changed)
Modified:
z3/deliverance/branches/packaged/deliverance/tests/test-data/index.html
Log:
Move in rest of test content
Modified: z3/deliverance/branches/packaged/deliverance/tests/test-data/index.html
==============================================================================
--- z3/deliverance/branches/packaged/deliverance/tests/test-data/index.html (original)
+++ z3/deliverance/branches/packaged/deliverance/tests/test-data/index.html Mon Jul 31 22:17:50 2006
@@ -1,13 +1,11 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
-<html>
-<head>
-<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15">
-<title>Test file</title>
-</head>
-
-<body>
-<h1>Test file</h1>
-
-This is a test file!
-
-</body> </html>
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <title>Hello World Content Page</title>
+ </head>
+ <body>
+ <p>Hello world, banjos everywhere.</p>
+ </body>
+</html>
\ No newline at end of file
Added: z3/deliverance/branches/packaged/deliverance/tests/test-data/intro.html
==============================================================================
--- (empty file)
+++ z3/deliverance/branches/packaged/deliverance/tests/test-data/intro.html Mon Jul 31 22:17:50 2006
@@ -0,0 +1,268 @@
+<?xml version="1.0" encoding="utf-8" ?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+ <head>
+ <title>Deliverance</title>
+ </head>
+ <body>
+ <div class="document" id="deliverance">
+ <div class="section" id="content-deliver-for-cms-systems">
+ <h1>
+ <a name="content-deliver-for-cms-systems">Content delivery for CMS systems</a>
+ </h1>
+ <p>CMS systems, particularly in Zope, excel at the structured environment of content
+ <em>production</em>. This area places a strong emphasis on security,
+ workflow, metadata, and other content services.</p>
+ <p>For content <em>delivery</em> on public sites, though, some of this machinery is
+ overkill. The framework for content production has a nasty side effect of
+ killing performance for content delivery, making reliability and debugging a
+ challenge, and forcing other audiences (like web designers in charge of
+ look-and-feel) to learn another way to do things.</p>
+ <p>For this reason, many ECM packages make a formal distinction between content
+ production and content delivery.</p>
+ <p>Deliverance is a lightweight, semi-static system for content delivery of CMS
+ resources. It runs in mod_python, generating branded pages and navigation
+ elements, giving high-performance throughput to anonymous visitors. Its primary
+ benefits:</p>
+ <ul>
+ <li>High performance</li>
+ <li>Simple re-branding</li>
+ <li>Trusted stack</li>
+ <li>Extreme productivity</li>
+ </ul>
+ <p>It is focused on audiencces that want:</p>
+ <ul>
+ <li>Predictable delivery to anonymous visitors</li>
+ <li>Some portion of an airgap (logical/physical) between the CMS and the live
+ site</li>
+ <li>Integration with mainstream systems and technologies</li>
+ </ul>
+ <p>This document discusses how the system works, then revisits the benefits in
+ detail.</p>
+ </div>
+ <div class="section" id="overview">
+ <h1>
+ <a name="overview">Overview</a>
+ </h1>
+ <p>Deliverance has two major parts:</p>
+ <blockquote>
+ <p>o <em>Themes</em>. These apply a consistent look-and-feel to content that
+ streams through Apache. This content can be on the filesystem, in Zope with
+ mod_proxy, or using the other part of Deliverance. In a nutshell, a theme is
+ an HTML file (plus the CSS, images, etc.) containing boxes that get filled
+ by content.</p>
+ <p>o <em>Content maps</em>. A description of the content on a site, including
+ metadata and different organization schemes. The content map also has views
+ that, inside Deliverance, can generate HTML for navigation and other
+ purposes.</p>
+ </blockquote>
+ <p>Each of these can be used without the other.</p>
+ </div>
+ <div class="section" id="how-it-works">
+ <h1>
+ <a name="how-it-works">How It Works</a>
+ </h1>
+ <p>In a nutshell, Deliverance gets an XML map describing all the published content
+ at a point in time. It uses this map to draw navigation elements and issue HTTP
+ requests for content of single resources. Finally, a "theme"
+ provides the HTML to return with named boxes to be filled by rules.</p>
+ <p>Let's first introduce some major concepts, then walk a request through from start
+ to finish, using these concepts.</p>
+ <p>1) <em>Theme</em>. Web designers don't want to learn anything new. ZPT tried to
+ embrace this, but by the time the ZPT developer has injected all the tal and
+ refactored everything into macros, the web designer can't possible continue.</p>
+ <p>A theme is the corporate identity for a site. It is <em>not</em> a template, as
+ it has zero stuff in it beyond HTML.</p>
+ <p>A theme is created by saving the customer's home page and identifying the boxes
+ to be replaced. For example, <div id="sitemenu">
+ identifies a place where a generated menu should go. Web designers are familiar
+ with this, as CSS uses such selectors to apply style.</p>
+ <p>Deliverance has a ruleset that does the merge between the theme file and the
+ generated content. In essense, this rule file says: "Find 'site-menu'
+ in the theme and replace its children with the generated contents with an id of
+ 'generated-menu'."</p>
+ <p>The ruleset is under the control of the integrator, who bridges the gap between
+ what the CMS provides and what the UI designer makes available. This is done,
+ though, without touching anything on either side of the equation.</p>
+ <p>This allows the theme to be applied to all pages, without touching the pages. The
+ theme engine uses XSLT (via lxml) to perform the merge.</p>
+ <p>2) <em>Map</em>. By design, content delivery is separated from content
+ management. In fact, this separation acts as insulation. At various intervals,
+ the CMS makes an "edition" or a snapshot of its contents,
+ providing a map with metadata for all the contents that should be visible. (In
+ fact, the map could point at a certain <em>version</em> of a resource that
+ should be visible.)</p>
+ <p>This map is read by mod_python using lxml. It serves two functions:</p>
+ <p>a. <em>Navigation elements</em>. We can draw site menus and trees without
+ visiting the server. Since these can be done in XSLT, not only is the
+ performance very good, but we can draw many other kinds of pages. For example,
+ we can show all the contents modified in the last week, or all the contents in
+ France.</p>
+ <p>b. <em>Resource lookup</em>. The map controls how an incoming, virtual URL gets
+ mapped to a real resource. This means the URL space can be placeless. Content
+ can appear in multiple places. Equally, content from multiple CMSs, even
+ multiple remote hosts, can be integrated into the same map.</p>
+ <p>The identifier used for retrieving the content for the resource can be a normal
+ GET or a more complicated QUERY_STRING or even xml-rpc kind of lookup.</p>
+ <p>Note that the contents for many CMS resources are, in fact, very small amounts of
+ data. They could be cached inline in the content map and not looked up. For
+ frequent pages, this would provide a big win.</p>
+ <p>Finally, some resources in the map might be virtual, meaning the page can be
+ fully rendered in Deliverance. For example, a URL to show all the content with
+ the keyword of "CPS" can be serviced without a trip to the
+ server. All that is needed is an XSLT rule for generation. (Later, the XSLT
+ could be eliminated with a Python extension function in lxml.)</p>
+ <p>3) <em>Compilation</em>. The goal is high performance. There are certain aspects
+ that never change between requests:</p>
+ <ol class="loweralpha simple">
+ <li>The contents of the map.</li>
+ <li>The theme and the rule file for merging.</li>
+ <li>Site configuration, such as site menus.</li>
+ </ol>
+ <p>It makes no sense to re-parse DOMs and stylesheets on each request. Equally, it
+ makes no sense to have a multi-stage pipeline when several parts never change.</p>
+ <p>Deliverance gets a tremendous speedup by compiling the theme into a stylesheet.
+ It reads the XHTML file for the theme, identifies the nodes to be replaced, and
+ generates an XSTL with xsl:value-of and xsl:apply-template statements in the
+ right location. Compilation also inlines the map data into the XSLT so it
+ doesn't have to included later.</p>
+ <p>Compilation thus gives two benefits:</p>
+ <p>a. You can re-brand stuff without learning XSLT and without touching the HTML of
+ the theme file.</p>
+ <p>b. Most of the work needed for per-request transformations is done on startup.
+ Specifically, we avoid the 50ms hit that the "identity
+ transformation" pattern seemed to give.</p>
+ <p>4) <em>Retrieval</em>. mod_python has a Bobo-inspired publisher that walks the
+ URL, traversing Python objects using a set of rules.</p>
+ <p>Deliverance has a similar idea. The URL provides an identifier into the map file
+ to retrieve a map item. The map item then gives instructions on how to find the
+ content for the page and how to render it.</p>
+ <p>In most cases, some Python code will be issued to retrieve a page from the CMS.
+ For this, a very stripped-down skin will be used in the CMS, or perhaps no skin
+ at all. For example, the URL in the map file might request the DAV view of the
+ resource, thus giving just the data. For CMF-based systems, this is a 10x
+ speedup.</p>
+ <p>In other cases, the map might point to a virtual page, as discussed above.</p>
+ <p>The mapping provides some interesting possibilities for integration. First,
+ Deliverance could leverage Apache's infrastructure for retrieval and caching.
+ Second, libxml2 has several Python extension facilities (XPath functions, custom
+ resolvers) that allow the map to act as an integration facility. Simply put some
+ metadata on a map entry to make it look like a resource, with the actual
+ retrieval being done with custom code.</p>
+ <p>5) <em>Generation</em>. Deliverance does not have a parser of any kind. It uses
+ XSLT to generate HTML. As noted above, for important parts of usage, no XSLT
+ knowledge is required.</p>
+ <p>Using XSLT gives some benefits:</p>
+ <blockquote>
+ <p>o Extremely optimized.</p>
+ <p>o Extremely documented.</p>
+ <p>o Rich tool chain.</p>
+ <p>o Maintenance burden belongs to others.</p>
+ </blockquote>
+ <p>XSLT has a negative reputation. Thus, Deliverance works hard to allow people to
+ avoid using it, except when they need something custom. For example, navigation
+ boxes don't have to be generated by XSLT, they could be in the HTML lookup up by
+ the CMS and inserted into the theme.</p>
+ </div>
+ <div class="section" id="a-typical-request">
+ <h1>
+ <a name="a-typical-request">A Typical Request</a>
+ </h1>
+ <p>With that background, how does Deliverance work, end-to-end? The following
+ section starts with an Apache restart, finishing with the last byte returned to
+ the browser.</p>
+ <dl class="docutils">
+ <dt><a href="#id3" name="id4">
+ <span class="problematic" id="id4">*</span>
+ </a>Note: This describes how things will be, not how they currently are.</dt>
+ <dd>
+ <p class="last">lxml needs some more work for a couple of things mentioned
+ herein.*</p>
+ </dd>
+ </dl>
+ <p>First, Apache is started. In the conf file, there is a section that maps part of
+ the URL space to a mod_python handler. This handler is part of Deliverance.</p>
+ <p>When the handler module is imported, it performs some one-time optimizations on
+ startup:</p>
+ <p>a. Read the map file, the theme's XHTML, and the site configuration into XML
+ DOMs.</p>
+ <p>b. Read the "blank-compilerdoc" and the compiler stylesheet
+ into a DOM and a processor, respectively.</p>
+ <p>c. Merge everything from (a) into the blank-compilerdoc (later replaced by
+ XInclude).</p>
+ <p>d. Create a compiled theme processor by applying the compiler stylesheet against
+ the blank-compilerdoc. The output is, in fact, another XSLT stylesheet. Namely,
+ it is a "compiled" stylesheet, ready to be applied to each
+ incoming request while doing the least amount of work needed.</p>
+ <p>When a request comes in, Apache passes it off to the Python handler function in
+ Deliverance. The handler takes the relevant part of the URI and does an XPath
+ lookup in the map, grabbing the node referenced by this URI fragment. This map
+ node contains instructions for the next two steps:</p>
+ <blockquote>
+ <p>o Retrieve the contents.</p>
+ <p>o Format the contents.</p>
+ </blockquote>
+ <p>The handler then retrieves the X(H)TML for the contents and applies the compiled
+ stylesheet. The compiled stylesheet has a rule for handling anything unique
+ about that resource type.</p>
+ <p>The results are serialized and returned.</p>
+ </div>
+ <div class="section" id="performance">
+ <h1>
+ <a name="performance">Performance</a>
+ </h1>
+ <p>Since much of the information needed for rendering a requests is statically
+ contained in a specially-tuned, in-memory DOM, performance automatically gets a
+ boost. (This would be the same in Zope.)</p>
+ <p>The use of XSLT, especially compiled into a well-tuned state, gives another big
+ performance win. Many operations, such as drawing a tree or site map, fit the
+ XSLT pattern better than ZPT. Also, libxslt is a much more actively developed
+ project, used by 1,000x the number of people, than ZPT.</p>
+ <p>Memory usage is likely to be an issue. A content map with 400,000 entries could
+ occupy 150 Mb of real memory. However:</p>
+ <blockquote>
+ <p>o Few sites have 400,000 public resources.</p>
+ <p>o Those that do can afford a gigabyte of RAM.</p>
+ </blockquote>
+ <p>For requests that don't require a trip to the CMS, 130 requests/sec should be
+ expected.</p>
+ </div>
+ <div class="section" id="productivity">
+ <h1>
+ <a name="productivity">Productivity</a>
+ </h1>
+ <p>You can speed up a computer by buying a bigger box. How do you speed up a
+ programmer? Unfortunately, Zope has accumulated layers and layer of
+ idiosyncratic frameworks. Some of this is hidden from the integrator and web
+ designer, but some of it peeks through.</p>
+ <p>Deliverance is a massive increase in UI productivity. First and foremost, the
+ entire UI can be developed outside of the CMS, using static models on disk. As
+ long as the CMS returns XML that looks the same as the sample documents and
+ sample content map, everything should just work.</p>
+ <p>Second, this approach gives multiple tools in the toolchain. I like using Oxygen,
+ a cheap but amazing XML/XSLT authoring environment. I can edit the dynamic UI
+ with files on disk, press a button, and see what it will look like when
+ rendered. If there is an error, I get a useful (non-ZPT!) error message, with
+ the cursor sitting on the offending line. I even get a stepwise debugger, where
+ I can watch the output get rendered and set a breakpoint to see the evaluation
+ context.</p>
+ <p>Alternatively, someone can run an xsltproc command like this:</p>
+ <p>xsltproc compiler.xsl blank-compiled.xml | xsltproc - ../tests/sampledoc1.xml</p>
+ <p>...and see what the page will look like. Finally, the simple Python scripts in
+ Deliverance can be run from the command line to process real output in the map.
+ Each part of the process can be inspected to find the offending problem.</p>
+ <p>More generally, the XML+XSLT approach is fundamentally easier. In ZPT, the data
+ model is exposed via baroque, undocumented APIs appearing in TAL expressions. In
+ XML, you just look at the file and visually see the data model. XPath gives a
+ wonderful, simple, but powerful way to manipulate the data model. And although
+ XSLT is baroque, so is the messy pile of deconstructed macros and slots
+ appearing ad-hoc in most large-scale Zope apps.</p>
+ <p>This approach gives other kinds of productivity. For example, there are tons of
+ books, and Google has an answer to every question you might have. Why? Because
+ the installed base of XML and XSLT is four orders of magnitude higher than
+ Zope+CMF+ZPT+[CPS/Plone/Silva].</p>
+ </div>
+
+ </div>
+ </body>
+</html>
More information about the z3-checkins
mailing list