Dave's Page

Author: Dave Kuhlman
Revision: 1.1h
Date: April 08, 2012
copyright:Copyright (c) 2004 Dave Kuhlman. This documentation is covered by The MIT License: http://www.opensource.org/licenses/mit-license.
abstract:Open Source software projects by Dave Kuhlman. These projects are implemented in or for Python. These projects center around XML, parsing XML, etc. They provide tools for building data mapping and Web services. Keywords are: python, xml, editor, text processing, python training.


1   Caution

Work in Progress.

You are entering an alpha-ware zone.

2   Training Documents for Python

Here are several documents that are intended as part of a self-training course and as materials for Python training:

2.1   Python 101 -- Beginning Python

Beginning Python programmers and Pythonista-wannabe's can start with "Python 101 -- Beginning Python". It is now included as Part 1 in A Python Book.

A version converted to ASCII text is at Python 101 at ASCII-World

2.2   Python 201 -- (Slightly) Advanced Python

Look at "Python 201 -- (Slightly) Advanced Python" for several slightly more advanced topics on Python programming. It is now included as Part 2 in A Python Book.

2.3   A Python Book

I've also combined material from Python 101 and Python 201 and a Python Workbook that I have been workin on. You can find this combined book on Python here:

A printed copy is available at Lulu.com.

2.4   Proposed Python courses and self-training documents

Here is a summary of several courses that I am prepared to teach: Python Course Descriptions.

And, here are more detailed course outlines of these courses:

More self-learning documents -- Here are severl documents that might help:

  • A "Python Workbook" has lots of exercises and solutions along with lots of sample code. It is now included as Part 3 in A Python Book.
  • This document is a bit old, but may help users of SciPy: SciPy Notes and Guide.

3   Python Software and Python Extensions

3.1   generateDS.py -- Generate Python data bindings from XML Schema

generateDS.py generates Python data structures from an Xschema document. It generates a file containing: (1) a Python class for each element definition, (2) a parser (using minidom from PyXML) for XML documents that satisfy the Xschema document. The class definitions contain: (1) a constructor with initializers for member variables, (2) get and set methods for member variables, (3) a 'build' member function used during parsing to populate an instance, (4) an 'export' function that will re-create the XML element in an XML document.

The distribution now contains a SWIG subdirectory containing support for processing the XML documents that describe an interface which are generated by SWIG 1.3 (with SWIG's "-xml" switch). It can serve as a reasonably extensive example of the use of generateDS.py and can also be used as a basis for building processors for the XML output from SWIG.

Here is some documentation on generateDS.py.

You can find the source distribution here:

A bit of documentation and a sample/test file is included.

Here is an analysis document that compares the use of generateDS for performing transformations on XML documents with the use of XSLT for this same purpose.

I've also compared the use of generateDS.py with the Gnosis/objectify library. You can find this Gnosis/objectify comparison document here.

generateDS.py is used extensively in my work on FSM/REST. See, for example, my work with AOLserver and FSM/REST.

3.2   docbook2odf_py

This Python script is a wrapper for the XSLT stylesheets in the docbook2odf project. These stylesheets transform DocBook documents into OpenDocument ODF files (.odt, .odp, ...).

You can learn about docbook2odf here: http://open.comsultia.com/docbook2odf/

And, you can obtain docbook2odf.py here: docbook2odf_py.

3.3   PySgrep

I'm working on Python wrappers for Sgrep, a tool that searches structured text. Here is more information about PySgrep.

You can build PySgrep by unrolling the original sgrep distribution (sgrep-1.94a.tar.gz), then unrolling pysgrep-1.1a.zip on top of it. (If you do not have unzip, use pysgrep-1.1a.tar.gz instead.) See pysgrep.html or README_pysgrep for details.

3.4   libxml_saxlib

libxml_saxlib is a Python extension module that enables you to use the SAX interface of libxml in order to parse XML documents.

You can read about libxml_saxlib at libxml_saxlib.html.

And, you can find a version that will build on Linux using Python's Distutils libxml_saxlib-1.1a.tar.gz.

3.5   libxml_domlib

libxml_domlib is a Python extension module that enables you to use the DOM interface of libxml in order to parse XML documents. You can read about libxml_domlib libxml_domlib.html. And, you can find a version that will build on Linux using Python's Distutils at libxml_domlib-1.2a.tar.gz.

3.6   libxsltmod

libxsltmod is a Python extension module that enables you to use libxslt to perform XSLT transformations from Python scripts. You can read about libxsltmod at libxsltmod.html.


3.7   rxpop -- SAX support for PyXML built on RXP

rxpop in intended as an alternative to pyexpat and sgmlop parsers. It is an alternative in the sense that the parser driver is implemented in C. It is an interface to the SAX parser in RXP, which is available at http://www.cogsci.ed.ac.uk/~richard/rxp.html.

A few notes on rxpop are at rxpop.html.

And, rxpop itself is at rxpop.zip.

3.8   libxmlop -- SAX support for PyXML built on libxml

libxmlop in intended as an alternative to pyexpat and sgmlop parsers. It is an alternative in the sense that the parser driver is implemented in C. It is an interface to the SAX parser in libxml2, which is available at http://xmlsoft.org.

A few notes on libxmlop are at libxmlop.html.

And, libxmlop itself is at libxmlop.zip.

3.9   Tree support for libxml using SWIG

pytreeswiglibxml provides SWIG generated wrappers for the tree support available in libxml2 which is available at http://xmlsoft.org.

First several qualifications -- There is tree support for libxml which comes with the libxml2 distribution. That is the official support, and the effort that has gone into it is much more extensive than what is describe in this document.

Still, the tree support provided here does show what can be done with SWIG. It shows that we can very quickly produce extensive and usable support for a large C library using SWIG.

There are restrictions and limitations:

  • There are some types of nodes that either must be avoided or, when used, must be used in a restricted way. Basically, to use this wrapping you will need to be aware of the different types of tree nodes implemented by libxml and the capabilities and restrictions of each type of node.
  • Some memory management must be explictly preformed. In particular, when you are finished with a document (tree), you should call xmlFreeDoc() and not use the document or nodes in it after you have done so.

On the positive side, this implementation does satisfy several desirable requirements:

  • The tree, the nodes in it, and the connections between those nodes are represented in C, not in Python. This means that creating the tree is quite fast and does not take up space for Python objects. And, because the links between nodes are represented in C and not in Python, we do not have to worry about circular references between Python objects.
  • Python objects that wrap the tree and nodes in it (the shadow classes generated by SWIG) (1) are created as needed and (2) are destroyed when not needed (e.g. when the reference count reaches zero). A consequence of this is that we can walk a tree and, if we re-use and over-write the same variable, we will not keep Python objects for a large number of nodes.

Tree support for libxml2 generated with SWIG is available at pytreeswiglibxml-1.0a.tar.gz.

3.10   dtGenerator.py -- Generate Python data type implementation

Python enables you to define new data-types that can be manipulated from Python scripts. (See http://www.python.org/doc/current/ext/defining-new-types.html for instructions on how to do that.) One way to implement a new Python data-type is to copy the file Objects/xxobject.c in the source distribution of Python and then start replacing text and hacking on it. xxobject.c puts you many steps ahead of starting from scratch (many thanks to the Python development crew for providing it), but in my work on libxml_domlib, I had to implement several data-types and even starting with xxobject.c became tedious.

So I wrote dtGenerator.py to do some of the work for me. Basically, dtGenerator.py begins with a template that is very similar to xxobject.c, then does some of the replacement for you. It also generates skeletons of "getter" functions.

You can read some notes about dtGenerator.py in dtGenerator.html.

You can find a copy of it at dtGenerator.zip.

3.11   SWIG XML -- Generate XML (for Python) from SWIG

Note: SWIG (since at least version 1.3.15) contains built-in support for generating XML. And, this support is more extensive that the SWIG extension described here. My recommendation is that the built-in support be used. The SWIG XML support described here may be of interest if you need to learn how to extend SWIG.

This package provides an extension to SWIG that enables SWIG to generate XML (instead of Python code, Perl code, Java code, etc). The generated XML code can serve as input to a code generator (possibly written in Python or a code analysis system.

Also included in this package is a Python module that can parse the XML output from the SWIG extension and create a tree of Python objects that represent the SWIG XML. Note that this Python module was generated by generateDS.py.

In order to use this SWIG extension, you must download the CVS development version of SWIG.

You can read more about swigxml at swigxml.html.

You can find the files needed to build this extension to SWIG at swigxml-1.0a.zip.

SWIG 1.3 now provides the ability to generate XML documents. You should consider that support more official than what I have implemented. It's output is very extensive, and you will not have to build anything extra. If you decide to use it, you may want to look at the SWIG sub-directory in distribution of generateDS.py -- Generate Python data bindings from XML Schema, which contains Python support for processing the XML output from SWIG 1.3. And, if you use the XML capability in SWIG 1.3, be sure to send a message to the SWIG group thanking them and encouraging them to keep supporting it.

3.12   Data Mapping

I'm working on solutions to the problem of mapping (and converting) XML documents onto Python data structures and back. Here are some results from that work.

3.12.1   XSLT transformations

This technique uses XSLT to transform an XML document into a canonical XML document, and then to load that XML document into Python data structures.

You can learn more about this technique in this document on data mapping transforms

And a sample of how to use it is at XsltDatamapping.zip.

3.13   A Parser for RELAX NG Compact Syntax

I've implemented (most of) a parser in Python for the RELAX NG compact syntax.

It's written in Python and uses PLY (yet another implementation of lex and yacc for Python). It produces a parse tree whose nodes are instances of a class ASTNode, which is defined in the parser module.

It's recognizes most but not all of the compact syntax, but, hopefully, recognizes enough to make it useful, and can be extended when necessary.

You can find documentation on the parser here: http://www.rexx.com/~dkuhlman/relaxngcompact.html

And, you can find a distribution file here: http://www.rexx.com/~dkuhlman/relaxngcompact-1.0a.tar.gz

3.14   A Generator for Adapters/Wrappers for Java Code

generate_wrappers.py generates support files that enable Python to use the classes and methods in a Java source code file.

You can find documentation here: http://www.rexx.com/~dkuhlman/generate_wrappers.html.

And, the distribution is here: http://www.rexx.com/~dkuhlman/generate_wrappers-1.0a.tar.gz.

4   Text Processing

4.1   ODF writer for Docutils

rst2odt.py is a writer for Docutils that translates reST (reStructuredText) into an ODF (Open Document Format) .odt file which is usable with the OpenOffice.org toolset.

Documentation -- You can learn more about odf-odt writer here: documentation on odf-odt writer for Docutils. Or, at the Docutils project here: Odt Writer for Docutils -- http://docutils.sourceforge.net/docs/user/odt.html.

Distribution -- odf-odt writer is available in the Docutils snapshot and from the Docutils Subversion repository. See: Docutils

4.2   ODF writer for the Silva CMS

silva2odt translates the pages and folders exported from the Silva Content Management System to ODF (Open Document Format) .odt files which are usable with the OpenOffice.org toolset.

Documentation -- You can learn more about silva2odt here: documentation on silva2odt.

Distribution -- The distribution file is here: source distribution of silva2odt.

silva2odt is also available via Subversion from the Silva repository under silva2odt. See: Silva download area..

For more on Silva see: The Silva Content Management System.

4.3   A Docutils writer for the Documenting Python system

I've written a reStructuredText writer for use with the Python project's documentation tools. It is intended to be used as part of the Docutils tool set.

A brief introduction is at rstpythonlatex_intro.html.

And, there is a distribution file at rstpythonlatex-1.0b.zip. The distribution contains a README and a bit of additional documentation.

4.4   Python LaTeX Setup Information

I frequently use the "Documenting Python" system for producing documentation on Python topics. This system translates LaTeX into various viewable formats. So, I've written some documentation and some support on how to setup for processing documents with the Python LaTeX documentation system.

I also use reStructuredText (reST) to create the LaTeX files that I feed to the "Documenting Python" system. In order to do so, I've extended Docutils with the ability to translate reStructuredText to Python LaTeX. You can learn more about Docutils and reStructuredText at the Docutils home page.

Here is documentation that describes how to do the set-up. needed for this processing.

And, here is a distribution file for Python LaTeX setup. that contains the source document, Makefile, etc.

4.6   Macros for the JED Text Editor

JED is a powerful but light-weight text editor. I've used a variety of text editors, and JED is my favorite.

This document has a number of macros that I find especially useful. You can find it here: Macros for the JED Text Editor.

5   More Python Stuff

5.1   Python Comments

I've written various notes about Python and Jython and Training.

5.2   A Python XML FAQ and How-to

I've written a small document which might help you get started on processing XML with Python. You can find it at http://www.rexx.com/~dkuhlman/pyxmlfaq.html.

5.3   SciTE Python Properties

SciTE is a very nice text editor for editing Python code on both Linux and MS Windows. It has lots of features. However, I've found that I have had to customize the Python properties file so that SciTE will use 4 spaces and no tabs for indentation.

Here are a few lines of code that you can copy and paste into your python.properties file in order to get this behavior. Add them below the lines that define file.patterns.py, which is near the top of python.properties:

# Use standard Python indentation and block comment characters.

Also, look for an occurance of comment.block.python later in the file. It might look like the following:


You may want to comment that out, because it will override the earlier definition (which you just copy and pasted from the above). The string "##" is a more common block quoting character sequence, I believe.

6   Utilities and miscellaneous information

6.1   zip-ls -- A Zip file listing program written in Python

I've implemented a Zip file listing program that gives me some of the listing and formatting options that I've wanted from unzip -l and unzip -Z. It's written in Python using the zipfile module from the Python standard library.

Documentation on this program is at zip-ls.html.

And, there is a distribution file is at zip-ls.zip

6.2   Computer Assembly How-to

I've assembled my own computer. And, it works.

So, I've written a document that attempts to help you assemble a computer from components such as a case and power supply, CPU, motherboard, hard disk drive, etc. You can find this document here: Computer Assembly How-to.

6.3   Installation of IPTables-firewall on Debian

I've installed Arno's IPtables-firewall on my gateway machine. It provides a firewall and also does NAT (network address translation) and IP masquerading. So it both protects the machines on my small sub-net and gives them access to the Internet.

You can find instructions on how to install this firewall on a Debian system (Libranet Debian GNU/Linux, in my case) here: Installation of IPTables-firewall on Debian.

7   Pylons

Here is a quick start document on Pylons: Pylons Quick Site Development.

8   Zope, CMS, CPS, etc

Zope is powerful, but has a long, steep learning curve. This sections has documentation and support for Zope.

8.1   CPS

These documents offer support on building sites with CPS.

8.1.1   Notes on Customizing a CPS Site

I'm working through the process of customizing a CPS site and developing an application with CPS. You can read notes on this here: Notes on Customizing a CPS Site.

8.1.2   A Workflow Implementation Procedure

This document contains notes on how to implement a business process as a CPS workflow. You can read it here: A Workflow Implementation Procedure.

8.1.3   Understanding and using the CPS Remote Controller

Here is a document that explains CPSRemoteControl, which is a CPS product that enables you to manipulate your CPS site using XML-RPC. You can read it here: Understanding and using the CPS Remote Controller.

9   Applications and Samples and Documentation

9.2   Support for AOLserver and PyWX

9.2.1   AOLserver and PyWX

I've been exploring the use of AOLserver, PyWX (Python on top of AOLserver), and PostgreSQL (with AOLserver and Python).

Here is a how-to document on my experiences with AOLserver. You will also find sections on using the Quixote templating language and on using Quixote with AOLserver and PyWX.

9.3   Support for Quixote and REST Etc

You can find my writings about Quixote and REST and so on here: http://www.rexx.com/~dkuhlman/quixote_index.html.

9.4   Amazon Web services

Amazon.com has a Web services interface. It supports two styles, one of which is XML over HTTP, which is REST-like. Here is a bit of support for that XML over HTTP, REST-like interface to Amazon Web services. It helps you to parse and process the XML response documents from Amazon.com.

Here is documentation on Amazon Webservices support.

And, the code is at amazon_ws_support-1.0.tar.gz.