How to do an XSL transformation in Jython
An advice to beginning programmers

Parsing XML in Jython with Java SDK's libraries

In somewhat similar fashion, it's easy to parse an XML document to a DOM tree in Jython with Java SDK's libraries:
 from java.io import FileInputStream
 from javax.xml.transform.stream import StreamSource
 from javax.xml.transform.stream import StreamResult
 

from javax.xml.parsers import DocumentBuilderFactory

factory = DocumentBuilderFactory.newInstance() builder = factory.newDocumentBuilder()

input = FileInputStream("myfile.xml") document = builder.parse(input) document.getDocumentElement() # etc.

There's no specific need for this, because you can do DOM stuff in recent Python implementations too, but with my naive benchmarks the above Jython snippet clocked at roughly 2 seconds with a 3 megabyte XML file and a Python 2.3 minidom parsing [1] clocked at 26 seconds. (I am not saying that the stock Python 2.3 minidom parser is representative for Python XML parsers, but to give some reference for the speed.)

I am not, of course, suggesting that you should switch to Jython, but I wish I did demonstrate the usefulness of Jython as a scripting tool for various XML tasks.

[1]

 from xml.dom.minidom import parse
 dom1 = parse('myfile.xml')
 

[4 comments]

Filed under:

Comments:

Posted by Paul Boddie at 02.09.2004, 17:30

What about using xml.dom.javadom? Does that affect the performance in any way? I'm assuming it still works with more recent Java XML toolkits.

Posted by Jarno Virtanen at 02.09.2004, 22:23

Hmm. Need to check that out.

Posted by Jarno Virtanen at 06.09.2004, 17:25

I'm definitely _not_ sure that my timings are correct. :-)

I'll try to rethink the timings as soon as I have the time and the energy. Don't hold your breath waiting.