On software project scheduling
Next

Fredrik asked about my suspicious timings, and I'm a bit puzzled too, now. Well, anyway, with XML (stolen from some David Mertz article that benchmarked XML parsers) like this:
 
   64.172.22.154
   -
   -
   19/Aug/2001:01:46:01
   -0500
   GET
   /
   HTTP/1.1
   200
   2131
 
 
times 3000 (which equals around 1 megabyte), the following code prints out a time of around 3 seconds:
 from xml.dom.minidom import parse
 import time
 start_time = time.time()
 dom1 = parse('myfile2.xml')
 print time.time() - start_time
 
(And similarly, this parsing with elementree (w. Python parser) prints out a time of around 2.5 seconds:
 import time
 from elementtree import ElementTree
 t1 = time.time()
 tree = ElementTree.parse("myfile2.xml")
 print time.time()-t1
 
)

The similar parsing with Jython and Java's libraries takes around half a second, by the way. The difference 5-6 times to pure Python parsers versus difference of over 10 times in the previous timing. The previous Python minidom timing suffered from memory consumption and swapping, I think.

Filed under: