This is a basic tutorial to introduce minidom
. In fact, this is not really a tutorial but just a few list of notes about this library as there is already a short but good tutorial about it online1.
Let us consider this XML data:
|
|
You can either save this as an XML file or consider it as a string. In both cases, you will need to import the library:
|
|
In the later case, copy/paste the XML data above:
|
|
And then parse it using parseString()
function which returns an XML Document object:
|
|
If you encounter this error message:
|
|
Then you have to remove the leading and trailing characters due to the copy/paste text operation:
|
|
In case you decide to save the data in an XML file (with .xml
extension) then you need to use parse()
to parse it:
|
|
Once the XML Document object xml_doc obtained, the rest of the operations remain the same for both cases.
Getting the root element
|
|
We have only two products as direct children of stock which is the root element in our XML Document xml_doc
. So we expect the number of children elements of root = xml_doc.documentElement
to be 2, but we get a different result:
|
|
To understand this weird result, you can inspect the output of root.childNodes
:
|
|
For example, let us display the tag name of the root’s child which has id == p1
:
|
|
One important thing to be aware of is that minidom is very memory consuming and I think this is due to the fact it is heavilty based on recursive functions. It can be handy only when processing small files of XML data.
An other thing to know is minidom does not support XPATH expressions.
To remedy to the two forementioned drawbacks, it is recommended to use the lxml
library.
The library is quite well documented2. In case you do not have Internet access to check the documentation, you can always the most important minidom available functions by calling the XML Document object you created this way:
|
|
And you can inspect an individual function (chosen from the list the forementioned line outputs) this way:
|
|
1.https://wiki.python.org/moin/MiniDom
2.https://docs.python.org/3.0/library/xml.dom.html