How to preserve XML::Simple element order
In the FAQ, it is mentioned that in the future Tie::IxHash could be use in XML::Simple. I am using v2.18 at the moment, is there any hack to preserve the order of the data in the hashref?
To the above question Grant McLean, the author of XML::Simple, answered:
Retaining element order is not and never will be a feature of XML::Simple. For some XML document types you might be able to hack it in by subclassing XML::Simple and overriding the new_hashref() method to supply a hashref tied to Tie::IxHash. That could solve the ABC case but it won't solve the ABA case.
The short answer is that if you care about element order then you should not use XML::Simple. XML::LibXML is an excellent alternative which for many use cases is really no harder to use than XML::Simple.
Sample XML file with ABA
In this example we have 3 elements inside the "order" root element. If I am not mistaken, that's what Grant referred to when he wrote 'ABA' case.
examples/order.xml
<order> <pizza> <name>Margarita</name> <size>large</size> </pizza> <drink> <name>Coke</name> <size>1.5 L</size> </drink> <pizza> <name>Funghi</name> <size>XXL</size> </pizza> </order>
Using XML::Simple we can read in this XML file and immediately print it out.
examples/xml_simple_order.pl
use strict; use warnings; use XML::Simple qw(XMLin XMLout); my $xml = XMLin('order.xml', ForceArray => 1, KeepRoot => 1); print XMLout($xml, KeepRoot => 1);
The result is this:
examples/order_out.xml
<order> <drink> <name>Coke</name> <size>1.5 L</size> </drink> <pizza> <name>Margarita</name> <size>large</size> </pizza> <pizza> <name>Funghi</name> <size>XXL</size> </pizza> </order>
The order of the elements has changed.
This might be acceptable in some cases, but in other cases the order of the elements might be important.
Dumping the XML as a Perl data structure
So why has XML::Simple messed up the order? The answer is in how it holds the data after reading the XML file. We can use Data::Dumper to see the content of the $xml variable:
examples/xml_simple_order_dump.pl
use strict; use warnings; use XML::Simple qw(XMLin XMLout); use Data::Dumper qw(Dumper); my $xml = XMLin('order.xml', ForceArray => 1, KeepRoot => 1); print Dumper $xml;
The output looks like this:
examples/order_dump_out.txt
$VAR1 = { 'order' => [ { 'pizza' => [ { 'size' => [ 'large' ], 'name' => [ 'Margarita' ] }, { 'size' => [ 'XXL' ], 'name' => [ 'Funghi' ] } ], 'drink' => [ { 'name' => [ 'Coke' ], 'size' => [ '1.5 L' ] } ] } ] };
Here we can see that XML::Simple has merged the content of the two 'pizza' tags into one hash. From this data structure we cannot rebuild the original order of pizza-drink-pizza.
Is this a bug in XML::Simple? You can say so, or you can say that it was a design decision. In either way Grant, the author of the module is not planning to change this behaviour and actually recommends agains the use of XML::Simple for new projects. Back when XML::Simple was created it was a good solution for a lot of problems, but today there are other solutions. He recommended XML::LibXML.
Let's see how this round-trip works using XML::LibXML.
Round-trip with XML::LibXML
examples/xml_libxml_order.pl
use strict; use warnings; use XML::LibXML; my $dom = XML::LibXML->load_xml(location => 'order.xml'); print $dom->toString();
Running this code we'll get the XML back exactly in the same order as we had in the original file.
(This article was rescued from CPAN::Forum)
Published on 2015-09-12