In the FAQ, it is mentioned that in the future Tie::IxHash could be use in XML::Simple. I am using v2.18 at the moment, is there any hack to preserve the order of the data in the hashref?
To the above question Grant McLean, the author of XML::Simple, answered:
Retaining element order is not and never will be a feature of XML::Simple. For some XML document types you might be able to hack it in by subclassing XML::Simple and overriding the new_hashref() method to supply a hashref tied to Tie::IxHash. That could solve the ABC case but it won't solve the ABA case.
The short answer is that if you care about element order then you should not use XML::Simple. XML::LibXML is an excellent alternative which for many use cases is really no harder to use than XML::Simple.
Sample XML file with ABA
In this example we have 3 elements inside the "order" root element. If I am not mistaken, that's what Grant referred to when he wrote 'ABA' case.
<order>
<pizza>
<name>Margarita</name>
<size>large</size>
</pizza>
<drink>
<name>Coke</name>
<size>1.5 L</size>
</drink>
<pizza>
<name>Funghi</name>
<size>XXL</size>
</pizza>
</order>
Using XML::Simple we can read in this XML file and immediately print it out.
use strict;
use warnings;
use XML::Simple qw(XMLin XMLout);
my $xml = XMLin('order.xml', ForceArray => 1, KeepRoot => 1);
print XMLout($xml, KeepRoot => 1);
The result is this:
<order>
<drink>
<name>Coke</name>
<size>1.5 L</size>
</drink>
<pizza>
<name>Margarita</name>
<size>large</size>
</pizza>
<pizza>
<name>Funghi</name>
<size>XXL</size>
</pizza>
</order>
The order of the elements has changed.
This might be acceptable in some cases, but in other cases the order of the elements might be important.
Dumping the XML as a Perl data structure
So why has XML::Simple messed up the order? The answer is in how it holds the data after reading the XML file.
We can use Data::Dumper to see the content of the $xml
variable:
examples/xml_simple_order_dump.pl
use strict;
use warnings;
use XML::Simple qw(XMLin XMLout);
use Data::Dumper qw(Dumper);
my $xml = XMLin('order.xml', ForceArray => 1, KeepRoot => 1);
print Dumper $xml;
The output looks like this:
$VAR1 = {
'order' => [
{
'pizza' => [
{
'size' => [
'large'
],
'name' => [
'Margarita'
]
},
{
'size' => [
'XXL'
],
'name' => [
'Funghi'
]
}
],
'drink' => [
{
'name' => [
'Coke'
],
'size' => [
'1.5 L'
]
}
]
}
]
};
Here we can see that XML::Simple has merged the content of the two 'pizza' tags into one hash. From this data structure we cannot rebuild the original order of pizza-drink-pizza.
Is this a bug in XML::Simple? You can say so, or you can say that it was a design decision. In either way Grant, the author of the module is not planning to change this behaviour and actually recommends agains the use of XML::Simple for new projects. Back when XML::Simple was created it was a good solution for a lot of problems, but today there are other solutions. He recommended XML::LibXML.
Let's see how this round-trip works using XML::LibXML.
Round-trip with XML::LibXML
use strict;
use warnings;
use XML::LibXML;
my $dom = XML::LibXML->load_xml(location => 'order.xml');
print $dom->toString();
Running this code we'll get the XML back exactly in the same order as we had in the original file.
(This article was rescued from CPAN::Forum)