niXforums Forum Index
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   PreferencesPreferences   Log in to check your private messagesLog in to check your private messages   Log inLog in 
·  nixdoc.net ·  man pages ·  Linux HOWTOs ·  FreeBSD Tips ·  Forums
navigation Forum index » Programming » Perl » modules
Compare huge XML Files
Post new topic   Reply to topic Page 1 of 1 [9 Posts] View previous topic :: View next topic
Author Message
John Bokma
*nix forums Guru


Joined: 23 Feb 2005
Posts: 1136

PostPosted: Sat Feb 26, 2005 1:59 am    Post subject: Re: Compare huge XML Files Reply with quote

junnuthala wrote:

Quote:
The bottleneck is not in the XML::Parser when I use the "Stream" option
which returns all the tags in XML format itself.

But when I use the "Tree" options and it is processing each tag I am
getting much delay.

I guess I have to use "Stream" option and do my own callback functions
for startTag, endTag, startDocument and endDocument.

anyone have any suggestions on what would be the fastest way ?

How much memory does the process use?

--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html
Back to top
junnuthala
*nix forums beginner


Joined: 23 Feb 2005
Posts: 10

PostPosted: Fri Feb 25, 2005 9:01 pm    Post subject: Re: Compare huge XML Files Reply with quote

The bottleneck is not in the XML::Parser when I use the "Stream" option
which returns all the tags in XML format itself.

But when I use the "Tree" options and it is processing each tag I am
getting much delay.

I guess I have to use "Stream" option and do my own callback functions
for startTag, endTag, startDocument and endDocument.

anyone have any suggestions on what would be the fastest way ?
Back to top
John Bokma
*nix forums Guru


Joined: 23 Feb 2005
Posts: 1136

PostPosted: Fri Feb 25, 2005 4:42 pm    Post subject: Re: Compare huge XML Files Reply with quote

junnuthala wrote:

Quote:
I have a 6MB XML file, but it has more than 300,000 elements.

XML::Parser is taking almost 35 minutes to get the result as a tree.

That sounds awfully slow. I parse over half a MB in seconds. I don't build
a tree but the overhead of doing so (if I do it) is insignificant.

--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html
Back to top
junnuthala
*nix forums beginner


Joined: 23 Feb 2005
Posts: 10

PostPosted: Fri Feb 25, 2005 12:22 am    Post subject: Re: Compare huge XML Files Reply with quote

I have a 6MB XML file, but it has more than 300,000 elements.

XML::Parser is taking almost 35 minutes to get the result as a tree.
Back to top
junnuthala
*nix forums beginner


Joined: 23 Feb 2005
Posts: 10

PostPosted: Fri Feb 25, 2005 12:07 am    Post subject: Re: Compare huge XML Files Reply with quote

Thomas Malt wrote:
Quote:
"junnuthala" <junnuthula@yahoo.com> writes:

John Bokma wrote:
junnuthala wrote:

Hello,

Can someone please suggest me a Perl module for comparing huge
XML
Files.

How do you want to compare them?

Parse and read the XML elements, attributes text into a tree or a
hash
and then compare.

I tried using Semanticdiff, but it is taking a lot of time to read
the
XML file into Hash.

That will take a lot of time no matter what you do. But define
"huge". And define "a lot of time" Smile.


I have a XML file of sixe 6MB, but it has more than 300,000 elements.


Quote:
If you want efficiency more than anything else then XML::Parser is
still the fastest. Or at least it was the last time I checked.


XML::Parser is taking almost 35 minutes to get the result as a tree.


Quote:
Implementing handlers to put attributes and CDATA into an HASH is
really straight forward, but if your files are in the 100MB area
that could still take several minutes depending on your hardware.

Thomas
--
: Thomas Malt.: tm@linpro.no ...: http://www.malt.no/ ...:
+4797748504 :
: Linpro AS...: info@linpro.no .: http://www.linpro.no/ .:
+4722871180 :
: :... >> Ledende på Linux i Norge >> Best på alt i
verden :
Back to top
Thomas Malt
*nix forums beginner


Joined: 23 Feb 2005
Posts: 2

PostPosted: Wed Feb 23, 2005 10:25 pm    Post subject: Re: Compare huge XML Files Reply with quote

"junnuthala" <junnuthula@yahoo.com> writes:

Quote:
John Bokma wrote:
junnuthala wrote:

Hello,

Can someone please suggest me a Perl module for comparing huge XML
Files.

How do you want to compare them?

Parse and read the XML elements, attributes text into a tree or a hash
and then compare.

I tried using Semanticdiff, but it is taking a lot of time to read the
XML file into Hash.

That will take a lot of time no matter what you do. But define
"huge". And define "a lot of time" Smile.

If you want efficiency more than anything else then XML::Parser is
still the fastest. Or at least it was the last time I checked.

Implementing handlers to put attributes and CDATA into an HASH is
really straight forward, but if your files are in the 100MB area
that could still take several minutes depending on your hardware.

Thomas
--
: Thomas Malt.: tm@linpro.no ...: http://www.malt.no/ ...: +4797748504 :
: Linpro AS...: info@linpro.no .: http://www.linpro.no/ .: +4722871180 :
: :... >> Ledende på Linux i Norge >> Best på alt i verden :
Back to top
junnuthala
*nix forums beginner


Joined: 23 Feb 2005
Posts: 10

PostPosted: Wed Feb 23, 2005 7:44 pm    Post subject: Re: Compare huge XML Files Reply with quote

John Bokma wrote:
Quote:
junnuthala wrote:

Hello,

Can someone please suggest me a Perl module for comparing huge XML
Files.

How do you want to compare them?

Parse and read the XML elements, attributes text into a tree or a hash
and then compare.

I tried using Semanticdiff, but it is taking a lot of time to read the
XML file into Hash.

Quote:

--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html
Back to top
John Bokma
*nix forums Guru


Joined: 23 Feb 2005
Posts: 1136

PostPosted: Wed Feb 23, 2005 5:36 pm    Post subject: Re: Compare huge XML Files Reply with quote

junnuthala wrote:

Quote:
Hello,

Can someone please suggest me a Perl module for comparing huge XML
Files.

How do you want to compare them?

--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html
Back to top
junnuthala
*nix forums beginner


Joined: 23 Feb 2005
Posts: 10

PostPosted: Wed Feb 23, 2005 8:01 am    Post subject: Compare huge XML Files Reply with quote

Hello,

Can someone please suggest me a Perl module for comparing huge XML
Files.

I tried XML::SemanticDiff, but it is taking a lots and lots of time to
load the XML File nodes, elements and attributes to the Hash.

Any suggestions would be really appreciated.

Thank you
-Venkat
Back to top
Google

Back to top
Display posts from previous:   
Post new topic   Reply to topic Page 1 of 1 [9 Posts] View previous topic :: View next topic
The time now is Wed Jan 07, 2009 5:27 pm | All times are GMT
navigation Forum index » Programming » Perl » modules
Jump to:  

Similar Topics
Topic Author Forum Replies Last Post
No new posts howto log in from one bsd-server to another and move file... Tobias Steer FreeBSD 3 Thu Jul 20, 2006 10:02 am
No new posts FAQ 1.9 How does Perl compare with other languages like J... PerlFAQ Server Perl 0 Thu Jul 20, 2006 7:03 am
No new posts Huge and sick Postfix Karel Chromik Postfix 0 Wed Jul 19, 2006 7:03 pm
No new posts Binary Files Ronin C++ 8 Wed Jul 19, 2006 3:12 pm
No new posts Bug#378877: ITP: libsvm-doc -- documentation and example ... Rudi Cilibrasi devel 0 Wed Jul 19, 2006 1:50 pm

Mortgages | Bankruptcy | Lingerie | Bad Credit Mortgages | Loans
Copyright © 2004-2005 DeniX Solutions SRL
 
Other DeniX Solutions sites: Unix/Linux blog |  electronics forum |  medicine forum |  science forum | 
Privacy Policy


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1785s ][ Queries: 20 (0.0798s) ][ GZIP on - Debug on ]