Rupert Woodman *nix forums beginner
Joined: 21 Jun 2006
Posts: 5
|
Posted: Mon Jul 03, 2006 4:30 pm Post subject:
Efficient way of doing this
|
|
|
Hi All,
I have a number of XML documents on disk, which need to be munged, and
inserted into the Berkley DB for later querying.
The documents will be something like:
<?xml version="1.0" encoding="UTF-8" ?>
<metadata>
<test name="test1">
.
.
.
</test>
<test name="test2">
.
.
.
</test>
</metadats>
And I want to end up with 2 documents in the database, looking something
like:
<?xml version="1.0" encoding="UTF-8" ?>
<test name="test1" deliveryid="1">
..
..
..
</test>
and
<?xml version="1.0" encoding="UTF-8" ?>
<test name="test2" deliveryid="1">
..
..
..
</test>
So I want to pull out the <test> elements, insert a new attribute, and write
the resulting data to the database (and I'm assuming I need the <?xml...>
bit adding to make it valid).
I could extract the data using Xerces, add the attribute, and then use
putDocument, but I have a couple of problems with doing that:
1) Whenever I've read a document from disk into a String, the putDocument
has failed with a UTFDataFormatException (which I've not managed to solve,
other than using an XMLInputStream).
2) I am writing a web service running under Axis, and class loaders to
encapsulate the classes relevent to the task in hand. The way this
currently works is that if I need to deal with a different type of test, I
can just plonk a new JAR file in a specific directory and it's then able to
process these types of test. It will be made much more difficult to
distribute Xerces with a JAR file, and I'd end up distributing this multiple
times.
It seems to me as if it should be possible with the Java BerkeleyDB XML API,
but I can't see a way. The API seems to only allow data to be modifed after
it has been returned from a container.
Could anyone suggest a way of achieving what I want to do? I can't believe
this is an unusual requirement, I just can't see how to solve it!
many thanks
Rupert |
|