XOM api is a fantastic wrapper over apache xerces. XOM is very memory efficient. If you read an entire document into memory, XOM uses as little memory as possible. More importantly, XOM allows you to filter documents as they’re built so you don’t have to build the parts of the tree you aren’t interested in. For instance, you can skip building text nodes that only represent boundary white space, if such white space is not significant in your application. You can even process a document piece by piece and throw away each piece when you’re done with it. XOM has been used to process documents that are gigabytes in size.
You will find some cool examples here