We first describe the approach to XML data integration in the
XPathLog/LoPiX project that uses a warehouse strategy. We show that
the DOM model and the XML Query Data Model are not suitable for this
task since the integrated database is not necessarily a tree, but a
set of overlapping (original and integrated) trees. The problem is
solved by using a node-labeled graph-based data model, called
XTreeGraph, for the internal XML database that
represents multiple, overlapping XML trees, or tree views.
In the second part, we return to the standard XML data model - by
still keeping the overlapping tree idea by "simulating" it: The
data is internally represented by XML where the "overlayed"
resulting tree is represented by XLink elements that refer to the
original sources. By using a logical, transparent data model for
XLinks as investigated in
WWW-02, all queries behave as
stated against the XTreeGraph. The use of links for partial
materialization also turns the approach from a warehouse approach
into a mixed approach that combines the advantages of the warehouse
approach and of the virtual approach. The approach is again
illustrated by using XPathLog as data integration language.