Institute for Informatics
Georg-August-Universität Göttingen

Databases and Information Systems

dbis
Uni Göttingen

International Workshop on Internet Data Management (IDM'99),
Firenze, Italy, September 2, 1999. Proc. DEXA 99 Workshop, IEEE Computer Society Press, pp. 721-725.

Modeling and Querying Structure and Contents of the Web

Wolfgang May

Abstract:

For accessing and processing the information provided on the Web, there is a need for extraction, restructuring, and integration of semistructured data from autonomous, heterogeneous sources. In this paper, we regard the Web and its contents as a unit, represented in an object-oriented data model: the Web structure (inter-document level), given by its hyperlinks, the parse-trees of Web pages (intra-document level), and their contents. The model is complemented by a rule-based object-oriented language which is extended by Web access capabilities and allows for and navigation in the unified model. We show the practicability of our approach by using the FLORID system.

[The Paper (IEEE Digital Library)]
[Slides]