Semistructured Data and XML Summer 2024
Prof. Dr. Wolfgang May
Date and Time:
- Monday 14-16 ct,
IFI SR 2.101
- Wednesday 10-12 ct,
IFI SR 2.101
- This year, DBIS will use mainly non-live teaching by pre-recordings.
There will be some live online meetings with BigBlueButton provided by GWDG;
the rooms/meetings can be entered via StudIP.
Maybe, there are also optional meetings (Monday afternoon slots) in presence
at the IFI.
- Materials for self-studying (in english) will be linked below weekwise:
- revised videos taken from summer term 2020 (as the "original" dates in the filenames indicate)
- PDF slides
- Please also read the general and technical information
about DBIS virtual teaching.
Lecture and Exercises mixed (see announcements on this page). There will be non-mandatory
exercise sheets whose solutions will be discussed as parts of the lecture.
All materials and announcements can be found HERE on the "blue DBIS pages".
Module M.Inf.1141, 4 SWS, 6 ECTS.
The module's home is the MSc studies in
Applied CS. It can also be credited in the BSc studies in Applied CS
(as "Vertiefung Softwaresysteme und Daten", B.inf.1706),
and in several other studies:
BSc/MSc Wirtschaftsinformatik, Mathematik (BSc/MSc), Digital Humanities,
Teaching/2-Fach-Bachelor, PhD GAUSS, ...
Course Description
One of the most important facts that lead to the overall success of XML
is that the "XML world" combines a lot of already known concepts in an
optimal way for coping with a broad spectrum of requirements.
The course will first review some of these preceding (partially even historic)
concepts (network database model, relational databases, object-oriented
databases) and the integration of data and metadata (SchemaSQL). Then,
the idea of "semistructured data" is introduced by showing early
representatives that helped to shape the XML world (F-Logic, OEM).
In the main part, XML is presented as a data model and a markup-meta-language,
and the current languages of the concepts of the XML world are systematically
investigated and applied: DTD, XPath, XQuery, XSLT, XLink, XML Schema,
and SQL/XML.
The lecture uses the geographical sample database "Mondial"
in its XML version for illustrations.
For practical exercises, the XML software is installed in the IFI CIP
Pool.
The software playground page can be found
here;
the XPath/XQuery/XSLT Web interface is available
here.
The sample code fragments can be found in the CIP pool under
/afs/informatik.uni-goettingen.de/course/xml-lecture/
.
Dates & Topics
Part 0: Preface
Materials for self-studying:
Part I: History, evolution, and comparison of data models until 1995 and requirements for XML
- This part is not to be seen as a "technical lecture" to learn details of some languages,
but to show how ideas and concepts, in this case, data management and data models (and the concept
of high-level declarative query languages) evolved, and
how new requirements (Web, data integration, data interoperability, handling
documents+data and metadata) lead to XML in the mid/late 1990s.
- 10.4., 15./17.4., 22./24.4.: Self-studying, no pre-scheduled live meetings.
- Overview:
-
Data Models: about structuring data and the development of query languages.
Slides: Review of the Relational Model
Recording: General concepts
of data models, the relational model (with its querying concepts) as an example data model.
- The Requirements for "semistructured data" in the mid 90s, history of data models:
appropriateness of the data model for modeling data and other requirements of the time.
Slides: data models
Recording: the network database model (1960s,
pre-declarative querying) and the object-oriented database model (late 1980s)
Recording: the object-oriented database model
(late 1980s, OQL, OIF, Corba)
- Some references to read about database history (optional):
- "History" continued - requirements and academic prototypes of the early 1990s:
SchemaSQL (an
extremely powerful, yet syntactically minimal "opening" of SQL to metadata) and
early semistructured data models (Tsimmis/OEM and F-Logic):
Slides: early semistructured data models
Recording: SchemaSQL
Recording:Tsimmis/OEM (part 1)
Recording: Tsimmis (part 2), F-Logic and the situation pre-XML
-
(Exercise: if you have knowledge about JSON (JavaScript Object Notation), compare JSON with the
concepts discussed above in Part I)
Part II: XML concepts and technology - aspects of XML as a data structure in Computer Science
-
Provisional Exam Date: Thursday, July 25th, 10-13h Online exam with ILIAS (as in the previous years)
Confirmed in the Meeting on April 29
-
Monday, April 29th 14:15: Live online meeting
to get some feedback, answer questions, and to discuss/give a roadmap for the
rest of the lecture.
From now on, the course gets "productive" and continues with XML and the languages of the XML world.
-
(Exercise: if you have knowledge about JSON, compare JSON continuously with the following aspects of XML)
- 29.4.: XML: data model, language, DTDs etc.
Slides: XML basics
Recording: XML basics
- 1.5.: Holiday
- 6.5.: XML: DTDs etc. (cont'd)
Recording: DTD, the xmllint tool
- Exercise Sheet 1
(XML basics, parsing, grammar aspects, parsing)
If there are questions etc.,
the RocketChat dbis channel can be used
(also participants are encouraged to answer questions from others).
- 8.5. XML parsing ...
Recording: XML parsing, XHTML (and parsing)
- 13.5.: Discussion of Exercise Sheet 1:
Solutions to Exercise Sheet 1.
Recording:
- You find the recording from 2022 as follows: register for "Semistructured Data Summer
Term 2022",
and use the recording of the Meeting "2022-05-30-exercise-sheet-1" (this was an
"experimental" session in the seminar room, using the Smartboard and BBB)
Part III: Languages of the XML world: XPath, XQuery, XSLT ...
-
Monday, May 13th, 14:15: Live online meeting
questions&answers, summary, outlook ...
- 15.5.: XPath: navigation and addressing language for XML
Slides: XPath
Recording: XPath I
- 20.5. Holiday (Whitsun Monday)
- 22.5.: XPath (cont'd)
Recall slide from last time: XML Axes for XPath
XPath position functions (local) with graphics
Recording: XPath II
- Exercise Sheet 2: XPath
- 27.5.: XPath (cont'd)
Recording: XPath III
- 29.5./3.6.:
XPath (conclusions), XML Query Languages: History/Evolution - XQL, XML-QL
Recording: XPath IV: Conclusions
XPath tree navigation sketch (pdf)
Slides: XQuery
Recording: XQL, XML-QL
- 5.6.:
Solutions to Exercise Sheet 2
Existing Recording: Exercise 2.5, including some
insights into storage of XML data and the query evaluation strategy.
PDF graphics for Ex. 2.5 (axes/traversals)
-
Monday, June 10th 14:15: Live online meeting
to get some feedback, answer questions, and to discuss/give a roadmap for the
next lectures.
- 10.6.: XQuery
Recording: XQuery (I)
Notes on XML query language design (rule metaconcept, index-based eval)
note: for
experimenting with XQuery, using saxon from the command line (on
the IFI CIP Pool computers or install it on your own computer)
provides better error messages than the Web Service.
- 12.6.: XQuery (cont'd)
Exercise Sheet 3 (XQuery)
Recording: XQuery (II+III)
pdf notes: functional "if"/"case" statement
- 17./19.6.: XQuery (cont'd)
Recording: XQuery (IV)
Recording: XML Query Languages - Conclusion
-
Solutions to Exercise Sheet 3
-
Monday, June 24th 14:15: Live online meeting
to get some feedback, answer questions, and to discuss/give a roadmap for the
next lectures.
- Slides: XSLT
24./26.6./1.7.
Recording: XSLT (I)
Exercise Sheet 4 (XSLT)
Recording: XSLT (II)
Recording: XSLT (III)
- 3.7.: Solutions to Exercise Sheet 4
-
Monday, July 8th: 14:15: Live online meeting
to get some feedback, and to answer questions.
- Five ILIAS example exams are available online
via StudIP->SSD/XML-SS2024->Learning Modules:
a first sample exam based on the one from 2008 (edited for Ilias for SS2020),
and the regular ILIAS exams from 2020/2021/2022/2023.
If something does not work or looks strange, send me a mail of use the dbis RocketChat.
I cannot test what the students see and can (not) do with these exams.
- 8.7.: XML Schema
Slides: XML Schema
(note: the XML Schema atomic datatypes are also used in the RDF data model)
Recording: XML Schema
- 10.7.: Optional, belonging to the SQL lab course: XML in SQL/SQLX
Slides: XML and Databases
SQL Web interface (in addition to the
Mondial tables, the tables mondial, countryXML, cityXML used on the slides exist)
Recording: SQLX
- Note: the rest of the slide set belongs to the XML lab course
which maybe takes again place in the winter term 2024/25.
- Announcement:
Praktikum/Lab course XML WS2024/25
- Announcement:
Semantic Web WS2024/25
- (General) information about E-Exams
(click upper right for German language)
- 12.7.2024 End of lecture period.
- It is planned to have one or more (optional, recorded) online-meetings, including questions and answers, before.
- 22.7.2024 14-16 optional questions and answers meeting in BBB
- 24.7.2024 14-15 optional last-minute questions and answers meeting in BBB
- 25.7.2024 10:15 (IDENT), 10:30 (Ilias) - ca. 13:00 Exam (see below)
Exams
The exam will take place as a written/online exam using the ILIAS system.
Thursday, July 25th, 2024, 10-13h (time details see below)
Online-at home(or @anywhere) with ILIAS.
- The exam is an "open-book-exam", i.e., you can use documentation
whatever and whenever you want (but it is intended that you should not need it much, except maybe
for looking up syntax - DBIS exams are not "learning" exams, but competence exams).
Help by any other persons is not allowed.
-
Strongly recommended: prepare a "cheat sheet" (German: Spickzettel) where you put
everything that you may want to lookup quickly. This preparation also helps to become
aware of the material.
Details of the syntax of the pre-XML history section are not relevant. That section is
for understanding the concepts and the problems.
- Like in a "paper exam", solutions that do maybe not work (completely) can be
delivered and will be graded with appropriately partial points.
The Exam Procedure with IDENT, ILIAS, and BBB (as 2023, technical details may change)
-
- Official information about online exams:
german/
english
- We will use the IDENT feature (with photo). Enter IDENT via FlexNow
(the IDENT feature opens at 10:15) -
you must identify by uploading a photo of your face and your study ID card
(and find an information text, basically the same as here, and the Ilias-password
[not yet the password]
) before entering Ilias.
(note: when entering, the IDENT system usually starts in german, so you have to switch
manually to english there).
- We meet in a StudIP-BBB-Room, via the StudIP lecture -> Meetings -> "SSD-2024-07-25-EXAM".
-
The BBB meeting officially begins at 10:30
(I will be there at about 10:10).
Then, I will read through the whole exam, give some comments and answer
first questions - like in a synchronous in-presence-exam.
This is expected to take about 15 minutes, until about 10:45.
- Also in the StudIP course, under "Learning Modules" -> "Course in Ilias" you can
then log into ILIAS from 10:30 on, using the above password.
It is strongly recommended not to start working in ILIAS immediately, but to
follow reading the whole exam first.
After reading, ILIAS will remain open for 120 minutes, ending about 12:45.
- You should have installed xmllint (for XML-DTD validation, it has
better error messages than saxon) and saxon (for XQuery and XSLT) -or
whatever XML software you want to use- on your computer.
Exam Preparation
Note: this section is usually published at te end of the lecture, when it starts being useful for exam preparation.
By now, it is just to get an overview what the exam looks like.
- you need to design a small, but useful XML instance from the given text (which will contain
sample data as in the earlier exams). Note that for queries in the "all x such that for all y ..."
style, it is helpful to have such x,y in the instance.
- have a plan, "experience", how to edit an XML file quickly with copy+pasting elements.
This can be much faster than in a paper exam where the chatty XML stuff must be written
several times.
Choose short element names and attribute names. You may also abbreviate text contents like
person names to initials.
Training example exams
- Three computer-based ILIAS training exams are available via StudIP, see below.
In case of any questions, or in case something does not work, preferably use the
RocketChat:dbis channel.
- The written exams from previous years are also a good preparation (hmm... most of them they are only
available in German, but still give some overview what they looked like):
The 2024 Exam
- [13.10.] The grading of the SSD/XML exam is finished. You should be
able to see your grades and the comments to your solutions in Ilias
(enter it as before via StudIP). The grades are not yet in FlexNow.
Grades: passed with 41P or more, 3.7 with 45P ...., every 4P, 1.0 with 77P or more.
-
In case of any questions, there is an open post exam-review meeting
(German: Klausureinsicht) on Tuesday, Oct. 15, 13:30-14:30) in the
same BBB room as we used in the exam (enter it again via
StudIP->SSDXML->Meetings->). You can also state questions in the
RocketChat or by mail and we have a meeting later.
- Exam 2024 without solutions (in English)
- Exam 2024 with solutions (in English)
- The grades will be submitted to FlexNow end of next week.
Short statistical overview:
4x 1.0, 3x1.3, 5x 1.7, 1x2.0, 2x 2.7, 3x 3.7, 4x not passed.
|