DataGuides
DataGuide
is a "structural summary" for semistructured data
and may be considered as analog of traditional database schema in context of
semistructured data management.
DataGuide is a descriptive schema for XML/SXML data.
While prescriptive schemas (DTD, XML Schema, Relax-NG)
act more as a traditional datanase schema, restricting allowable
XML data, a DataGuide infers rather than imposes structure.
DataGuide describes actual (rather than possible) structure of XML data
extracting the structure from the XML data.
It may be used as schema for semistructered data
without any explicit schema declaration, such as non-validated XML documents.
Strong DataGuide for a SXML tree s is a SXML tree dg such
that exactly one element or attribute exists in dg for every location path
of s, and every location path of dg is a location path of
s.
Flat DataGuide for a SXML tree s is a list of location paths
dg such that exactly one location path exists in dg for every
location path of s, and every location path in dg is a location
path of s.
Software
guides.ss
is included in SXML collection for PLT Schemes
dg.scm
A command line tool for DataGuides extraction and comparison.
Requires "ssax" and "sxml" collections available from the link above.