SXLink An W3C XLink impementation and Scheme API Contents: 1. What is SXLink? 2. How can SXLink be used 2.1. Link Resolution 2.2. Node Inclusion 2.3. Link validation 3. Usage guide 3.1. Loading XML documents containing XLink elements 3.2. Uniting XLink information 3.3. XPointer dereference 3.4. Link Resolution and Node Inclusion 4. Project description ------------ 1. What is SXLink? SXLink is an Application Program Interface (API) which provides XLink links' managament for Scheme SXLink allows an application to: - open XML documents containing XLink elements and convert this documents to an extended SXML (XLink-related information is stored within a socalled auxiliary list) - access remote resources via HTTP protocol - work with multiple documents containing XLink elements - load XLink linkbases automatically (see 5.1.5 in XML Linking Language Specification for details) - unite XLink information defined in several documents - process links between portions of resources being addressed via XPointer language (in particular, document's DTD is automatically viewed and ID attributes are discovered) - get information about traversals having the definite document's node for a starting resource (in particular, get the ending resource of the traversal) - perform Link Resolution and Node Inclusion operations (see 2.1, 2.2 for details) - verify links' validity (see 2.3 for details) ------------ 2. How can SXLink be used 2.1. Link Resolution Unlike HTML hyperlinks, XLink links can not only be outbound (i.e. with a starting resource as a local one and an ending resource - as a remote one), but also inbound (vice versa, i.e. the starting resource is the remote one and the ending resource - is a local one) and third-party (both resources being connected are remote ones, i.e. the link is defined separately from the resources it connects). Such an extension makes linking very flexible. However, traversing between documents becomes difficult, because XLink traversals related to the document can typically be defined externally. Link Resolution is the operation which transforms all links to outbound ones. This means that links having the node of a document for a starting resource become explicitly defined within the document as outbound links. This new document is called the "resolved document". For example, the resolved document can then be transformed to HTML and viewed by a browser. Link resolution operation for an XML document named "hub3.xml" can be performed by a single function call: (xlink:link-resolution "hub3.xml" xlink:lrh-simple xlink:elh-remove-related xlink:alh-remove) The SXML presentation of the resolved document is returned. NOTE 1: Linkbases being refered to by "hub3.xml" are automatically loaded (recursively) and analysed in order to discover XLink links related to "hub3.xml". NOTE 2: The last three arguments of the XLINK:LINK-RESOLUTION function are itself functions. They are used as event handlers and thus define XLINK:LINK-RESOLUTION function's behavior. For example, XLINK:LRH-SIMPLE constructs a simple outbound link. Being changed to XLINK:LRH-EXTENDED, it will construct an extended outbound link. 2.2. Node Inclusion Node Inclusion is the further development of Link Resolution operation. Every node within a document serving for the starting resource of a link, is replaced with the ending resource of this link. For example, ending resources can contain some detailed information, so this information is explicitly inserted in a document created as the result of Node Inclusion operation. Node Inclusion for an XML document named "hub2.xml" is performed by a single API function again: (xlink:node-inclusion "hub2.xml" xlink:nih-ending-resource xlink:elh-remove-related xlink:alh-remove) The SXML presentation of the new document (a documents with new nodes included) is returned. NOTE: SXLink allows the application to define the behavior of Node Inclusion operation when several ending resources correspond to a single starting resource. This is archieved by the second argument (XLINK:NIH-ENDING-RESOURCE in an example above). SXLink provides several basic handler functions for this purpose. Moreover, the application is free to implement its own handler functions for advanced purposes (for example, the ending resource can be transformed before being inserted on the place of the starting resource). 2.3. Link validation XLink links can potentially connect a lot of resources, which can be located in different places of the World Wide Web. These resources can be modified/ created/deleted, and links themselves can be modified/created/deleted too. When the amount of resources and links is big, it becomes very hard to keep all the links valid. SXLink provides means of automatic XLink links' validation. This includes the following: a) all resources participating in links are available; b) all portions of resources defined by XPointer fragment identifiers are available; c) (if required) XLink roles and/or arcroles are URIs of existing resources. NOTE: One of the possible interpretations of XLink role/arcrole is the address of the resource containing information about the role. That's why SXLink implements c). SXLink link validator takes two types of arguments: - Options which define validation mode. Options include the following: 'linkbases - recursively load linkbases being refered by XLink markup 'docs - recursively load documents being refered by XLink markup 'roles - check roles' validity 'arcroles - check arcroles' validity - URIs of resources serving for the "starting point" of the validation process (additional resources are loaded and checked either if required by options). This is the example: ((xlink:validator 'arcroles 'docs) "hub1.xml" "hub2.xml" "hub3.xml") 'arcroles and 'docs are Options here (see above); "hub1.xml", "hub2.xml" and "hub3.xml" - the resources serving for the "starting point" of a validation process. 'docs option specifies that any document which is encountered as a participant of a link will be loaded and analysed for the presense of XLink elements (this can cause loading another documents and so on until the whole "network" of connected documents is loaded and validated). XLINK:VALIDATOR returns a boolean value (whether the links are valid). Moreover, in case of #f (some links are unvalid) diagnostics is printed to stderr. ------------ 3. Usage guide The previous sections illustrated highest level API functions. They are very powerful and are simple for use. However, these functions provide somehow limited possibilities. This section describes the whole API. Applying several (lower level) API functions, one can reach the full expressive power of SXLink. 3.1. Loading XML documents containing XLink elements First of all, we would like to load some documents: (define doc-set (xlink:load-docs "hub1.xml" "hub2.xml" "hub3.xml")) XLINK:LOAD-DOCS function call loads the documents, transforms them to SXML presentation and adds some additional XLink-related information to their auxiliary lists (this information is SXLink-specific). DOC-SET is similar to a node-set in SXPath: it is a list of SXML documents. The number of documents in the DOC-SET: (length doc-set) ==> 3 The list of URIs presented in the DOC-SET: (xlink:uris doc-set) ==> '("hub1.xml" "hub2.xml" "hub3.xml") We can access the document via its URI: (xlink:find-doc "hub3.xml" doc-set) ==> ;the SXML presentation of the document Once DOC-SET is created, it can be analyzed, and additional documents can be loaded. For example, we might wish to load linkbases, links to which are encountered within DOC-SET: (define doc-set2 (xlink:load-linkbases-recursively doc-set)) NOTE 1: "Recursively" means that a linkbase may itself refer to another linkbase and so on. All this "chain" of linkbases is then loaded by XLINK:LOAD-LINKBASES-RECURSIVELY. If curcular dependancy of linkbases exists, duplicates are not loaded. NOTE 2: XLINK:LOAD-LINKBASES-RECURSIVELY has an optional second argument - MAX-STEPS. This argument, if presented, defines the maximal number of steps in the "chain" of linkbases that will be loaded. 3.2. Uniting XLink information After finishing step 3.1, we have all the documents we are currently interested in. Recall that XLink Recommendation allows outbound, inbound and third-party links. This means that our documents may define links for each other (for example, document A may define a link with the starting resource within document B and the ending resource in document C). That's why effective link management implies uniting XLink information encountered in multiple documents: (define sorted-info (xlink:doc-set->sorted-traversal doc-set2)) SORTED-INFO now contains the complete XLink information about links defined in DOC-SET2. NOTE 1: This information is automatically sorted in accordanca with starting resources of XLink traversals. NOTE 2: SORTED-INFO is stored separately from auxiliary lists, because SORTED-INFO doesn't belong to any document in DOC-SET2. SORTED-INFO is the characteristics of the whole DOC-SET2. 3.3. XPointer dereference Once we possess global (sorted) information about XLink traversals, it is high time to associate starting resources with document nodes. This is performed by means of evaluating XPointer fragment identifiers for starting resources which reside within the definite document: (define hub1 (xlink:xpointer-dereference sorted-info (xlink:find-doc "hub1.xml" doc-set2))) HUB1 is the SXML document containing an new auxiliary list for resolved XPointer identifiers and related traversal information: (define resolved (xlink:resolved hub1)) Suppose we have somehow chosen a node within HUB1 document: (define node ...) We can now learn about XLink traversals which have this NODE for a starting resource, by typing: (assoc node resolved) 3.4. Link Resolution and Node Inclusion Let's perform Link Resolution and Node Inclusion using lower level API functions than in 2.2 and 2.3. There is not much to do one we've got resolved XPointer fragment identifiers: (xlink:resolve-links hub1 doc-set2 xlink:lrh-simple xlink:elh-remove-related xlink:alh-remove) (xlink:include-nodes hub1 doc-set2 xlink:nih-ending-resource xlink:elh-remove-related xlink:alh-remove) ------------ 4. Project description libs/ - the folder containing basic Scheme libraries sxml-tools/sxml-tools.scm - SXML tools sxml/ssax.scm - Oleg Kiselyov's SSAX parser sxml/sxpathlib.scm - SXPath sxpath/sxpath-ext.scm - an extension for SXPath library sxpath/txpath.scm - XPointer implementation sxpath/test/sxpath-ext-tf.scm - test fixture for XPath extension sxpath/test/txpath-tf.scm - test fixture for XPointer multi-parser/id/srfi-12.scm - SRFI-12 implementation multi-parser/id/mime.scm - handling of MIME entities multi-parser/id/http.scm - HTTP protocol multi-parser/id/access-remote.scm - accessing resources multi-parser/id/id.scm - reads the DTD and creates an id-index xlink-parser/xlink/xlink-parser.scm - processes XLink elements xlink-parser/multi-parser.scm - combines an ID parser and an XLink parser xlink-parser/ssax-prim.scm - a helper module for multi-parser xlink-parser/test/multi-parser-tf.scm - test fixture for a multi-parser xlink-api/xlink-api.scm - XLink API xlink-api/api-examples.scm - examples illustrated by this readme xlink-api/test/xlink-api-tf.scm - test fixture for XLink API This code is in Public Domain. Please send bugreports and suggestions to: Dmitry Lizorkin