X-Git-Url: http://git.indexdata.com/?p=idzebra-moved-to-github.git;a=blobdiff_plain;f=doc%2Frecordmodel-alvisxslt.xml;fp=doc%2Frecordmodel-alvisxslt.xml;h=328bbce68b27c9a79765944d0663605c19a74fea;hp=e64bb840dc92a6221853a122a25a4a51529028cb;hb=5ca4e60e990af6ad6b62ebff855d7b642f37c3ec;hpb=e6ff84c71e457ff668dce640382fc1ad88c37d6d diff --git a/doc/recordmodel-alvisxslt.xml b/doc/recordmodel-alvisxslt.xml index e64bb84..328bbce 100644 --- a/doc/recordmodel-alvisxslt.xml +++ b/doc/recordmodel-alvisxslt.xml @@ -1,15 +1,15 @@ - - ALVIS XML Record Model and Filter Module + + ALVIS &xml; Record Model and Filter Module The record model described in this chapter applies to the fundamental, - structured XML + structured &xml; record type alvis, introduced in - . The ALVIS XML record model + . The ALVIS &xml; record model is experimental, and it's inner workings might change in future - releases of the Zebra Information Server. + releases of the &zebra; Information Server. This filter has been developed under the @@ -22,7 +22,7 @@
ALVIS Record Filter - The experimental, loadable Alvis XML/XSLT filter module + The experimental, loadable Alvis &xml;/XSLT filter module mod-alvis.so is packaged in the GNU/Debian package libidzebra1.4-mod-alvis. It is invoked by the zebra.cfg configuration statement @@ -35,7 +35,7 @@ path db/filter_alvis_conf.xml. The Alvis XSLT filter configuration file must be - valid XML. It might look like this (This example is + valid &xml;. It might look like this (This example is used for indexing and display of OAI harvested records): <?xml version="1.0" encoding="UTF-8"?> @@ -66,7 +66,7 @@ The <split level="2"/> decides where the - XML Reader shall split the + &xml; Reader shall split the collections of records into individual records, which then are loaded into DOM, and have the indexing XSLT stylesheet applied. @@ -78,13 +78,13 @@
ALVIS Internal Record Representation - When indexing, an XML Reader is invoked to split the input - files into suitable record XML pieces. Each record piece is then - transformed to an XML DOM structure, which is essentially the + When indexing, an &xml; Reader is invoked to split the input + files into suitable record &xml; pieces. Each record piece is then + transformed to an &xml; DOM structure, which is essentially the record model. Only XSLT transformations can be applied during index, search and retrieval. Consequently, output formats are - restricted to whatever XSLT can deliver from the record XML - structure, be it other XML formats, HTML, or plain text. In case + restricted to whatever XSLT can deliver from the record &xml; + structure, be it other &xml; formats, HTML, or plain text. In case you have libxslt1 running with EXSLT support, you can use this functionality inside the Alvis filter configuration XSLT stylesheets. @@ -127,13 +127,13 @@ </z:record> - This means the following: From the original XML file - one-record.xml (or from the XML record DOM of the + This means the following: From the original &xml; file + one-record.xml (or from the &xml; record DOM of the same form coming from a splitted input file), the indexing - stylesheet produces an indexing XML record, which is defined by + stylesheet produces an indexing &xml; record, which is defined by the record element in the magic namespace xmlns:z="http://indexdata.dk/zebra/xslt/1". - Zebra uses the content of + &zebra; uses the content of z:id="oai:JTRS:CP-3290---Volume-I" as internal record ID, and - in case static ranking is set - the content of z:rank="47896" as static rank. Following the @@ -236,7 +236,7 @@ As mentioned above, there can be only one indexing stylesheet, and configuration of the indexing process is a synonym - of writing an XSLT stylesheet which produces XML output containing the + of writing an XSLT stylesheet which produces &xml; output containing the magic elements discussed in . Obviously, there are million of different ways to accomplish this @@ -246,19 +246,19 @@ Stylesheets can be written in the pull or the push style: pull - means that the output XML structure is taken as starting point of + means that the output &xml; structure is taken as starting point of the internal structure of the XSLT stylesheet, and portions of - the input XML are pulled out and inserted - into the right spots of the output XML structure. On the other + the input &xml; are pulled out and inserted + into the right spots of the output &xml; structure. On the other side, push XSLT stylesheets are recursavly calling their template definitions, a process which is commanded - by the input XML structure, and avake to produce some output XML + by the input &xml; structure, and avake to produce some output &xml; whenever some special conditions in the input styelsheets are met. The pull type is well-suited for input - XML with strong and well-defined structure and semantcs, like the + &xml; with strong and well-defined structure and semantcs, like the following OAI indexing example, whereas the push type might be the only possible way to - sort out deeply recursive input XML formats. + sort out deeply recursive input &xml; formats. A pull stylesheet example used to index @@ -313,16 +313,16 @@ Notice also, that the names and types of the indexes can be defined in the indexing XSLT stylesheet dynamically according to - content in the original XML records, which has + content in the original &xml; records, which has opportunities for great power and wizardery as well as grande disaster. The following excerpt of a push stylesheet might - be a good idea according to your strict control of the XML + be a good idea according to your strict control of the &xml; input format (due to rigerours checking against well-defined and - tight RelaxNG or XML Schema's, for example): + tight RelaxNG or &xml; Schema's, for example): @@ -333,11 +333,11 @@ ]]> This template creates indexes which have the name of the working - node of any input XML file, and assigns a '1' to the index. + node of any input &xml; file, and assigns a '1' to the index. The example query find @attr 1=xyz 1 finds all files which contain at least one - xyz XML element. In case you can not control + xyz &xml; element. In case you can not control which element names the input files contain, you might ask for disaster and bad karma using this technique. @@ -378,15 +378,15 @@ XSLT transformation, as far as the stylesheet is registered in the main Alvis XSLT filter configuration file, see . - In principle anything that can be expressed in XML, HTML, and + In principle anything that can be expressed in &xml;, HTML, and TEXT can be the output of a schema or element set directive during search, as long as the information comes from the - original input record XML DOM tree - (and not the transformed and indexed XML!!). + original input record &xml; DOM tree + (and not the transformed and indexed &xml;!!). - In addition, internal administrative information from the Zebra + In addition, internal administrative information from the &zebra; indexer can be accessed during record retrieval. The following example is a summary of the possibilities: @@ -492,7 +492,7 @@ c) Main "alvis" XSLT filter config file: see: http://www.indexdata.com/yaz/doc/tools.tkl#tools.cql.map - in db/ an indexing XSLT stylesheet. This is a PULL-type XSLT thing, - as it constructs the new XML structure by pulling data out of the + as it constructs the new &xml; structure by pulling data out of the respective elements/attributes of the old structure. Notice the special zebra namespace, and the special elements in this @@ -502,7 +502,7 @@ c) Main "alvis" XSLT filter config file: indicates that a new record with given id and static rank has to be updated. - encloses all the text/XML which shall be indexed in the index named + encloses all the text/&xml; which shall be indexed in the index named "title" and of index type "w" (see file default.idx in your zebra installation)