From 49decee635c51f48079d9c7b035ff239f24ed500 Mon Sep 17 00:00:00 2001 From: Marc Cromme Date: Wed, 21 Feb 2007 14:15:07 +0000 Subject: [PATCH] added more content on dom filter pipelines --- doc/recordmodel-domxml.xml | 67 ++++++++++++++++++++++++++------------------ 1 file changed, 39 insertions(+), 28 deletions(-) diff --git a/doc/recordmodel-domxml.xml b/doc/recordmodel-domxml.xml index 8dfcdb6..009d0fd 100644 --- a/doc/recordmodel-domxml.xml +++ b/doc/recordmodel-domxml.xml @@ -1,5 +1,5 @@ - + &dom; &xml; Record Model and Filter Module @@ -216,50 +216,61 @@
Extract pipeline + + The <extact> pipeline takes documents + from any common &dom; &xml; format to the &zebra; specific + indexing &dom; &xml; format. + It may consist of zero ore more + ]]> + &xslt; transformations, and the outcome is handled to the + &zebra; core to drive the proces of building the inverted + indexes. See + for + details. +
Store pipeline -
+ The <store> pipeline takes documents + from any common &dom; &xml; format to the &zebra; specific + storage &dom; &xml; format. + It may consist of zero ore more + ]]> + &xslt; transformations, and the outcome is handled to the + &zebra; core for deposition into the internal storage system. +
Retrieve pipeline - - All named stylesheets defined inside - schema element tags - are for presentation after search, including - the indexing stylesheet (which is a great debugging help). The - names defined in the name attributes must be - unique, these are the literal schema or + Finally, there may be one or more + <retrieve> pipeline definitions, each + of them again consisting of zero or more + ]]> + &xslt; transformations. These are used for document + presentation after search, and take the internal storage &dom; + &xml; to the requested output formats during record present + requests. + + + The possible multiple + <retrieve> pipeline definitions + are distinguished by their unique name + attributes, these are the literal schema or element set names used in &srw;, &sru; and - &z3950; protocol queries. + &z3950; protocol queries.
-
- &dom; filter internal record representation - When indexing, an &xml; Reader is invoked to split the input - files into suitable record &xml; pieces. Each record piece is then - transformed to an &xml; &dom; structure, which is essentially the - record model. Only &xslt; transformations can be applied during - index, search and retrieval. Consequently, output formats are - restricted to whatever &xslt; can deliver from the record &xml; - structure, be it other &xml; formats, HTML, or plain text. In case - you have libxslt1 running with E&xslt; support, - you can use this functionality inside the &dom; - filter configuration &xslt; stylesheets. - -
- -
- &dom; Canonical Indexing Format +
+ Canonical Indexing Format The output of the indexing &xslt; stylesheets must contain certain elements in the magic - xmlns:z="http://indexdata.dk/zebra/xslt/1" + xmlns:z="http://indexdata.dk/zebra-2.0" namespace. The output of the &xslt; indexing transformation is then parsed using &dom; methods, and the contained instructions are performed on the magic elements and their -- 1.7.10.4