doc/recordmodel-domxml.xml

   1 <chapter id="record-model-domxml">
   2   <!-- $Id: recordmodel-domxml.xml,v 1.5 2007-02-21 12:29:52 marc Exp $ -->
   3   <title>&dom; &xml; Record Model and Filter Module</title>
   4
   5   <para>
   6    The record model described in this chapter applies to the fundamental,
   7    structured &xml;
   8    record type <literal>dom</literal>, introduced in
   9    <xref linkend="componentmodulesdom"/>. The &dom; &xml; record model
  10    is experimental, and it's inner workings might change in future
  11    releases of the &zebra; Information Server.
  12   </para>
  13
  14
  15
  16   <section id="record-model-domxml-filter">
  17    <title>&dom; Record Filter Architecture</title>
  18
  19      <para>
  20       The &dom; &xml; filter uses a standard &dom; &xml; structure as
  21       internal data model, and can therefore parse, index, and display
  22       any &xml; document type. It is wellsuited to work on
  23       standardized &xml;-based formats such as Dublin Core, MODS, METS,
  24       MARCXML, OAI-PMH, RSS, and performs equally  well on any other
  25       non-standard &xml; format.
  26     </para>
  27     <para>
  28       A parser for binary &marc; records based on the ISO2709 library
  29       standard is provided, it transforms these to the internal
  30       &marcxml; &dom; representation. Other binary document parsers
  31       are planned to follow.
  32     </para>
  33
  34     <para>
  35       The &dom; filter architecture consists of four
  36       different pipelines, each being a chain of arbitraily many sucessive
  37       &xslt; transformations of the internal &dom; &xml;
  38       representations of documents.
  39     </para>
  40
  41     <figure id="record-model-domxml-architecture-fig">
  42       <title>&dom; &xml; filter architecture</title>
  43       <mediaobject>
  44        <imageobject>
  45          <imagedata fileref="domfilter.pdf" format="PDF" scale="50"/>
  46         </imageobject>
  47         <imageobject>
  48           <imagedata fileref="domfilter.png" format="PNG"/>
  49         </imageobject>
  50         <textobject>
  51         <!-- Fall back if none of the images can be used -->
  52         <phrase>
  53           [Here there should be a diagram showing the &dom; &xml;
  54            filter architecture, but is seems that your
  55            tool chain has not been able to include the diagram in this
  56            document.]
  57          </phrase>
  58         </textobject>
  59       </mediaobject>
  60      </figure>
  61
  62
  63     <table id="record-model-domxml-architecture-table" frame="top">
  64       <title>&dom; &xml; filter pipelines overview</title>
  65       <tgroup cols="5">
  66        <thead>
  67         <row>
  68          <entry>Name</entry>
  69          <entry>When</entry>
  70          <entry>Description</entry>
  71          <entry>Input</entry>
  72          <entry>Output</entry>
  73         </row>
  74        </thead>
  75
  76        <tbody>
  77         <row>
  78          <entry><literal>input</literal></entry>
  79          <entry>first</entry>
  80          <entry>input parsing and initial
  81           transformations to common &xml; format</entry>
  82          <entry>raw &xml; record buffers, &xml;  streams and
  83                 binary &marc; buffers</entry>
  84          <entry>single &dom; &xml; documents suitable for indexing and
  85                 internal storage</entry>
  86         </row>
  87         <row>
  88          <entry><literal>extract</literal></entry>
  89          <entry>second</entry>
  90          <entry>indexing term extraction
  91           transformations</entry>
  92          <entry>common single &dom; &xml; format</entry>
  93          <entry>&zebra; internal indexing &dom; &xml; document</entry>
  94         </row>
  95         <row>
  96          <entry><literal>store</literal></entry>
  97          <entry>second</entry>
  98          <entry> transformations before internal document
  99           storage</entry>
 100          <entry>common single &dom; &xml; format</entry>
 101          <entry>&zebra; internal storage &dom; &xml; document</entry>
 102         </row>
 103         <row>
 104          <entry><literal>retrieve</literal></entry>
 105          <entry>third</entry>
 106          <entry>multiple document retrieve transformations from
 107           storage to different output
 108           formats are possible</entry>
 109          <entry>&zebra; internal storage &dom; &xml; document</entry>
 110          <entry>output &xml; syntax and requested format</entry>
 111         </row>
 112        </tbody>
 113       </tgroup>
 114      </table>
 115
 116     <para>
 117       The &dom; &xml; filter pipelines use &xslt; (and if  supported on
 118       your platform, even &exslt;), it brings thus full &xpath;
 119       support to the indexing, storage and display rules of not only
 120       &xml; documents, but also binary &marc; records.
 121     </para>
 122    </section>
 123
 124
 125    <section id="record-model-domxml-pipeline">
 126     <title>&dom; &xml; filter pipeline configuration</title>
 127
 128    <para>
 129     The experimental, loadable  &dom; &xml;/&xslt; filter module
 130    <literal>mod-dom.so</literal>
 131     is invoked by the <filename>zebra.cfg</filename> configuration statement
 132     <screen>
 133      recordtype.xml: dom.db/filter_dom_conf.xml
 134     </screen>
 135     In this example on all data files with suffix
 136     <filename>*.xml</filename>, where the
 137     &dom; &xslt; filter configuration file is found in the
 138     path <filename>db/filter_dom_conf.xml</filename>.
 139    </para>
 140
 141    <para>The &dom; &xslt; filter configuration file must be
 142     valid &xml;. It might look like this:
 143     <screen>
 144     <![CDATA[
 145     <?xml version="1.0" encoding="UTF8"?>
 146     <dom xmlns="http://indexdata.com/zebra-2.0">
 147       <input>
 148         <xmlreader level="1"/>
 149         <!-- <marc inputcharset="marc-8"/> -->
 150       </input>
 151       <extrac>
 152          <xslt stylesheet="common2index.xsl"/>
 153       </extract>
 154       <store>
 155          <xslt stylesheet="common2store.xsl"/>
 156       </store>
 157       <retrieve name="dc">
 158         <xslt stylesheet="store2dc.xsl"/>
 159       </retrieve>
 160       <retrieve name="mods">
 161         <xslt stylesheet="store2mods.xsl"/>
 162       </retrieve>
 163     </dom>
 164     ]]>
 165     </screen>
 166    </para>
 167
 168    <para>
 169     All named stylesheets defined inside
 170     <literal>schema</literal> element tags
 171     are for presentation after search, including
 172     the indexing stylesheet (which is a great debugging help). The
 173     names defined in the <literal>name</literal> attributes must be
 174     unique, these are the literal <literal>schema</literal> or
 175     <literal>element set</literal> names used in
 176       <ulink url="http://www.loc.gov/standards/sru/srw/">&srw;</ulink>,
 177       <ulink url="&url.sru;">&sru;</ulink> and
 178     &z3950; protocol queries.
 179     The paths in the <literal>stylesheet</literal> attributes
 180     are relative to zebras working directory, or absolute to file
 181     system root.
 182    </para>
 183    <para>
 184     The <literal>&lt;split level="2"/&gt;</literal> decides where the
 185     &xml; Reader shall split the
 186     collections of records into individual records, which then are
 187     loaded into &dom;, and have the indexing &xslt; stylesheet applied.
 188    </para>
 189    <para>
 190     There must be exactly one indexing &xslt; stylesheet, which is
 191     defined by the magic attribute
 192     <literal>identifier="http://indexdata.dk/zebra/xslt/1"</literal>.
 193    </para>
 194
 195    <section id="record-model-domxml-internal">
 196     <title>&dom; filter internal record representation</title>
 197     <para>When indexing, an &xml; Reader is invoked to split the input
 198     files into suitable record &xml; pieces. Each record piece is then
 199     transformed to an &xml; &dom; structure, which is essentially the
 200     record model. Only &xslt; transformations can be applied during
 201     index, search and retrieval. Consequently, output formats are
 202     restricted to whatever &xslt; can deliver from the record &xml;
 203     structure, be it other &xml; formats, HTML, or plain text. In case
 204     you have <literal>libxslt1</literal> running with E&xslt; support,
 205     you can use this functionality inside the &dom;
 206     filter configuration &xslt; stylesheets.
 207     </para>
 208    </section>
 209
 210    <section id="record-model-domxml-canonical">
 211     <title>&dom; Canonical Indexing Format</title>
 212     <para>The output of the indexing &xslt; stylesheets must contain
 213     certain elements in the magic
 214      <literal>xmlns:z="http://indexdata.dk/zebra/xslt/1"</literal>
 215     namespace. The output of the &xslt; indexing transformation is then
 216     parsed using &dom; methods, and the contained instructions are
 217     performed on the <emphasis>magic elements and their
 218     subtrees</emphasis>.
 219     </para>
 220     <para>
 221     For example, the output of the command
 222      <screen>
 223       xsltproc xsl/oai2index.xsl one-record.xml
 224      </screen>
 225      might look like this:
 226      <screen>
 227       &lt;?xml version="1.0" encoding="UTF-8"?&gt;
 228       &lt;z:record xmlns:z="http://indexdata.dk/zebra/xslt/1"
 229            z:id="oai:JTRS:CP-3290---Volume-I"
 230            z:rank="47896"
 231            z:type="update"&gt;
 232        &lt;z:index name="oai_identifier" type="0"&gt;
 233                 oai:JTRS:CP-3290---Volume-I&lt;/z:index&gt;
 234        &lt;z:index name="oai_datestamp" type="0"&gt;2004-07-09&lt;/z:index&gt;
 235        &lt;z:index name="oai_setspec" type="0"&gt;jtrs&lt;/z:index&gt;
 236        &lt;z:index name="dc_all" type="w"&gt;
 237           &lt;z:index name="dc_title" type="w"&gt;Proceedings of the 4th
 238                 International Conference and Exhibition:
 239                 World Congress on Superconductivity - Volume I&lt;/z:index&gt;
 240           &lt;z:index name="dc_creator" type="w"&gt;Kumar Krishen and *Calvin
 241                 Burnham, Editors&lt;/z:index&gt;
 242        &lt;/z:index&gt;
 243      &lt;/z:record&gt;
 244      </screen>
 245     </para>
 246     <para>This means the following: From the original &xml; file
 247      <literal>one-record.xml</literal> (or from the &xml; record &dom; of the
 248      same form coming from a splitted input file), the indexing
 249      stylesheet produces an indexing &xml; record, which is defined by
 250      the <literal>record</literal> element in the magic namespace
 251      <literal>xmlns:z="http://indexdata.dk/zebra/xslt/1"</literal>.
 252      &zebra; uses the content of
 253      <literal>z:id="oai:JTRS:CP-3290---Volume-I"</literal> as internal
 254      record ID, and - in case static ranking is set - the content of
 255      <literal>z:rank="47896"</literal> as static rank. Following the
 256      discussion in <xref linkend="administration-ranking"/>
 257      we see that this records is internally ordered
 258      lexicographically according to the value of the string
 259      <literal>oai:JTRS:CP-3290---Volume-I47896</literal>.
 260      The type of action performed during indexing is defined by
 261      <literal>z:type="update"&gt;</literal>, with recognized values
 262      <literal>insert</literal>, <literal>update</literal>, and
 263      <literal>delete</literal>.
 264     </para>
 265     <para>In this example, the following literal indexes are constructed:
 266      <screen>
 267        oai_identifier
 268        oai_datestamp
 269        oai_setspec
 270        dc_all
 271        dc_title
 272        dc_creator
 273      </screen>
 274      where the indexing type is defined in the
 275      <literal>type</literal> attribute
 276      (any value from the standard configuration
 277      file <filename>default.idx</filename> will do). Finally, any
 278      <literal>text()</literal> node content recursively contained
 279      inside the <literal>index</literal> will be filtered through the
 280      appropriate charmap for character normalization, and will be
 281      inserted in the index.
 282     </para>
 283     <para>
 284      Specific to this example, we see that the single word
 285      <literal>oai:JTRS:CP-3290---Volume-I</literal> will be literal,
 286      byte for byte without any form of character normalization,
 287      inserted into the index named <literal>oai:identifier</literal>,
 288      the text
 289      <literal>Kumar Krishen and *Calvin Burnham, Editors</literal>
 290      will be inserted using the <literal>w</literal> character
 291      normalization defined in <filename>default.idx</filename> into
 292      the index <literal>dc:creator</literal> (that is, after character
 293      normalization the index will keep the inidividual words
 294      <literal>kumar</literal>, <literal>krishen</literal>,
 295      <literal>and</literal>, <literal>calvin</literal>,
 296      <literal>burnham</literal>, and <literal>editors</literal>), and
 297      finally both the texts
 298      <literal>Proceedings of the 4th International Conference and Exhibition:
 299       World Congress on Superconductivity - Volume I</literal>
 300      and
 301      <literal>Kumar Krishen and *Calvin Burnham, Editors</literal>
 302      will be inserted into the index <literal>dc:all</literal> using
 303      the same character normalization map <literal>w</literal>.
 304     </para>
 305     <para>
 306      Finally, this example configuration can be queried using &pqf;
 307      queries, either transported by &z3950;, (here using a yaz-client)
 308      <screen>
 309       <![CDATA[
 310       Z> open localhost:9999
 311       Z> elem dc
 312       Z> form xml
 313       Z>
 314       Z> f @attr 1=dc_creator Kumar
 315       Z> scan @attr 1=dc_creator adam
 316       Z>
 317       Z> f @attr 1=dc_title @attr 4=2 "proceeding congress superconductivity"
 318       Z> scan @attr 1=dc_title abc
 319       ]]>
 320      </screen>
 321      or the proprietary
 322      extentions <literal>x-pquery</literal> and
 323      <literal>x-pScanClause</literal> to
 324      &sru;, and &srw;
 325      <screen>
 326       <![CDATA[
 327       http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=%40attr+1%3Ddc_creator+%40attr+4%3D6+%22the
 328       http://localhost:9999/?version=1.1&operation=scan&x-pScanClause=@attr+1=dc_date+@attr+4=2+a
 329       ]]>
 330      </screen>
 331      See <xref linkend="zebrasrv-sru"/> for more information on &sru;/&srw;
 332      configuration, and <xref linkend="gfs-config"/> or the &yaz;
 333      <ulink url="&url.yaz.cql;">&cql; section</ulink>
 334      for the details or the &yaz; frontend server.
 335     </para>
 336     <para>
 337      Notice that there are no <filename>*.abs</filename>,
 338      <filename>*.est</filename>, <filename>*.map</filename>, or other &grs1;
 339      filter configuration files involves in this process, and that the
 340      literal index names are used during search and retrieval.
 341     </para>
 342    </section>
 343   </section>
 344
 345
 346   <section id="record-model-domxml-conf">
 347    <title>&dom; Record Model Configuration</title>
 348
 349
 350   <section id="record-model-domxml-index">
 351    <title>&dom; Indexing Configuration</title>
 352     <para>
 353      As mentioned above, there can be only one indexing
 354      stylesheet, and configuration of the indexing process is a synonym
 355      of writing an &xslt; stylesheet which produces &xml; output containing the
 356      magic elements discussed in
 357      <xref linkend="record-model-domxml-internal"/>.
 358      Obviously, there are million of different ways to accomplish this
 359      task, and some comments and code snippets are in order to lead
 360      our paduans on the right track to the  good side of the force.
 361     </para>
 362     <para>
 363      Stylesheets can be written in the <emphasis>pull</emphasis> or
 364      the <emphasis>push</emphasis> style: <emphasis>pull</emphasis>
 365      means that the output &xml; structure is taken as starting point of
 366      the internal structure of the &xslt; stylesheet, and portions of
 367      the input &xml; are <emphasis>pulled</emphasis> out and inserted
 368      into the right spots of the output &xml; structure. On the other
 369      side, <emphasis>push</emphasis> &xslt; stylesheets are recursavly
 370      calling their template definitions, a process which is commanded
 371      by the input &xml; structure, and avake to produce some output &xml;
 372      whenever some special conditions in the input styelsheets are
 373      met. The <emphasis>pull</emphasis> type is well-suited for input
 374      &xml; with strong and well-defined structure and semantcs, like the
 375      following &oai; indexing example, whereas the
 376      <emphasis>push</emphasis> type might be the only possible way to
 377      sort out deeply recursive input &xml; formats.
 378     </para>
 379     <para>
 380      A <emphasis>pull</emphasis> stylesheet example used to index
 381      &oai; harvested records could use some of the following template
 382      definitions:
 383      <screen>
 384       <![CDATA[
 385       <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 386        xmlns:z="http://indexdata.dk/zebra/xslt/1"
 387        xmlns:oai="http://www.openarchives.org/&oai;/2.0/"
 388        xmlns:oai_dc="http://www.openarchives.org/&oai;/2.0/oai_dc/"
 389        xmlns:dc="http://purl.org/dc/elements/1.1/"
 390        version="1.0">
 391
 392        <xsl:output indent="yes" method="xml" version="1.0" encoding="UTF-8"/>
 393
 394         <!-- disable all default text node output -->
 395         <xsl:template match="text()"/>
 396
 397          <!-- match on oai xml record root -->
 398          <xsl:template match="/">
 399           <z:record z:id="{normalize-space(oai:record/oai:header/oai:identifier)}"
 400            z:type="update">
 401            <!-- you might want to use z:rank="{some &xslt; function here}" -->
 402            <xsl:apply-templates/>
 403           </z:record>
 404          </xsl:template>
 405
 406          <!-- &oai; indexing templates -->
 407          <xsl:template match="oai:record/oai:header/oai:identifier">
 408           <z:index name="oai_identifier" type="0">
 409            <xsl:value-of select="."/>
 410           </z:index>
 411          </xsl:template>
 412
 413          <!-- etc, etc -->
 414
 415          <!-- DC specific indexing templates -->
 416          <xsl:template match="oai:record/oai:metadata/oai_dc:dc/dc:title">
 417           <z:index name="dc_title" type="w">
 418            <xsl:value-of select="."/>
 419           </z:index>
 420          </xsl:template>
 421
 422          <!-- etc, etc -->
 423
 424       </xsl:stylesheet>
 425       ]]>
 426      </screen>
 427     </para>
 428     <para>
 429      Notice also,
 430      that the names and types of the indexes can be defined in the
 431      indexing &xslt; stylesheet <emphasis>dynamically according to
 432      content in the original &xml; records</emphasis>, which has
 433      opportunities for great power and wizardery as well as grande
 434      disaster.
 435     </para>
 436     <para>
 437      The following excerpt of a <emphasis>push</emphasis> stylesheet
 438      <emphasis>might</emphasis>
 439      be a good idea according to your strict control of the &xml;
 440      input format (due to rigerours checking against well-defined and
 441      tight RelaxNG or &xml; Schema's, for example):
 442      <screen>
 443       <![CDATA[
 444       <xsl:template name="element-name-indexes">
 445        <z:index name="{name()}" type="w">
 446         <xsl:value-of select="'1'"/>
 447        </z:index>
 448       </xsl:template>
 449       ]]>
 450      </screen>
 451      This template creates indexes which have the name of the working
 452      node of any input  &xml; file, and assigns a '1' to the index.
 453      The example query
 454      <literal>find @attr 1=xyz 1</literal>
 455      finds all files which contain at least one
 456      <literal>xyz</literal> &xml; element. In case you can not control
 457      which element names the input files contain, you might ask for
 458      disaster and bad karma using this technique.
 459     </para>
 460     <para>
 461      One variation over the theme <emphasis>dynamically created
 462      indexes</emphasis> will definitely be unwise:
 463      <screen>
 464       <![CDATA[
 465       <!-- match on oai xml record root -->
 466       <xsl:template match="/">
 467        <z:record z:type="update">
 468
 469         <!-- create dynamic index name from input content -->
 470         <xsl:variable name="dynamic_content">
 471          <xsl:value-of select="oai:record/oai:header/oai:identifier"/>
 472         </xsl:variable>
 473
 474         <!-- create zillions of indexes with unknown names -->
 475         <z:index name="{$dynamic_content}" type="w">
 476          <xsl:value-of select="oai:record/oai:metadata/oai_dc:dc"/>
 477         </z:index>
 478        </z:record>
 479
 480       </xsl:template>
 481       ]]>
 482      </screen>
 483      Don't be tempted to cross
 484      the line to the dark side of the force, paduan; this leads
 485      to suffering and pain, and universal
 486      disentigration of your project schedule.
 487     </para>
 488   </section>
 489
 490   <section id="record-model-domxml-elementset">
 491    <title>&dom; Exchange Formats</title>
 492    <para>
 493      An exchange format can be anything which can be the outcome of an
 494      &xslt; transformation, as far as the stylesheet is registered in
 495      the main &dom; &xslt; filter configuration file, see
 496      <xref linkend="record-model-domxml-filter"/>.
 497      In principle anything that can be expressed in  &xml;, HTML, and
 498      TEXT can be the output of a <literal>schema</literal> or
 499     <literal>element set</literal> directive during search, as long as
 500      the information comes from the
 501      <emphasis>original input record &xml; &dom; tree</emphasis>
 502      (and not the transformed and <emphasis>indexed</emphasis> &xml;!!).
 503     </para>
 504     <para>
 505      In addition, internal administrative information from the &zebra;
 506      indexer can be accessed during record retrieval. The following
 507      example is a summary of the possibilities:
 508      <screen>
 509       <![CDATA[
 510       <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 511        xmlns:z="http://indexdata.dk/zebra/xslt/1"
 512        version="1.0">
 513
 514        <!-- register internal zebra parameters -->
 515        <xsl:param name="id" select="''"/>
 516        <xsl:param name="filename" select="''"/>
 517        <xsl:param name="score" select="''"/>
 518        <xsl:param name="schema" select="''"/>
 519
 520        <xsl:output indent="yes" method="xml" version="1.0" encoding="UTF-8"/>
 521
 522        <!-- use then for display of internal information -->
 523        <xsl:template match="/">
 524          <z:zebra>
 525            <id><xsl:value-of select="$id"/></id>
 526            <filename><xsl:value-of select="$filename"/></filename>
 527            <score><xsl:value-of select="$score"/></score>
 528            <schema><xsl:value-of select="$schema"/></schema>
 529          </z:zebra>
 530        </xsl:template>
 531
 532       </xsl:stylesheet>
 533       ]]>
 534      </screen>
 535     </para>
 536
 537   </section>
 538
 539   <section id="record-model-domxml-example">
 540    <title>&dom; Filter &oai; Indexing Example</title>
 541    <para>
 542      The sourcecode tarball contains a working &dom; filter example in
 543      the directory <filename>examples/dom-oai/</filename>, which
 544      should get you started.
 545     </para>
 546     <para>
 547      More example data can be harvested from any &oai; complient server,
 548      see details at the  &oai;
 549      <ulink url="http://www.openarchives.org/">
 550       http://www.openarchives.org/</ulink> web site, and the community
 551       links at
 552      <ulink url="http://www.openarchives.org/community/index.html">
 553       http://www.openarchives.org/community/index.html</ulink>.
 554      There is a  tutorial
 555      found at
 556      <ulink url="http://www.oaforum.org/tutorial/">
 557       http://www.oaforum.org/tutorial/</ulink>.
 558     </para>
 559    </section>
 560
 561   </section>
 562
 563
 564  </chapter>
 565
 566
 567 <!--
 568
 569 c)  Main "dom" &xslt; filter config file:
 570   cat db/filter_dom_conf.xml
 571
 572   <?xml version="1.0" encoding="UTF8"?>
 573   <schemaInfo>
 574     <schema name="dom" stylesheet="db/dom2dom.xsl" />
 575     <schema name="index" identifier="http://indexdata.dk/zebra/xslt/1"
 576             stylesheet="db/dom2index.xsl" />
 577     <schema name="dc" stylesheet="db/dom2dc.xsl" />
 578     <schema name="dc-short" stylesheet="db/dom2dc_short.xsl" />
 579     <schema name="snippet" snippet="25" stylesheet="db/dom2snippet.xsl" />
 580     <schema name="help" stylesheet="db/dom2help.xsl" />
 581     <split level="1"/>
 582   </schemaInfo>
 583
 584   the paths are relative to the directory where zebra.init is placed
 585   and is started up.
 586
 587   The split level decides where the SAX parser shall split the
 588   collections of records into individual records, which then are
 589   loaded into &dom;, and have the indexing &xslt; stylesheet applied.
 590
 591   The indexing stylesheet is found by it's identifier.
 592
 593   All the other stylesheets are for presentation after search.
 594
 595 - in data/ a short sample of harvested carnivorous plants
 596   ZEBRA_INDEX_DIRS=data/carnivor_20050118_2200_short-346.xml
 597
 598 - in root also one single data record - nice for testing the xslt
 599   stylesheets,
 600
 601   xsltproc db/dom2index.xsl carni*.xml
 602
 603   and so on.
 604
 605 - in db/ a cql2pqf.txt yaz-client config file
 606   which is also used in the yaz-server <ulink url="&url.cql;">&cql;</ulink>-to-&pqf; process
 607
 608    see: http://www.indexdata.com/yaz/doc/tools.tkl#tools.cql.map
 609
 610 - in db/ an indexing &xslt; stylesheet. This is a PULL-type XSLT thing,
 611   as it constructs the new &xml; structure by pulling data out of the
 612   respective elements/attributes of the old structure.
 613
 614   Notice the special zebra namespace, and the special elements in this
 615   namespace which indicate to the zebra indexer what to do.
 616
 617   <z:record id="67ht7" rank="675" type="update">
 618   indicates that a new record with given id and static rank has to be updated.
 619
 620   <z:index name="title" type="w">
 621    encloses all the text/&xml; which shall be indexed in the index named
 622    "title" and of index type "w" (see  file default.idx in your zebra
 623    installation)
 624
 625
 626    </para>
 627
 628    <para>
 629 -->
 630
 631
 632
 633
 634  <!-- Keep this comment at the end of the file
 635  Local variables:
 636  mode: sgml
 637  sgml-omittag:t
 638  sgml-shorttag:t
 639  sgml-minimize-attributes:nil
 640  sgml-always-quote-attributes:t
 641  sgml-indent-step:1
 642  sgml-indent-data:t
 643  sgml-parent-document: "zebra.xml"
 644  sgml-local-catalogs: nil
 645  sgml-namecase-general:t
 646  End:
 647  -->