added section on idxpath query model
[idzebra-moved-to-github.git] / doc / querymodel.xml
index 8852fc7..c7ebc17 100644 (file)
  <chapter id="querymodel">
-  <!-- $Id: querymodel.xml,v 1.3 2006-06-14 12:20:06 marc Exp $ -->
+  <!-- $Id: querymodel.xml,v 1.10 2006-06-21 13:32:33 marc Exp $ -->
   <title>Query Model</title>
   
   <sect1 id="querymodel-overview">
    <title>Query Model Overview</title>
    
+
+   <sect2 id="querymodel-query-languages">
+    <title>Query Languages</title>
+    <para>
+     Zebra is born as a networking Information Retrieval engine adhering
+     to the international standards 
+     <ulink url="&url.z39.50;">Z39.50</ulink> and
+     <ulink url="&url.sru;">SRU</ulink>,
+     and implement the 
+     <literal>type-1 Reverse Polish Notation (RPN)</literal> query
+     model defined there.
+     Unfortunately, this model has only defined a binary
+     encoded representation, which is used as transport packaging in
+     the Z39.50 protocol layer. This representation is not human
+     readable, nor defines any convenient way to specify queries. 
+    </para>
+    <para>
+     Since the <literal>type-1 (RPN)</literal> 
+     query structure has no direct, useful string
+     representation, every origin application needs to provide some
+     form of mapping from a local query notation or representation to it.
+     </para>
+
+
+   <sect3 id="querymodel-query-languages-pqf">
+    <title>Prefix Query Format (PQF)</title>
+
    <para>
-    Zebra is born as a networking Information Retrieval engine adhering
-    to the international standards 
-    <ulink url="&url.z39.50;">Z39.50</ulink> and
-    <ulink url="&url.sru;">SRU</ulink>,
-    and implement the query model defined there.
-    Unfortunately, the Z39.50 query model has only defined a binary
-    encoded representation, which is used as transport packaging in
-    the Z39.50 protocol layer. This representation is not human
-    readable, nor defines any convenient way to specify queries. 
-   </para>
-   <para>
-    Therefore, Index Data has defined a textual representaion in the 
-    <literal>Prefix Query Format</literal>, short
-    <literal>PQF</literal>, which then has been adopted by other
-    parties developing Z39.50 software. It is also often referred to as
-    <literal>Prefix Query Notation</literal>, or in short 
-    <literal>PQN</literal>, and is thoroughly explained in       
-    <xref linkend="querymodel-pqf"/>. 
-   </para>
-   
-   <para>
-    In addition, Zebra can be configured to understand and map the 
-    <literal>Common Query Language</literal>
-    (<ulink url="&url.cql;">CQL</ulink>)
-    to PQF. See an introduction on the mapping to the internal query
-    representation in  
-    <xref linkend="querymodel-cql-to-pqf"/>.
-   </para>
-  </sect1>
+     Index Data has defined a textual representaion in the 
+     <literal>Prefix Query Format</literal>, short
+     <literal>PQF</literal>, which mappes 
+      <literal>one-to-one</literal> to binary encoded  
+      <literal>type-1 RPN</literal> query packages.
+      It has been adopted by other
+      parties developing Z39.50 software, and is often referred to as
+     <literal>Prefix Query Notation</literal>, or in short 
+     <literal>PQN</literal>. See       
+     <xref linkend="querymodel-pqf"/> for further explanaitions and
+     descriptions of Zebra's capabilities.  
+    </para>
+   </sect3>    
+
+   <sect3 id="querymodel-query-languages-cql">
+    <title>Common Query Language (CQL)</title>
+     <para>
+      The query model of the   <literal>type-1 RPN</literal>,
+      expressed in <literal>PQF/PQN</literal> is natively supported. 
+      On the other hand, the default <literal>SRU</literal>
+      webservices <literal>Common Query Language</literal>
+     <ulink url="&url.cql;">CQL</ulink> is not natively supported.
+     </para>
+     <para>
+     Zebra can be configured to understand and map CQL to PQF. See
+     <xref linkend="querymodel-cql-to-pqf"/>.
+    </para>
+   </sect3>    
+   </sect2>
+
+   <sect2 id="querymodel-operation-types">
+    <title>Operation types</title>
+    <para>
+     Zebra supports all of the three different
+     <literal>Z39.50/SRU</literal> operations defined in the
+     standards: <literal>explain</literal>, <literal>search</literal>, 
+     and <literal>scan</literal>. A short description of the
+     functionality and purpose of each is quite in order here. 
+    </para>
+
+    <sect3 id="querymodel-operation-type-explain">
+     <title>Explain Operation</title>
+     <para>
+      The <emphasis>syntax</emphasis> of Z39.50/SRU queries is
+      well known to any client, but the specific
+      <emphasis>semantics</emphasis> - taking into account a
+      particular servers functionalities and abilities - must be
+      discovered from case to case. Enters the 
+      <literal>explain</literal> operation, which provides the means
+      for learning which  
+      <emphasis>fields</emphasis> (also called
+      <emphasis>indexes</emphasis> or <emphasis>access points</emphasis>
+      are provided, which default parameter the server uses, which
+      retrieve document formats are defined, and which specific parts
+      of the general query model are supported.      
+     </para>
+     <para>
+      The Z39.50 embeddes the <literal>explain</literal> operation
+      by perfoming a 
+      <literal>search</literal> in the magic 
+      <literal>IR-Explain-1</literal> database;
+      see <xref linkend="querymodel-exp1"/>. 
+     </para>
+     <para>
+      In SRU, <literal>explain</literal> is an entirely  seperate
+      operation, which returns an  <literal>Zeerex
+      XML</literal> record according to the 
+      structure defined by the protocol.
+     </para>
+     <para>
+      In both cases, the information gathered through
+      <literal>explain</literal> operations can be used to
+      auto-configure a client user interface to the servers
+      capabilities.  
+     </para>
+    </sect3>
+
+    <sect3 id="querymodel-operation-type-search">
+     <title>Search Operation</title>
+     <para>
+      Search and retrieve interactions are the raison d'ĂȘtre. 
+      They are used to query the remote database and
+      return search result documents.  Search queries span from
+      simple free text searches to nested complex boolean queries,
+      targeting specific indexes, and possibly enhanced with many
+      query semantic specifications. Search interactions are the heart
+      and soul of Z39.50/SRU servers.
+     </para>
+    </sect3>
+
+    <sect3 id="querymodel-operation-type-scan">
+     <title>Scan Operation</title>
+     <para>
+      The <literal>scan</literal> operation is a helper functionality,
+       which operates on one index or access point a time. 
+     </para>
+     <para>
+      It provides
+      the means to investigate the content of specific indexes.
+      Scanning an index returns a handfull of terms actually fond in
+      the indexes, and in addition the <literal>scan</literal>
+      operation returns th enumber of documents indexed by each term.
+      A search client can use this information to propose proper
+      spelling of search terms, to auto-fill search boxes, or to 
+      display  controlled vocabularies.
+     </para>
+    </sect3>
+
+   </sect2>
+
+ </sect1>
+
   
   <sect1 id="querymodel-pqf">
    <title>Prefix Query Format structure and syntax</title>
      may start with one specification of the 
      <emphasis>attribute set</emphasis> used. Following is a query
      tree, which 
-     consists of <emphasis>atomic query parts</emphasis>, eventually
+     consists of <emphasis>atomic query parts (APT)</emphasis> or
+     <emphasis>named result sets</emphasis>, eventually
      paired by <emphasis>boolean binary operators</emphasis>, and 
      finally  <emphasis>recursively combined </emphasis> into 
      complex query trees.   
       issued. Zebra comes with some predefined attribute set
       definitions, others can easily be defined and added to the
       configuration.
-      <note>
-       The Zebra internal query procesing is modeled after 
-       the <literal>Bib1</literal> attribute set, and the non-use
-       attributes type 2-9 are hard-wired in. It is therefore essential
-       to be familiar with <xref linkend="querymodel-bib1"/>. 
-      </note>
      </para>
+
      
-     <table id="querymodel-attribute-sets-table">
+     <table id="querymodel-attribute-sets-table"
+      frame="all" rowsep="1" colsep="1" align="center">
+
       <caption>Attribute sets predefined in Zebra</caption>
-       <!--
+       
        <thead>
-       <tr><td>one</td><td>two</td></tr>
+       <tr>
+         <td>Attribute set</td>
+         <td>Short hand</td>
+         <td>Status</td>
+         <td>Notes</td>
+        </tr>
       </thead>
-       -->
+      
        <tbody>
         <tr>
-         <td><emphasis>exp-1</emphasis></td>
-         <td><literal>Explain</literal> attribute set</td>
+         <td><literal>Explain</literal></td>
+         <td><literal>exp-1</literal></td>
          <td>Special attribute set used on the special automagic
           <literal>IR-Explain-1</literal> database to gain information on
           server capabilities, database names, and database
           and semantics.</td>
+         <td>predefined</td>
         </tr>
         <tr>
-         <td><emphasis>bib-1</emphasis></td>
-         <td><literal>Bib1</literal> attribute set</td>
+         <td><literal>Bib1</literal></td>
+         <td><literal>bib-1</literal></td>
          <td>Standard PQF query language attribute set which defines the
           semantics of Z39.50 searching. In addition, all of the
-          non-use attributes (type 2-9) define the Zebra internal query
-          processing</td>
+          non-use attributes (type 2-9) define the hard-wired 
+          Zebra internal query
+          processing.</td>
+         <td>default</td>
         </tr>
         <tr>
-         <td><emphasis>gils</emphasis></td>
-         <td><literal>GILS</literal> attribute set</td>
+         <td><literal>GILS</literal></td>
+         <td><literal>gils</literal></td>
          <td>Extention to the <literal>Bib1</literal> attribute set.</td>
+         <td>predefined</td>
+        </tr>
+        <tr>
+         <td><literal>IDXPATH</literal></td>
+         <td><literal>idxpath</literal></td>
+         <td>Hardwired XPATH like attribute set, only available for
+             indexing with the GRS record model</td>
+         <td>depreciated</td>
         </tr>
        </tbody>
      </table>
     </sect3>
+
+    <para>
+     The use attributes (type 1) of the predefined attribute sets can
+     be reconfigured by  tweaking the files
+     <filename>tab/*.att</filename>.
+     New attribute sets can be defined by adding similar files in the
+     configuration path of the server.  
+    </para>
+
+    <note>
+     The Zebra internal query processing is modeled after 
+     the <literal>Bib1</literal> attribute set, and the non-use
+     attributes type 2-6 are hard-wired in. It is therefore essential
+     to be familiar with <xref linkend="querymodel-bib1-nonuse"/>. 
+    </note>
+
     
     <sect3 id="querymodel-boolean-operators">
      <title>Boolean operators</title>
       using the standard boolean operators into new query trees.
      </para>
      
-     <table id="querymodel-boolean-operators-table">
+     <table id="querymodel-boolean-operators-table"
+      frame="all" rowsep="1" colsep="1" align="center">
+
       <caption>Boolean operators</caption>
        <!--
        <thead>
       </thead>
        -->
        <tbody>
-        <tr><td><emphasis>@and</emphasis></td>
+        <tr><td><literal>@and</literal></td>
          <td>binary <literal>AND</literal> operator</td>
          <td>Set intersection of two atomic queries hit sets</td>
         </tr>
-        <tr><td><emphasis>@or</emphasis></td>
+        <tr><td><literal>@or</literal></td>
          <td>binary <literal>OR</literal> operator</td>
          <td>Set union of two atomic queries hit sets</td>
         </tr>
-        <tr><td><emphasis>@not</emphasis></td>
+        <tr><td><literal>@not</literal></td>
          <td>binary <literal>AND NOT</literal> operator</td>
          <td>Set complement of two atomic queries hit sets</td>
         </tr>
-        <tr><td><emphasis>@prox</emphasis></td>
+        <tr><td><literal>@prox</literal></td>
          <td>binary <literal>PROXIMY</literal> operator</td>
          <td>Set intersection of two atomic queries hit sets. In 
           addition, the intersection set is purged for all 
     
     
     <sect3 id="querymodel-atomic-queries">
-     <title>Atomic queries</title>
+     <title>Atomic queries (APT)</title>
      <para>
       Atomic queries are the query parts which work on one acess point
       only. These consist of <literal>an attribute list</literal>
       followed by a <literal>single term</literal> or a
-      <literal>quoted term list</literal>.
+      <literal>quoted term list</literal>, and are often called 
+      <emphasis>Attributes-Plus-Terms (APT)</emphasis> queries.
      </para>
      <para>
       Unsupplied non-use attributes type 2-9 are either inherited from
       See <xref linkend="querymodel-bib1"/> for details. 
      </para>
      
-     <table id="querymodel-atomic-queries-table">
+     <table id="querymodel-atomic-queries-table"
+      frame="all" rowsep="1" colsep="1" align="center">
+
       <caption>Atomic queries</caption>
        <!--
        <thead>
       </screen>
      </para>
      <para>
-      Equivalent query fully specified:
+      Equivalent query fully specified including all default values:
       <screen>
        Z> find @attrset bib-1 @attr 1=1017 @attr 2=3 @attr 3=3 @attr 4=1 @attr 5=100 @attr 6=1 "information"
       </screen>
 
     </sect3>
     
+    
+    <sect3 id="querymodel-resultset">
+     <title>Named Result Sets</title>
+     <para>
+      Named result sets are supported in Zebra, and result sets can be
+      used as operands without limitations.
+     </para>
+     <para>      
+      After the execution of a search, the result set is available at
+      the server, such that the client can use it for subsequent
+      searches or retrieval requests. The Z30.50 standard actually
+      stresses the fact that result sets are voliatile. It may cease
+      to exist at any time point after search, and the server will
+      send a diagnostic to the effect that the requested
+      result set does not exist any more.
+     </para>
+     
+     <para>
+      Defining a named result set and re-using it in the next query,
+      using <literal>yaz-client</literal>. 
+      <screen>
+       Z> f @attr 1=4 mozart
+       ...
+       Number of hits: 43, setno 1
+       ...
+       Z> f @and @set 1 @attr 1=4 amadeus
+       ...
+       Number of hits: 14, setno 2
+       ...
+       Z> f @attr 1=1016 beethoven
+       ...
+       Number of hits: 26, setno 3
+       ...
+      </screen>
+     </para>
+     
+     <note>
+      Named result sets are only supported by the Z39.50 protocol.
+      The SRU web service is stateless, and therefore the notion of
+      named result sets does not exist when acessing a Zebra server by
+      the SRU protocol.
+     </note>
+    </sect3>
+
+
     <sect3 id="querymodel-use-string">
      <title>Zebra's special use attribute type 1 of form 'string'</title>
      <para>
       this facility when speed is essential, and the database content
       size is medium to large. 
      </warning>
+
     </sect3>
     
    </sect2>
    <sect2 id="querymodel-bib1">
     <title>Bib1 Attribute Set</title>
     <para>
-     Something about querying to be written ..
-    </para>
-    <para>
      Most of the information contained in this section is an excerpt of
      the <literal>ATTRIBUTE SET BIB-1 (Z39.50-1995)
       SEMANTICS</literal>, 
      <ulink url="&url.z39.50.attset.bib1;">Bib-1
       Attribute Set</ulink> 
      version from 2003. Index Data is not the copyright holder of this
-     information. 
+     information, except for the configuration details, the listing of
+     Zebra's capabilities, and the example queries. 
     </para>
     
     
    <sect3 id="querymodel-bib1-use">
-     <title>Use Attributes (type = 1)</title>
-    </sect3>
+     <title>Use Attributes (type 1)</title>
+
+    <para>
+     A use attribute specifies an access point for any atomic query.
+     These acess points are highly dependent on the attribute set used
+     in the query, and are user configurable using the following
+     default configuration files:
+     <filename>tab/bib1.att</filename>,
+     <filename>tab/dan1.att</filename>,
+     <filename>tab/explain.att</filename>, and
+     <filename>tab/gils.att</filename>.
+     New attribute sets can be added by adding new 
+     <filename>tab/*.att</filename> configuration files, which need to
+     be sourced in the main configuration <filename>zebra.cfg</filename>.
+     </para>
+
+    <para>
+     In addition, Zebra allows the acess of 
+     <emphasis>internal index names</emphasis> and <emphasis>dynamic
+     XPath</emphasis> as use attributes. 
+     See  <xref linkend="querymodel-use-string"/> and 
+     <xref linkend="querymodel-use-xpath"/> for
+     alternative acess to the Zebra internal index names and XPath queries.
+    </para> 
 
     <para>
      Phrase search for <emphasis>information retrieval</emphasis> in
       Z> find @attr 1=4 "information retrieval"
      </screen>
     </para>
+    </sect3>
+
+   </sect2>
+
 
+   <sect2 id="querymodel-bib1-nonuse">
+     <title>Zebra general Bib1 Non-Use Attributes (type 2-6)</title>
     
     <sect3 id="querymodel-bib1-relation">
-     <title>Relation Attributes (type = 2)</title>
-    </sect3>
+     <title>Relation Attributes (type 2)</title>
+     
+     <para>
+      Relation attributes describe the relationship of the access
+      point (left side 
+      of the relation) to the search term as qualified by the attributes (right
+      side of the relation), e.g., Date-publication &lt;= 1975.
+      </para>
+
+     <table id="querymodel-bib1-relation-table"
+      frame="all" rowsep="1" colsep="1" align="center">
+
+      <caption>Relation Attributes (type 2)</caption>
+      <thead>
+        <tr>
+         <td>Relation</td>
+         <td>Value</td>
+         <td>Notes</td>
+        </tr>
+       </thead>
+       <tbody>
+        <tr>
+         <td> Less than</td>
+         <td>1</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Less than or equal</td>
+         <td>2</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Equal</td>
+         <td>3</td>
+         <td>default</td>
+        </tr>
+        <tr>
+         <td>Greater or equal</td>
+         <td>4</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Greater than</td>
+         <td>5</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Not equal</td>
+         <td>6</td>
+         <td>unsupported</td>
+        </tr>
+        <tr>
+         <td>Phonetic</td>
+         <td>100</td>
+         <td>unsupported</td>
+        </tr>
+        <tr>
+         <td>Stem</td>
+         <td>101</td>
+         <td>unsupported</td>
+        </tr>
+        <tr>
+         <td>Relevance</td>
+         <td>102</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>AlwaysMatches</td>
+         <td>103</td>
+         <td>unsupported</td>
+        </tr>
+       </tbody>
+     </table>
+
+     <para>
+      The relation attribute 
+      <literal>relevance (102)</literal> is supported, see
+      <xref linkend="administration-ranking"/> for full information.
+      <!-- always-matches (103) not supported for all indexes -->
+     </para>
+     
     <para>
+     All ordering operations are based on a lexicographical ordering, 
+     <emphasis>expect</emphasis> when the 
+     <literal>structure attribute numeric (109)</literal> is used. In
+     this case, ordering is numerical. See 
+      <xref linkend="querymodel-bib1-structure"/>.
     </para>
 
-    <para>
+     <para>
      Ranked search for <emphasis>information retrieval</emphasis> in
-     the title-register
-     (see <xref linkend="administration-ranking"/> for the glory details):
+     the title-register:
      <screen>
       Z> find @attr 1=4 @attr 2=102 "information retrieval"
      </screen>
     </para>
-    
+    </sect3>
+
     <sect3 id="querymodel-bib1-position">
-     <title>Position Attributes (type = 3)</title>
+     <title>Position Attributes (type 3)</title>
+     <para>
+      The position attribute specifies the location of the search term
+      within the field or subfield in which it appears.
+     </para>
+
+     <table id="querymodel-bib1-position-table"
+      frame="all" rowsep="1" colsep="1" align="center">
+
+      <caption>Position Attributes (type 3)</caption>
+      <thead>
+        <tr>
+         <td>Position</td>
+         <td>Value</td>
+         <td>Notes</td>
+        </tr>
+       </thead>
+       <tbody>
+        <tr>
+         <td>First in field </td>
+         <td>1</td>
+         <td>unsupported</td>
+        </tr>
+        <tr>
+         <td>First in subfield</td>
+         <td>2</td>
+         <td>unsupported</td>
+        </tr>
+        <tr>
+         <td>Any position in field</td>
+         <td>3</td>
+         <td>default</td>
+        </tr>
+       </tbody>
+     </table>
+    <para>
+      The position attribute values <literal>first in field (1)</literal>,
+      and <literal>first in subfield(2)</literal> are unsupported.
+      Using them does not trigger an error, but silent defaults to 
+      <literal>any position in field (3)</literal>.
+      <!-- It should -->
+      </para>
     </sect3>
     
     <sect3 id="querymodel-bib1-structure">
-     <title>Structure Attributes (type = 4)</title>
+     <title>Structure Attributes (type 4)</title>
+   
+     <para>
+      The structure attribute specifies the type of search
+      term. This causes the search to be mapped on
+      different Zebra internal indexes, which must have been defined
+      at index time. 
+     </para>
+
+     <para> 
+      The possible values of the  
+      <literal>structure attribute (type 4)</literal> can be defined
+      using the configuraiton file <filename>
+      tab/default.idx</filename>.
+      The default configuration is summerized in this table.
+     </para>
+
+     <table id="querymodel-bib1-structure-table"
+      frame="all" rowsep="1" colsep="1" align="center">
+
+      <caption>Structure Attributes (type 4)</caption>
+      <thead>
+        <tr>
+         <td>Structure</td>
+         <td>Value</td>
+         <td>Notes</td>
+        </tr>
+       </thead>
+       <tbody>
+        <tr>
+         <td>Phrase </td>
+         <td>1</td>
+         <td>default</td>
+        </tr>
+        <tr>
+         <td>Word</td>
+         <td>2</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Key</td>
+         <td>3</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Year</td>
+         <td>4</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Date (normalized)</td>
+         <td>5</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Word list</td>
+         <td>6</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Date (un-normalized)</td>
+         <td>100</td>
+         <td>unsupported</td>
+        </tr>
+        <tr>
+         <td>Name (normalized) </td>
+         <td>101</td>
+         <td>unsupported</td>
+        </tr>
+        <tr>
+         <td>Name (un-normalized) </td>
+         <td>102</td>
+         <td>unsupported</td>
+        </tr>
+        <tr>
+         <td>Structure</td>
+         <td>103</td>
+         <td>unsupported</td>
+        </tr>
+        <tr>
+         <td>Urx</td>
+         <td>104</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Free-form-text</td>
+         <td>105</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Document-text</td>
+         <td>106</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Local-number</td>
+         <td>107</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>String</td>
+         <td>108</td>
+         <td>unsupported</td>
+        </tr>
+        <tr>
+         <td>Numeric string</td>
+         <td>109</td>
+         <td>supported</td>
+        </tr>
+       </tbody>
+     </table>
     </sect3>
     
+    <para>
+     The structure attribute value <literal>local-number
+      (107)</literal>
+     is supported, and maps always to the Zebra internal document ID.
+     </para>
 
     <para>
      For example, in
 
     <sect3 id="querymodel-bib1-truncation">
      <title>Truncation Attributes (type = 5)</title>
+
+     <para>
+      The truncation attribute specifies whether variations of one or
+      more characters are allowed between serch term and hit terms, or
+      not. Using non-default truncation attributes will broaden the
+      document hit set of a search query.
+     </para>
+
+     <table id="querymodel-bib1-truncation-table"
+      frame="all" rowsep="1" colsep="1" align="center">
+
+      <caption>Truncation Attributes (type 5)</caption>
+      <thead>
+        <tr>
+         <td>Truncation</td>
+         <td>Value</td>
+         <td>Notes</td>
+        </tr>
+       </thead>
+       <tbody>
+        <tr>
+         <td>Right truncation </td>
+         <td>1</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Left truncation</td>
+         <td>2</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Left and right truncation</td>
+         <td>3</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>Do not truncate</td>
+         <td>100</td>
+         <td>default</td>
+        </tr>
+        <tr>
+         <td>Process # in search term</td>
+         <td>101</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>RegExpr-1 </td>
+         <td>102</td>
+         <td>supported</td>
+        </tr>
+        <tr>
+         <td>RegExpr-2</td>
+         <td>103</td>
+         <td>supported</td>
+        </tr>
+       </tbody>
+     </table>
+
+     <para>
+      Truncation attribute value 
+      <literal>Process # in search term (100)</literal> is a
+      poor-man's regular expression search. It maps
+      each <literal>#</literal> to <literal>.*</literal>, and
+      performes then a <literal>Regexp-1 (102)</literal> regular
+      expression search.
+     </para>
+     <para>
+      Truncation attribute value 
+       <literal>Regexp-1 (102)</literal> is a normal regular search,
+      see.
+     </para>
+     <para>
+       Truncation attribute value 
+      <literal>Regexp-2 (103) </literal> is a Zebra specific extention
+      which allows <emphasis>fuzzy</emphasis> matches. One single
+      error in spelling of search terms is allowed, i.e., a document
+      is hit if it includes a term which can be mapped to the used
+      search term by one character substitution, addition, deletion or
+      change of posiiton. 
+      </para>  
+      <!--
+      Special 104, 105, 106 are deprecated and will be removed! -->
     </sect3>
     
     <sect3 id="querymodel-bib1-completeness">
     <title>Completeness Attributes (type = 6)</title>
+     <para>
+      This attribute is ONLY used if structure w, p is to be
+      chosen. completeness is ignorned if not w, p is to be
+      used..
+      Incomplete field(1) is the default and makes Zebra use
+      register type w.
+      complete subfield(2) and complete field(3) both triggers
+      search field type p.
+     </para>
     </sect3>
    </sect2>
     
      set used in a <literal>search</literal> operation query.
     </para>
 
-     <table id="querymodel-zebra-attr-search-table">
+     <table id="querymodel-zebra-attr-search-table"
+      frame="all" rowsep="1" colsep="1" align="center">
+
       <caption>Zebra Search Attribute Extentions</caption>
        <thead>
         <tr>
-         <td><emphasis>Name and Type</emphasis></td>
+         <td>Name</td>
+         <td>Value</td>
          <td>Operation</td>
          <td>Zebra version</td>
         </tr>
       </thead>
        <tbody>
         <tr>
-         <td><emphasis>Embedded Sort (type 7)</emphasis></td>
+         <td>Embedded Sort</td>
+         <td>7</td>
          <td>search</td>
          <td>1.1</td>
         </tr>
         <tr>
-         <td><emphasis>Term Set (type 8)</emphasis></td>
+         <td>Term Set</td>
+         <td>8</td>
          <td>search</td>
          <td>1.1</td>
         </tr>
         <tr>
-         <td><emphasis>Rank weight  (type 9)</emphasis></td>
+         <td>Rank Weight</td>
+         <td>9</td>
          <td>search</td>
          <td>1.1</td>
         </tr>
         <tr>
-         <td><emphasis>Approx Limit (type 9)</emphasis></td>
+         <td>Approx Limit</td>
+         <td>9</td>
          <td>search</td>
          <td>1.4</td>
         </tr>
         <tr>
-         <td><emphasis>Term Reference (type 10)</emphasis></td>
+         <td>Term Reference</td>
+         <td>10</td>
          <td>search</td>
          <td>1.4</td>
         </tr>
      <title>Zebra Extention Term Reference Attribute (type 10)</title>
     </sect3>
     <para>
-     Zebra supports the searchResult-1 facility. If attribute 10 is
+     Zebra supports the <literal>searchResult-1</literal> facility. 
+     If the <literal>Term Reference Attribute (type 10)</literal> is
      given, that specifies a subqueryId value returned as part of the
      search result. It is a way for a client to name an APT part of a
      query. 
      recognized regardless of attribute 
      set used in a <literal>scan</literal> operation query.
     </para>
-     <table id="querymodel-zebra-attr-scan-table">
+     <table id="querymodel-zebra-attr-scan-table"
+      frame="all" rowsep="1" colsep="1" align="center">
+
       <caption>Zebra Scan Attribute Extentions</caption>
        <thead>
         <tr>
-         <td><emphasis>Name and Type</emphasis></td>
+         <td>Name</td>
+         <td>Type</td>
          <td>Operation</td>
          <td>Zebra version</td>
         </tr>
       </thead>
        <tbody>
         <tr>
-         <td><emphasis>Result Set Narrow (type 8)</emphasis></td>
+         <td>Result Set Narrow</td>
+         <td>8</td>
          <td>scan</td>
          <td>1.3</td>
         </tr>
         <tr>
-         <td><emphasis>Approximative Limit (type 9)</emphasis></td>
+         <td>Approximative Limit</td>
+         <td>9</td>
          <td>scan</td>
          <td>1.4</td>
         </tr>
        </tbody>
       </table>      
 
-    <sect3 id="querymodel-zebra-attr-xyz">
+    <sect3 id="querymodel-zebra-attr-narrow">
      <title>Zebra Extention Result Set Narrow (type 8)</title>
     </sect3>
     <para>
-     If attribute 8 is given for scan, the value is the name of a
-     result set. Each hit count in scan is @and'ed with the result set
-     given. 
+     If attribute <literal>Result Set Narrow (type 8)</literal> 
+     is given for <literal>scan</literal>, the value is the name of a
+     result set. Each hit count in <literal>scan</literal> is 
+     <literal>@and</literal>'ed with the result set given. 
     </para>
-    <!--
     <para>
+     Consider for example 
+     the case of scanning all title fields around the
+     scanterm <emphasis>mozart</emphasis>, then refining the scan by
+     issuing a filtering query for <emphasis>amadeus</emphasis> to
+     restric the scan to the result set of the query:  
      <screen>
+      Z> scan @attr 1=4 mozart 
+      ...
+      * mozart (43)
+        mozartforskningen (1)
+        mozartiana (1)
+        mozarts (16)
+      ...
+      Z> f @attr 1=4 amadeus   
+      ...
+      Number of hits: 15, setno 2
+      ...
+      Z> scan @attr 1=4 @attr 8=2 mozart
+      ...
+      * mozart (14)
+        mozartforskningen (0)
+        mozartiana (0)
+        mozarts (1)
+      ...
      </screen>
     </para>
-    -->
+   
     <warning>
-     Experimental and buggy. Definitely not to be used in production code.
+     Experimental. Do not use in production code.
     </warning>
 
-    <sect3 id="querymodel-zebra-attr-xyz">
+    <sect3 id="querymodel-zebra-attr-approx">
      <title>Zebra Extention Approximative Limit (type 9)</title>
     </sect3>
     <para>
-     The approximative limit (as for search) is a way to enable approx
-     hit counts for scan hit counts. 
+     The <literal>Zebra Extention Approximative Limit (type
+      9)</literal> is a way to enable approx
+     hit counts for <literal>scan</literal> hit counts, in the same
+     way as for <literal>search</literal> hit counts. 
     </para>
     <!--
     <para>
     </para>
     -->
     <warning>
-     Experimental. Do not use in production code.
+     Experimental and buggy. Definitely not to be used in production code.
     </warning>
 
 
    </sect2>
-    
+   
+   
+   <sect2 id="querymodel-idxpath">
+    <title>Zebra special IDXPATH Attribute Set for GRS indexing</title>
+    <para>
+     The attribute-set <literal>idxpath</literal> consists of a single 
+     <literal>Use (type 1)</literal> attribute. All non-use attributes
+     behave as normal. 
+    </para>
+    <para>
+     This feature is enabled when defining the
+     <literal>xpath enable</literal> option in the GRS filter
+     <literal>*.abs</literal> configuration files. If one wants to use
+     the special <literal>idxpath</literal> numeric attribute set, the
+     main Zebra configuraiton file <filename>zebra.cfg</filename>
+     directive <literal>attset: idxpath.att</literal> must be enabled.
+    </para>
+    <warning>The <literal>idxpath</literal> is depreciated, may not be
+     supported in future Zebra versions, and should definitely
+     not be used in production code.
+    </warning>
+
+    <sect3 id="querymodel-idxpath-use">
+    <title>IDXPATH Use Attributes (type = 1)</title>
+     <para>
+      This attribute set allows one to search GRS filter indexed
+      records by XPATH like structured index names. It is enabled by
+      specifying the <literal></literal>
+     </para>
+
+
+     <warning>The <literal>idxpath</literal> option defines hard-coded
+      index names, which might clash with your own index names.
+     </warning>
+     
+     <table id="querymodel-idxpath-use-table"
+      frame="all" rowsep="1" colsep="1" align="center">
+
+      <caption>Zebra specific IDXPATH Use Attributes (type 1)</caption>
+      <thead>
+        <tr>
+         <td>IDXPATH</td>
+         <td>Value</td>
+         <td>String Index</td>
+         <td>Notes</td>
+        </tr>
+       </thead>
+       <tbody>
+        <tr>
+         <td>XPATH Begin</td>
+         <td>1</td>
+         <td>_XPATH_BEGIN</td>
+         <td>depreciated</td>
+        </tr>
+        <tr>
+         <td>XPATH End</td>
+         <td>2</td>
+         <td>_XPATH_END</td>
+         <td>depreciated</td>
+        </tr>
+        <tr>
+         <td>XPATH CData</td>
+         <td>1016</td>
+         <td>_XPATH_CDATA</td>
+         <td>depreciated</td>
+        </tr>
+        <tr>
+         <td>XPATH Attribute Name</td>
+         <td>3</td>
+         <td>_XPATH_ATTR_NAME</td>
+         <td>depreciated</td>
+        </tr>
+        <tr>
+         <td>XPATH Attribute CData</td>
+         <td>1015</td>
+         <td>_XPATH_ATTR_CDATA</td>
+         <td>depreciated</td>
+        </tr>
+       </tbody>
+     </table>
+
+
+     <para>
+      See <filename>tab/idxpath.att</filename> for more information.
+     </para>
+     <para>
+      Search for all documents starting with root element 
+      <literal>/root</literal> (either using the numeric or the string
+      use attributes):
+      <screen>
+       Z> find @attrset idxpath @attr 1=1 @attr 4=3 root/ 
+       Z> find @attr idxpath 1=1 @attr 4=3 root/ 
+       Z> find @attr 1=_XPATH_BEGIN @attr 4=3 root/ 
+      </screen>
+     </para>
+     <para>
+      Search for all documents where specific nested XPATH 
+      <literal>/c1/c2/../cn</literal> exists. Notice the very
+      counter-intuitive <emphasis>reverse</emphasis> notation!
+      <screen>
+       Z> find @attrset idxpath @attr 1=1 @attr 4=3 cn/cn-1/../c1/ 
+       Z> find @attr 1=_XPATH_BEGIN @attr 4=3 cn/cn-1/../c1/ 
+      </screen>
+     </para>
+     <para>
+      Search for CDATA string <emphasis>text</emphasis> in any  element
+      <screen>
+       Z> find @attrset idxpath @attr 1=1016 text
+       Z> find @attr 1=_XPATH_CDATA text
+      </screen>
+     </para>
+     <para>
+       Search for CDATA string <emphasis>anothertext</emphasis> in any
+       attribute: 
+      <screen> 
+       Z> find @attrset idxpath @attr 1=1015 anothertext
+       Z> find @attr 1=_XPATH_ATTR_CDATA anothertext
+      </screen>
+     </para>
+     <para>
+       Search for all documents with have an XML element node
+       including an XML  attribute named <emphasis>creator</emphasis> 
+      <screen> 
+       Z> find @attrset idxpath @attr 1=3 @attr 4=3 creator 
+       Z> find @attr 1=_XPATH_ATTR_NAME @attr 4=3 creator 
+      </screen>
+     </para>
+     <para>
+      Combining usual <literal>bib-1</literal> attribut set searches
+      with <literal>idxpath</literal> attribute set searches:
+      <screen>
+       Z> find @and @attr idxpath 1=1 @attr 4=3 link/ @attr 1=4 mozart
+       Z> find @and @attr 1=_XPATH_BEGIN @attr 4=3 link/ @attr 1=_XPATH_CDATA mozart
+      </screen>
+     </para>
+
+    </sect3>
+   </sect2>
+
 
    <sect2 id="querymodel-bib1-mapping">
     <title>Mapping from Bib1 Attributes to Zebra internal 
      Both query types follow the same syntax with the operands:
     </para>
 
-     <table id="querymodel-regular-operands-table">
+     <table id="querymodel-regular-operands-table"
+      frame="all" rowsep="1" colsep="1" align="center">
+
       <caption>Regular Expression Operands</caption>
        <!--
        <thead>
        -->
        <tbody>
         <tr>
-         <td><emphasis>x</emphasis></td>
-         <td>Matches the character <emphasis>x</emphasis>.</td>
+         <td><literal>x</literal></td>
+         <td>Matches the character <literal>x</literal>.</td>
         </tr>
         <tr>
-         <td><emphasis>.</emphasis></td>
+         <td><literal>.</literal></td>
          <td>Matches any character.</td>
         </tr>
         <tr>
-         <td><emphasis>[ .. ]</emphasis></td>
+         <td><literal>[ .. ]</literal></td>
          <td>Matches the set of characters specified;
          such as <literal>[abc]</literal> or <literal>[a-c]</literal>.</td>
         </tr>
      The above operands can be combined with the following operators:
     </para>
 
-    
-     <table id="querymodel-regular-operators-table">
+     <table id="querymodel-regular-operators-table"
+      frame="all" rowsep="1" colsep="1" align="center">
       <caption>Regular Expression Operators</caption>
        <!--
        <thead>
        -->
        <tbody>
         <tr>
-         <td><emphasis>x*</emphasis></td>
-         <td>Matches <emphasis>x</emphasis> zero or more times. 
+         <td><literal>x*</literal></td>
+         <td>Matches <literal>x</literal> zero or more times. 
           Priority: high.</td>
         </tr>
         <tr>
-         <td><emphasis>x+</emphasis></td>
-         <td>Matches <emphasis>x</emphasis> one or more times. 
+         <td><literal>x+</literal></td>
+         <td>Matches <literal>x</literal> one or more times. 
           Priority: high.</td>
         </tr>
         <tr>
-         <td><emphasis>x?</emphasis></td>
-         <td> Matches <emphasis>x</emphasis> zero or once. 
+         <td><literal>x?</literal></td>
+         <td> Matches <literal>x</literal> zero or once. 
           Priority: high.</td>
         </tr>
         <tr>
-         <td><emphasis>xy</emphasis></td>
-         <td> Matches <emphasis>x</emphasis>, then <emphasis>y</emphasis>.
+         <td><literal>xy</literal></td>
+         <td> Matches <literal>x</literal>, then <literal>y</literal>.
          Priority: medium.</td>
         </tr>
         <tr>
-         <td><emphasis>x|y</emphasis></td>
-         <td> Matches either <emphasis>x</emphasis> or <emphasis>y</emphasis>.
+         <td><literal>x|y</literal></td>
+         <td> Matches either <literal>x</literal> or <literal>y</literal>.
          Priority: low.</td>
         </tr>
         <tr>
-         <td><emphasis>( )</emphasis></td>
+         <td><literal>( )</literal></td>
          <td>The order of evaluation may be changed by using parentheses.</td>
         </tr>
        </tbody>
       </table>      
-    
+
     <para>
-     If the first character of the <emphasis>Regxp-2</emphasis> query
+     If the first character of the <literal>Regxp-2</literal> query
      is a plus character (<literal>+</literal>) it marks the
      beginning of a section with non-standard specifiers.
      The next plus character marks the end of the section.
 
     <para>
      Combinations with other attributes are possible. For example, a
-     ranked search with a regular expression 
-     (see <xref linkend="administration-ranking"/> for the glory details):
+     ranked search with a regular expression:
      <screen>
       Z> find @attr 1=4 @attr 5=102 @attr 2=102 "informat.* retrieval"
      </screen>
     process input records.
     Two basic types of processing are available - raw text and structured
     data. Raw text is just that, and it is selected by providing the
-    argument <emphasis>text</emphasis> to Zebra. Structured records are
+    argument <literal>text</literal> to Zebra. Structured records are
     all handled internally using the basic mechanisms described in the
     subsequent sections.
     Zebra can read structured records in many different formats.