added a lot of info about attribute sets, PQF query structure, and string use attributes

[idzebra-moved-to-github.git] / doc / server.xml
diff --git a/doc/server.xml b/doc/server.xml

index 52b0bc2..dd0a9e9 100644 (file)
--- a/doc/server.xml
+++ b/doc/server.xml
@@ -1,5 +1,5 @@
  <chapter id="server">
- <!-- $Id: server.xml,v 1.17 2006-03-08 10:47:37 mike Exp $ -->
+ <!-- $Id: server.xml,v 1.24 2006-06-13 13:45:08 marc Exp $ -->
   <title>The Z39.50 Server</title>
   
   <sect1 id="zebrasrv">
@@ -242,245 +242,8 @@
      also the following section).
     </para>
  
-   <para>
-    <emphasis>Use</emphasis> attributes are interpreted according to the
-    attribute sets which have been loaded in the
-    <literal>zebra.cfg</literal> file, and are matched against specific
-    fields as specified in the <literal>.abs</literal> file which
-    describes the profile of the records which have been loaded.
-    If no Use attribute is provided, a default of Bib-1 Any is assumed.
-   </para>
-
-   <para>
-    If a <emphasis>Structure</emphasis> attribute of
-    <emphasis>Phrase</emphasis> is used in conjunction with a
-    <emphasis>Completeness</emphasis> attribute of
-    <emphasis>Complete (Sub)field</emphasis>, the term is matched
-    against the contents of the phrase (long word) register, if one
-    exists for the given <emphasis>Use</emphasis> attribute.
-    A phrase register is created for those fields in the
-    <literal>.abs</literal> file that contains a
-    <literal>p</literal>-specifier.
-    <!-- ### whatever the hell _that_ is -->
-   </para>
-
-   <para>
-    If <emphasis>Structure</emphasis>=<emphasis>Phrase</emphasis> is
-    used in conjunction with <emphasis>Incomplete Field</emphasis> - the
-    default value for <emphasis>Completeness</emphasis>, the
-    search is directed against the normal word registers, but if the term
-    contains multiple words, the term will only match if all of the words
-    are found immediately adjacent, and in the given order.
-    The word search is performed on those fields that are indexed as
-    type <literal>w</literal> in the <literal>.abs</literal> file.
-   </para>
-
-   <para>
-    If the <emphasis>Structure</emphasis> attribute is
-    <emphasis>Word List</emphasis>,
-    <emphasis>Free-form Text</emphasis>, or
-    <emphasis>Document Text</emphasis>, the term is treated as a
-    natural-language, relevance-ranked query.
-    This search type uses the word register, i.e. those fields
-    that are indexed as type <literal>w</literal> in the
-    <literal>.abs</literal> file.
-   </para>
-
-   <para>
-    If the <emphasis>Structure</emphasis> attribute is
-    <emphasis>Numeric String</emphasis> the term is treated as an integer.
-    The search is performed on those fields that are indexed
-    as type <literal>n</literal> in the <literal>.abs</literal> file.
-   </para>
-
-   <para>
-    If the <emphasis>Structure</emphasis> attribute is
-    <emphasis>URx</emphasis> the term is treated as a URX (URL) entity.
-    The search is performed on those fields that are indexed as type
-    <literal>u</literal> in the <literal>.abs</literal> file.
-   </para>
-
-   <para>
-    If the <emphasis>Structure</emphasis> attribute is
-    <emphasis>Local Number</emphasis> the term is treated as
-    native Zebra Record Identifier.
-   </para>
-
-   <para>
-    If the <emphasis>Relation</emphasis> attribute is
-    <emphasis>Equals</emphasis> (default), the term is matched
-    in a normal fashion (modulo truncation and processing of
-    individual words, if required).
-    If <emphasis>Relation</emphasis> is <emphasis>Less Than</emphasis>,
-    <emphasis>Less Than or Equal</emphasis>,
-    <emphasis>Greater than</emphasis>, or <emphasis>Greater than or
-     Equal</emphasis>, the term is assumed to be numerical, and a
-    standard regular expression is constructed to match the given
-    expression.
-    If <emphasis>Relation</emphasis> is <emphasis>Relevance</emphasis>,
-    the standard natural-language query processor is invoked.
-   </para>
-
-   <para>
-    For the <emphasis>Truncation</emphasis> attribute,
-    <emphasis>No Truncation</emphasis> is the default.
-    <emphasis>Left Truncation</emphasis> is not supported.
-    <emphasis>Process # in search term</emphasis> is supported, as is
-    <emphasis>Regxp-1</emphasis>.
-    <emphasis>Regxp-2</emphasis> enables the fault-tolerant (fuzzy)
-    search. As a default, a single error (deletion, insertion, 
-    replacement) is accepted when terms are matched against the register
-    contents.
-   </para>
-
-   <sect3>
-    <title>Regular expressions</title>
-    
-    <para>
-     Each term in a query is interpreted as a regular expression if
-     the truncation value is either <emphasis>Regxp-1</emphasis> (102)
-     or <emphasis>Regxp-2</emphasis> (103).
-     Both query types follow the same syntax with the operands:
-     <variablelist>
-
-      <varlistentry>
-       <term>x</term>
-       <listitem>
-        <para>
-         Matches the character <emphasis>x</emphasis>.
-        </para>
-       </listitem>
-      </varlistentry>
-      <varlistentry>
-       <term>.</term>
-       <listitem>
-        <para>
-         Matches any character.
-        </para>
-       </listitem>
-      </varlistentry>
-      <varlistentry>
-       <term><literal>[</literal>..<literal>]</literal></term>
-       <listitem>
-        <para>
-         Matches the set of characters specified;
-         such as <literal>[abc]</literal> or <literal>[a-c]</literal>.
-        </para>
-       </listitem>
-      </varlistentry>
-     </variablelist>
-     and the operators:
-     <variablelist>
-      
-      <varlistentry>
-       <term>x*</term>
-       <listitem>
-        <para>
-         Matches <emphasis>x</emphasis> zero or more times. Priority: high.
-        </para>
-       </listitem>
-      </varlistentry>
-      <varlistentry>
-       <term>x+</term>
-       <listitem>
-        <para>
-         Matches <emphasis>x</emphasis> one or more times. Priority: high.
-        </para>
-       </listitem>
-      </varlistentry>
-      <varlistentry>
-       <term>x?</term>
-       <listitem>
-        <para>
-         Matches <emphasis>x</emphasis> zero or once. Priority: high.
-        </para>
-       </listitem>
-      </varlistentry>
-      <varlistentry>
-       <term>xy</term>
-       <listitem>
-        <para>
-         Matches <emphasis>x</emphasis>, then <emphasis>y</emphasis>.
-         Priority: medium.
-        </para>
-       </listitem>
-      </varlistentry>
-      <varlistentry>
-       <term>x|y</term>
-       <listitem>
-        <para>
-         Matches either <emphasis>x</emphasis> or <emphasis>y</emphasis>.
-         Priority: low.
-        </para>
-       </listitem>
-      </varlistentry>
-     </variablelist>
-     The order of evaluation may be changed by using parentheses.
-    </para>
-
-    <para>
-     If the first character of the <emphasis>Regxp-2</emphasis> query
-     is a plus character (<literal>+</literal>) it marks the
-     beginning of a section with non-standard specifiers.
-     The next plus character marks the end of the section.
-     Currently Zebra only supports one specifier, the error tolerance,
-     which consists one digit. 
-    </para>
-
-    <para>
-     Since the plus operator is normally a suffix operator the addition to
-     the query syntax doesn't violate the syntax for standard regular
-     expressions.
-    </para>
-
-   </sect3>
-
-   <sect3>
-    <title>Query examples</title>
-
-    <para>
-     Phrase search for <emphasis>information retrieval</emphasis> in
-     the title-register:
-     <screen>
-      @attr 1=4 "information retrieval"
-     </screen>
-    </para>
-
-    <para>
-     Ranked search for the same thing:
-     <screen>
-      @attr 1=4 @attr 2=102 "Information retrieval"
-     </screen>
-    </para>
-
-    <para>
-     Phrase search with a regular expression:
-     <screen>
-      @attr 1=4 @attr 5=102 "informat.* retrieval"
-     </screen>
-    </para>
-
-    <para>
-     Ranked search with a regular expression:
-     <screen>
-      @attr 1=4 @attr 5=102 @attr 2=102 "informat.* retrieval"
-     </screen>
-    </para>
-
-    <para>
-     In the GILS schema (<literal>gils.abs</literal>), the
-     west-bounding-coordinate is indexed as type <literal>n</literal>,
-     and is therefore searched by specifying
-     <emphasis>structure</emphasis>=<emphasis>Numeric String</emphasis>.
-     To match all those records with west-bounding-coordinate greater
-     than -114 we use the following query:
-     <screen>
-      @attr 4=109 @attr 2=5 @attr gils 1=2038 -114
-     </screen> 
-    </para>
-   </sect3>
-  </sect2>
-
+   </sect2>
+   
    <sect2>
     <title>Present</title>
     <para>
@@ -534,6 +297,32 @@
      timeout.
     </para>
    </sect2>
+   
+   <sect2>
+    <title>Explain</title>
+    <para>
+     Zebra maintains a "classic" 
+     <ulink url="&url.z39.50.explain;">Explain</ulink> database
+     on the side. 
+     This database is called <literal>IR-Explain-1</literal> and can be
+     searched using the attribute set <literal>exp-1</literal>.
+    </para>
+    <para>
+     The records in the explain database are of type 
+     <literal>grs.sgml</literal>.
+     The root element for the Explain grs.sgml records is 
+     <literal>explain</literal>, thus 
+     <filename>explain.abs</filename> is used for indexing.
+    </para>
+    <note>
+     <para>
+      Zebra <emphasis>must</emphasis> be able to locate
+      <filename>explain.abs</filename> in order to index the Explain
+      records properly. Zebra will work without it but the information
+      will not be searchable.
+     </para>
+    </note>
+   </sect2>
   </sect1>
  </chapter>