Move some private notes in here.

[idzebra-moved-to-github.git] / doc / examples.xml
diff --git a/doc/examples.xml b/doc/examples.xml

index 1a08eae..0341fd5 100644 (file)
--- a/doc/examples.xml
+++ b/doc/examples.xml
@@ -1,5 +1,5 @@
  <chapter id="examples">
- <!-- $Id: examples.xml,v 1.1 2002-08-29 01:16:12 mike Exp $ -->
+ <!-- $Id: examples.xml,v 1.4 2002-08-30 01:18:40 mike Exp $ -->
   <title>Example Configurations</title>
  
   <sect1>
@@ -44,55 +44,141 @@
   </sect1>
  
   <sect1>
-  <title>First Example: Minimal Configuration</title>
+  <title>Example 1: Minimal Configuration</title>
  
    <para>
-   This example shows how Zebra can be used, with absolutely minimal
-   configuration, to index a body of XML documents, and search them
+   This example shows how Zebra can be used with absolutely minimal
+   configuration to index a body of XML documents, and search them
     using XPath expressions to specify access points.
    </para>
    <para>
-   Go to the
-   <literal>zebra/examples/dinosauricon</literal>
-   directory.  There you will find three significant files:
+   Go to the <literal>zebra/examples/dinosauricon</literal> directory.
+   There you will find a <literal>records</literal> subdirectory,
+   which contains some raw XML data to be added to the database: in
+   this case, two files, <literal>genera.xml</literal> and
+   <literal>taxa.xml</literal>, which contain information about all
+   the known dinosaur genera as of August 2002.
    </para>
+  <para>
+   Now we need to create the Zebra database, which we do with the
+   Zebra indexer, <literal>zebraidx</literal>.  This program's
+   behaviour is driven by a configuration life, generally called
+   <literal>zebra.cfg</literal>, although this can be changed with the
+   <literal>-c</literal> option.  For our purposes, we don't need any
+   special behaviour - we can use the defaults - so an empty
+   configuration will do just fine.  We can either create an empty
+   <literal>zebra.cfg</literal> or specify the name of an existing
+   empty file using, for example, <literal>-c /dev/null</literal>.
+  </para>
+  <para>
+   In this case, we'll use an empty <literal>zebra.cfg</literal> so
+   we can add more configuration to it later.
+  </para>
+  <para>
+   That's all you need for a minimal Zebra configuration.  Now you can
+   roll the XML records into the database and build the indexes:
+   <screen>
+       zebraidx -t grs.sgml update records
+   </screen>
+   (### What does "grs.sgml" actually mean?)
+  </para>
+  <para>
+   Now start the server.  Like the indexer, its behaviour is
+   controlled by a configuration file, generally
+   <literal>zebra.cfg</literal>; and like the indexer, it works just
+   fine with an empty configuration.
+   <screen>
+       zebrasrv
+   </screen>
+   By default, the server listens on IP port number 9999, although
+   this can easily be changed.
+  </para>
+  <para>
+   Now you can use the Z39.50 client program of your choice to execute
+   XPath-based boolean queries and fetch the XML records that satisfy
+   them:
+   <screen>
+       Z&gt; open tcp:@:9999
+       Connecting...Ok.
+       Z&gt; find @attr 1=/GENUS/MEANING @or vertebra jaw
+       Number of hits: 1
+       Z&gt; format xml
+       Z&gt; show 1
+       Z> show 1
+       &lt;GENUS name="Hudiesaurus" type="with" xmlns:idzebra="http://www.indexdata.dk/zebra/"&gt;
+        &lt;MEANING&gt;
+         butterfly &lt;LOW&gt;vertebra&lt;/LOW&gt; lizard
+        &lt;/MEANING&gt;
+        &lt;LENGTH value="30"&gt;&lt;/LENGTH&gt;
+        &lt;PLACE name="China"&gt;&lt;/PLACE&gt;
+        &lt;REMAINS content="4 teeth, forelimb, first dorsal vertebra"&gt;&lt;/REMAINS&gt;
+        &lt;SPECIES name="sinojapanorum" status="nudum"&gt;
+         &lt;AUTHOR name="Dong" year="1997"&gt;&lt;/AUTHOR&gt;
+         &lt;MEANING&gt;
+          Chinese-Japanese
+         &lt;/MEANING&gt;
+        &lt;/SPECIES&gt;
+       &lt;idzebra:size&gt;359&lt;/idzebra:size&gt;&lt;idzebra:localnumber&gt;447&lt;/idzebra:localnumber&gt;&lt;idzebra:filename&gt;records/genera.xml&lt;/idzebra:filename&gt;&lt;/GENUS&gt;
+   </screen>
+  </para>
+  <para>
+   Now wasn't that easy?
+  </para>
+ </sect1>
  
-  <itemizedlist>
-   <listitem>
-    <para>
-     The <literal>records</literal> subdirectory, which contains the
-     raw XML data to be added to the database: in this case, just one
-     file, <literal>genera.xml</literal>, which contains information
-     about all the known dinosaur genera as of October 2000.
-     <!-- ### Get more recent data -->
-    </para>
-   </listitem>
+ <sect1>
+  <title>Example 2: Adding Some Configuration</title>
+
+  <para>
+   You may have noticed as <literal>zebraidx</literal> was building
+   the database that it issued several warnings, which we ignored at
+   the time:
+   <screen>
+zebraidx -t grs.sgml update records
+02:12:32-30/08: zebraidx(18151) [warn] default.idx [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] Couldn't open explain.abs [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Couldn't open GENUS.abs [No such file or directory]
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Unknown register type: 0
+02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Unknown register type: w
+02:12:35-30/08: zebraidx(18151) [warn] records/taxa.xml:0 Couldn't open TAXON.abs [No such file or directory]
+   </screen>
+   And the server issued several more as the client connected to it,
+   then searched for and retrieved a record:
+   <screen>
+02:17:10-30/08: zebrasrv(18165) [warn] default.idx [No such file or directory]
+02:17:10-30/08: zebrasrv(18165) [warn] Couldn't open explain.abs [No such file or directory]
+02:17:57-30/08: zebrasrv(18165) [warn] Unknown register type: w
+02:18:42-30/08: zebrasrv(18165) [warn] Couldn't open GENUS.abs [No such file or directory]
+   </screen>
+  </para>
+ </sect1>
+</chapter>
+
+<!--
  
     <listitem>
      <para>
       The master configuration file, <literal>zebra.cfg</literal>,
       which is as short and simple as it can be:
-     <!-- ### Keep this up to date -->
       <screen>
-       # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.1 2002-08-29 01:16:12 mike Exp $
+       # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.4 2002-08-30 01:18:40 mike Exp $
         # Bare-bones master configuration file for Zebra
-       attset: bib1.att
+       profilePath: .:../../tab:../../../yaz/tab
       </screen>
       Apart from the comments, which are ignored, all this specifies is
       that the server should recognise the attribute set described in
       the file called
       <literal>bib1.att</literal>.
+     ### What is an attribute set?
      </para>
-    <!-- ### What is an attribute set? -->
     </listitem>
  
     <listitem>
      <para>
       The BIB-1 attribute set configuration file,
       <literal>bib1.att</literal>, which is also as short as possible:
-     <!-- ### Keep this up to date -->
       <screen>
-       # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.1 2002-08-29 01:16:12 mike Exp $
+       # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.4 2002-08-30 01:18:40 mike Exp $
         # Bare-bones BIB-1 attribute set file for Zebra
         reference Bib-1
       </screen>
@@ -101,44 +187,40 @@
       <literal>Bib-1</literal>, a name recognised by the system as
       referring to a well-known opaque identifier that is transmitted
       by clients as part of their searches.
-     <!-- ### Yeuch!  Surely we can say that better! -->
+     ### Yeuch!  Surely we can say that better!
      </para>
      <para>
       ### Can't we somehow say this trivial thing in the main
       configuration file?
      </para>
     </listitem>
-  </itemizedlist>
+-->
  
-  <para>
-   That's all you need for a minimal Zebra configuration.  Now you can
-   roll the XML records into the database and build the indexes:
-   <screen>
-       zebraidx -t grs.sgml update records
-   </screen>
-   <!-- ### What does "grs.sgml" actually mean? -->
-   and start the server which, by default listens on port 9999:
-   <screen>
-       zebrasrv
-   </screen>
-  </para>
-  <para>
-   Now you can use the Z39.50 client program of your choice to execute
-   XPath-based boolean queries and fetch the XML records that satisfy
-   them:
-   <screen>
-       Z&gt; open tcp:@:9999
-       Connecting...Ok.
-       Z&gt; find @attr 1=/GENUS/MEANING @or vertebra jaw
-       Number of hits: 2
-       Z&gt; format xml
-       Z&gt; show 1
-       &lt;GENUS name="Anurognathus" type="with" xmlns:idzebra="http://www.indexdata.dk/zebra/"&gt;&lt;SPECIES name="ammoni"&gt;&lt;AUTHOR name="Doederline" year="1923"&gt;&lt;/AUTHOR&gt;&lt;/SPECIES&gt;&lt;MEANING&gt;tailless&lt;I&gt;or&lt;/I&gt;anuran&lt;LOW&gt;(frog)&lt;/LOW&gt;jaw&lt;/MEANING&gt;&lt;TIME value="Tithonian" section="late"&gt;&lt;/TIME&gt;&lt;PLACE name="Germany"&gt;&lt;/PLACE&gt;&lt;LENGTH wingspan="1" value=".5"&gt;&lt;/LENGTH&gt;&lt;idzebra:size&gt;304&lt;/idzebra:size&gt;&lt;idzebra:localnumber&gt;70&lt;/idzebra:localnumber&gt;&lt;idzebra:filename&gt;records/genera.xml&lt;/idzebra:filename&gt;&lt;/GENUS&gt;
-   </screen>
-  </para>
- </sect1>
-
-</chapter>
+<!--
+       The simplest hello-world example could go like this:
+       
+       Index the document
+       
+       <book>
+          <title>The art of motorcycle maintenance</title>
+          <subject scheme="Dewey">zen</subject>
+       </book>
+       
+       And search it like
+       
+       f @attr 1=/book/title motorcycle
+       
+       f @attr 1=/book/subject[@scheme=Dewey] zen
+       
+       If you suddenly decide you want broader interop, you can add
+       an abs file (more or less like this):
+       
+       attset bib1.att
+       tagset tagsetg.tag
+       
+       elm (2,1)       title   title
+       elm (2,21)      subject  subject
+-->
  
   <!-- Keep this comment at the end of the file
   Local variables: