<chapter id="examples">
- <!-- $Id: examples.xml,v 1.6 2002-09-20 09:58:04 mike Exp $ -->
+ <!-- $Id: examples.xml,v 1.7 2002-10-08 08:09:43 mike Exp $ -->
<title>Example Configurations</title>
<sect1>
<listitem>
<para>
- Where to find the default indexing rules (### default.idx)
+ Where to find subsidiary configuration files, including
+ <literal>default.idx</literal>
+ which specifies the default indexing rules.
</para>
</listitem>
<listitem>
<para>
- ### Something to do with explain.abs?!
+ What attribute sets to recognise in searches.
</para>
</listitem>
<listitem>
<para>
- ### Where to find other configuration files, e.g. searches using
- BIB-1 attributes require a bib1.att configuration file (even if
- the access point is actually an XPath expression). These are
- searched for in the working directory unless otherwise
- specified.
+ Policy details such as what record type to expect, what
+ low-level indexing algorithm to use, how to identify potential
+ duplicate records, etc.
</para>
</listitem>
</itemizedlist>
</para>
+ <para>
+ Now let's see what goes in the <literal>zebra.cfg</literal> file
+ for some example configurations.
+ </para>
</sect1>
<sect1 id="example1">
- <title>Example 1: Minimal Configuration</title>
+ <title>Example 1: XML Indexing And Searching</title>
<para>
This example shows how Zebra can be used with absolutely minimal
- configuration to index a body of XML documents, and search them
- using XPath expressions to specify access points.
+ configuration to index a body of
+ <ulink url="http://www.w3.org/xml/###">XML</ulink>
+ documents, and search them using
+ <ulink url="http://www.w3.org/xpath/###">XPath</ulink>
+ expressions to specify access points.
</para>
<para>
- Go to the <literal>zebra/examples/dinosauricon</literal> directory.
+ Go to the <literal>examples/dinosauricon</literal> subdirectory
+ of the distribution archive.
There you will find a <literal>records</literal> subdirectory,
which contains some raw XML data to be added to the database: in
- this case, two files, <literal>genera.xml</literal> and
- <literal>taxa.xml</literal>, which contain information about all
- the known dinosaur genera as of August 2002.
+ this case, as single file, <literal>genera.xml</literal>,
+ which contain information about all the known dinosaur genera as of
+ August 2002.
</para>
<para>
Now we need to create the Zebra database, which we do with the
- Zebra indexer, <literal>zebraidx</literal>. This program's
- behaviour is driven by a configuration life, generally called
- <literal>zebra.cfg</literal>, although this can be changed with the
- <literal>-c</literal> option. For our purposes, we don't need any
- special behaviour - we can use the defaults - so an empty
- configuration will do just fine. We can either create an empty
- <literal>zebra.cfg</literal> or specify the name of an existing
- empty file using, for example, <literal>-c /dev/null</literal>.
- </para>
- <para>
- In this case, we'll use an empty <literal>zebra.cfg</literal> so
- we can add more configuration to it later.
+ Zebra indexer, <literal>zebraidx</literal>, which is
+ driven by the <literal>zebra.cfg</literal> configuration file.
+ For our purposes, we don't need any
+ special behaviour - we can use the defaults - so we start with a
+ minimal file that just tells <literal>zebraidx</literal> where to
+ find the default indexing rules, and how to parse the records:
+ <screen>
+ profilePath: .:../../tab:../../../yaz/tab
+ recordType: grs.sgml
+ </screen>
</para>
<para>
That's all you need for a minimal Zebra configuration. Now you can
roll the XML records into the database and build the indexes:
<screen>
- zebraidx -t grs.sgml update records
+ zebraidx update records
</screen>
- (### What does "grs.sgml" actually mean?)
</para>
<para>
Now start the server. Like the indexer, its behaviour is
- controlled by a configuration file, generally
- <literal>zebra.cfg</literal>; and like the indexer, it works just
- fine with an empty configuration.
+ controlled by the
+ <literal>zebra.cfg</literal> file; and like the indexer, it works
+ just fine with this minimal configuration.
<screen>
zebrasrv
</screen>
By default, the server listens on IP port number 9999, although
- this can easily be changed.
+ this can easily be changed - see
+ <xref linkend="zebrasrv"/>.
</para>
<para>
Now you can use the Z39.50 client program of your choice to execute
XPath-based boolean queries and fetch the XML records that satisfy
them:
<screen>
- Z> open tcp:@:9999
- Connecting...Ok.
- Z> find @attr 1=/GENUS/MEANING @or vertebra jaw
- Number of hits: 1
- Z> format xml
- Z> show 1
- Z> show 1
- <GENUS name="Hudiesaurus" type="with" xmlns:idzebra="http://www.indexdata.dk/zebra/">
- <MEANING>
- butterfly <LOW>vertebra</LOW> lizard
- </MEANING>
- <LENGTH value="30"></LENGTH>
- <PLACE name="China"></PLACE>
- <REMAINS content="4 teeth, forelimb, first dorsal vertebra"></REMAINS>
- <SPECIES name="sinojapanorum" status="nudum">
- <AUTHOR name="Dong" year="1997"></AUTHOR>
- <MEANING>
- Chinese-Japanese
- </MEANING>
- </SPECIES>
- <idzebra:size>359</idzebra:size><idzebra:localnumber>447</idzebra:localnumber><idzebra:filename>records/genera.xml</idzebra:filename></GENUS>
+ $ yaz-client tcp:@:9999
+ Connecting...Ok.
+ Z> find @attr 1=/GENUS/MEANING @and lizard earthquakes
+ Number of hits: 1
+ Z> format xml
+ Z> show 1
+ <GENUS name="Sauroposeidon" type="with">
+ <MEANING>lizard Poseidon <LOW>(Greek god of, among other things, earthquakes)</LOW></MEANING>
+ <SPECIES name="proteles">
+ <AUTHOR type="vide" name="Franklin" year="2000"></AUTHOR>
+ <AUTHOR name="Wedel, Cifelli, Sanders"></AUTHOR>
+ </SPECIES>
+ <PLACE name="Oklahoma"></PLACE>
+ <TIME value="Albian"></TIME>
+ <LENGTH value="30" q="1"></LENGTH>
+ <REMAINS content="rib, cervical vertebrae"></REMAINS>
+ <ESSAY>
+ <P> This new <NOMEN name="Brachiosaurus"></NOMEN>-like <LINK content="dinosaur"></LINK>
+ was perhaps the tallest. With its head raised, it stood 60 feet (nearly
+ 20 m) tall. </P>
+ </ESSAY>
+
+ <idzebra xmlns="http://www.indexdata.dk/zebra/">
+ <size>593</size>
+ <localnumber>891</localnumber>
+ <filename>records/genera.xml</filename>
+ </idzebra>
+ </GENUS>
</screen>
</para>
<para>
</sect1>
<sect1 id="example2">
- <title>Example 2: Adding Some Configuration</title>
+ <title>Example 2: Supporting Z39.50 Searches</title>
<para>
You may have noticed as <literal>zebraidx</literal> was building
- the database that it issued several warnings, which we ignored at
- the time:
- <screen>
-zebraidx -t grs.sgml update records
-02:12:32-30/08: zebraidx(18151) [warn] default.idx [No such file or directory]
-02:12:32-30/08: zebraidx(18151) [warn] Couldn't open explain.abs [No such file or directory]
-02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Couldn't open GENUS.abs [No such file or directory]
-02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Unknown register type: 0
-02:12:32-30/08: zebraidx(18151) [warn] records/genera.xml:0 Unknown register type: w
-02:12:35-30/08: zebraidx(18151) [warn] records/taxa.xml:0 Couldn't open TAXON.abs [No such file or directory]
- </screen>
- And the server issued several more as the client connected to it,
- then searched for and retrieved a record:
+ the database that it issued a warning, which we ignored at the
+ time:
<screen>
-02:17:10-30/08: zebrasrv(18165) [warn] default.idx [No such file or directory]
-02:17:10-30/08: zebrasrv(18165) [warn] Couldn't open explain.abs [No such file or directory]
-02:17:57-30/08: zebrasrv(18165) [warn] Unknown register type: w
-02:18:42-30/08: zebrasrv(18165) [warn] Couldn't open GENUS.abs [No such file or directory]
+ $ zebraidx update records
+ 00:45:46-08/10: ../../index/zebraidx(5016) [warn] records/genera.xml:0 Couldn't open GENUS.abs [No such file or directory]
</screen>
</para>
</sect1>
The master configuration file, <literal>zebra.cfg</literal>,
which is as short and simple as it can be:
<screen>
- # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.6 2002-09-20 09:58:04 mike Exp $
+ # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.7 2002-10-08 08:09:43 mike Exp $
# Bare-bones master configuration file for Zebra
profilePath: .:../../tab:../../../yaz/tab
</screen>
The BIB-1 attribute set configuration file,
<literal>bib1.att</literal>, which is also as short as possible:
<screen>
- # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.6 2002-09-20 09:58:04 mike Exp $
+ # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.7 2002-10-08 08:09:43 mike Exp $
# Bare-bones BIB-1 attribute set file for Zebra
reference Bib-1
</screen>