<chapter id="examples">
- <!-- $Id: examples.xml,v 1.19 2002-12-30 12:56:07 adam Exp $ -->
+ <!-- $Id: examples.xml,v 1.26 2007-02-02 11:10:08 marc Exp $ -->
<title>Example Configurations</title>
- <sect1>
+ <sect1 id="examples-overview">
<title>Overview</title>
<para>
- <literal>zebraidx</literal> and <literal>zebrasrv</literal> are both
+ <command>zebraidx</command> and
+ <command>zebrasrv</command> are both
driven by a master configuration file, which may refer to other
subsidiary configuration files. By default, they try to use
<filename>zebra.cfg</filename> in the working directory as the
option to specify an alternative master configuration file.
</para>
<para>
- The master configuration file tells Zebra:
+ The master configuration file tells &zebra;:
<itemizedlist>
<listitem>
</sect1>
<sect1 id="example1">
- <title>Example 1: XML Indexing And Searching</title>
+ <title>Example 1: &xml; Indexing And Searching</title>
<para>
- This example shows how Zebra can be used with absolutely minimal
+ This example shows how &zebra; can be used with absolutely minimal
configuration to index a body of
- <ulink url="http://www.w3.org/XML/">XML</ulink>
+ <ulink url="&url.xml;">&xml;</ulink>
documents, and search them using
- <ulink url="http://www.w3.org/TR/xpath">XPath</ulink>
+ <ulink url="&url.xpath;">XPath</ulink>
expressions to specify access points.
</para>
<para>
records are generated from the family tree in the file
<literal>dino.tree</literal>.)
Type <literal>make records/dino.xml</literal>
- to make the XML data file.
- (Or you could just type <literal>make dino</literal> to build the XML
+ to make the &xml; data file.
+ (Or you could just type <literal>make dino</literal> to build the &xml;
data file, create the database and populate it with the taxonomic
records all in one shot - but then you wouldn't learn anything,
would you? :-)
</para>
<para>
- Now we need to create a Zebra database to hold and index the XML
+ Now we need to create a &zebra; database to hold and index the &xml;
records. We do this with the
- Zebra indexer, <literal>zebraidx</literal>, which is
+ &zebra; indexer, <command>zebraidx</command>, which is
driven by the <literal>zebra.cfg</literal> configuration file.
For our purposes, we don't need any
special behaviour - we can use the defaults - so we can start with a
- minimal file that just tells <literal>zebraidx</literal> where to
+ minimal file that just tells <command>zebraidx</command> where to
find the default indexing rules, and how to parse the records:
<screen>
profilePath: .:../../tab
</screen>
</para>
<para>
- That's all you need for a minimal Zebra configuration. Now you can
- roll the XML records into the database and build the indexes:
+ That's all you need for a minimal &zebra; configuration. Now you can
+ roll the &xml; records into the database and build the indexes:
<screen>
zebraidx update records
</screen>
<xref linkend="zebrasrv"/>.
</para>
<para>
- Now you can use the Z39.50 client program of your choice to execute
- XPath-based boolean queries and fetch the XML records that satisfy
+ Now you can use the &z3950; client program of your choice to execute
+ XPath-based boolean queries and fetch the &xml; records that satisfy
them:
<screen>
$ yaz-client @:9999
significantly because it ties searching semantics to the physical
structure of the searched records. You can't use the same search
specification to search two databases if their internal
- representations are different. Consider an different taxonomy
+ representations are different. Consider a different taxonomy
database in which the records have taxon names specified
inside a <literal><name></literal> element nested within a
<literal><identification></literal> element
<para>
How, then, can we build broadcasting Information Retrieval
applications that look for records in many different databases?
- The Z39.50 protocol offers a powerful and general solution to this:
- abstract ``access points''. In the Z39.50 model, an access point
+ The &z3950; protocol offers a powerful and general solution to this:
+ abstract ``access points''. In the &z3950; model, an access point
is simply a point at which searches can be directed. Nothing is
said about implementation: in a given database, an access point
might be implemented as an index, a path into physical records, an
algorithm for interrogating relational tables or whatever works.
- The only important thing point is that the semantics of an access
- point are fixed and well defined.
+ The only important thing is that the semantics of an access
+ point is fixed and well defined.
</para>
<para>
For convenience, access points are gathered into <firstterm>attribute
- sets</firstterm>. For example, the BIB-1 attribute set is supposed to
+ sets</firstterm>. For example, the &bib1; attribute set is supposed to
contain bibliographic access points such as author, title, subject
and ISBN; the GEO attribute set contains access points pertaining
to geospatial information (bounding coordinates, stratum, latitude
(provenance, inscriptions, etc.)
</para>
<para>
- In practice, the BIB-1 attribute set has tended to be a dumping
+ In practice, the &bib1; attribute set has tended to be a dumping
ground for all sorts of access points, so that, for example, it
includes some geospatial access points as well as strictly
bibliographic ones. Nevertheless, this model
records in databases.
</para>
<para>
- In the BIB-1 attribute set, a taxon name is probably best
+ In the &bib1; attribute set, a taxon name is probably best
interpreted as a title - that is, a phrase that identifies the item
- in question. BIB-1 represents title searches by
- access point 4. (See
- <ulink url="ftp://ftp.loc.gov/pub/z3950/defs/bib1.txt"
- >The BIB-1 Attribute Set Semantics</ulink>)
+ in question. &bib1; represents title searches by
+ access point 4. (See
+ <ulink url="&url.z39.50.bib1.semantics;">The &bib1; Attribute
+ Set Semantics</ulink>)
So we need to configure our dinosaur database so that searches for
- BIB-1 access point 4 look in the
+ &bib1; access point 4 look in the
<literal><termName></literal> element,
inside the top-level
<literal><Zthes></literal> element.
</para>
<para>
- This is a two-step process. First, we need to tell Zebra that we
- want to support the BIB-1 attribute set. Then we need to tell it
+ This is a two-step process. First, we need to tell &zebra; that we
+ want to support the &bib1; attribute set. Then we need to tell it
which elements of its record pertain to access point 4.
</para>
<para>
</callout>
<callout arearefs="attset.attset">
<para>
- Declare Bib-1 attribute set. See <filename>bib1.att</filename> in
- Zebra's <filename>tab</filename> directory.
+ Declare &bib1; attribute set. See <filename>bib1.att</filename> in
+ &zebra;'s <filename>tab</filename> directory.
</para>
</callout>
<callout arearefs="termId">
<callout arearefs="termName">
<para>
Make <literal>termName</literal> word searchable by both
- Zthes attribute termName (1002) and Bib-1 atttribute title (4).
+ Zthes attribute termName (1002) and &bib1; atttribute title (4).
</para>
</callout>
</calloutlist>
</programlistingco>
<para>
- After re-indexing, we can search the database using Bib-1
+ After re-indexing, we can search the database using &bib1;
attribute, title, as follows:
<screen>
Z> form xml
Z> s
Sent presentRequest (1+1).
Records: 1
-[Default]Record type: XML
+[Default]Record type: &xml;
<Zthes>
<termId>2</termId>
<termName>Eoraptor</termName>
by exporting a line-drawing done in TGIF, then converted that to the
GIF using a shell-script called "epstogif" which used an appallingly
baroque sequence of conversions, which I would prefer not to pollute
-the Zebra build environment with:
+the &zebra; build environment with:
#!/bin/sh