<chapter id="administration">
- <!-- $Id: administration.xml,v 1.3 2002-04-09 13:26:26 adam Exp $ -->
+ <!-- $Id: administration.xml,v 1.12 2002-10-30 11:09:39 adam Exp $ -->
<title>Administrating Zebra</title>
-
+ <!-- ### It's a bit daft that this chapter (which describes half of
+ the configuration-file formats) is separated from
+ "recordmodel.xml" (which describes the other half) by the
+ instructions on running zebraidx and zebrasrv. Some careful
+ re-ordering is required here.
+ -->
+
<para>
Unlike many simpler retrieval systems, Zebra supports safe, incremental
updates to an existing index.
Indexing is a per-record process, in which either insert/modify/delete
will occur. Before a record is indexed search keys are extracted from
whatever might be the layout the original record (sgml,html,text, etc..).
- The Zebra system currently supports two fundamantal types of records:
+ The Zebra system currently supports two fundamental types of records:
structured and simple text.
To specify a particular extraction process, use either the
command line option <literal>-t</literal> or specify a
<para>
You can edit the configuration file with a normal text editor.
- parameter names and values are seperated by colons in the file. Lines
- starting with a hash sign (<literal>#</literal>) are
+ parameter names and values are separated by colons in the file. Lines
+ starting with a hash sign (<literal>#</literal>) are
treated as comments.
</para>
explained further in the following sections.
</para>
+ <!--
+ FIXME - Didn't Adam make something to have multiple databases in multiple dirs...
+ -->
+
<para>
<variablelist>
<varlistentry>
<term>
<emphasis>group</emphasis>
- .recordType[<emphasis>.name</emphasis>]:
+ .recordType[<emphasis>.name</emphasis>]:
<replaceable>type</replaceable>
</term>
<listitem>
<listitem>
<para>
Specifies the Z39.50 database name.
+ <!-- FIXME - now we can have multiple databases in one server. -H -->
</para>
</listitem>
</varlistentry>
group of records. If you plan to update/delete this type of
records later this should be specified as 1; otherwise it
should be 0 (default), to save register space.
+ <!-- ### this is the first mention of "register" -->
See <xref linkend="file-ids"/>.
</para>
</listitem>
</listitem>
</varlistentry>
<varlistentry>
+ <!-- ### probably a better place to define "register" -->
<term>register: <replaceable>register-location</replaceable></term>
<listitem>
<para>
<term>keyTmpDir: <replaceable>directory</replaceable></term>
<listitem>
<para>
- Directory in which temporary files used during zebraidx' update
+ Directory in which temporary files used during zebraidx's update
phase are stored.
</para>
</listitem>
<para>
Specifies a directory base for Zebra. All relative paths
given (in profilePath, register, shadow) are based on this
- directory. This setting is useful if if you Zebra server
+ directory. This setting is useful if your Zebra server
is running in a different directory from where
<literal>zebra.cfg</literal> is located.
</para>
<title>Locating Records</title>
<para>
- The default behaviour of the Zebra system is to reference the
+ The default behavior of the Zebra system is to reference the
records from their original location, i.e. where they were found when you
ran <literal>zebraidx</literal>.
That is, when a client wishes to retrieve a record
following a search operation, the files are accessed from the place
where you originally put them - if you remove the files (without
- running <literal>zebraidx</literal> again, the client
- will receive a diagnostic message.
+ running <literal>zebraidx</literal> again, the server will return
+ diagnostic number 14 (``System error in presenting records'') to
+ the client.
</para>
<para>
disk space than simpler indexing methods, but it makes it easier for
you to keep the index in sync with a frequently changing set of data.
If you combine this system with the <emphasis>safe update</emphasis>
- facility (see below), you never have to take your server offline for
+ facility (see below), you never have to take your server off-line for
maintenance or register updating purposes.
</para>
in the configuration file. In addition, you should set
<literal>storeKeys</literal> to <literal>1</literal>, since the Zebra
indexer must save additional information about the contents of each record
- in order to modify the indices correctly at a later time.
+ in order to modify the indexes correctly at a later time.
</para>
+ <!--
+ FIXME - There must be a simpler way to do this with Adams string tags -H
+ -->
+
<para>
For example, to update records of group <literal>esdd</literal>
located below
and then run <literal>zebraidx</literal> with the
<literal>update</literal> command.
</para>
+ <!-- ### what happens if a file contains multiple records? -->
</sect1>
<sect1 id="generic-ids">
<title>Indexing with General Record IDs</title>
<para>
- When using this method you construct an (almost) arbritrary, internal
+ When using this method you construct an (almost) arbitrary, internal
record key based on the contents of the record itself and other system
information. If you have a group of records that explicitly associates
an ID with each record, this method is convenient. For example, the
each directory in the order specified and use the next specified
directories as needed.
The <emphasis>size</emphasis> is an integer followed by a qualifier
- code, <literal>M</literal> for megabytes,
+ code,
+ <literal>b</literal> for bytes,
<literal>k</literal> for kilobytes.
+ <literal>M</literal> for megabytes,
+ <literal>G</literal> for gigabytes.
</para>
<para>
For instance, if you have allocated two disks for your register, and
the first disk is mounted
- on <literal>/d1</literal> and has 200 Mb of free space and the
- second, mounted on <literal>/d2</literal> has 300 Mb, you could
+ on <literal>/d1</literal> and has 2GB of free space and the
+ second, mounted on <literal>/d2</literal> has 3.6 GB, you could
put this entry in your configuration file:
<screen>
- register: /d1:200M /d2:300M
+ register: /d1:2G /d2:3600M
</screen>
</para>
your responsibility to ensure that enough space is available, and that
other applications do not attempt to use the free space. In a large
production system, it is recommended that you allocate one or more
- filesystem exclusively to the Zebra register files.
+ file system exclusively to the Zebra register files.
</para>
</sect1>
In order to make changes to the system take effect for the
users, you'll have to submit a "commit" command after a
(sequence of) update operation(s).
- You can ask the indexer to commit the changes immediately
- after the update operation:
</para>
<para>
<screen>
- $ zebraidx update /d1/records update /d2/more-records commit
+ $ zebraidx update /d1/records
+ $ zebraidx commit
</screen>
</para>
<para>
<screen>
- $ zebraidx -g books update /d1/records update /d2/more-records
+ $ zebraidx -g books update /d1/records /d2/more-records
$ zebraidx -g fun update /d3/fun-records
$ zebraidx commit
</screen>