<chapter id="server">
- <!-- $Id: server.xml,v 1.3 2002-04-10 14:47:49 heikki Exp $ -->
+ <!-- $Id: server.xml,v 1.13 2006-02-16 12:28:52 mike Exp $ -->
<title>The Z39.50 Server</title>
<sect1 id="zebrasrv">
<title>Running the Z39.50 Server (zebrasrv)</title>
- <para>
+ <!--
FIXME - We need to be consistent here, zebraidx had the options at the
end, and lots of explaining text before them. Same for zebrasvr! -H
FIXME - At least we need a small intro, what is zebrasvr, and how it
can be run (inetd, nt service, stand-alone program, daemon...) -H
- </para>
+ -->
+
+ <!-- re-write by MC, using the newly created input files for the
+ zebrasrv manpage -->
+
+
+ <sect2><title>DESCRIPTION</title>
+ <para>Zebra is a high-performance, general-purpose structured text indexing
+ and retrieval engine. It reads structured records in a variety of input
+ formats (eg. email, XML, MARC) and allows access to them through exact
+ boolean search expressions and relevance-ranked free-text queries.
+ </para>
+ <para>
+ <command>zebrasrv</command> is the Z39.50 and <ulink url="http://www.loc.gov/standards/sru/srw/">SRW</ulink>/U frontend
+ server for the <command>Zebra</command> indexer.
+ </para>
+ <para>
+ On Unix you can run the <command>zebrasrv</command>
+ server from the command line - and put it
+ in the background. It may also operate under the inet daemon.
+ On WIN32 you can run the server as a console application or
+ as a WIN32 Service.
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>SYNOPSIS</title>
+ &zebrasrv-synopsis;
+ </sect2>
+
+ <sect2>
+ <title>OPTIONS</title>
+
+ <para>
+ The options for <command>zebrasrv</command> are the same
+ as those for YAZ' <command>yaz-ztest</command>.
+ Option <literal>-c</literal> specifies a Zebra configuration
+ file - if omitted <filename>zebra.cfg</filename> is read.
+ </para>
+
+ &zebrasrv-options;
+ </sect2>
+ <sect2 id="gfs-config"><title>VIRTUAL HOSTS</title>
+ <para>
+ <command>zebrasrv</command> uses the YAZ server frontend and does
+ support multiple virtual servers behind multiple listening sockets.
+ </para>
+ &zebrasrv-virtual;
+ </sect2>
+ <sect2><title>FILES</title>
+ <para>
+ <filename>zebra.cfg</filename>
+ </para>
+ </sect2>
+ <sect2><title>SEE ALSO</title>
+ <para>
+ <citerefentry>
+ <refentrytitle>zebraidx</refentrytitle>
+ <manvolnum>1</manvolnum>
+ </citerefentry>,
+ <citerefentry>
+ <refentrytitle>yaz-ztest</refentrytitle>
+ <manvolnum>8</manvolnum>
+ </citerefentry>
+ </para>
+ <para>
+ Section "The Z39.50 Server" in the Zebra manual.
+ <filename>http://www.indexdata.dk/zebra/doc/server.tkl</filename>
+ </para>
+ <para>
+ Section "Virtual Hosts" in the YAZ manual.
+ <filename>http://www.indexdata.dk/yaz/doc/server.vhosts.tkl</filename>
+ </para>
+ <para>
+ Section "Specification of <ulink url="http://www.loc.gov/standards/sru/cql/">CQL</ulink> to RPN mappings" in the YAZ manual.
+ <filename>http://www.indexdata.dk/yaz/doc/tools.tkl#tools.cql.map</filename>
+ </para>
+ <para>
+ The Zebra software is Copyright <command>Index Data</command>
+ <filename>http://www.indexdata.dk</filename>
+ and distributed under the
+ GPLv2 license.
+ </para>
+ </sect2>
+ <!--
<para>
<emphasis remap="bf">Syntax</emphasis>
<screen>
- zebrasrv [options] [listener-address ...]
+ zebrasrv [options] [listener-address ...]
</screen>
</para>
<listitem>
<para>
The log level. Use a comma-separated list of members of the set
- {fatal,debug,warn,log,all,none}.
+ {fatal,debug,warn,log,all,none}.
</para>
</listitem>
</varlistentry>
</varlistentry>
</variablelist>
</para>
-
- <para>
- A <replaceable>listener-address</replaceable> consists of an optional
- transport mode followed by a colon (:) followed by a listener address.
- The transport mode is either <literal>ssl</literal> or
- <literal>tcp</literal> (default).
- </para>
-
- <para>
- For TCP, an address has the form
- </para>
-
- <para>
-
- <screen>
- hostname | IP-number [: portnumber]
- </screen>
-
- </para>
-
- <para>
- The port number defaults to 210 (standard Z39.50 port) for
- privileged users (root), and 9999 for normal users.
- </para>
-
- <para>
- Examples
- </para>
-
- <para>
-
- <screen>
- tcp:dranet.dra.com
-
- ssl:secure.lib.com:3000
- </screen>
-
- </para>
-
- <para>
- In both cases, the special hostname "@" is mapped to
- the address INADDR_ANY, which causes the server to listen on any local
- interface. To start the server listening on the registered port for
- Z39.50, and to drop root privileges once the ports are bound, execute
- the server like this (from a root shell):
- </para>
-
- <para>
-
- <screen>
- zebrasrv -u daemon @
- </screen>
-
- </para>
-
- <para>
- You can replace <literal>daemon</literal> with another user, eg.
- your own account, or a dedicated IR server account.
- </para>
-
- <para>
- The default behavior for <literal>zebrasrv</literal> is to establish
- a single TCP/IP listener, for the Z39.50 protocol, on port 9999.
- </para>
-
+ -->
</sect1>
+
<sect1 id="protocol-support">
<title>Z39.50 Protocol Support and Behavior</title>
<sect2 id="search">
<title>Search</title>
- <para>
+ <!--
FIXME - Need to explain the string tag stuff before people get bogged
down with all these attribute numbers. Perhaps in its own
chapter? -H
- </para>
+ -->
<para>
The supported query type are 1 and 101. All operators are currently
</para>
<para>
- The server has full support for piggy-backed present requests (see
+ The server has full support for piggy-backed retrieval (see
also the following section).
</para>
A phrase register is created for those fields in the
<literal>.abs</literal> file that contains a
<literal>p</literal>-specifier.
+ <!-- ### whatever the hell _that_ is -->
</para>
<para>
For the <emphasis>Truncation</emphasis> attribute,
<emphasis>No Truncation</emphasis> is the default.
<emphasis>Left Truncation</emphasis> is not supported.
- <emphasis>Process #</emphasis> is supported, as is
+ <emphasis>Process # in search term</emphasis> is supported, as is
<emphasis>Regxp-1</emphasis>.
<emphasis>Regxp-2</emphasis> enables the fault-tolerant (fuzzy)
search. As a default, a single error (deletion, insertion,
<term>x?</term>
<listitem>
<para>
- Matches <emphasis>x</emphasis> once or twice. Priority: high.
- FIXME Is this right? Std regexp has '?' meaning zero or one -H
+ Matches <emphasis>x</emphasis> zero or once. Priority: high.
</para>
</listitem>
</varlistentry>
</listitem>
</varlistentry>
<varlistentry>
- <term>x|y</term>
+ <term>x|y</term>
<listitem>
<para>
Matches either <emphasis>x</emphasis> or <emphasis>y</emphasis>.
</sect2>
</sect1>
</chapter>
+
+
+<chapter id="server-sru">
+ <title>The SRU/SRW Server</title>
+ <para>
+ In addition to Z39.50, Zebra supports the more recent and
+ web-friendly IR protocol SRU, described at
+ <ulink url="http://www.loc.gov/sru"/>.
+ SRU is ``Search/Retrieve via URL'', a simple, REST-like protocol
+ that uses HTTP GET to request search responses. The request
+ itself is made of parameters such as
+ <literal>query</literal>,
+ <literal>startRecord</literal>,
+ <literal>maximumRecords</literal>
+ and
+ <literal>recordSchema</literal>;
+ the response is an XML document containing hit-count, result-set
+ records, diagnostics, etc. SRU can be thought of as a re-casting
+ of Z39.50 semantics in web-friendly terms; or as a standardisation
+ of the ad-hoc query parameters used by search engines such as Google
+ and AltaVista; or as a superset of A9's OpenSearch (which it
+ predates).
+ </para>
+ <para>
+ Zebra further supports SRW, described at
+ <ulink url="http://www.loc.gov/srw"/>.
+ SRW is the ``Search/Retrieve Web Service'', a SOAP-based alternative
+ implementation of the abstract protocol that SRU implements as HTTP
+ GET requests. In SRW, requests are encoded as XML documents which
+ are posted to the server. The responses are identical to those
+ returned by SRU servers, except that they are wrapped in a several
+ layers of SOAP envelope.
+ </para>
+ <para>
+ Zebra supports all three protocols - Z39.50, SRU and SRW - on the
+ same port, recognising what protocol is used by each incoming
+ requests and handling them accordingly. This is a achieved through
+ the use of Deep Magic; civilians are warned not to stand too close.
+ </para>
+ <para>
+ From here on, ``SRU'' is used to indicate both the SRU and SRW
+ protocols, as they are identical except for the transport used for
+ the protocol packets and Zebra's support for them is equivalent.
+ </para>
+
+ <sect1 id="server-sru-run">
+ <title>Running the SRU Server (zebrasrv)</title>
+ <para>
+ Because Zebra supports all three protocols on one port, it would
+ seem to follow that the SRU server is run in the same way as
+ the Z39.50 server, as described above. This is true, but only in
+ an uninterestingly vacuous way: a Zebra server run in this manner
+ will indeed recognise and accept SRU requests; but since it
+ doesn't know how to handle the CQL queries that these protocols
+ use, all it can do is send failure responses.
+ </para>
+ <note>
+ <para>
+ It is possible to cheat, by having SRU search Zebra with
+ a PQF query instead of CQL, using the
+ <literal>x-pquery</literal>
+ parameter instead of
+ <literal>query</literal>.
+ This is a
+ <emphasis role="strong">non-standard extension</emphasis>
+ of CQL, and a
+ <emphasis role="strong">very naughty</emphasis>
+ thing to do, but it does give you a way to see Zebra serving SRU
+ ``right out of the box''. If you start your favourite Zebra
+ server in the usual way, on port 9999, then you can send your web
+ browser to:
+ </para>
+ <screen>
+ http://localhost:9999/Default?version=1.1&
+ operation=searchRetrieve&
+ x-pquery=mineral&
+ startRecord=1&
+ maximumRecords=1
+ </screen>
+ <para>
+ This will display the XML-formatted SRU response that includes the
+ first record in the result-set found by the query
+ <literal>mineral</literal>. (For clarity, the SRU URL is shown
+ here broken across lines, but the lines should be joined to gether
+ to make single-line URL for the browser to submit.)
+ </para>
+ </note>
+ <para>
+ In order to turn on Zebra's support for CQL queries, it's necessary
+ to have the YAZ generic front-end (which Zebra uses) translate them
+ into the Z39.50 Type-1 query format that is used internally. And
+ to do this, the generic front-end's own configuration file must be
+ used. This file is described
+ <link linkend="gfs-config">elsewhere</link>;
+ the salient point for SRU support is that
+ <command>zebrasrv</command>
+ must be started with the
+ <literal>-f frontendConfigFile</literal>
+ option rather than the
+ <literal>-c zebraConfigFile</literal>
+ option,
+ and that the front-end configuration file must include both a
+ reference to the Zebra configuration file and the CQL-to-PQF
+ translator configuration file.
+ </para>
+ <para>
+ A minimal front-end configuration file that does this would read as
+ follows:
+ </para>
+ <screen><![CDATA[
+ <yazgfs>
+ <server>
+ <config>zebra.cfg</config>
+ <cql2rpn>../../tab/pqf.properties</cql2rpn>
+ </server>
+ </yazgfs>
+]]></screen>
+ <para>
+ The
+ <literal><config></literal>
+ element contains the name of the Zebra configuration file that was
+ previously specified by the
+ <literal>-c</literal>
+ command-line argument, and the
+ <literal><cql2rpn></literal>
+ element contains the name of the CQL properties file specifying how
+ various CQL indexes, relations, etc. are translated into Type-1
+ queries.
+ </para>
+ </sect1>
+
+ <sect1 id="server-sru-support">
+ <title>SRU and SRW Protocol Support and Behavior</title>
+ <para>
+ Zebra running as an SRU server supports SRU version 1.1, including
+ CQL version 1.1. In particular, it provides support for the
+ following elements of the protocol.
+ </para>
+
+ <sect2>
+ <title>Search and Retrieval</title>
+ <para>
+ Zebra fully supports SRU's core
+ <literal>searchRetrieve</literal>
+ operation, as described at
+ <ulink url="http://www.loc.gov/standards/sru/sru-spec.html"/>
+ </para>
+ <para>
+ One of the great strengths of SRU is that it mandates a standard
+ query language, CQL, and that all conforming implementations can
+ therefore be trusted to correctly interpret the same queries. It
+ is with some shame, then, that we admit that Zebra also supports
+ an additional query language, our own Prefix Query Format (PQF,
+ <ulink url="http://indexdata.com/yaz/doc/tools.tkl#PQF"/>).
+x-pquery
+
+
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Scan</title>
+ <para>
+ ###
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Explain</title>
+ <para>
+ ###
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Initialization, Present, Sort, Close</title>
+ <para>
+ In the Z39.50 protocol, Initialization, Present, Sort and Close
+ are separate operations. In SRU, however, these operations do not
+ exist.
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ SRU has no explicit initialization handshake phase, but
+ commences immediately with searching, scanning and explain
+ operations.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Neither does SRU have a close operation, since the protocol is
+ stateless and each request is self-contained. (It is true that
+ multiple SRU request/response pairs may be implemented as
+ multiple HTTP request/response pairs over a single persistent
+ TCP/IP connection; but the closure of that connection is not a
+ protocol-level operation.)
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Retrieval in SRU is part of the
+ <literal>searchRetrieve</literal> operation, in which a search
+ is submitted and the response includes a subset of the records
+ in the result set. There is no direct analogue of Z39.50's
+ Present operation which requests records from an established
+ result set. In SRU, this is achieved by sending a subsequent
+ <literal>searchRetrieve</literal> request with the query
+ <literal>cql.resultSetId=</literal><emphasis>id</emphasis> where
+ <emphasis>id</emphasis> is the identifier of the previously
+ generated result-set.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Sorting in CQL is done within the
+ <literal>searchRetrieve</literal> operation - in v1.1, by an
+ explicit <literal>sort</literal> parameter, but the forthcoming
+ v1.2 or v2.0 will most likely use an extension of the query
+ language, CQL for sorting: see
+ <ulink url="http://zing.z3950.org/cql/sorting.html"/>
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ It can be seen, then, that while Zebra operating as an SRU server
+ does not provide the same set of operations as when operating as a
+ Z39.50 server, it does provide equivalent functionality.
+ </para>
+ </sect2>
+ </sect1>
+</chapter>
+
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml