+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ A list of element descriptions (this is the actual ARS of the
+ schema, in Z39.50 terms), which lists the ways in which the various
+ tags can be used and organized hierarchically.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ </para>
+
+ <para>
+ Several of the entries above simply refer to other files, which
+ describe the given objects.
+ </para>
+
+ </sect2>
+
+ <sect2>
+ <title>The Configuration Files</title>
+
+ <para>
+ This section describes the syntax and use of the various tables which
+ are used by the retrieval module.
+ </para>
+
+ <para>
+ The number of different file types may appear daunting at first, but
+ each type corresponds fairly clearly to a single aspect of the Z39.50
+ retrieval facilities. Further, the average database administrator,
+ who is simply reusing an existing profile for which tables already
+ exist, shouldn't have to worry too much about the contents of these tables.
+ </para>
+
+ <para>
+ Generally, the files are simple ASCII files, which can be maintained
+ using any text editor. Blank lines, and lines beginning with a (#) are
+ ignored. Any characters on a line followed by a (#) are also ignored.
+ All other lines contain <emphasis>directives</emphasis>, which provide
+ some setting or value to the system.
+ Generally, settings are characterized by a single
+ keyword, identifying the setting, followed by a number of parameters.
+ Some settings are repeatable (r), while others may occur only once in a
+ file. Some settings are optional (o), while others again are
+ mandatory (m).
+ </para>
+
+ </sect2>
+
+ <sect2 id="abs-file">
+ <title>The Abstract Syntax (.abs) Files</title>
+
+ <para>
+ The name of this file type is slightly misleading in Z39.50 terms,
+ since, apart from the actual abstract syntax of the profile, it also
+ includes most of the other definitions that go into a database
+ profile.
+ </para>
+
+ <para>
+ When a record in the canonical, SGML-like format is read from a file
+ or from the database, the first tag of the file should reference the
+ profile that governs the layout of the record. If the first tag of the
+ record is, say, <literal><gils></literal>, the system will look
+ for the profile definition in the file <literal>gils.abs</literal>.
+ Profile definitions are cached, so they only have to be read once
+ during the lifespan of the current process.
+ </para>
+
+ <para>
+ When writing your own input filters, the
+ <emphasis>record-begin</emphasis> command
+ introduces the profile, and should always be called first thing when
+ introducing a new record.
+ </para>
+
+ <para>
+ The file may contain the following directives:
+ </para>
+
+ <para>
+ <variablelist>
+
+ <varlistentry>
+ <term>name <replaceable>symbolic-name</replaceable></term>
+ <listitem>
+ <para>
+ (m) This provides a shorthand name or
+ description for the profile. Mostly useful for diagnostic purposes.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>reference <replaceable>OID-name</replaceable></term>
+ <listitem>
+ <para>
+ (m) The reference name of the OID for the profile.
+ The reference names can be found in the <emphasis>util</emphasis>
+ module of YAZ.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>attset <replaceable>filename</replaceable></term>
+ <listitem>
+ <para>
+ (m) The attribute set that is used for
+ indexing and searching records belonging to this profile.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>tagset <replaceable>filename</replaceable></term>
+ <listitem>
+ <para>
+ (o) The tag set (if any) that describe
+ that fields of the records.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>varset <replaceable>filename</replaceable></term>
+ <listitem>
+ <para>
+ (o) The variant set used in the profile.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>maptab <replaceable>filename</replaceable></term>
+ <listitem>
+ <para>
+ (o,r) This points to a
+ conversion table that might be used if the client asks for the record
+ in a different schema from the native one.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>marc <replaceable>filename</replaceable></term>
+ <listitem>
+ <para>
+ (o) Points to a file containing parameters
+ for representing the record contents in the ISO2709 syntax.
+ Read the description of the MARC representation facility below.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>esetname <replaceable>name filename</replaceable></term>
+ <listitem>
+ <para>
+ (o,r) Associates the
+ given element set name with an element selection file. If an (@) is
+ given in place of the filename, this corresponds to a null mapping for
+ the given element set name.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>any <replaceable>tags</replaceable></term>
+ <listitem>
+ <para>
+ (o) This directive specifies a list of attributes
+ which should be appended to the attribute list given for each
+ element. The effect is to make every single element in the abstract
+ syntax searchable by way of the given attributes. This directive
+ provides an efficient way of supporting free-text searching across all
+ elements. However, it does increase the size of the index
+ significantly. The attributes can be qualified with a structure, as in
+ the <replaceable>elm</replaceable> directive below.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>elm <replaceable>path name attributes</replaceable></term>
+ <listitem>
+ <para>
+ (o,r) Adds an element to the abstract record syntax of the schema.
+ The <replaceable>path</replaceable> follows the
+ syntax which is suggested by the Z39.50 document - that is, a sequence
+ of tags separated by slashes (/). Each tag is given as a
+ comma-separated pair of tag type and -value surrounded by parenthesis.
+ The <replaceable>name</replaceable> is the name of the element, and
+ the <replaceable>attributes</replaceable>
+ specifies which attributes to use when indexing the element in a
+ comma-separated list.
+ A ! in place of the attribute name is equivalent to
+ specifying an attribute name identical to the element name.
+ A - in place of the attribute name
+ specifies that no indexing is to take place for the given element.
+ The attributes can be qualified with <replaceable>field
+ types</replaceable> to specify which
+ character set should govern the indexing procedure for that field.
+ The same data element may be indexed into several different
+ fields, using different character set definitions.
+ See the <xref linkend="field-structure-and-character-sets"/>.
+ The default field type is <literal>w</literal> for
+ <emphasis>word</emphasis>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>xelm <replaceable>xpath attributes</replaceable></term>
+ <listitem>
+ <para>
+ Specifies indexing for record nodes given by
+ <replaceable>xpath</replaceable>. Unlike directive
+ elm, this directive allows you to index attribute
+ contents. The <replaceable>xpath</replaceable> uses
+ a syntax similar to XPath. The <replaceable>attributes</replaceable>
+ have same syntax and meaning as directive elm, except that operator
+ ! refers to the nodes selected by <replaceable>xpath</replaceable>.
+ <!--
+ xelm / !:w default index
+ xelm // !:w additional index
+ xelm /gils/title/@att myatt:w index attribute @att in myatt
+ xelm title/@att myatt:w same meaning.
+ -->
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>encoding <replaceable>encodingname</replaceable></term>
+ <listitem>
+ <para>
+ This directive specifies character encoding for external records.
+ For records such as XML that specifies encoding within the
+ file via a header this directive is ignored.
+ If neither this directive is given, nor an encoding is set
+ within external records, ISO-8859-1 encoding is assumed.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>xpath <literal>enable</literal>/<literal>disable</literal></term>
+ <listitem>
+ <para>
+ If this directive is followed by <literal>enable</literal>,
+ then extra indexing is performed to allow for XPath-like queries.
+ If this directive is not specified - equivalent to
+ <literal>disable</literal> - no extra XPath-indexing is performed.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <!-- Adam's version
+ <varlistentry>
+ <term>systag <replaceable>systemtag</replaceable> <replaceable>element</replaceable></term>
+ <listitem>
+ <para>
+ This directive maps system information to an element during
+ retrieval. This information is dynamically created. The
+ following system tags are defined
+ <variablelist>
+ <varlistentry>
+ <term>size</term>
+ <listitem>
+ <para>
+ Size of record in bytes. By default this
+ is mapped to element <literal>size</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>rank</term>
+ <listitem>
+ <para>
+ Score/rank of record. By default this
+ is mapped to element <literal>rank</literal>.
+ If no score was calculated for the record (non-ranked
+ searched) search this directive is ignored.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>sysno</term>
+ <listitem>
+ <para>
+ Zebra's system number (record ID) for the
+ record. By default this is mapped to element
+ <literal>localControlNumber</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ If you do not want a particular system tag to be applied,
+ then set the resulting element to something undefined in the
+ abs file (such as <literal>none</literal>).
+ </para>
+ </listitem>
+ </varlistentry>
+ -->
+
+ <!-- Mike's version -->
+ <varlistentry>
+ <term>
+ systag
+ <replaceable>systemTag</replaceable>
+ <replaceable>actualTag</replaceable>
+ </term>
+ <listitem>
+ <para>
+ Specifies what information, if any, Zebra should
+ automatically include in retrieval records for the
+ ``system fields'' that it supports.
+ <replaceable>systemTag</replaceable> may
+ be any of the following:
+ <variablelist>
+ <varlistentry>
+ <term><literal>rank</literal></term>
+ <listitem><para>
+ An integer indicating the relevance-ranking score
+ assigned to the record.
+ </para></listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>sysno</literal></term>
+ <listitem><para>
+ An automatically generated identifier for the record,
+ unique within this database. It is represented by the
+ <literal><localControlNumber></literal> element in
+ XML and the <literal>(1,14)</literal> tag in GRS-1.
+ </para></listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>size</literal></term>
+ <listitem><para>
+ The size, in bytes, of the retrieved record.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <para>
+ The <replaceable>actualTag</replaceable> parameter may be
+ <literal>none</literal> to indicate that the named element
+ should be omitted from retrieval records.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+
+ <note>
+ <para>
+ The mechanism for controlling indexing is not adequate for
+ complex databases, and will probably be moved into a separate
+ configuration table eventually.
+ </para>
+ </note>
+
+ <para>
+ The following is an excerpt from the abstract syntax file for the GILS
+ profile.
+ </para>
+
+ <para>
+
+ <screen>
+ name gils
+ reference GILS-schema
+ attset gils.att
+ tagset gils.tag
+ varset var1.var
+
+ maptab gils-usmarc.map
+
+ # Element set names
+
+ esetname VARIANT gils-variant.est # for WAIS-compliance
+ esetname B gils-b.est
+ esetname G gils-g.est
+ esetname F @
+
+ elm (1,10) rank -
+ elm (1,12) url -
+ elm (1,14) localControlNumber Local-number
+ elm (1,16) dateOfLastModification Date/time-last-modified
+ elm (2,1) title w:!,p:!
+ elm (4,1) controlIdentifier Identifier-standard
+ elm (2,6) abstract Abstract
+ elm (4,51) purpose !
+ elm (4,52) originator -
+ elm (4,53) accessConstraints !
+ elm (4,54) useConstraints !
+ elm (4,70) availability -
+ elm (4,70)/(4,90) distributor -
+ elm (4,70)/(4,90)/(2,7) distributorName !
+ elm (4,70)/(4,90)/(2,10) distributorOrganization !
+ elm (4,70)/(4,90)/(4,2) distributorStreetAddress !
+ elm (4,70)/(4,90)/(4,3) distributorCity !
+ </screen>
+
+ </para>
+
+ </sect2>
+
+ <sect2 id="attset-files">
+ <title>The Attribute Set (.att) Files</title>
+
+ <para>
+ This file type describes the <replaceable>Use</replaceable> elements of
+ an attribute set.
+ It contains the following directives.
+ </para>
+
+ <para>
+ <variablelist>
+ <varlistentry>
+ <term>name <replaceable>symbolic-name</replaceable></term>
+ <listitem>
+ <para>
+ (m) This provides a shorthand name or
+ description for the attribute set.
+ Mostly useful for diagnostic purposes.
+ </para>
+ </listitem></varlistentry>
+ <varlistentry>
+ <term>reference <replaceable>OID-name</replaceable></term>
+ <listitem>
+ <para>
+ (m) The reference name of the OID for
+ the attribute set.
+ The reference names can be found in the <replaceable>util</replaceable>
+ module of <replaceable>YAZ</replaceable>.
+ </para>
+ </listitem></varlistentry>
+ <varlistentry>
+ <term>include <replaceable>filename</replaceable></term>
+ <listitem>
+ <para>
+ (o,r) This directive is used to
+ include another attribute set as a part of the current one. This is
+ used when a new attribute set is defined as an extension to another
+ set. For instance, many new attribute sets are defined as extensions
+ to the <replaceable>bib-1</replaceable> set.
+ This is an important feature of the retrieval
+ system of Z39.50, as it ensures the highest possible level of
+ interoperability, as those access points of your database which are
+ derived from the external set (say, bib-1) can be used even by clients
+ who are unaware of the new set.
+ </para>
+ </listitem></varlistentry>
+ <varlistentry>
+ <term>att
+ <replaceable>att-value att-name [local-value]</replaceable></term>
+ <listitem>
+ <para>
+ (o,r) This
+ repeatable directive introduces a new attribute to the set. The
+ attribute value is stored in the index (unless a
+ <replaceable>local-value</replaceable> is
+ given, in which case this is stored). The name is used to refer to the
+ attribute from the <replaceable>abstract syntax</replaceable>.
+ </para>
+ </listitem></varlistentry>
+ </variablelist>
+ </para>
+
+ <para>
+ This is an excerpt from the GILS attribute set definition.
+ Notice how the file describing the <emphasis>bib-1</emphasis>
+ attribute set is referenced.
+ </para>
+
+ <para>
+
+ <screen>
+ name gils
+ reference GILS-attset
+ include bib1.att
+
+ att 2001 distributorName
+ att 2002 indextermsControlled
+ att 2003 purpose
+ att 2004 accessConstraints
+ att 2005 useConstraints
+ </screen>
+
+ </para>
+
+ </sect2>
+
+ <sect2>
+ <title>The Tag Set (.tag) Files</title>
+
+ <para>
+ This file type defines the tagset of the profile, possibly by
+ referencing other tag sets (most tag sets, for instance, will include
+ tagsetG and tagsetM from the Z39.50 specification. The file may
+ contain the following directives.
+ </para>
+
+ <para>
+ <variablelist>
+
+ <varlistentry>
+ <term>name <emphasis>symbolic-name</emphasis></term>
+ <listitem>
+ <para>
+ (m) This provides a shorthand name or
+ description for the tag set. Mostly useful for diagnostic purposes.
+ </para>
+ </listitem></varlistentry>
+ <varlistentry>
+ <term>reference <emphasis>OID-name</emphasis></term>
+ <listitem>
+ <para>
+ (o) The reference name of the OID for the tag set.
+ The reference names can be found in the <emphasis>util</emphasis>
+ module of <emphasis>YAZ</emphasis>.
+ The directive is optional, since not all tag sets
+ are registered outside of their schema.
+ </para>
+ </listitem></varlistentry>
+ <varlistentry>
+ <term>type <emphasis>integer</emphasis></term>
+ <listitem>
+ <para>
+ (m) The type number of the tagset within the schema
+ profile (note: this specification really should belong to the .abs
+ file. This will be fixed in a future release).
+ </para>
+ </listitem></varlistentry>
+ <varlistentry>
+ <term>include <emphasis>filename</emphasis></term>
+ <listitem>
+ <para>
+ (o,r) This directive is used
+ to include the definitions of other tag sets into the current one.
+ </para>
+ </listitem></varlistentry>
+ <varlistentry>
+ <term>tag <emphasis>number names type</emphasis></term>
+ <listitem>
+ <para>
+ (o,r) Introduces a new tag to the set.
+ The <emphasis>number</emphasis> is the tag number as used
+ in the protocol (there is currently no mechanism for
+ specifying string tags at this point, but this would be quick
+ work to add).
+ The <emphasis>names</emphasis> parameter is a list of names
+ by which the tag should be recognized in the input file format.
+ The names should be separated by slashes (/).
+ The <emphasis>type</emphasis> is the recommended data type of
+ the tag.
+ It should be one of the following:
+
+ <itemizedlist>