+ </sect1>
+
+ <sect1 id="shadow-registers">
+ <title>Safe Updating - Using Shadow Registers</title>
+
+ <sect2 id="shadow-registers-description">
+ <title>Description</title>
+
+ <para>
+ The &zebra; server supports <emphasis>updating</emphasis> of the index
+ structures. That is, you can add, modify, or remove records from
+ databases managed by &zebra; without rebuilding the entire index.
+ Since this process involves modifying structured files with various
+ references between blocks of data in the files, the update process
+ is inherently sensitive to system crashes, or to process interruptions:
+ Anything but a successfully completed update process will leave the
+ register files in an unknown state, and you will essentially have no
+ recourse but to re-index everything, or to restore the register files
+ from a backup medium.
+ Further, while the update process is active, users cannot be
+ allowed to access the system, as the contents of the register files
+ may change unpredictably.
+ </para>
+
+ <para>
+ You can solve these problems by enabling the shadow register system in
+ &zebra;.
+ During the updating procedure, <literal>zebraidx</literal> will temporarily
+ write changes to the involved files in a set of "shadow
+ files", without modifying the files that are accessed by the
+ active server processes. If the update procedure is interrupted by a
+ system crash or a signal, you simply repeat the procedure - the
+ register files have not been changed or damaged, and the partially
+ written shadow files are automatically deleted before the new updating
+ procedure commences.
+ </para>
+
+ <para>
+ At the end of the updating procedure (or in a separate operation, if
+ you so desire), the system enters a "commit mode". First,
+ any active server processes are forced to access those blocks that
+ have been changed from the shadow files rather than from the main
+ register files; the unmodified blocks are still accessed at their
+ normal location (the shadow files are not a complete copy of the
+ register files - they only contain those parts that have actually been
+ modified). If the commit process is interrupted at any point during the
+ commit process, the server processes will continue to access the
+ shadow files until you can repeat the commit procedure and complete
+ the writing of data to the main register files. You can perform
+ multiple update operations to the registers before you commit the
+ changes to the system files, or you can execute the commit operation
+ at the end of each update operation. When the commit phase has
+ completed successfully, any running server processes are instructed to
+ switch their operations to the new, operational register, and the
+ temporary shadow files are deleted.
+ </para>
+
+ </sect2>
+
+ <sect2 id="shadow-registers-how-to-use">
+ <title>How to Use Shadow Register Files</title>
+
+ <para>
+ The first step is to allocate space on your system for the shadow
+ files.
+ You do this by adding a <literal>shadow</literal> entry to the
+ <literal>zebra.cfg</literal> file.
+ The syntax of the <literal>shadow</literal> entry is exactly the
+ same as for the <literal>register</literal> entry
+ (see <xref linkend="register-location"/>).
+ The location of the shadow area should be
+ <emphasis>different</emphasis> from the location of the main register
+ area (if you have specified one - remember that if you provide no
+ <literal>register</literal> setting, the default register area is the
+ working directory of the server and indexing processes).
+ </para>
+
+ <para>
+ The following excerpt from a <literal>zebra.cfg</literal> file shows
+ one example of a setup that configures both the main register
+ location and the shadow file area.
+ Note that two directories or partitions have been set aside
+ for the shadow file area. You can specify any number of directories
+ for each of the file areas, but remember that there should be no
+ overlaps between the directories used for the main registers and the
+ shadow files, respectively.
+ </para>
+ <para>
+
+ <screen>
+ register: /d1:500M
+ shadow: /scratch1:100M /scratch2:200M
+ </screen>
+
+ </para>
+
+ <para>
+ When shadow files are enabled, an extra command is available at the
+ <literal>zebraidx</literal> command line.
+ In order to make changes to the system take effect for the
+ users, you'll have to submit a "commit" command after a
+ (sequence of) update operation(s).
+ </para>
+
+ <para>
+
+ <screen>
+ $ zebraidx update /d1/records
+ $ zebraidx commit
+ </screen>
+
+ </para>
+
+ <para>
+ Or you can execute multiple updates before committing the changes:
+ </para>
+
+ <para>
+
+ <screen>
+ $ zebraidx -g books update /d1/records /d2/more-records
+ $ zebraidx -g fun update /d3/fun-records
+ $ zebraidx commit
+ </screen>
+
+ </para>
+
+ <para>
+ If one of the update operations above had been interrupted, the commit
+ operation on the last line would fail: <literal>zebraidx</literal>
+ will not let you commit changes that would destroy the running register.
+ You'll have to rerun all of the update operations since your last
+ commit operation, before you can commit the new changes.
+ </para>
+
+ <para>
+ Similarly, if the commit operation fails, <literal>zebraidx</literal>
+ will not let you start a new update operation before you have
+ successfully repeated the commit operation.
+ The server processes will keep accessing the shadow files rather
+ than the (possibly damaged) blocks of the main register files
+ until the commit operation has successfully completed.
+ </para>
+
+ <para>
+ You should be aware that update operations may take slightly longer
+ when the shadow register system is enabled, since more file access
+ operations are involved. Further, while the disk space required for
+ the shadow register data is modest for a small update operation, you
+ may prefer to disable the system if you are adding a very large number
+ of records to an already very large database (we use the terms
+ <emphasis>large</emphasis> and <emphasis>modest</emphasis>
+ very loosely here, since every application will have a
+ different perception of size).
+ To update the system without the use of the the shadow files,
+ simply run <literal>zebraidx</literal> with the <literal>-n</literal>
+ option (note that you do not have to execute the
+ <emphasis>commit</emphasis> command of <literal>zebraidx</literal>
+ when you temporarily disable the use of the shadow registers in
+ this fashion.
+ Note also that, just as when the shadow registers are not enabled,
+ server processes will be barred from accessing the main register
+ while the update procedure takes place.
+ </para>
+
+ </sect2>
+
+ </sect1>
+
+
+ <sect1 id="administration-ranking">
+ <title>Relevance Ranking and Sorting of Result Sets</title>
+
+ <sect2 id="administration-overview">
+ <title>Overview</title>
+ <para>
+ The default ordering of a result set is left up to the server,
+ which inside &zebra; means sorting in ascending document ID order.
+ This is not always the order humans want to browse the sometimes
+ quite large hit sets. Ranking and sorting comes to the rescue.
+ </para>
+
+ <para>
+ In cases where a good presentation ordering can be computed at
+ indexing time, we can use a fixed <literal>static ranking</literal>
+ scheme, which is provided for the <literal>alvis</literal>
+ indexing filter. This defines a fixed ordering of hit lists,
+ independently of the query issued.
+ </para>
+
+ <para>
+ There are cases, however, where relevance of hit set documents is
+ highly dependent on the query processed.
+ Simply put, <literal>dynamic relevance ranking</literal>
+ sorts a set of retrieved records such that those most likely to be
+ relevant to your request are retrieved first.
+ Internally, &zebra; retrieves all documents that satisfy your
+ query, and re-orders the hit list to arrange them based on
+ a measurement of similarity between your query and the content of
+ each record.
+ </para>
+
+ <para>
+ Finally, there are situations where hit sets of documents should be
+ <literal>sorted</literal> during query time according to the
+ lexicographical ordering of certain sort indexes created at
+ indexing time.
+ </para>
+ </sect2>
+
+
+ <sect2 id="administration-ranking-static">
+ <title>Static Ranking</title>
+
+ <para>
+ &zebra; uses internally inverted indexes to look up term frequencies
+ in documents. Multiple queries from different indexes can be
+ combined by the binary boolean operations <literal>AND</literal>,
+ <literal>OR</literal> and/or <literal>NOT</literal> (which
+ is in fact a binary <literal>AND NOT</literal> operation).
+ To ensure fast query execution
+ speed, all indexes have to be sorted in the same order.
+ </para>
+ <para>
+ The indexes are normally sorted according to document
+ <literal>ID</literal> in
+ ascending order, and any query which does not invoke a special
+ re-ranking function will therefore retrieve the result set in
+ document
+ <literal>ID</literal>
+ order.
+ </para>
+ <para>
+ If one defines the
+ <screen>
+ staticrank: 1
+ </screen>
+ directive in the main core &zebra; configuration file, the internal document
+ keys used for ordering are augmented by a preceding integer, which
+ contains the static rank of a given document, and the index lists
+ are ordered
+ first by ascending static rank,
+ then by ascending document <literal>ID</literal>.
+ Zero
+ is the ``best'' rank, as it occurs at the
+ beginning of the list; higher numbers represent worse scores.
+ </para>
+ <para>
+ The experimental <literal>alvis</literal> filter provides a
+ directive to fetch static rank information out of the indexed &acro.xml;
+ records, thus making <emphasis>all</emphasis> hit sets ordered
+ after <emphasis>ascending</emphasis> static
+ rank, and for those doc's which have the same static rank, ordered
+ after <emphasis>ascending</emphasis> doc <literal>ID</literal>.
+ See <xref linkend="record-model-alvisxslt"/> for the gory details.
+ </para>
+ </sect2>
+
+
+ <sect2 id="administration-ranking-dynamic">
+ <title>Dynamic Ranking</title>
+ <para>
+ In order to fiddle with the static rank order, it is necessary to
+ invoke additional re-ranking/re-ordering using dynamic
+ ranking or score functions. These functions return positive
+ integer scores, where <emphasis>highest</emphasis> score is
+ ``best'';
+ hit sets are sorted according to <emphasis>descending</emphasis>
+ scores (in contrary
+ to the index lists which are sorted according to
+ ascending rank number and document ID).
+ </para>
+ <para>
+ Dynamic ranking is enabled by a directive like one of the
+ following in the zebra configuration file (use only one of these a time!):
+ <screen>
+ rank: rank-1 # default TDF-IDF like
+ rank: rank-static # dummy do-nothing
+ </screen>
+ </para>
+
+ <para>
+ Dynamic ranking is done at query time rather than
+ indexing time (this is why we
+ call it ``dynamic ranking'' in the first place ...)
+ It is invoked by adding
+ the &acro.bib1; relation attribute with
+ value ``relevance'' to the &acro.pqf; query (that is,
+ <literal>@attr 2=102</literal>, see also
+ <ulink url="&url.z39.50;bib1.html">
+ The &acro.bib1; Attribute Set Semantics</ulink>, also in
+ <ulink url="&url.z39.50.attset.bib1;">HTML</ulink>).
+ To find all articles with the word <literal>Eoraptor</literal> in
+ the title, and present them relevance ranked, issue the &acro.pqf; query:
+ <screen>
+ @attr 2=102 @attr 1=4 Eoraptor
+ </screen>
+ </para>
+
+ <sect3 id="administration-ranking-dynamic-rank1">
+ <title>Dynamically ranking using &acro.pqf; queries with the 'rank-1'
+ algorithm</title>
+
+ <para>
+ The default <literal>rank-1</literal> ranking module implements a
+ TF/IDF (Term Frequecy over Inverse Document Frequency) like
+ algorithm. In contrast to the usual definition of TF/IDF
+ algorithms, which only considers searching in one full-text
+ index, this one works on multiple indexes at the same time.
+ More precisely,
+ &zebra; does boolean queries and searches in specific addressed
+ indexes (there are inverted indexes pointing from terms in the
+ dictionary to documents and term positions inside documents).
+ It works like this:
+ <variablelist>
+ <varlistentry>
+ <term>Query Components</term>
+ <listitem>
+ <para>
+ First, the boolean query is dismantled into its principal components,
+ i.e. atomic queries where one term is looked up in one index.
+ For example, the query
+ <screen>
+ @attr 2=102 @and @attr 1=1010 Utah @attr 1=1018 Springer
+ </screen>
+ is a boolean AND between the atomic parts
+ <screen>
+ @attr 2=102 @attr 1=1010 Utah
+ </screen>
+ and
+ <screen>
+ @attr 2=102 @attr 1=1018 Springer
+ </screen>
+ which gets processed each for itself.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Atomic hit lists</term>
+ <listitem>
+ <para>
+ Second, for each atomic query, the hit list of documents is
+ computed.
+ </para>
+ <para>
+ In this example, two hit lists for each index
+ <literal>@attr 1=1010</literal> and
+ <literal>@attr 1=1018</literal> are computed.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Atomic scores</term>
+ <listitem>
+ <para>
+ Third, each document in the hit list is assigned a score (_if_ ranking
+ is enabled and requested in the query) using a TF/IDF scheme.
+ </para>
+ <para>
+ In this example, both atomic parts of the query assign the magic
+ <literal>@attr 2=102</literal> relevance attribute, and are
+ to be used in the relevance ranking functions.
+ </para>
+ <para>
+ It is possible to apply dynamic ranking on only parts of the
+ &acro.pqf; query:
+ <screen>
+ @and @attr 2=102 @attr 1=1010 Utah @attr 1=1018 Springer
+ </screen>
+ searches for all documents which have the term 'Utah' on the
+ body of text, and which have the term 'Springer' in the publisher
+ field, and sort them in the order of the relevance ranking made on
+ the body-of-text index only.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Hit list merging</term>
+ <listitem>
+ <para>
+ Fourth, the atomic hit lists are merged according to the boolean
+ conditions to a final hit list of documents to be returned.
+ </para>
+ <para>
+ This step is always performed, independently of the fact that
+ dynamic ranking is enabled or not.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Document score computation</term>
+ <listitem>
+ <para>
+ Fifth, the total score of a document is computed as a linear
+ combination of the atomic scores of the atomic hit lists
+ </para>
+ <para>
+ Ranking weights may be used to pass a value to a ranking
+ algorithm, using the non-standard &acro.bib1; attribute type 9.
+ This allows one branch of a query to use one value while
+ another branch uses a different one. For example, we can search
+ for <literal>utah</literal> in the
+ <literal>@attr 1=4</literal> index with weight 30, as
+ well as in the <literal>@attr 1=1010</literal> index with weight 20:
+ <screen>
+ @attr 2=102 @or @attr 9=30 @attr 1=4 utah @attr 9=20 @attr 1=1010 city
+ </screen>
+ </para>
+ <para>
+ The default weight is
+ sqrt(1000) ~ 34 , as the &acro.z3950; standard prescribes that the top score
+ is 1000 and the bottom score is 0, encoded in integers.
+ </para>
+ <warning>
+ <para>
+ The ranking-weight feature is experimental. It may change in future
+ releases of zebra.
+ </para>
+ </warning>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Re-sorting of hit list</term>
+ <listitem>
+ <para>
+ Finally, the final hit list is re-ordered according to scores.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ </para>
+
+
+ <para>
+ The <literal>rank-1</literal> algorithm
+ does not use the static rank
+ information in the list keys, and will produce the same ordering
+ with or without static ranking enabled.
+ </para>
+
+
+ <!--
+ <sect3 id="administration-ranking-dynamic-rank1">
+ <title>Dynamically ranking &acro.pqf; queries with the 'rank-static'
+ algorithm</title>
+ <para>
+ The dummy <literal>rank-static</literal> reranking/scoring
+ function returns just
+ <literal>score = max int - staticrank</literal>
+ in order to preserve the static ordering of hit sets that would
+ have been produced had it not been invoked.
+ Obviously, to combine static and dynamic ranking usefully,
+ it is necessary
+ to make a new ranking
+ function; this is left
+ as an exercise for the reader.
+ </para>
+ </sect3>
+ -->
+
+ <warning>
+ <para>
+ <literal>Dynamic ranking</literal> is not compatible
+ with <literal>estimated hit sizes</literal>, as all documents in
+ a hit set must be accessed to compute the correct placing in a
+ ranking sorted list. Therefore the use attribute setting
+ <literal>@attr 2=102</literal> clashes with
+ <literal>@attr 9=integer</literal>.
+ </para>
+ </warning>
+
+ <!--
+ we might want to add ranking like this:
+ UNPUBLISHED:
+ Simple BM25 Extension to Multiple Weighted Fields
+ Stephen Robertson, Hugo Zaragoza and Michael Taylor
+ Microsoft Research
+ ser@microsoft.com
+ hugoz@microsoft.com
+ mitaylor2microsoft.com
+ -->
+
+ </sect3>
+
+ <sect3 id="administration-ranking-dynamic-cql">
+ <title>Dynamically ranking &acro.cql; queries</title>
+ <para>
+ Dynamic ranking can be enabled during sever side &acro.cql;
+ query expansion by adding <literal>@attr 2=102</literal>
+ chunks to the &acro.cql; config file. For example
+ <screen>
+ relationModifier.relevant = 2=102
+ </screen>
+ invokes dynamic ranking each time a &acro.cql; query of the form
+ <screen>
+ Z> querytype cql
+ Z> f alvis.text =/relevant house
+ </screen>
+ is issued. Dynamic ranking can also be automatically used on
+ specific &acro.cql; indexes by (for example) setting
+ <screen>
+ index.alvis.text = 1=text 2=102
+ </screen>
+ which then invokes dynamic ranking each time a &acro.cql; query of the form
+ <screen>
+ Z> querytype cql
+ Z> f alvis.text = house
+ </screen>
+ is issued.
+ </para>
+
+ </sect3>
+
+ </sect2>
+
+
+ <sect2 id="administration-ranking-sorting">
+ <title>Sorting</title>
+ <para>
+ &zebra; sorts efficiently using special sorting indexes
+ (type=<literal>s</literal>; so each sortable index must be known
+ at indexing time, specified in the configuration of record
+ indexing. For example, to enable sorting according to the &acro.bib1;
+ <literal>Date/time-added-to-db</literal> field, one could add the line
+ <screen>
+ xelm /*/@created Date/time-added-to-db:s
+ </screen>
+ to any <literal>.abs</literal> record-indexing configuration file.
+ Similarly, one could add an indexing element of the form
+ <screen><![CDATA[
+ <z:index name="date-modified" type="s">
+ <xsl:value-of select="some/xpath"/>
+ </z:index>
+ ]]></screen>
+ to any <literal>alvis</literal>-filter indexing stylesheet.
+ </para>
+ <para>
+ Indexing can be specified at searching time using a query term
+ carrying the non-standard
+ &acro.bib1; attribute-type <literal>7</literal>. This removes the
+ need to send a &acro.z3950; <literal>Sort Request</literal>
+ separately, and can dramatically improve latency when the client
+ and server are on separate networks.
+ The sorting part of the query is separate from the rest of the
+ query - the actual search specification - and must be combined
+ with it using OR.
+ </para>
+ <para>
+ A sorting subquery needs two attributes: an index (such as a
+ &acro.bib1; type-1 attribute) specifying which index to sort on, and a
+ type-7 attribute whose value is be <literal>1</literal> for
+ ascending sorting, or <literal>2</literal> for descending. The
+ term associated with the sorting attribute is the priority of
+ the sort key, where <literal>0</literal> specifies the primary
+ sort key, <literal>1</literal> the secondary sort key, and so
+ on.
+ </para>
+ <para>For example, a search for water, sort by title (ascending),
+ is expressed by the &acro.pqf; query
+ <screen>
+ @or @attr 1=1016 water @attr 7=1 @attr 1=4 0
+ </screen>
+ whereas a search for water, sort by title ascending,
+ then date descending would be
+ <screen>
+ @or @or @attr 1=1016 water @attr 7=1 @attr 1=4 0 @attr 7=2 @attr 1=30 1
+ </screen>
+ </para>
+ <para>
+ Notice the fundamental differences between <literal>dynamic
+ ranking</literal> and <literal>sorting</literal>: there can be
+ only one ranking function defined and configured; but multiple
+ sorting indexes can be specified dynamically at search
+ time. Ranking does not need to use specific indexes, so
+ dynamic ranking can be enabled and disabled without
+ re-indexing; whereas, sorting indexes need to be
+ defined before indexing.
+ </para>
+
+ </sect2>
+
+
+ </sect1>
+
+ <sect1 id="administration-extended-services">
+ <title>Extended Services: Remote Insert, Update and Delete</title>
+
+ <note>
+ <para>
+ Extended services are only supported when accessing the &zebra;
+ server using the <ulink url="&url.z39.50;">&acro.z3950;</ulink>
+ protocol. The <ulink url="&url.sru;">&acro.sru;</ulink> protocol does
+ not support extended services.
+ </para>
+ </note>
+
+ <para>
+ The extended services are not enabled by default in zebra - due to the
+ fact that they modify the system. &zebra; can be configured
+ to allow anybody to
+ search, and to allow only updates for a particular admin user
+ in the main zebra configuration file <filename>zebra.cfg</filename>.
+ For user <literal>admin</literal>, you could use:
+ <screen>
+ perm.anonymous: r
+ perm.admin: rw
+ passwd: passwordfile
+ </screen>
+ And in the password file
+ <filename>passwordfile</filename>, you have to specify users and
+ encrypted passwords as colon separated strings.
+ Use a tool like <filename>htpasswd</filename>
+ to maintain the encrypted passwords.
+ <screen>
+ admin:secret
+ </screen>
+ It is essential to configure &zebra; to store records internally,
+ and to support
+ modifications and deletion of records:
+ <screen>
+ storeData: 1
+ storeKeys: 1
+ </screen>
+ The general record type should be set to any record filter which
+ is able to parse &acro.xml; records, you may use any of the two
+ declarations (but not both simultaneously!)
+ <screen>
+ recordType: dom.filter_dom_conf.xml
+ # recordType: grs.xml
+ </screen>
+ Notice the difference to the specific instructions
+ <screen>
+ recordType.xml: dom.filter_dom_conf.xml
+ # recordType.xml: grs.xml
+ </screen>
+ which only work when indexing XML files from the filesystem using
+ the <literal>*.xml</literal> naming convention.
+ </para>
+ <para>
+ To enable transaction safe shadow indexing,
+ which is extra important for this kind of operation, set
+ <screen>
+ shadow: directoryname: size (e.g. 1000M)
+ </screen>
+ See <xref linkend="zebra-cfg"/> for additional information on
+ these configuration options.
+ </para>
+ <note>
+ <para>
+ It is not possible to carry information about record types or
+ similar to &zebra; when using extended services, due to
+ limitations of the <ulink url="&url.z39.50;">&acro.z3950;</ulink>
+ protocol. Therefore, indexing filters can not be chosen on a
+ per-record basis. One and only one general &acro.xml; indexing filter
+ must be defined.
+ <!-- but because it is represented as an OID, we would need some
+ form of proprietary mapping scheme between record type strings and
+ OIDs. -->
+ <!--
+ However, as a minimum, it would be extremely useful to enable
+ people to use &acro.marc21;, assuming grs.marcxml.marc21 as a record
+ type.
+ -->
+ </para>
+ </note>
+
+
+ <sect2 id="administration-extended-services-z3950">
+ <title>Extended services in the &acro.z3950; protocol</title>
+
+ <para>
+ The <ulink url="&url.z39.50;">&acro.z3950;</ulink> standard allows
+ servers to accept special binary <emphasis>extended services</emphasis>
+ protocol packages, which may be used to insert, update and delete
+ records into servers. These carry control and update
+ information to the servers, which are encoded in seven package fields:
+ </para>
+
+ <table id="administration-extended-services-z3950-table" frame="top">
+ <title>Extended services &acro.z3950; Package Fields</title>
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry>Parameter</entry>
+ <entry>Value</entry>
+ <entry>Notes</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><literal>type</literal></entry>
+ <entry><literal>'update'</literal></entry>
+ <entry>Must be set to trigger extended services</entry>
+ </row>
+ <row>
+ <entry><literal>action</literal></entry>
+ <entry><literal>string</literal></entry>
+ <entry>
+ Extended service action type with
+ one of four possible values: <literal>recordInsert</literal>,
+ <literal>recordReplace</literal>,
+ <literal>recordDelete</literal>,
+ and <literal>specialUpdate</literal>
+ </entry>
+ </row>
+ <row>
+ <entry><literal>record</literal></entry>
+ <entry><literal>&acro.xml; string</literal></entry>
+ <entry>An &acro.xml; formatted string containing the record</entry>
+ </row>
+ <row>
+ <entry><literal>syntax</literal></entry>
+ <entry><literal>'xml'</literal></entry>
+ <entry>XML/SUTRS/MARC. GRS-1 not supported.
+ The default filter (record type) as given by recordType in
+ zebra.cfg is used to parse the record.</entry>
+ </row>
+ <row>
+ <entry><literal>recordIdOpaque</literal></entry>
+ <entry><literal>string</literal></entry>
+ <entry>
+ Optional client-supplied, opaque record
+ identifier used under insert operations.
+ </entry>
+ </row>
+ <row>
+ <entry><literal>recordIdNumber </literal></entry>
+ <entry><literal>positive number</literal></entry>
+ <entry>&zebra;'s internal system number,
+ not allowed for <literal>recordInsert</literal> or
+ <literal>specialUpdate</literal> actions which result in fresh
+ record inserts.
+ </entry>
+ </row>
+ <row>
+ <entry><literal>databaseName</literal></entry>
+ <entry><literal>database identifier</literal></entry>
+ <entry>
+ The name of the database to which the extended services should be
+ applied.
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+
+ <para>
+ The <literal>action</literal> parameter can be any of
+ <literal>recordInsert</literal> (will fail if the record already exists),
+ <literal>recordReplace</literal> (will fail if the record does not exist),
+ <literal>recordDelete</literal> (will fail if the record does not
+ exist), and
+ <literal>specialUpdate</literal> (will insert or update the record
+ as needed, record deletion is not possible).
+ </para>
+
+ <para>
+ During all actions, the
+ usual rules for internal record ID generation apply, unless an
+ optional <literal>recordIdNumber</literal> &zebra; internal ID or a
+ <literal>recordIdOpaque</literal> string identifier is assigned.
+ The default ID generation is
+ configured using the <literal>recordId:</literal> from
+ <filename>zebra.cfg</filename>.
+ See <xref linkend="zebra-cfg"/>.
+ </para>
+
+ <para>
+ Setting of the <literal>recordIdNumber</literal> parameter,
+ which must be an existing &zebra; internal system ID number, is not
+ allowed during any <literal>recordInsert</literal> or
+ <literal>specialUpdate</literal> action resulting in fresh record
+ inserts.
+ </para>
+
+ <para>
+ When retrieving existing
+ records indexed with &acro.grs1; indexing filters, the &zebra; internal
+ ID number is returned in the field
+ <literal>/*/id:idzebra/localnumber</literal> in the namespace
+ <literal>xmlns:id="http://www.indexdata.dk/zebra/"</literal>,
+ where it can be picked up for later record updates or deletes.
+ </para>
+
+ <para>
+ A new element set for retrieval of internal record
+ data has been added, which can be used to access minimal records
+ containing only the <literal>recordIdNumber</literal> &zebra;
+ internal ID, or the <literal>recordIdOpaque</literal> string
+ identifier. This works for any indexing filter used.
+ See <xref linkend="special-retrieval"/>.
+ </para>
+
+ <para>
+ The <literal>recordIdOpaque</literal> string parameter
+ is an client-supplied, opaque record
+ identifier, which may be used under
+ insert, update and delete operations. The
+ client software is responsible for assigning these to
+ records. This identifier will
+ replace zebra's own automagic identifier generation with a unique
+ mapping from <literal>recordIdOpaque</literal> to the
+ &zebra; internal <literal>recordIdNumber</literal>.
+ <emphasis>The opaque <literal>recordIdOpaque</literal> string
+ identifiers
+ are not visible in retrieval records, nor are
+ searchable, so the value of this parameter is
+ questionable. It serves mostly as a convenient mapping from
+ application domain string identifiers to &zebra; internal ID's.
+ </emphasis>
+ </para>
+ </sect2>
+
+
+ <sect2 id="administration-extended-services-yaz-client">
+ <title>Extended services from yaz-client</title>
+
+ <para>
+ We can now start a yaz-client admin session and create a database:
+ <screen>
+ <![CDATA[
+ $ yaz-client localhost:9999 -u admin/secret
+ Z> adm-create
+ ]]>
+ </screen>
+ Now the <literal>Default</literal> database was created,
+ we can insert an &acro.xml; file (esdd0006.grs
+ from example/gils/records) and index it:
+ <screen>
+ <![CDATA[
+ Z> update insert id1234 esdd0006.grs
+ ]]>
+ </screen>
+ The 3rd parameter - <literal>id1234</literal> here -
+ is the <literal>recordIdOpaque</literal> package field.
+ </para>
+ <para>
+ Actually, we should have a way to specify "no opaque record id" for
+ yaz-client's update command.. We'll fix that.
+ </para>
+ <para>
+ The newly inserted record can be searched as usual:
+ <screen>
+ <![CDATA[
+ Z> f utah
+ Sent searchRequest.
+ Received SearchResponse.
+ Search was a success.
+ Number of hits: 1, setno 1
+ SearchResult-1: term=utah cnt=1
+ records returned: 0
+ Elapsed: 0.014179
+ ]]>
+ </screen>
+ </para>
+ <para>
+ Let's delete the beast, using the same
+ <literal>recordIdOpaque</literal> string parameter:
+ <screen>
+ <![CDATA[
+ Z> update delete id1234
+ No last record (update ignored)
+ Z> update delete 1 esdd0006.grs
+ Got extended services response
+ Status: done
+ Elapsed: 0.072441
+ Z> f utah
+ Sent searchRequest.
+ Received SearchResponse.
+ Search was a success.
+ Number of hits: 0, setno 2
+ SearchResult-1: term=utah cnt=0
+ records returned: 0
+ Elapsed: 0.013610
+ ]]>
+ </screen>
+ </para>
+ <para>
+ If shadow register is enabled in your
+ <filename>zebra.cfg</filename>,
+ you must run the adm-commit command
+ <screen>
+ <![CDATA[
+ Z> adm-commit
+ ]]>
+ </screen>
+ after each update session in order write your changes from the
+ shadow to the life register space.
+ </para>
+ </sect2>
+
+
+ <sect2 id="administration-extended-services-yaz-php">
+ <title>Extended services from yaz-php</title>
+
+ <para>
+ Extended services are also available from the &yaz; &acro.php; client layer. An
+ example of an &yaz;-&acro.php; extended service transaction is given here:
+ <screen>
+ <![CDATA[
+ $record = '<record><title>A fine specimen of a record</title></record>';
+
+ $options = array('action' => 'recordInsert',
+ 'syntax' => 'xml',
+ 'record' => $record,
+ 'databaseName' => 'mydatabase'
+ );
+
+ yaz_es($yaz, 'update', $options);
+ yaz_es($yaz, 'commit', array());
+ yaz_wait();
+
+ if ($error = yaz_error($yaz))
+ echo "$error";
+ ]]>
+ </screen>
+ </para>
+ </sect2>
+
+ <sect2 id="administration-extended-services-debugging">
+ <title>Extended services debugging guide</title>
+ <para>
+ When debugging ES over PHP we recommend the following order of tests:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Make sure you have a nice record on your filesystem, which you can
+ index from the filesystem by use of the zebraidx command.
+ Do it exactly as you planned, using one of the GRS-1 filters,
+ or the DOMXML filter.
+ When this works, proceed.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Check that your server setup is OK before you even coded one single
+ line PHP using ES.
+ Take the same record form the file system, and send as ES via
+ <literal>yaz-client</literal> like described in
+ <xref linkend="administration-extended-services-yaz-client"/>,
+ and
+ remember the <literal>-a</literal> option which tells you what
+ goes over the wire! Notice also the section on permissions:
+ try
+ <screen>
+ perm.anonymous: rw
+ </screen>
+ in <literal>zebra.cfg</literal> to make sure you do not run into
+ permission problems (but never expose such an insecure setup on the
+ internet!!!). Then, make sure to set the general
+ <literal>recordType</literal> instruction, pointing correctly
+ to the GRS-1 filters,
+ or the DOMXML filters.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ If you insist on using the <literal>sysno</literal> in the
+ <literal>recordIdNumber</literal> setting,
+ please make sure you do only updates and deletes. Zebra's internal
+ system number is not allowed for
+ <literal>recordInsert</literal> or
+ <literal>specialUpdate</literal> actions
+ which result in fresh record inserts.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ If <literal>shadow register</literal> is enabled in your
+ <literal>zebra.cfg</literal>, you must remember running the
+ <screen>
+ Z> adm-commit
+ </screen>
+ command as well.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ If this works, then proceed to do the same thing in your PHP script.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+
+ </sect2>
+
+ </sect1>