-<!-- $Id: tools.xml,v 1.56 2007-02-08 09:03:31 adam Exp $ -->
<chapter id="tools"><title>Supporting Tools</title>
<para>
symbolic language for expressing boolean query structures.
</para>
- <para>
- The EUROPAGATE research project working under the Libraries programme
- of the European Commission's DG XIII has, amongst other useful tools,
- implemented a general-purpose CCL parser which produces an output
- structure that can be trivially converted to the internal RPN
- representation of &yaz; (The <literal>Z_RPNQuery</literal> structure).
- Since the CCL utility - along with the rest of the software
- produced by EUROPAGATE - is made freely available on a liberal
- license, it is included as a supplement to &yaz;.
- </para>
-
<sect3 id="ccl.syntax">
<title>CCL Syntax</title>
</table>
</para>
<para>
- Refer to the complete
+ Refer to <xref linkend="bib1"/> or the complete
<ulink url="&url.z39.50.attset.bib1;">list of Bib-1 attributes</ulink>
</para>
<para>
<para>
The basic YAZ representation of an OID is an array of integers,
- terminated with the value -1. The &odr; module provides two
- utility-functions to create and copy this type of data elements:
- </para>
-
- <screen>
- Odr_oid *odr_getoidbystr(ODR o, char *str);
- </screen>
-
- <para>
- Creates an OID based on a string-based representation using dots (.)
- to separate elements in the OID.
- </para>
-
- <screen>
- Odr_oid *odr_oiddup(ODR odr, Odr_oid *o);
- </screen>
-
- <para>
- Creates a copy of the OID referenced by the <emphasis>o</emphasis>
- parameter.
- Both functions take an &odr; stream as parameter. This stream is used to
- allocate memory for the data elements, which is released on a
- subsequent call to <function>odr_reset()</function> on that stream.
- </para>
-
- <para>
- The OID module provides a higher-level representation of the
- family of object identifiers which describe the Z39.50 protocol and its
- related objects. The definition of the module interface is given in
- the <filename>oid.h</filename> file.
- </para>
-
- <para>
- The interface is mainly based on the <literal>oident</literal> structure.
- The definition of this structure looks like this:
- </para>
-
- <screen>
-typedef struct oident
-{
- oid_proto proto;
- oid_class oclass;
- oid_value value;
- int oidsuffix[OID_SIZE];
- char *desc;
-} oident;
- </screen>
-
- <para>
- The proto field takes one of the values
- </para>
-
- <screen>
- PROTO_Z3950
- PROTO_GENERAL
- </screen>
-
- <para>
- Use <literal>PROTO_Z3950</literal> for Z39.50 Object Identifers,
- <literal>PROTO_GENERAL</literal> for other types (such as
- those associated with ILL).
+ terminated with the value -1. This integer is of type
+ <literal>Odr_oid</literal>.
</para>
<para>
-
- The oclass field takes one of the values
+ Fundamental OID operations and the type <literal>Odr_oid</literal>
+ are defined in <filename>yaz/oid_util.h</filename>.
</para>
-
- <screen>
- CLASS_APPCTX
- CLASS_ABSYN
- CLASS_ATTSET
- CLASS_TRANSYN
- CLASS_DIAGSET
- CLASS_RECSYN
- CLASS_RESFORM
- CLASS_ACCFORM
- CLASS_EXTSERV
- CLASS_USERINFO
- CLASS_ELEMSPEC
- CLASS_VARSET
- CLASS_SCHEMA
- CLASS_TAGSET
- CLASS_GENERAL
- </screen>
-
<para>
- corresponding to the OID classes defined by the Z39.50 standard.
-
- Finally, the value field takes one of the values
+ An OID can either be declared as a automatic variable or it can
+ allocated using the memory utilities or ODR/NMEM. It's
+ guaranteed that an OID can fit in <literal>OID_SIZE</literal> integers.
</para>
-
- <screen>
- VAL_APDU
- VAL_BER
- VAL_BASIC_CTX
- VAL_BIB1
- VAL_EXP1
- VAL_EXT1
- VAL_CCL1
- VAL_GILS
- VAL_WAIS
- VAL_STAS
- VAL_DIAG1
- VAL_ISO2709
- VAL_UNIMARC
- VAL_INTERMARC
- VAL_CCF
- VAL_USMARC
- VAL_UKMARC
- VAL_NORMARC
- VAL_LIBRISMARC
- VAL_DANMARC
- VAL_FINMARC
- VAL_MAB
- VAL_CANMARC
- VAL_SBN
- VAL_PICAMARC
- VAL_AUSMARC
- VAL_IBERMARC
- VAL_EXPLAIN
- VAL_SUTRS
- VAL_OPAC
- VAL_SUMMARY
- VAL_GRS0
- VAL_GRS1
- VAL_EXTENDED
- VAL_RESOURCE1
- VAL_RESOURCE2
- VAL_PROMPT1
- VAL_DES1
- VAL_KRB1
- VAL_PRESSET
- VAL_PQUERY
- VAL_PCQUERY
- VAL_ITEMORDER
- VAL_DBUPDATE
- VAL_EXPORTSPEC
- VAL_EXPORTINV
- VAL_NONE
- VAL_SETM
- VAL_SETG
- VAL_VAR1
- VAL_ESPEC1
- </screen>
-
- <para>
- again, corresponding to the specific OIDs defined by the standard.
- Refer to the
- <ulink url="&url.z39.50.oids;">
- Registry of Z39.50 Object Identifiers</ulink> for the
- whole list.
- </para>
-
- <para>
- The desc field contains a brief, mnemonic name for the OID in question.
- </para>
-
+ <example id="tools.oid.bib1.1"><title>Create OID on stack</title>
+ <para>
+ We can create an OID for the Bib-1 attribute set with:
+ <screen>
+ Odr_oid bib1[OID_SIZE];
+ bib1[0] = 1;
+ bib1[1] = 2;
+ bib1[2] = 840;
+ bib1[3] = 10003;
+ bib1[4] = 3;
+ bib1[5] = 1;
+ bib1[6] = -1;
+ </screen>
+ </para>
+ </example>
<para>
- The function
+ And OID may also be filled from a string-based representation using
+ dots (.). This is achieved by function
+ <screen>
+ int oid_dotstring_to_oid(const char *name, Odr_oid *oid);
+ </screen>
+ This functions returns 0 if name could be converted; -1 otherwise.
</para>
-
- <screen>
- struct oident *oid_getentbyoid(int *o);
- </screen>
-
- <para>
- takes as argument an OID, and returns a pointer to a static area
- containing an <literal>oident</literal> structure. You typically use
- this function when you receive a PDU containing an OID, and you wish
- to branch out depending on the specific OID value.
+ <example id="tools.oid.bib1.2"><title>Using oid_oiddotstring_to_oid</title>
+ <para>
+ We can fill the Bib-1 attribute set OID easier with:
+ <screen>
+ Odr_oid bib1[OID_SIZE];
+ oid_oiddotstring_to_oid("1.2.840.10003.3.1", bib1);
+ </screen>
</para>
-
+ </example>
<para>
- The function
- </para>
-
+ We can also allocate an OID dynamically on a ODR stream with:
<screen>
- int *oid_ent_to_oid(struct oident *ent, int *dst);
+ Odr_oid *odr_getoidbystr(ODR o, const char *str);
</screen>
-
- <para>
- Takes as argument an <literal>oident</literal> structure - in which
- the <literal>proto</literal>, <literal>oclass</literal>/, and
- <literal>value</literal> fields are assumed to be set correctly -
- and returns a pointer to a the buffer as given by <literal>dst</literal>
- containing the base
- representation of the corresponding OID. The function returns
- NULL and the array dst is unchanged if a mapping couldn't place.
- The array <literal>dst</literal> should be at least of size
- <literal>OID_SIZE</literal>.
+ This creates an OID from string-based representation using dots.
+ This function take an &odr; stream as parameter. This stream is used to
+ allocate memory for the data elements, which is released on a
+ subsequent call to <function>odr_reset()</function> on that stream.
</para>
- <para>
- The <function>oid_ent_to_oid()</function> function can be used whenever
- you need to prepare a PDU containing one or more OIDs. The separation of
- the <literal>protocol</literal> element from the remainder of the
- OID-description makes it simple to write applications that can
- communicate with either Z39.50 or OSI SR-based applications.
- </para>
+ <example id="tools.oid.bib1.3"><title>Using odr_getoidbystr</title>
+ <para>
+ We can create a OID for the Bib-1 attribute set with:
+ <screen>
+ Odr_oid *bib1 = odr_getoidbystr(odr, "1.2.840.10003.3.1");
+ </screen>
+ </para>
+ </example>
<para>
The function
+ <screen>
+ char *oid_oid_to_dotstring(const Odr_oid *oid, char *oidbuf)
+ </screen>
+ does the reverse of <function>oid_oiddotstring_to_oid</function>. It
+ converts an OID to the string-based representation using dots.
+ The supplied char buffer <literal>oidbuf</literal> holds the resulting
+ string and must be at least <literal>OID_STR_MAX</literal> in size.
</para>
- <screen>
- oid_value oid_getvalbyname(const char *name);
- </screen>
-
- <para>
- takes as argument a mnemonic OID name, and returns the
- <literal>/value</literal> field of the first entry in the database that
- contains the given name in its <literal>desc</literal> field.
- </para>
-
- <para>
- Three utility functions are provided for translating OIDs'
- symbolic names (e.g. <literal>Usmarc</literal> into OID structures
- (int arrays) and strings containing the OID in dotted notation
- (e.g. <literal>1.2.840.10003.9.5.1</literal>). They are:
- </para>
-
- <screen>
- int *oid_name_to_oid(oid_class oclass, const char *name, int *oid);
- char *oid_to_dotstring(const int *oid, char *oidbuf);
- char *oid_name_to_dotstring(oid_class oclass, const char *name, char *oidbuf);
- </screen>
-
- <para>
- <literal>oid_name_to_oid()</literal>
- translates the specified symbolic <literal>name</literal>,
- interpreted as being of class <literal>oclass</literal>. (The
- class must be specified as many symbolic names exist within
- multiple classes - for example, <literal>Zthes</literal> is the
- symbolic name of an attribute set, a schema and a tag-set.) The
- sequence of integers representing the OID is written into the
- area <literal>oid</literal> provided by the caller; it is the
- caller's responsibility to ensure that this area is large enough
- to contain the translated OID. As a convenience, the address of
- the buffer (i.e. the value of <literal>oid</literal>) is
- returned.
- </para>
- <para>
- <literal>oid_to_dotstring()</literal>
- Translates the int-array <literal>oid</literal> into a dotted
- string which is written into the area <literal>oidbuf</literal>
- supplied by the caller; it is the caller's responsibility to
- ensure that this area is large enough. The address of the buffer
- is returned.
- </para>
<para>
- <literal>oid_name_to_dotstring()</literal>
- combines the previous two functions to derive a dotted string
- representing the OID specified by <literal>oclass</literal> and
- <literal>name</literal>, writing it into the buffer passed as
- <literal>oidbuf</literal> and returning its address.
+ OIDs can be copied with <function>oid_oidcpy</function> which takes
+ two OID lists as arguments. Alternativly, an OID copy can be allocated
+ on a ODR stream with:
+ <screen>
+ Odr_oid *odr_oiddup(ODR odr, const Odr_oid *o);
+ </screen>
</para>
-
+
<para>
- Finally, the module provides the following utility functions, whose
- meaning should be obvious:
+ OIDs can be compared with <function>oid_oidcmp</function> which returns
+ zero if the two OIDs provided are identical; non-zero otherwise.
</para>
+
+ <sect2 id="tools.oid.database"><title>OID database</title>
+ <para>
+ From YAZ version 3 and later, the oident system has been replaced
+ by an OID database. OID database is a misnomer .. the old odient
+ system was also a database.
+ </para>
+ <para>
+ The OID database is really just a map between named Object Identifiers
+ (string) and their OID raw equivalents. Most operations either
+ convert from string to OID or other way around.
+ </para>
+ <para>
+ Unfortunately, whenever we supply a string we must also specify the
+ <emphasis>OID class</emphasis>. The class is necessary because some
+ strings correspond to multiple OIDs. An example of such a string is
+ <literal>Bib-1</literal> which may either be an attribute-set
+ or a diagnostic-set.
+ </para>
+ <para>
+ Applications using the YAZ database should include
+ <filename>yaz/oid_db.h</filename>.
+ </para>
+ <para>
+ A YAZ database handle is of type <literal>yaz_oid_db_t</literal>.
+ Actually that's a pointer. You need not think deal with that.
+ YAZ has a built-in database which can be considered "constant" for
+ most purposes.
+ We can get hold that by using function <function>yaz_oid_std</function>.
+ </para>
+ <para>
+ All functions with prefix <function>yaz_string_to_oid</function>
+ converts from class + string to OID. We have variants of this
+ operation due to different memory allocation strategies.
+ </para>
+ <para>
+ All functions with prefix
+ <function>yaz_oid_to_string</function> converts from OID to string
+ + class.
+ </para>
- <screen>
- void oid_oidcpy(int *t, int *s);
- void oid_oidcat(int *t, int *s);
- int oid_oidcmp(int *o1, int *o2);
- int oid_oidlen(int *o);
- </screen>
+ <example id="tools.oid.bib1.4"><title>Create OID with YAZ DB</title>
+ <para>
+ We can create an OID for the Bib-1 attribute set on the ODR stream
+ odr with:
+ <screen>
+ Odr_oid *bib1 =
+ yaz_string_to_oid_odr(yaz_oid_std(), CLASS_ATTSET, "Bib-1", odr);
+ </screen>
+ This is more complex than using <function>odr_getoidbystr</function>.
+ You would only use <function>yaz_string_to_oid_odr</function> when the
+ string (here Bib-1) is supplied by a user or configuration.
+ </para>
+ </example>
- <note>
+ </sect2>
+ <sect2 id="tools.oid.std"><title>Standard OIDs</title>
+
+ <para>
+ All the object identifers in the standard OID database as returned
+ by <function>yaz_oid_std</function> can referenced directly in a
+ program as a constant OID.
+ Each constant OID is prefixed with <literal>yaz_oid_</literal> -
+ followed by OID class (lowercase) - then by OID name (normalized and
+ lowercase).
+ </para>
<para>
- The OID module has been criticized - and perhaps rightly so
- - for needlessly abstracting the
- representation of OIDs. Other toolkits use a simple
- string-representation of OIDs with good results. In practice, we have
- found the interface comfortable and quick to work with, and it is a
- simple matter (for what it's worth) to create applications compatible
- with both ISO SR and Z39.50. Finally, the use of the
- <literal>/oident</literal> database is by no means mandatory.
- You can easily create your own system for representing OIDs, as long
- as it is compatible with the low-level integer-array representation
- of the ODR module.
+ See <xref linkend="list-oids"/> for list of all object identifiers
+ built into YAZ.
+ These are declared in <filename>yaz/oid_std.h</filename> but are
+ included by <filename>yaz/oid_db.h</filename> as well.
</para>
- </note>
+ <example id="tools.oid.bib1.5"><title>Use a built-in OID</title>
+ <para>
+ We can allocate our own OID filled with the constant OID for
+ Bib-1 with:
+ <screen>
+ Odr_oid *bib1 = odr_oiddup(o, yaz_oid_attset_bib1);
+ </screen>
+ </para>
+ </example>
+ </sect2>
</sect1>
-
<sect1 id="tools.nmem"><title>Nibble Memory</title>
<para>
</example>
</sect1>
+ <sect1 id="tools.retrieval">
+ <title>Retrieval Facility</title>
+ <para>
+ YAZ version 2.1.20 or later includes a Retrieval facility tool
+ which allows a SRU/Z39.50 to describe itself and perform record
+ conversions. The idea is the following:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ An SRU/Z39.50 client sends a retrieval request which includes
+ a combination of the following parameters: syntax (format),
+ schema (or element set name).
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The retrieval facility is invoked with parameters in a
+ server/proxy. The retrieval facility matches the parameters a set of
+ "supported" retrieval types.
+ If there is no match, the retrieval signals an error
+ (syntax and / or schema not supported).
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ For a successful match, the backend is invoked with the same
+ or altered retrieval parameters (syntax, schema). If
+ a record is received from the backend, it is converted to the
+ frontend name / syntax.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The resulting record is sent back the client and tagged with
+ the frontend syntax / schema.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+ </para>
+ <para>
+ The Retrieval facility is driven by an XML configuration. The
+ configuration is neither Z39.50 ZeeRex or SRU ZeeRex. But it
+ should be easy to generate both of them from the XML configuration.
+ (unfortunately the two versions
+ of ZeeRex differ substantially in this regard).
+ </para>
+ <sect2 id="tools.retrieval.format">
+ <title>Retrieval XML format</title>
+ <para>
+ All elements should be covered by namespace
+ <literal>http://indexdata.com/yaz</literal> .
+ The root element node must be <literal>retrievalinfo</literal>.
+ </para>
+ <para>
+ The <literal>retrievalinfo</literal> must include one or
+ more <literal>retrieval</literal> elements. Each
+ <literal>retrieval</literal> defines specific combination of
+ syntax, name and identifier supported by this retrieval service.
+ </para>
+ <para>
+ The <literal>retrieval</literal> element may include any of the
+ following attributes:
+ <variablelist>
+ <varlistentry><term><literal>syntax</literal> (REQUIRED)</term>
+ <listitem>
+ <para>
+ Defines the record syntax. Possible values is any
+ of the names defined in YAZ' OID database or a raw
+ OID in (n.n ... n).
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry><term><literal>name</literal> (OPTIONAL)</term>
+ <listitem>
+ <para>
+ Defines the name of the retrieval format. This can be
+ any string. For SRU, the value, is equivalent to schema (short-hand);
+ for Z39.50 it's equivalent to simple element set name.
+ For YAZ 3.0.24 and later this name may be specified as a glob
+ expression with operators
+ <literal>*</literal> and <literal>?</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry><term><literal>identifier</literal> (OPTIONAL)</term>
+ <listitem>
+ <para>
+ Defines the URI schema name of the retrieval format. This can be
+ any string. For SRU, the value, is equivalent to URI schema.
+ For Z39.50, there is no equivalent.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <para>
+ The <literal>retrieval</literal> may include one
+ <literal>backend</literal> element. If a <literal>backend</literal>
+ element is given, it specifies how the records are retrieved by
+ some backend and how the records are converted from the backend to
+ the "frontend".
+ </para>
+ <para>
+ The attributes, <literal>name</literal> and <literal>syntax</literal>
+ may be specified for the <literal>backend</literal> element. These
+ semantics of these attributes is equivalent to those for the
+ <literal>retrieval</literal>. However, these values are passed to
+ the "backend".
+ </para>
+ <para>
+ The <literal>backend</literal> element may includes one or more
+ conversion instructions (as children elements). The supported
+ conversions are:
+ <variablelist>
+ <varlistentry><term><literal>marc</literal></term>
+ <listitem>
+ <para>
+ The <literal>marc</literal> element specifies a conversion
+ to - and from ISO2709 encoded MARC and
+ <ulink url="&url.marcxml;">&acro.marcxml;</ulink>/MarcXchange.
+ The following attributes may be specified:
+
+ <variablelist>
+ <varlistentry><term><literal>inputformat</literal> (REQUIRED)</term>
+ <listitem>
+ <para>
+ Format of input. Supported values are
+ <literal>marc</literal> (for ISO2709); and <literal>xml</literal>
+ for MARCXML/MarcXchange.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term><literal>outputformat</literal> (REQUIRED)</term>
+ <listitem>
+ <para>
+ Format of output. Supported values are
+ <literal>line</literal> (MARC line format);
+ <literal>marcxml</literal> (for MARCXML),
+ <literal>marc</literal> (ISO2709),
+ <literal>marcxhcange</literal> (for MarcXchange).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term><literal>inputcharset</literal> (OPTIONAL)</term>
+ <listitem>
+ <para>
+ Encoding of input. For XML input formats, this need not
+ be given, but for ISO2709 based inputformats, this should
+ be set to the encoding used. For MARC21 records, a common
+ inputcharset value would be <literal>marc-8</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term><literal>outputcharset</literal> (OPTIONAL)</term>
+ <listitem>
+ <para>
+ Encoding of output. If outputformat is XML based, it is
+ strongly recommened to use <literal>utf-8</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry><term><literal>xslt</literal></term>
+ <listitem>
+ <para>
+ The <literal>xslt</literal> element specifies a conversion
+ via &acro.xslt;. The following attributes may be specified:
+
+ <variablelist>
+ <varlistentry><term><literal>stylesheet</literal> (REQUIRED)</term>
+ <listitem>
+ <para>
+ Stylesheet file.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </sect2>
+ <sect2 id="tools.retrieval.examples">
+ <title>Retrieval Facility Examples</title>
+ <example id="tools.retrieval.marc21">
+ <title>MARC21 backend</title>
+ <para>
+ A typical way to use the retrieval facility is to enable XML
+ for servers that only supports ISO2709 encoded MARC21 records.
+ </para>
+ <programlisting><![CDATA[
+ <retrievalinfo>
+ <retrieval syntax="usmarc" name="F"/>
+ <retrieval syntax="usmarc" name="B"/>
+ <retrieval syntax="xml" name="marcxml"
+ identifier="info:srw/schema/1/marcxml-v1.1">
+ <backend syntax="usmarc" name="F">
+ <marc inputformat="marc" outputformat="marcxml"
+ inputcharset="marc-8"/>
+ </backend>
+ </retrieval>
+ <retrieval syntax="xml" name="dc">
+ <backend syntax="usmarc" name="F">
+ <marc inputformat="marc" outputformat="marcxml"
+ inputcharset="marc-8"/>
+ <xslt stylesheet="MARC21slim2DC.xsl"/>
+ </backend>
+ </retrieval>
+ </retrievalinfo>
+]]>
+ </programlisting>
+ <para>
+ This means that our frontend supports:
+ <itemizedlist>
+ <listitem>
+ <para>
+ MARC21 F(ull) records.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ MARC21 B(rief) records.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ MARCXML records.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Dublin core records.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </example>
+ </sect2>
+ <sect2 id="tools.retrieval.api">
+ <title>API</title>
+ <para>
+ It should be easy to use the retrieval systems from applications. Refer
+ to the headers
+ <filename>yaz/retrieval.h</filename> and
+ <filename>yaz/record_conv.h</filename>.
+ </para>
+ </sect2>
+ </sect1>
</chapter>
<!-- Keep this comment at the end of the file