+
+ <sect1 id="tools.log"><title>Log</title>
+ <para>
+ &yaz; has evolved a fairly complex log system which should be useful both
+ for debugging &yaz; itself, debugging applications that use &yaz;, and for
+ production use of those applications.
+ </para>
+ <para>
+ The log functions are declared in header <filename>yaz/log.h</filename>
+ and implemented in <filename>src/log.c</filename>.
+ Due to name clash with syslog and some math utilities the logging
+ interface has been modified as of YAZ 2.0.29. The obsolete interface
+ is still available if in header file <filename>yaz/log.h</filename>.
+ The key points of the interface are:
+ </para>
+ <screen>
+ void yaz_log(int level, const char *fmt, ...)
+
+ void yaz_log_init(int level, const char *prefix, const char *name);
+ void yaz_log_init_file(const char *fname);
+ void yaz_log_init_level(int level);
+ void yaz_log_init_prefix(const char *prefix);
+ void yaz_log_time_format(const char *fmt);
+ void yaz_log_init_max_size(int mx);
+
+ int yaz_log_mask_str(const char *str);
+ int yaz_log_module_level(const char *name);
+ </screen>
+
+ <para>
+ The reason for the whole log module is the <function>yaz_log</function>
+ function. It takes a bitmask indicating the log levels, a
+ <literal>printf</literal>-like format string, and a variable number of
+ arguments to log.
+ </para>
+
+ <para>
+ The <literal>log level</literal> is a bit mask, that says on which level(s)
+ the log entry should be made, and optionally set some behaviour of the
+ logging. In the most simple cases, it can be one of <literal>YLOG_FATAL,
+ YLOG_DEBUG, YLOG_WARN, YLOG_LOG</literal>. Those can be combined with bits
+ that modify the way the log entry is written:<literal>YLOG_ERRNO,
+ YLOG_NOTIME, YLOG_FLUSH</literal>.
+ Most of the rest of the bits are deprecated, and should not be used. Use
+ the dynamic log levels instead.
+ </para>
+
+ <para>
+ Applications that use &yaz;, should not use the LOG_LOG for ordinary
+ messages, but should make use of the dynamic loglevel system. This consists
+ of two parts, defining the loglevel and checking it.
+ </para>
+
+ <para>
+ To define the log levels, the (main) program should pass a string to
+ <function>yaz_log_mask_str</function> to define which log levels are to be
+ logged. This string should be a comma-separated list of log level names,
+ and can contain both hard-coded names and dynamic ones. The log level
+ calculation starts with <literal>YLOG_DEFAULT_LEVEL</literal> and adds a bit
+ for each word it meets, unless the word starts with a '-', in which case it
+ clears the bit. If the string <literal>'none'</literal> is found,
+ all bits are cleared. Typically this string comes from the command-line,
+ often identified by <literal>-v</literal>. The
+ <function>yaz_log_mask_str</function> returns a log level that should be
+ passed to <function>yaz_log_init_level</function> for it to take effect.
+ </para>
+
+ <para>
+ Each module should check what log bits it should be used, by calling
+ <function>yaz_log_module_level</function> with a suitable name for the
+ module. The name is cleared from a preceding path and an extension, if any,
+ so it is quite possible to use <literal>__FILE__</literal> for it. If the
+ name has been passed to <function>yaz_log_mask_str</function>, the routine
+ returns a non-zero bitmask, which should then be used in consequent calls
+ to yaz_log. (It can also be tested, so as to avoid unnecessary calls to
+ yaz_log, in time-critical places, or when the log entry would take time
+ to construct.)
+ </para>
+
+ <para>
+ Yaz uses the following dynamic log levels:
+ <literal>server, session, request, requestdetail</literal> for the server
+ functionality.
+ <literal>zoom</literal> for the zoom client api.
+ <literal>ztest</literal> for the simple test server.
+ <literal>malloc, nmem, odr, eventl</literal> for internal debugging of yaz itself.
+ Of course, any program using yaz is welcome to define as many new ones, as
+ it needs.
+ </para>
+
+ <para>
+ By default the log is written to stderr, but this can be changed by a call
+ to <function>yaz_log_init_file</function> or
+ <function>yaz_log_init</function>. If the log is directed to a file, the
+ file size is checked at every write, and if it exceeds the limit given in
+ <function>yaz_log_init_max_size</function>, the log is rotated. The
+ rotation keeps one old version (with a <literal>.1</literal> appended to
+ the name). The size defaults to 1GB. Setting it to zero will disable the
+ rotation feature.
+ </para>
+
+ <screen>
+ A typical yaz-log looks like this
+ 13:23:14-23/11 yaz-ztest(1) [session] Starting session from tcp:127.0.0.1 (pid=30968)
+ 13:23:14-23/11 yaz-ztest(1) [request] Init from 'YAZ' (81) (ver 2.0.28) OK
+ 13:23:17-23/11 yaz-ztest(1) [request] Search Z: @attrset Bib-1 foo OK:7 hits
+ 13:23:22-23/11 yaz-ztest(1) [request] Present: [1] 2+2 OK 2 records returned
+ 13:24:13-23/11 yaz-ztest(1) [request] Close OK
+ </screen>
+
+ <para>
+ The log entries start with a time stamp. This can be omitted by setting the
+ <literal>YLOG_NOTIME</literal> bit in the loglevel. This way automatic tests
+ can be hoped to produce identical log files, that are easy to diff. The
+ format of the time stamp can be set with
+ <function>yaz_log_time_format</function>, which takes a format string just
+ like <function>strftime</function>.
+ </para>
+
+ <para>
+ Next in a log line comes the prefix, often the name of the program. For
+ yaz-based servers, it can also contain the session number. Then
+ comes one or more logbits in square brackets, depending on the logging
+ level set by <function>yaz_log_init_level</function> and the loglevel
+ passed to <function>yaz_log_init_level</function>. Finally comes the format
+ string and additional values passed to <function>yaz_log</function>
+ </para>
+
+ <para>
+ The log level <literal>YLOG_LOGLVL</literal>, enabled by the string
+ <literal>loglevel</literal>, will log all the log-level affecting
+ operations. This can come in handy if you need to know what other log
+ levels would be useful. Grep the logfile for <literal>[loglevel]</literal>.
+ </para>
+
+ <para>
+ The log system is almost independent of the rest of &yaz;, the only
+ important dependence is of <filename>nmem</filename>, and that only for
+ using the semaphore definition there.
+ </para>
+
+ <para>
+ The dynamic log levels and log rotation were introduced in &yaz; 2.0.28. At
+ the same time, the log bit names were changed from
+ <literal>LOG_something</literal> to <literal>YLOG_something</literal>,
+ to avoid collision with <filename>syslog.h</filename>.
+ </para>
+
+ </sect1>
+
+ <sect1 id="marc"><title>MARC</title>
+
+ <para>
+ YAZ provides a fast utility for working with MARC records.
+ Early versions of the MARC utility only allowed decoding of ISO2709.
+ Today the utility may both encode - and decode to a varity of formats.
+ </para>
+ <synopsis><![CDATA[
+ #include <yaz/marcdisp.h>
+
+ /* create handler */
+ yaz_marc_t yaz_marc_create(void);
+ /* destroy */
+ void yaz_marc_destroy(yaz_marc_t mt);
+
+ /* set XML mode YAZ_MARC_LINE, YAZ_MARC_SIMPLEXML, ... */
+ void yaz_marc_xml(yaz_marc_t mt, int xmlmode);
+ #define YAZ_MARC_LINE 0
+ #define YAZ_MARC_SIMPLEXML 1
+ #define YAZ_MARC_OAIMARC 2
+ #define YAZ_MARC_MARCXML 3
+ #define YAZ_MARC_ISO2709 4
+ #define YAZ_MARC_XCHANGE 5
+ #define YAZ_MARC_CHECK 6
+ #define YAZ_MARC_TURBOMARC 7
+
+ /* supply iconv handle for character set conversion .. */
+ void yaz_marc_iconv(yaz_marc_t mt, yaz_iconv_t cd);
+
+ /* set debug level, 0=none, 1=more, 2=even more, .. */
+ void yaz_marc_debug(yaz_marc_t mt, int level);
+
+ /* decode MARC in buf of size bsize. Returns >0 on success; <=0 on failure.
+ On success, result in *result with size *rsize. */
+ int yaz_marc_decode_buf(yaz_marc_t mt, const char *buf, int bsize,
+ const char **result, size_t *rsize);
+
+ /* decode MARC in buf of size bsize. Returns >0 on success; <=0 on failure.
+ On success, result in WRBUF */
+ int yaz_marc_decode_wrbuf(yaz_marc_t mt, const char *buf,
+ int bsize, WRBUF wrbuf);
+]]>
+ </synopsis>
+ <note>
+ <para>
+ The synopsis is just a basic subset of all functionality. Refer
+ to the actual header file <filename>marcdisp.h</filename> for
+ details.
+ </para>
+ </note>
+ <para>
+ A MARC conversion handle must be created by using
+ <function>yaz_marc_create</function> and destroyed
+ by calling <function>yaz_marc_destroy</function>.
+ </para>
+ <para>
+ All other function operate on a <literal>yaz_marc_t</literal> handle.
+ The output is specified by a call to <function>yaz_marc_xml</function>.
+ The <literal>xmlmode</literal> must be one of
+ <variablelist>
+ <varlistentry>
+ <term>YAZ_MARC_LINE</term>
+ <listitem>
+ <para>
+ A simple line-by-line format suitable for display but not
+ recommend for further (machine) processing.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>YAZ_MARC_MARCXML</term>
+ <listitem>
+ <para>
+ <ulink url="&url.marcxml;">MARCXML</ulink>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>YAZ_MARC_ISO2709</term>
+ <listitem>
+ <para>
+ ISO2709 (sometimes just referred to as "MARC").
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>YAZ_MARC_XCHANGE</term>
+ <listitem>
+ <para>
+ <ulink url="&url.marcxchange;">MarcXchange</ulink>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>YAZ_MARC_CHECK</term>
+ <listitem>
+ <para>
+ Pseudo format for validation only. Does not generate
+ any real output except diagnostics.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>YAZ_MARC_TURBOMARC</term>
+ <listitem>
+ <para>
+ XML format with same semantics as MARCXML but more compact
+ and geared towards fast processing with XSLT. Refer to
+ <xref linkend="tools.turbomarc"/> for more information.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </para>
+ <para>
+ The actual conversion functions are
+ <function>yaz_marc_decode_buf</function> and
+ <function>yaz_marc_decode_wrbuf</function> which decodes and encodes
+ a MARC record. The former function operates on simple buffers, the
+ stores the resulting record in a WRBUF handle (WRBUF is a simple string
+ type).
+ </para>
+ <example id="example.marc.display">
+ <title>Display of MARC record</title>
+ <para>
+ The following program snippet illustrates how the MARC API may
+ be used to convert a MARC record to the line-by-line format:
+ <programlisting><![CDATA[
+ void print_marc(const char *marc_buf, int marc_buf_size)
+ {
+ char *result; /* for result buf */
+ size_t result_len; /* for size of result */
+ yaz_marc_t mt = yaz_marc_create();
+ yaz_marc_xml(mt, YAZ_MARC_LINE);
+ yaz_marc_decode_buf(mt, marc_buf, marc_buf_size,
+ &result, &result_len);
+ fwrite(result, result_len, 1, stdout);
+ yaz_marc_destroy(mt); /* note that result is now freed... */
+ }
+]]>
+ </programlisting>
+ </para>
+ </example>
+ <sect2 id="tools.turbomarc">
+ <title>TurboMARC</title>
+ <para>
+ TurboMARC is yet another XML encoding of a MARC record. The format
+ was designed for fast processing with XSLT.
+ </para>
+ <para>
+ Applications like
+ Pazpar2 uses XSLT to convert an XML encoded MARC record to an internal
+ representation. This conversion mostly check the tag of a MARC field
+ to determine the basic rules in the conversion. This check is
+ costly when that is tag is encoded as an attribute in MARCXML.
+ By having the tag value as the element instead, makes processing
+ many times faster (at least for Libxslt).
+ </para>
+ <para>
+ TurboMARC is encoded as follows:
+ <itemizedlist>
+ <listitem><para>
+ Record elements is part of namespace
+ "<literal>http://www.indexdata.com/turbomarc</literal>".
+ </para></listitem>
+ <listitem><para>
+ A record is enclosed in element <literal>r</literal>.
+ </para></listitem>
+ <listitem><para>
+ A collection of records is enclosed in element
+ <literal>collection</literal>.
+ </para></listitem>
+ <listitem><para>
+ The leader is encoded as element <literal>l</literal> with the
+ leader content as its (text) value.
+ </para></listitem>
+ <listitem><para>
+ A control field is encoded as element <literal>c</literal> concatenated
+ with the tag value of the control field if the tag value
+ matches the regular expression <literal>[a-zA-Z0-9]*</literal>.
+ If the tag value do not match the regular expression
+ <literal>[a-zA-Z0-9]*</literal> the control field is encoded
+ as element <literal>c</literal> and attribute <literal>code</literal>
+ will hold the tag value.
+ This rule ensure that in the rare cases where a tag value might
+ result in a non-wellformed XML YAZ encode it as a coded attribute
+ (as in MARCXML).
+ </para>
+ <para>
+ The control field content is the the text value of this element.
+ Indicators are encoded as attribute names
+ <literal>i1</literal>, <literal>i2</literal>, etc.. and
+ corresponding values for each indicator.
+ </para></listitem>
+ <listitem><para>
+ A data field is encoded as element <literal>d</literal> concatenated
+ with the tag value of the data field or using the attribute
+ <literal>code</literal> as described in the rules for control fields.
+ The children of the data field element is subfield elements.
+ Each subfield element is encoded as <literal>s</literal>
+ concatenated with the sub field code.
+ The text of the subfield element is the contents of the subfield.
+ Indicators are encoded as attributes for the data field element similar
+ to the encoding for control fields.
+ </para></listitem>
+ </itemizedlist>
+ </para>
+ </sect2>
+ </sect1>
+
+ <sect1 id="tools.retrieval">
+ <title>Retrieval Facility</title>
+ <para>
+ YAZ version 2.1.20 or later includes a Retrieval facility tool
+ which allows a SRU/Z39.50 to describe itself and perform record
+ conversions. The idea is the following:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ An SRU/Z39.50 client sends a retrieval request which includes
+ a combination of the following parameters: syntax (format),
+ schema (or element set name).
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The retrieval facility is invoked with parameters in a
+ server/proxy. The retrieval facility matches the parameters a set of
+ "supported" retrieval types.
+ If there is no match, the retrieval signals an error
+ (syntax and / or schema not supported).
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ For a successful match, the backend is invoked with the same
+ or altered retrieval parameters (syntax, schema). If
+ a record is received from the backend, it is converted to the
+ frontend name / syntax.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The resulting record is sent back the client and tagged with
+ the frontend syntax / schema.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+ </para>
+ <para>
+ The Retrieval facility is driven by an XML configuration. The
+ configuration is neither Z39.50 ZeeRex or SRU ZeeRex. But it
+ should be easy to generate both of them from the XML configuration.
+ (unfortunately the two versions
+ of ZeeRex differ substantially in this regard).
+ </para>
+ <sect2 id="tools.retrieval.format">
+ <title>Retrieval XML format</title>
+ <para>
+ All elements should be covered by namespace
+ <literal>http://indexdata.com/yaz</literal> .
+ The root element node must be <literal>retrievalinfo</literal>.
+ </para>
+ <para>
+ The <literal>retrievalinfo</literal> must include one or
+ more <literal>retrieval</literal> elements. Each
+ <literal>retrieval</literal> defines specific combination of
+ syntax, name and identifier supported by this retrieval service.
+ </para>
+ <para>
+ The <literal>retrieval</literal> element may include any of the
+ following attributes:
+ <variablelist>
+ <varlistentry><term><literal>syntax</literal> (REQUIRED)</term>
+ <listitem>
+ <para>
+ Defines the record syntax. Possible values is any
+ of the names defined in YAZ' OID database or a raw
+ OID in (n.n ... n).
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry><term><literal>name</literal> (OPTIONAL)</term>
+ <listitem>
+ <para>
+ Defines the name of the retrieval format. This can be
+ any string. For SRU, the value, is equivalent to schema (short-hand);
+ for Z39.50 it's equivalent to simple element set name.
+ For YAZ 3.0.24 and later this name may be specified as a glob
+ expression with operators
+ <literal>*</literal> and <literal>?</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry><term><literal>identifier</literal> (OPTIONAL)</term>
+ <listitem>
+ <para>
+ Defines the URI schema name of the retrieval format. This can be
+ any string. For SRU, the value, is equivalent to URI schema.
+ For Z39.50, there is no equivalent.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ <para>
+ The <literal>retrieval</literal> may include one
+ <literal>backend</literal> element. If a <literal>backend</literal>
+ element is given, it specifies how the records are retrieved by
+ some backend and how the records are converted from the backend to
+ the "frontend".
+ </para>
+ <para>
+ The attributes, <literal>name</literal> and <literal>syntax</literal>
+ may be specified for the <literal>backend</literal> element. These
+ semantics of these attributes is equivalent to those for the
+ <literal>retrieval</literal>. However, these values are passed to
+ the "backend".
+ </para>
+ <para>
+ The <literal>backend</literal> element may includes one or more
+ conversion instructions (as children elements). The supported
+ conversions are:
+ <variablelist>
+ <varlistentry><term><literal>marc</literal></term>
+ <listitem>
+ <para>
+ The <literal>marc</literal> element specifies a conversion
+ to - and from ISO2709 encoded MARC and
+ <ulink url="&url.marcxml;">&acro.marcxml;</ulink>/MarcXchange.
+ The following attributes may be specified:
+
+ <variablelist>
+ <varlistentry><term><literal>inputformat</literal> (REQUIRED)</term>
+ <listitem>
+ <para>
+ Format of input. Supported values are
+ <literal>marc</literal> (for ISO2709); and <literal>xml</literal>
+ for MARCXML/MarcXchange.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term><literal>outputformat</literal> (REQUIRED)</term>
+ <listitem>
+ <para>
+ Format of output. Supported values are
+ <literal>line</literal> (MARC line format);
+ <literal>marcxml</literal> (for MARCXML),
+ <literal>marc</literal> (ISO2709),
+ <literal>marcxhcange</literal> (for MarcXchange).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term><literal>inputcharset</literal> (OPTIONAL)</term>
+ <listitem>
+ <para>
+ Encoding of input. For XML input formats, this need not
+ be given, but for ISO2709 based inputformats, this should
+ be set to the encoding used. For MARC21 records, a common
+ inputcharset value would be <literal>marc-8</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry><term><literal>outputcharset</literal> (OPTIONAL)</term>
+ <listitem>
+ <para>
+ Encoding of output. If outputformat is XML based, it is
+ strongly recommened to use <literal>utf-8</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry><term><literal>xslt</literal></term>
+ <listitem>
+ <para>
+ The <literal>xslt</literal> element specifies a conversion
+ via &acro.xslt;. The following attributes may be specified:
+
+ <variablelist>
+ <varlistentry><term><literal>stylesheet</literal> (REQUIRED)</term>
+ <listitem>
+ <para>
+ Stylesheet file.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </sect2>
+ <sect2 id="tools.retrieval.examples">
+ <title>Retrieval Facility Examples</title>
+ <example id="tools.retrieval.marc21">
+ <title>MARC21 backend</title>
+ <para>
+ A typical way to use the retrieval facility is to enable XML
+ for servers that only supports ISO2709 encoded MARC21 records.
+ </para>
+ <programlisting><![CDATA[
+ <retrievalinfo>
+ <retrieval syntax="usmarc" name="F"/>
+ <retrieval syntax="usmarc" name="B"/>
+ <retrieval syntax="xml" name="marcxml"
+ identifier="info:srw/schema/1/marcxml-v1.1">
+ <backend syntax="usmarc" name="F">
+ <marc inputformat="marc" outputformat="marcxml"
+ inputcharset="marc-8"/>
+ </backend>
+ </retrieval>
+ <retrieval syntax="xml" name="dc">
+ <backend syntax="usmarc" name="F">
+ <marc inputformat="marc" outputformat="marcxml"
+ inputcharset="marc-8"/>
+ <xslt stylesheet="MARC21slim2DC.xsl"/>
+ </backend>
+ </retrieval>
+ </retrievalinfo>
+]]>
+ </programlisting>
+ <para>
+ This means that our frontend supports:
+ <itemizedlist>
+ <listitem>
+ <para>
+ MARC21 F(ull) records.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ MARC21 B(rief) records.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ MARCXML records.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Dublin core records.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </example>
+ </sect2>
+ <sect2 id="tools.retrieval.api">
+ <title>API</title>
+ <para>
+ It should be easy to use the retrieval systems from applications. Refer
+ to the headers
+ <filename>yaz/retrieval.h</filename> and
+ <filename>yaz/record_conv.h</filename>.
+ </para>
+ </sect2>
+ </sect1>