<?xml version="1.0" standalone="no"?>
-<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1//EN"
- "http://www.oasis-open.org/docbook/xml/4.1/docbookx.dtd"
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+ "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"
[
+ <!ENTITY copyright SYSTEM "copyright.xml">
<!ENTITY % local SYSTEM "local.ent">
%local;
<!ENTITY manref SYSTEM "manref.xml">
- <!ENTITY progref SYSTEM "progref.xml">
- <!ENTITY % common SYSTEM "common/common.ent">
- %common;
- <!-- Next line allows imagedata/@format="PDF" and is taken from
- http://lists.oasis-open.org/archives/docbook/200303/msg00163.html
- -->
- <!ENTITY % local.notation.class "| PDF">
- <!-- Next line is necessary for some XML parsers, for reasons I
- don't understand. I got this from
- http://lists.oasis-open.org/archives/docbook/200303/msg00180.html
- -->
- <!NOTATION PDF SYSTEM "PDF">
+ <!ENTITY gpl2 SYSTEM "gpl-2.0.xml">
+ <!ENTITY % idcommon SYSTEM "common/common.ent">
+ %idcommon;
]>
-<!-- $Id: book.xml,v 1.51 2007-01-18 09:24:47 marc Exp $ -->
-<book id="metaproxy">
+<book>
<bookinfo>
<title>Metaproxy - User's Guide and Reference</title>
<authorgroup>
</authorgroup>
<releaseinfo>&version;</releaseinfo>
<copyright>
- <year>2005-2007</year>
- <holder>Index Data ApS</holder>
+ <year>2005-2015</year>
+ <holder>Index Data</holder>
</copyright>
<abstract>
<simpara>
processes, interprets and redirects requests from IR clients using
standard protocols such as the binary
<ulink url="&url.z39.50;">ANSI/NISO Z39.50</ulink>
- and the information search and retireval
- web services <ulink url="&url.sru;">SRU</ulink>
- and <ulink url="&url.srw;">SRW</ulink>, as
- well as functioning as a limited
- <ulink url="&url.http;">HTTP</ulink> server.
+ and the information search and retrieval
+ web service <ulink url="&url.sru;">SRU</ulink>
+ as well as functioning as a limited
+ <ulink url="&url.http;">HTTP</ulink> server.
</simpara>
<simpara>
Metaproxy is configured by an XML file which
using the filter API.
</simpara>
<simpara>
- Metaproxy is <emphasis>not</emphasis> open-source software, but
- may be freely downloaded, unpacked, inspected, built and run for
- evaluation purposes. Deployment requires a separate, commercial,
- license.
+ Metaproxy is covered by the GNU General Public License version 2.
</simpara>
<simpara>
<inlinemediaobject>
<chapter id="introduction">
<title>Introduction</title>
-
-
+
<para>
<ulink url="&url.metaproxy;">Metaproxy</ulink>
is a stand alone program that acts as a universal router, proxy and
encapsulated metasearcher for information retrieval protocols such
- as <ulink url="&url.z39.50;">Z39.50</ulink>, and in the future
- <ulink url="&url.sru;">SRU</ulink> and <ulink url="&url.srw;">SRW</ulink>.
+ as <ulink url="&url.z39.50;">Z39.50</ulink> and
+ <ulink url="&url.sru;">SRU</ulink>.
To clients, it acts as a server of these protocols: it can be searched,
- records can be retrieved from it, etc.
+ records can be retrieved from it, etc.
To servers, it acts as a client: it searches in them,
retrieves records from them, etc. it satisfies its clients'
requests by transforming them, multiplexing them, forwarding them
</para>
</chapter>
- <chapter id="license">
- <title>The Metaproxy License</title>
- <orderedlist numeration="arabic">
- <listitem>
- <para>
- You are allowed to download this software for evaluation purposes.
- You can unpack it, build it, run it, see how it works and how it fits
- your needs, all at zero cost.
- </para>
- </listitem>
- <listitem>
- <para>
- You may NOT deploy the software. For the purposes of this license,
- deployment means running it for any purpose other than evaluation,
- whether or not you or anyone else makes a profit from doing so. If
- you wish to deploy the software, you must first contact Index Data and
- arrange to purchase a DEPLOYMENT LICENCE. If you are unsure
- whether or not your proposed use of the software constitutes
- deployment, email us at <literal>info@indexdata.com</literal>
- for clarification.
- </para>
- </listitem>
- <listitem>
- <para>
- You may modify your copy of the software (fix bugs, add features)
- if you need to. We encourage you to send your changes back to us for
- integration into the master copy, but you are not obliged to do so. You
- may NOT pass your changes on to any other party.
- </para>
- </listitem>
- <listitem>
- <para>
- There is NO WARRANTY for this software, to the extent permitted by
- applicable law. We provide the software ``as is'' without warranty of
- any kind, either expressed or implied, including, but not limited to, the
- implied warranties of MERCHANTABILITY and FITNESS FOR A
- PARTICULAR PURPOSE. The entire risk as to the quality and
- performance of the software is with you. Should the software prove
- defective, you assume the cost of all necessary servicing, repair or
- correction. In no event unless required by applicable law will we be
- liable to you for damages, arising out of the use of the software,
- including but not limited to loss of data or data being rendered
- inaccurate.
- </para>
- </listitem>
- <listitem>
- <para>
- All rights to the software are reserved by Index Data except where
- this license explicitly says otherwise.
- </para>
- </listitem>
- </orderedlist>
- </chapter>
-
<chapter id="installation">
<title>Installation</title>
<para>
</varlistentry>
<varlistentry><term><ulink url="&url.libxslt;">Libxslt</ulink></term>
<listitem>
- <para>This is an XSLT processor - based on
+ <para>This is an XSLT processor - based on
<ulink url="&url.libxml2;">Libxml2</ulink>. Both Libxml2 and
Libxslt must be installed with the development components
(header files, etc.) as well as the run-time libraries.
<listitem>
<para>
The popular C++ library. Initial versions of Metaproxy
- was built with 1.33.0. Version 1.33.1 works too.
+ was built with 1.32 but this is no longer supported.
+ Metaproxy is known to work with Boost version 1.33 through 1.55.
</para>
</listitem>
</varlistentry>
</para>
<para>
We have successfully built Metaproxy using the compilers
- <ulink url="&url.gcc;">GCC</ulink> version 4.0 and
- <ulink url="&url.vstudio;">Microsoft Visual Studio</ulink> 2003/2005.
+ <ulink url="&url.gcc;">GCC</ulink> and
+ <ulink url="&url.vstudio;">Microsoft Visual Studio</ulink>.
</para>
+ <para>
+ As an option, Metaproxy may also be compiled with
+ <ulink url="&url.usemarcon;">USEMARCON</ulink> support which allows for
+ MARC conversions for the <xref linkend="ref-record_transform"/> filter.
+ </para>
<section id="installation.unix">
<title>Installation on Unix (from Source)</title>
<para>
tools binary packages. If, for example, Libxml2/libxslt are already
installed as development packages use those (and omit compilation).
</para>
-
- <para>
- Libxml2/libxslt:
- </para>
- <screen>
- gunzip -c libxml2-version.tar.gz|tar xf -
- cd libxml2-version
- ./configure
- make
- su
- make install
- </screen>
- <screen>
- gunzip -c libxslt-version.tar.gz|tar xf -
- cd libxslt-version
- ./configure
- make
- su
- make install
- </screen>
- <para>
- YAZ/YAZ++:
- </para>
- <screen>
- gunzip -c yaz-version.tar.gz|tar xf -
- cd yaz-version
- ./configure
- make
- su
- make install
- </screen>
- <screen>
- gunzip -c yazpp-version.tar.gz|tar xf -
- cd yazpp-version
- ./configure
- make
- su
- make install
- </screen>
- <para>
- Boost:
- </para>
- <screen>
- gunzip -c boost-version.tar.gz|tar xf -
- cd boost-version
- ./configure
- make
- su
- make install
- </screen>
- <para>
- Metaproxy:
- </para>
- <screen>
- gunzip -c metaproxy-version.tar.gz|tar xf -
- cd metaproxy-version
- ./configure
- make
- su
- make install
- </screen>
+
+ <note>
+ <para>
+ <ulink url="&url.usemarcon;">USEMARCON</ulink> is not available
+ as a package at the moment, so Metaproxy must be built from source
+ if that is to be used.
+ </para>
+ </note>
+
+ <section id="libxml2.fromsource">
+ <title>Libxml2/libxslt</title>
+ <para>
+ Libxml2/libxslt:
+ </para>
+ <screen>
+ gunzip -c libxml2-version.tar.gz|tar xf -
+ cd libxml2-version
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ <screen>
+ gunzip -c libxslt-version.tar.gz|tar xf -
+ cd libxslt-version
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ </section>
+ <section id="usemarcon">
+ <title>USEMARCON (optional)</title>
+ <screen>
+ gunzip -c usemarcon317.tar.gz|tar xf -
+ cd usemarcon317
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ </section>
+
+ <section id="yaz.fromsource">
+ <title>YAZ/YAZ++</title>
+ <screen>
+ gunzip -c yaz-version.tar.gz|tar xf -
+ cd yaz-version
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ <screen>
+ gunzip -c yazpp-version.tar.gz|tar xf -
+ cd yazpp-version
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ </section>
+ <section>
+ <title id="boost.fromsource">Boost</title>
+ <para>
+ Metaproxy needs components thread and test from
+ Boost.
+ </para>
+ <screen>
+ gunzip -c boost-version.tar.gz|tar xf -
+ cd boost-version
+ ./configure --with-libraries=thread,test,regex --with-toolset=gcc
+ make
+ su
+ make install
+ </screen>
+ <para>
+ However, under the hood bjam is used. You can invoke that with
+ </para>
+ <screen>
+ ./bjam --toolset=gcc --with-thread --with-test --with-regex stage
+ </screen>
+ <para>
+ Replace <literal>stage</literal> with <literal>clean</literal> /
+ <literal>install</literal> to perform clean and install respectively.
+ </para>
+ <para>
+ Add <literal>--prefix=DIR</literal> to install Boost in other
+ prefix than <literal>/usr/local</literal>.
+ </para>
+ </section>
+ <section id="metaproxy.fromsource">
+ <title>Metaproxy</title>
+ <screen>
+ gunzip -c metaproxy-version.tar.gz|tar xf -
+ cd metaproxy-version
+ ./configure
+ make
+ su
+ make install
+ </screen>
+ <para>
+ You may have to tell configure where Boost is installed by supplying
+ options <literal>--with-boost</literal> and <literal>--with-boost-toolset</literal>.
+ The former sets the PREFIX for Boost (same as --prefix for Boost above).
+ The latter the compiler toolset (eg. gcc34).
+ </para>
+ <para>
+ Pass <literal>--help</literal> to configure to get a list of
+ available options.
+ </para>
+ </section>
</section>
<section id="installation.debian">
<title>Installation on Debian GNU/Linux</title>
<para>
- All dependencies for Metaproxy are available as
- <ulink url="&url.debian;">Debian</ulink>
- packages for the sarge (stable in 2005) and etch (testing in 2005)
- distributions.
+ All dependencies for Metaproxy are available as
+ <ulink url="&url.debian;">Debian</ulink> packages.
</para>
<para>
The procedures for Debian based systems, such as
</para>
<para>
There is currently no official Debian package for YAZ++.
- And the Debian package for YAZ is probably too old.
+ And the official Debian package for YAZ is probably too old.
+ But Index Data builds "new" versions of those for Debian (i386, amd64 only).
+ </para>
+ <para>
Update the <filename>/etc/apt/sources.list</filename>
to include the Index Data repository.
See YAZ' <ulink url="&url.yaz.download.debian;">Download Debian</ulink>
</para>
<screen>
apt-get install libxslt1-dev
- apt-get install libyazpp-dev
+ apt-get install libyazpp6-dev
apt-get install libboost-dev
+ apt-get install libboost-system-dev
apt-get install libboost-thread-dev
- apt-get install libboost-date-time-dev
- apt-get install libboost-program-options-dev
apt-get install libboost-test-dev
+ apt-get install libboost-regex-dev
</screen>
<para>
With these packages installed, the usual configure + make
</para>
</section>
+ <section id="installation.rpm">
+ <title>Installation on RPM based Linux Systems</title>
+ <para>
+ All external dependencies for Metaproxy are available as
+ RPM packages, either from your distribution site, or from the
+ <ulink url="http://fr.rpmfind.net/">RPMfind</ulink> site.
+ </para>
+ <para>
+ For example, an installation of the requires Boost C++ development
+ libraries on RedHat Fedora C4 and C5 can be done like this:
+ <screen>
+ wget ftp://fr.rpmfind.net/wlinux/fedora/core/updates/testing/4/SRPMS/boost-1.33.0-3.fc4.src.rpm
+ sudo rpmbuild --buildroot src/ --rebuild -p fc4/boost-1.33.0-3.fc4.src.rpm
+ sudo rpm -U /usr/src/redhat/RPMS/i386/boost-*rpm
+ </screen>
+ </para>
+ <para>
+ The <ulink url="&url.yaz;">YAZ</ulink> library is needed to
+ compile &metaproxy;, see there
+ for more information on available RPM packages.
+ </para>
+ <para>
+ There is currently no official RPM package for YAZ++.
+ See the <ulink url="&url.yazplusplus;">YAZ++</ulink> pages
+ for more information on a Unix tarball install.
+ </para>
+ <para>
+ With these packages installed, the usual configure + make
+ procedure can be used for Metaproxy as outlined in
+ <xref linkend="installation.unix"/>.
+ </para>
+ </section>
+
<section id="installation.windows">
<title>Installation on Windows</title>
<para>
- Metaproxy can be compiled with Microsoft
+ Metaproxy has been tested Microsoft
<ulink url="&url.vstudio;">Visual Studio</ulink>.
- Version 2003 (C 7.1) and 2005 (C 8.0) is known to work.
+ 2013 (C 12.0).
</para>
<section id="installation.windows.boost">
<title>Boost</title>
<para>
- Get Boost from its <ulink url="&url.boost;">home page</ulink>.
- You also need Boost Jam (an alternative to make).
- That's also available from the Boost home page.
- The files to be downloaded are called something like:
- <filename>boost_1_33-1.exe</filename>
- and
- <filename>boost-jam-3.1.12-1-ntx86.zip</filename>.
- Unpack Boost Jam first. Put <filename>bjam.exe</filename>
- in your system path. Make a command prompt and ensure
- it can be found automatically. If not check the PATH.
- The Boost .exe is a self-extracting exe with
- complete source for Boost. Compile that source with
- Boost Jam (An alternative to Make).
- The compilation takes a while.
- For Visual Studio 2003, use
- <screen>
- bjam "-sTOOLS=vc-7_1"
- </screen>
- Here <literal>vc-7_1</literal> refers to a "Toolset" (compiler system).
- For Visual Studio 2005, use
- <screen>
- bjam "-sTOOLS=vc-8_0"
- </screen>
- To install the libraries in a common place, use
- <screen>
- bjam "-sTOOLS=vc-7_1" install
- </screen>
- (or vc-8_0 for VS 2005).
- </para>
- <para>
- By default, the Boost build process installs the resulting
- libraries + header files in
- <literal>\boost\lib</literal>, <literal>\boost\include</literal>.
+ For Windows, it's easiest to get the precompiled Boost
+ package from <ulink url="&url.boost.windows.download;">here</ulink>.
+ Several versions of the Boost libraries may be selected when
+ installing Boost for windows. Please choose at least the
+ <emphasis>multithreaded</emphasis> (non-DLL) version because
+ the Metaproxy makefile uses that.
</para>
<para>
For more information about installing Boost refer to the
<para>
<ulink url="&url.libxslt;">Libxslt</ulink> can be downloaded
for Windows from
- <ulink url="&url.libxml2.download.win32;">here</ulink>.
+ <ulink url="&url.libxml2.download.windows;">here</ulink>.
</para>
<para>
- Libxslt has other dependencies, but these can all be downloaded
- from the same site. Get the following:
- iconv, zlib, libxml2, libxslt.
+ Libxslt also requires libxml2 to operate.
</para>
</section>
<title>YAZ++</title>
<para>
Get <ulink url="&url.yazplusplus;">YAZ++</ulink> as well.
- Version 1.0 or later is required. For now get it from
- Index Data's
- <ulink url="&url.snapshot.download;">Snapshot area</ulink>.
+ Version 1.6.0 or later is required.
</para>
<para>
YAZ++ includes NMAKE makefiles, similar to those found in the
</para>
</listitem>
</varlistentry>
-
+
</variablelist>
-
+
<para>
After successful compilation you'll find
<literal>metaproxy.exe</literal> in the
</section>
</chapter>
-
+
+<chapter id="yazproxy-comparison">
+ <title>YAZ Proxy Comparison</title>
+ <para>
+ The table below lists facilities either supported by either
+ <ulink url="&url.yazproxy;">YAZ Proxy</ulink> or Metaproxy.
+ </para>
+<table id="yazproxy-comparison-table">
+ <title>Metaproxy / YAZ Proxy comparison</title>
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry>Facility</entry>
+ <entry>Metaproxy</entry>
+ <entry>YAZ Proxy</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>Z39.50 server</entry>
+ <entry>Using filter <xref linkend="ref-frontend_net"/></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>SRU server</entry>
+ <entry>Supported with filter <xref linkend="ref-sru_z3950"/></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Z39.50 client</entry>
+ <entry>Supported with filter <xref linkend="ref-z3950_client"/></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>SRU client</entry>
+ <entry>Supported with filter <xref linkend="ref-zoom"/></entry>
+ <entry>Unsupported</entry>
+ </row>
+ <row>
+ <entry>Connection reuse</entry>
+ <entry>Supported with filter <literal>session_shared</literal></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Connection share</entry>
+ <entry>Supported with filter <literal>session_shared</literal></entry>
+ <entry>Unsupported</entry>
+ </row>
+ <row>
+ <entry>Result set reuse</entry>
+ <entry>Supported with filter <literal>session_shared</literal></entry>
+ <entry>Within one Z39.50 session / HTTP keep-alive</entry>
+ </row>
+ <row>
+ <entry>Record cache</entry>
+ <entry>Supported by filter <literal>session_shared</literal></entry>
+ <entry>Supported for last result set within one Z39.50/HTTP-keep alive session</entry>
+ </row>
+ <row>
+ <entry>Z39.50 Virtual database, i.e. select any Z39.50 target for database</entry>
+ <entry>Supported with filter <literal>virt_db</literal></entry>
+ <entry>Unsupported</entry>
+ </row>
+ <row>
+ <entry>SRU Virtual database, i.e. select any Z39.50 target for path</entry>
+ <entry>Supported with filter <literal>virt_db</literal>,
+ <literal>sru_z3950</literal></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Multi target search</entry>
+ <entry>Supported with filter <literal>multi</literal> (round-robin)</entry>
+ <entry>Unsupported</entry>
+ </row>
+ <row>
+ <entry>Retrieval and search limits</entry>
+ <entry>Supported using filter <literal>limit</literal></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Bandwidth limits</entry>
+ <entry>Supported using filter <literal>limit</literal></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Connect limits</entry>
+ <entry>Supported by filter <literal>frontend_net</literal> (connect-max)</entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Retrieval sanity check and conversions</entry>
+ <entry>Supported using filter <literal>record_transform</literal></entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Query check</entry>
+ <entry>
+ Supported by <literal>query_rewrite</literal> which may be check
+ a query and throw diagnostics (errors)
+ </entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Query rewrite</entry>
+ <entry>Supported with <literal>query_rewrite</literal></entry>
+ <entry>Unsupported</entry>
+ </row>
+ <row>
+ <entry>Session invalidate for -1 hits</entry>
+ <entry>Unsupported</entry>
+ <entry>Supported</entry>
+ </row>
+ <row>
+ <entry>Architecture</entry>
+ <entry>Multi-threaded + select for networked modules such as
+ <literal>frontend_net</literal>)</entry>
+ <entry>Single-threaded using select</entry>
+ </row>
+
+ <row>
+ <entry>Extensability</entry>
+ <entry>Most functionality implemented as loadable modules</entry>
+ <entry>Unsupported and experimental</entry>
+ </row>
+
+ <row>
+ <entry><ulink url="&url.usemarcon;">USEMARCON</ulink></entry>
+ <entry>Supported with <literal>record_transform</literal></entry>
+ <entry>Supported</entry>
+ </row>
+
+ <row>
+ <entry>Portability</entry>
+ <entry>
+ Requires YAZ, YAZ++ and modern C++ compiler supporting
+ <ulink url="&url.boost;">Boost</ulink>.
+ </entry>
+ <entry>
+ Requires YAZ and YAZ++.
+ STL is not required so pretty much any C++ compiler out there should work.
+ </entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+</table>
+</chapter>
+
<chapter id="architecture">
<title>The Metaproxy Architecture</title>
<para>
plugins that provide new filters. The filter API is small and
conceptually simple, but there are many details to master. See
the section below on
- <link linkend="extensions">extensions</link>.
+ <link linkend="filters">Filters</link>.
</para>
</listitem>
</varlistentry>
<chapter id="filters">
<title>Filters</title>
-
-
- <section>
+
+
+ <section id="filters-introductory-notes">
<title>Introductory notes</title>
<para>
It's useful to think of Metaproxy as an interpreter providing a small
others are sinks: they consume packages and return a result
(<literal>backend_test</literal>,
<literal>bounce</literal>,
- <literal>http_file</literal>,
+ <literal>http_file</literal>,
<literal>z3950_client</literal>);
the others are true filters, that read, process and pass on the
packages they are fed
<literal>virt_db</literal>).
</para>
</section>
-
-
+
+
<section id="overview.filter.types">
<title>Overview of filter types</title>
<para>
<para>
The filters are here listed in alphabetical order:
</para>
-
+
<!--
### New filters:
-->
- <section>
+ <section id="auth_simple">
<title><literal>auth_simple</literal>
(mp::filter::AuthSimple)</title>
<para>
the user.
</para>
</section>
-
- <section>
+
+ <section id="backend_test">
<title><literal>backend_test</literal>
(mp::filter::Backend_test)</title>
<para>
even read this section.
</para>
</section>
-
- <section>
+
+ <section id="bounce">
<title><literal>bounce</literal>
(mp::filter::Bounce)</title>
<para>
- A sink that swallows <emphasis>all packages</emphasis>,
+ A sink that swallows <emphasis>all packages</emphasis>,
and returns them almost unprocessed.
It never sends any package of any type further down the row, but
sets Z39.50 packages to Z_Close, and HTTP_Request packages to
HTTP_Response err code 400 packages, and adds a suitable bounce
- message.
- The bounce filter is usually added at end of each filter chain
- config.xml to prevent infinite hanging of for example HTTP
- requests packages when only the Z39.50 client partial sink
+ message.
+ The bounce filter is usually added at end of each filter chain route
+ to prevent infinite hanging of for example HTTP
+ requests packages when only the Z39.50 client partial sink
filter is found in the
- route.
+ route.
</para>
</section>
-
- <section>
+
+ <section id="cql_rpn">
+ <title><literal>cql_rpn</literal>
+ (mp::filter::CQLtoRPN)</title>
+ <para>
+ A query language transforming filter which catches Z39.50
+ <literal>searchRequest</literal>
+ packages containing <literal>CQL</literal> queries, transforms
+ those to <literal>RPN</literal> queries,
+ and sends the <literal>searchRequests</literal> on to the next
+ filters. It is among other things useful in a SRU context.
+ </para>
+ </section>
+
+ <section id="frontend_net">
<title><literal>frontend_net</literal>
(mp::filter::FrontendNet)</title>
<para>
</para>
</section>
- <section>
+ <section id="http_file">
<title><literal>http_file</literal>
(mp::filter::HttpFile)</title>
<para>
- A partial sink which swallows only HTTP_Request packages, and
+ A partial sink which swallows only
+ <literal>HTTP_Request</literal> packages, and
returns the contents of files from the local
- filesystem in response to HTTP requests.
+ filesystem in response to HTTP requests.
It lets Z39.50 packages and all other forthcoming package types
- pass untouched.
+ pass untouched.
(Yes, Virginia, this
does mean that Metaproxy is also a Web-server in its spare time. So
far it does not contain either an email-reader or a Lisp
interpreter, but that day is surely coming.)
</para>
</section>
-
- <section>
+
+ <section id="load_balance">
<title><literal>load_balance</literal>
(mp::filter::LoadBalance)</title>
<para>
Performs load balancing for incoming Z39.50 init requests.
It is used together with the <literal>virt_db</literal> filter,
but unlike the <literal>multi</literal> filter it does send an
- entire session to only one of the virtual backends. The
+ entire session to only one of the virtual backends. The
<literal>load_balance</literal> filter is assuming that
all backend targets have equal content, and chooses the backend
with least load cost for a new session.
</warning>
</para>
</section>
-
- <section>
+
+ <section id="log">
<title><literal>log</literal>
(mp::filter::Log)</title>
<para>
</para>
</section>
- <section>
+ <section id="multi">
<title><literal>multi</literal>
(mp::filter::Multi)</title>
<para>
of virtual databases and multi-database searching below.
</para>
</section>
-
- <section>
+
+ <section id="query_rewrite">
<title><literal>query_rewrite</literal>
(mp::filter::QueryRewrite)</title>
<para>
- Rewrites Z39.50 Type-1 and Type-101 (``RPN'') queries by a
+ Rewrites Z39.50 <literal>Type-1</literal>
+ and <literal>Type-101</literal> (``<literal>RPN</literal>'')
+ queries by a
three-step process: the query is transliterated from Z39.50
packet structures into an XML representation; that XML
representation is transformed by an XSLT stylesheet; and the
structure.
</para>
</section>
-
-
- <section>
+
+
+ <section id="record_transform">
<title><literal>record_transform</literal>
(mp::filter::RecordTransform)</title>
<para>
</para>
</section>
- <section>
+ <section id="session_shared">
<title><literal>session_shared</literal>
(mp::filter::SessionShared)</title>
<para>
- When this is finished, it will implement global sharing of
+ This filter implements global sharing of
result sets (i.e. between threads and therefore between
- clients), yielding performance improvements especially when
- incoming requests are from a stateless environment such as a
- web-server, in which the client process representing a session
- might be any one of many. However:
+ clients), yielding performance improvements by clever resource
+ pooling.
</para>
- <warning>
- <para>
- This filter is not yet completed.
- </para>
- </warning>
</section>
- <section>
+ <section id="sru_z3950">
<title><literal>sru_z3950</literal>
(mp::filter::SRUtoZ3950)</title>
<para>
messages.
The <literal>sru_z3950</literal> filter processes also SRU
GET/POST/SOAP explain requests, returning
- either the absolute minimum required by the standard, or a full
+ either the absolute minimum required by the standard, or a full
pre-defined ZeeReX explain record.
- See the
+ See the
<ulink url="&url.zeerex.explain;">ZeeReX Explain</ulink>
- standard pages and the
+ standard pages and the
<ulink url="&url.sru.explain;">SRU Explain</ulink> pages
for more information on the correct explain syntax.
SRU scan requests are not supported yet.
</para>
</section>
-
- <section>
+
+ <section id="template">
<title><literal>template</literal>
(mp::filter::Template)</title>
<para>
intended for civilians.
</para>
</section>
-
- <section>
+
+ <section id="virt_db">
<title><literal>virt_db</literal>
(mp::filter::VirtualDB)</title>
<para>
of virtual databases and multi-database searching below.
</para>
</section>
-
- <section>
+
+ <section id="z3950_client">
<title><literal>z3950_client</literal>
(mp::filter::Z3950Client)</title>
<para>
the route. Subsequent requests are sent to the same address,
which is remembered at Init time in a Session object.
HTTP_Request packages and all other forthcoming package types
- are passed untouched.
+ are passed untouched.
</para>
</section>
- <section>
+ <section id="zeerex_explain">
<title><literal>zeerex_explain</literal>
(mp::filter::ZeerexExplain)</title>
<para>
Z39.50 explain requests, returning a static ZeeReX
Explain XML record from the config section. All other packages
are passed through.
- See the
+ See the
<ulink url="&url.zeerex.explain;">ZeeReX Explain</ulink>
standard pages
for more information on the correct explain syntax.
</para>
</warning>
</section>
-
+
</section>
-
-
+
+
<section id="future.directions">
<title>Future directions</title>
<para>
</variablelist>
</section>
</chapter>
-
-
-
+
+
+
<chapter id="configuration">
<title>Configuration: the Metaproxy configuration file format</title>
-
-
- <section>
+
+
+ <section id="configuration-introductory-notes">
<title>Introductory notes</title>
<para>
If Metaproxy is an interpreter providing operations on packages, then
Metaproxy.)
</para>
</section>
-
+
<section id="overview.xml.structure">
<title>Overview of the config file XML structure</title>
<para>
<metaproxy xmlns="http://indexdata.com/metaproxy" version="1.0">
</screen>
<para>
- The top-level element is <metaproxy>. This contains a
- <start> element, a <filters> element and a
- <routes> element, in that order. <filters> is
- optional; the other two are mandatory. All three are
- non-repeatable.
+ The top-level element is <metaproxy>. This contains
+ a <dlpath> element,
+ a <start> element,
+ a <filters> element and
+ a <routes> element, in that order. <dlpath> and
+ <filters> are optional; the other two are mandatory.
+ All four are non-repeatable.
+ </para>
+ <para>
+ The <dlpath;> element contains a text element which
+ specifies the location of filter modules. This is only needed
+ if Metaproxy must load 3rd party filters (most filters with Metaproxy
+ are built into the Metaproxy application).
</para>
<para>
The <start> element is empty, but carries a
</para>
<screen><![CDATA[<?xml version="1.0"?>
<metaproxy xmlns="http://indexdata.com/metaproxy" version="1.0">
+ <dlpath>/usr/lib/metaproxy/modules</dlpath>
<start route="start"/>
<filters>
<filter id="frontend" type="frontend_net">
<filter id="backend" type="z3950_client">
</filter>
</filters>
- <routes>
+ <routes>
<route id="start">
<filter refid="frontend"/>
<filter type="log"/>
mutton, beef and trout packages.
When the response arrives, it is handed
back to the <literal>log</literal> filter, which emits another
- message; and then to the <literal>frontend_net</literal> filter,
+ message; and then to the <literal>frontend_net</literal> filter,
which returns the response to the client.
</para>
</section>
- <section id="checking.xml.syntax">
+
+ <section id="config-file-modularity">
+ <title>Config file modularity</title>
+ <para>
+ Metaproxy XML configuration snippets can be reused by other
+ filters using the <literal>XInclude</literal> standard, as seen in
+ the <literal>/etc/config-sru-to-z3950.xml</literal> example SRU
+ configuration.
+ <screen><![CDATA[
+ <filter id="sru" type="sru_z3950">
+ <database name="Default">
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
+ href="explain.xml"/>
+ </database>
+ </filter>
+]]></screen>
+ </para>
+ </section>
+
+ <section id="config-file-syntax-check">
<title>Config file syntax checking</title>
<para>
The distribution contains RelaxNG Compact and XML syntax checking
files, as well as XML Schema files. These are found in the
- distribution paths
+ distribution paths
<screen>
xml/schema/metaproxy.rnc
xml/schema/metaproxy.rng
configuration files. For example, using the utility
<filename>xmllint</filename>, syntax checking is done like this:
<screen>
- xmllint --noout --schema xml/schema/metaproxy.xsd etc/config-local.xml
- xmllint --noout --relaxng xml/schema/metaproxy.rng etc/config-local.xml
+ xmllint --noout --schema xml/schema/metaproxy.xsd etc/config-local.xml
+ xmllint --noout --relaxng xml/schema/metaproxy.rng etc/config-local.xml
</screen>
(A recent version of <literal>libxml2</literal> is required, as
support for XML Schemas is a relatively recent addition.)
<title>Virtual databases and multi-database searching</title>
- <section>
+ <section id="multidb-introductory-notes">
<title>Introductory notes</title>
<para>
Two of Metaproxy's filters are concerned with multiple-database
<screen><![CDATA[<filter type="virt_db">
<virtual>
<database>lc</database>
- <target>z3950.loc.gov:7090/voyager</target>
+ <target>lx2.loc.gov:210/LCDB_MARC8</target>
</virtual>
<virtual>
<database>marc</database>
<filter type="virt_db">
<virtual>
<database>lc</database>
- <target>z3950.loc.gov:7090/voyager</target>
+ <target>lx2.loc.gov:210/LCDB_MARC8</target>
</virtual>
<virtual>
<database>marc</database>
</virtual>
<virtual>
<database>all</database>
- <target>z3950.loc.gov:7090/voyager</target>
+ <target>lx2.loc.gov:210/LCDB_MARC8</target>
<target>indexdata.com/marc</target>
</virtual>
</filter>
merges them into a single Search response, which is what
eventually makes it back to the client.
</para>
- </section>
-
- <section id="multidb.picture">
- <title>A picture is worth a thousand words (but only five hundred on 64-bit architectures)</title>
- <simpara>
- <inlinemediaobject>
+ <mediaobject>
<imageobject>
<imagedata fileref="multi.pdf" format="PDF" scale="50"/>
</imageobject>
document.]
</phrase>
</textobject>
-<!-- ### This used to work with an older version of DocBook
<caption>
- <para>Caption: progress of packages through filters.</para>
+ <para>A picture is worth a thousand words (but only five hundred on 64-bit architectures)</para>
</caption>
--->
- </inlinemediaobject>
- </simpara>
+ </mediaobject>
</section>
</chapter>
+ <chapter id="sru-server">
+ <title>Combined SRU webservice and Z39.50 server configuration</title>
+ <para>
+ Metaproxy can act as
+ <ulink url="&url.sru;">SRU</ulink> and
+ web service server, which translates web service requests to
+ <ulink url="&url.z39.50;">ANSI/NISO Z39.50</ulink> packages and
+ sends them off to common available targets.
+ </para>
+ <para>
+ A typical setup for this operation needs a filter route including the
+ following modules:
+ </para>
+
+ <table id="sru-server-table-config" frame="top">
+ <title>SRU/Z39.50 Server Filter Route Configuration</title>
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry>Filter</entry>
+ <entry>Importance</entry>
+ <entry>Purpose</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry><literal>frontend_net</literal></entry>
+ <entry>required</entry>
+ <entry>Accepting HTTP connections and passing them to following
+ filters. Since this filter also accepts Z39.50 connections, the
+ server works as SRU and Z39.50 server on the same port.</entry>
+ </row>
+ <row>
+ <entry><literal>sru_z3950</literal></entry>
+ <entry>required</entry>
+ <entry>Accepting SRU GET/POST/SOAP explain and
+ searchRetrieve requests for the the configured databases.
+ Explain requests are directly served from the static XML configuration.
+ SearchRetrieve requests are
+ transformed to Z39.50 search and present packages.
+ All other HTTP and Z39.50 packages are passed unaltered.</entry>
+ </row>
+ <row>
+ <entry><literal>http_file</literal></entry>
+ <entry>optional</entry>
+ <entry>Serving HTTP requests from the filesystem. This is only
+ needed if the server should serve XSLT stylesheets, static HTML
+ files or Java Script for thin browser based clients.
+ Z39.50 packages are passed unaltered.</entry>
+ </row>
+ <row>
+ <entry><literal>cql_rpn</literal></entry>
+ <entry>required</entry>
+ <entry>Usually, Z39.50 servers do not talk CQL, hence the
+ translation of the CQL query language to RPN is mandatory in
+ most cases. Affects only Z39.50 search packages.</entry>
+ </row>
+ <row>
+ <entry><literal>record_transform</literal></entry>
+ <entry>optional</entry>
+ <entry>Some Z39.50 backend targets can not present XML record
+ syntaxes in common wanted element sets. using this filter, one
+ can transform binary MARC records to MARCXML records, and
+ further transform those to any needed XML schema/format by XSLT
+ transformations. Changes only Z39.50 present packages.</entry>
+ </row>
+ <row>
+ <entry><literal>session_shared</literal></entry>
+ <entry>optional</entry>
+ <entry>The stateless nature of web services requires frequent
+ re-searching of the same targets for display of paged result set
+ records. This might be an unacceptable burden for the accessed
+ backend Z39.50 targets, and this mosule can be added for
+ efficient backend target resource pooling.</entry>
+ </row>
+ <row>
+ <entry><literal>z3950_client</literal></entry>
+ <entry>required</entry>
+ <entry>Finally, a Z39.50 package sink is needed in the filter
+ chain to provide the response packages. The Z39.50 client module
+ is used to access external targets over the network, but any
+ coming local Z39.50 package sink could be used instead of.</entry>
+ </row>
+ <row>
+ <entry><literal>bounce</literal></entry>
+ <entry>required</entry>
+ <entry>Any Metaproxy package arriving here did not do so by
+ purpose, and is bounced back with connection closure. this
+ prevents inifinite package hanging inside the SRU server.</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ <para>
+ A typical minimal example <ulink url="&url.sru;">SRU</ulink>
+ server configuration file is found in the tarball distribution at
+ <literal>etc/config-sru-to-z3950.xml</literal>.
+ </para>
+ <para>
+ Off course, any other metaproxy modules can be integrated into a
+ SRU server solution, including, but not limited to, load balancing,
+ multiple target querying
+ (see <xref linkend="multidb"/>), and complex RPN query rewrites.
+ </para>
+
+
+ </chapter>
+ <!--
<chapter id="extensions">
<title>Writing extensions for Metaproxy</title>
<para>### To be written</para>
</chapter>
-
+ -->
<title>Classes in the Metaproxy source code</title>
- <section>
+ <section id="classes-introductory-notes">
<title>Introductory notes</title>
<para>
<emphasis>Stop! Do not read this!</emphasis>
parentheses.
</para>
- <section>
+ <section id="class-FactoryFilter">
<title><literal>mp::FactoryFilter</literal>
(<filename>factory_filter.cpp</filename>)</title>
<para>
</para>
</section>
- <section>
+ <section id="class-FactoryStatic">
<title><literal>mp::FactoryStatic</literal>
(<filename>factory_static.cpp</filename>)</title>
<para>
</para>
</section>
- <section>
+ <section id="class-filter-Base">
<title><literal>mp::filter::Base</literal>
(<filename>filter.cpp</filename>)</title>
<para>
</para>
</section>
- <section>
+ <section id="class-AuthSimple">
<title><literal>mp::filter::AuthSimple</literal>,
<literal>Backend_test</literal>, etc.
(<filename>filter_auth_simple.cpp</filename>,
</itemizedlist>
</section>
- <section>
+ <section id="class-Package">
<title><literal>mp::Package</literal>
(<filename>package.cpp</filename>)</title>
<para>
</para>
</section>
- <section>
+ <section id="class-Pipe">
<title><literal>mp::Pipe</literal>
(<filename>pipe.cpp</filename>)</title>
<para>
</para>
</section>
- <section>
+ <section id="class-RouterChain">
<title><literal>mp::RouterChain</literal>
(<filename>router_chain.cpp</filename>)</title>
<para>
</para>
</section>
- <section>
+ <section id="class-RouterFleXML">
<title><literal>mp::RouterFleXML</literal>
(<filename>router_flexml.cpp</filename>)</title>
<para>
</para>
</section>
- <section>
+ <section id="class-Session">
<title><literal>mp::Session</literal>
(<filename>session.cpp</filename>)</title>
<para>
</para>
</section>
- <section>
+ <section id="class-ThreadPoolSocketObserver">
<title><literal>mp::ThreadPoolSocketObserver</literal>
(<filename>thread_pool_observer.cpp</filename>)</title>
<para>
</para>
</section>
- <section>
+ <section id="class-util">
<title><literal>mp::util</literal>
(<filename>util.cpp</filename>)</title>
<para>
</para>
</section>
- <section>
+ <section id="class-xml">
<title><literal>mp::xml</literal>
(<filename>xmlutil.cpp</filename>)</title>
<para>
</para>
</section>
</chapter>
-
-
+
+
<reference id="reference">
<title>Reference</title>
- <partintro>
+ <partintro id="reference-introduction">
<para>
The material in this chapter is drawn directly from the individual
manual entries. In particular, the Metaproxy invocation section is
</partintro>
&manref;
</reference>
+
+<appendix id="license">
+ <title>License</title>
+
+ ©right;
+
+ <para>
+ Metaproxy is free software; you can redistribute it and/or modify it under
+ the terms of the GNU General Public License as published by the Free
+ Software Foundation; either version 2, or (at your option) any later
+ version.
+ </para>
+
+ <para>
+ Metaproxy is distributed in the hope that it will be useful, but WITHOUT ANY
+ WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ for more details.
+ </para>
+
+ <para>
+ You should have received a copy of the GNU General Public License
+ along with Metaproxy; see the file LICENSE. If not, write to the
+ Free Software Foundation,
+ 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ </para>
+
+ </appendix>
+
+ &gpl2;
</book>
- <!-- Keep this comment at the end of the file
- Local variables:
- mode: sgml
- sgml-omittag:t
- sgml-shorttag:t
- sgml-minimize-attributes:nil
- sgml-always-quote-attributes:t
- sgml-indent-step:1
- sgml-indent-data:t
- sgml-parent-document: nil
- sgml-local-catalogs: nil
- sgml-namecase-general:t
- End:
- -->
+<!-- Keep this comment at the end of the file
+Local variables:
+mode: nxml
+nxml-child-indent: 1
+End:
+-->