X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fbook.xml;h=dffd47497f61499fd04e72dffe73503b41c7bf06;hb=1267943141756671d8c2944fb43f70ea7371dcc5;hp=a191019d16ef9584644ac5e05f76e768be094d7e;hpb=e5901333c8011101505ee1a283df326663637a7e;p=metaproxy-moved-to-github.git diff --git a/doc/book.xml b/doc/book.xml index a191019..dffd474 100644 --- a/doc/book.xml +++ b/doc/book.xml @@ -2,7 +2,8 @@ + + %local; @@ -17,34 +18,43 @@ --> ]> - + Metaproxy - User's Guide and Reference - - AdamDickmeiss - - - MarcCromme - - - MikeTaylor - + + + AdamDickmeiss + + + MarcCromme + + + MikeTaylor + + + &version; 2005-2007 Index Data ApS + This manual is part of Metaproxy version &version;. + + Metaproxy is a universal router, proxy and encapsulated metasearcher for information retrieval protocols. It accepts, processes, interprets and redirects requests from IR clients using - standard protocols such as + standard protocols such as the binary ANSI/NISO Z39.50 - (and in the future SRU - and SRW), as + and the information search and retireval + web services SRU + and SRW, as well as functioning as a limited HTTP server. + + Metaproxy is configured by an XML file which specifies how the software should function in terms of routes that the request packets can take through the proxy, each step on a @@ -572,7 +582,7 @@ plugins that provide new filters. The filter API is small and conceptually simple, but there are many details to master. See the section below on - extensions. + Filters. @@ -653,8 +663,7 @@ the core Metaproxy binary. This overview is intended to give a flavor of the available functionality; more detailed information about each type of filter is included below in - the reference guide to Metaproxy filters. + . The filters are here named by the string that is used as the @@ -732,8 +741,8 @@ Figure out what additional information we need in: sets Z39.50 packages to Z_Close, and HTTP_Request packages to HTTP_Response err code 400 packages, and adds a suitable bounce message. - The bounce filter is usually added at end of each filter chain - config.xml to prevent infinite hanging of for example HTTP + The bounce filter is usually added at end of each filter chain route + to prevent infinite hanging of for example HTTP requests packages when only the Z39.50 client partial sink filter is found in the route. @@ -741,6 +750,19 @@ Figure out what additional information we need in:
+ <literal>cql_rpn</literal> + (mp::filter::CQLtoRPN) + + A query language transforming filter which catches Z39.50 + searchRequest + packages containing CQL queries, transforms + those to RPN queries, + and sends the searchRequests on to the next + filters. It is among other things useful in a SRU context. + +
+ +
<literal>frontend_net</literal> (mp::filter::FrontendNet) @@ -755,7 +777,8 @@ Figure out what additional information we need in: <literal>http_file</literal> (mp::filter::HttpFile) - A partial sink which swallows only HTTP_Request packages, and + A partial sink which swallows only + HTTP_Request packages, and returns the contents of files from the local filesystem in response to HTTP requests. It lets Z39.50 packages and all other forthcoming package types @@ -778,6 +801,12 @@ Figure out what additional information we need in: load_balance filter is assuming that all backend targets have equal content, and chooses the backend with least load cost for a new session. + + + This filter is experimental and yet not mature for heavy load + production sites. + +
@@ -806,7 +835,9 @@ Figure out what additional information we need in: <literal>query_rewrite</literal> (mp::filter::QueryRewrite) - Rewrites Z39.50 Type-1 and Type-101 (``RPN'') queries by a + Rewrites Z39.50 Type-1 + and Type-101 (``RPN'') + queries by a three-step process: the query is transliterated from Z39.50 packet structures into an XML representation; that XML representation is transformed by an XSLT stylesheet; and the @@ -834,18 +865,11 @@ Figure out what additional information we need in: <literal>session_shared</literal> (mp::filter::SessionShared) - When this is finished, it will implement global sharing of + This filter implements global sharing of result sets (i.e. between threads and therefore between - clients), yielding performance improvements especially when - incoming requests are from a stateless environment such as a - web-server, in which the client process representing a session - might be any one of many. However: + clients), yielding performance improvements by clever resource + pooling. - - - This filter is not yet completed. - -
@@ -857,12 +881,16 @@ Figure out what additional information we need in: and present requests, and wraps the received hit counts and XML records into suitable SRU response messages. - The sru_z3950 filter does only process SRU - GET/POST/SOAP explain requests in a very crude fashion, returning - the absolute minimum required by the standard. Full ZeeReX - explain support is added by including the - zeerex_explain filter before the - sru_z3950 filter. + The sru_z3950 filter processes also SRU + GET/POST/SOAP explain requests, returning + either the absolute minimum required by the standard, or a full + pre-defined ZeeReX explain record. + See the + ZeeReX Explain + standard pages and the + SRU Explain pages + for more information on the correct explain syntax. + SRU scan requests are not supported yet.
@@ -917,17 +945,19 @@ Figure out what additional information we need in: (mp::filter::ZeerexExplain) This filter acts as a sink for - SRU GET/POST/SOAP explain requests, returning a static ZeeReX + Z39.50 explain requests, returning a static ZeeReX Explain XML record from the config section. All other packages - are passed through, including SRU GET/POST/SOAP searchRetrieve - requests, which are handled by a following - sru_z3950 filter. + are passed through. See the ZeeReX Explain - standard pages and the - SRU Explain pages + standard pages for more information on the correct explain syntax. + + + This filter is not yet completed. + + @@ -983,16 +1013,11 @@ Figure out what additional information we need in: If Metaproxy is an interpreter providing operations on packages, then its configuration file can be thought of as a program for that - interpreter. Configuration is by means of a single file, the name + interpreter. Configuration is by means of a single XML file, the name of which is supplied as the sole command-line argument to the metaproxy program. (See - the reference guide - below for more information on invoking Metaproxy.) - - - The configuration files are written in XML. (But that's just an - implementation detail - they could just as well have been written - in YAML or Lisp-like S-expressions, or in a custom syntax.) + below for more information on invoking + Metaproxy.) @@ -1028,7 +1053,7 @@ Figure out what additional information we need in: and contain various elements that provide suitable configuration for a filter of its type. The filter-specific elements are described in - the reference guide below. + . Filters defined in this part of the file must carry an id attribute so that they can be referenced from elsewhere. @@ -1112,7 +1137,26 @@ Figure out what additional information we need in: which returns the response to the client.
-
+ +
+ Config file modularity + + Metaproxy XML configuration snippets can be reused by other + filters using the XInclude standard, as seen in + the /etc/config-sru-to-z3950.xml example SRU + configuration. + + + + + +]]> + +
+ +
Config file syntax checking The distribution contains RelaxNG Compact and XML syntax checking @@ -1514,12 +1558,123 @@ Z> + + Combined SRU webservice and Z39.50 server configuration + + Metaproxy can act as + SRU and + SRW + web service server, which translates web service requests to + ANSI/NISO Z39.50 packages and + sends them off to common available targets. + + + A typical setup for this operation needs a filter route including the + following modules: + + + + SRU/Z39.50 Server Filter Route Configuration + + + + Filter + Importance + Purpose + + + + + + frontend_net + required + Accepting HTTP connections and passing them to following + filters. Since this filter also accepts Z39.50 connections, the + server works as SRU and Z39.50 server on the same port. + + + sru_z3950 + required + Accepting SRU GET/POST/SOAP explain and + searchRetrieve requests for the the configured databases. + Explain requests are directly served from the static XML configuration. + SearchRetrieve requests are + transformed to Z39.50 search and present packages. + All other HTTP and Z39.50 packages are passed unaltered. + + + http_file + optional + Serving HTTP requests from the filesystem. This is only + needed if the server should serve XSLT stylesheets, static HTML + files or Java Script for thin browser based clients. + Z39.50 packages are passed unaltered. + + + cql_rpn + required + Usually, Z39.50 servers do not talk CQL, hence the + translation of the CQL query language to RPN is mandatory in + most cases. Affects only Z39.50 search packages. + + + record_transform + optional + Some Z39.50 backend targets can not present XML record + syntaxes in common wanted element sets. using this filter, one + can transform binary MARC records to MARCXML records, and + further transform those to any needed XML schema/format by XSLT + transformations. Changes only Z39.50 present packages. + + + session_shared + optional + The stateless nature of web services requires frequent + re-searching of the same targets for display of paged result set + records. This might be an unacceptable burden for the accessed + backend Z39.50 targets, and this mosule can be added for + efficient backend target resource pooling. + + + z3950_client + required + Finally, a Z39.50 package sink is needed in the filter + chain to provide the response packages. The Z39.50 client module + is used to access external targets over the network, but any + coming local Z39.50 package sink could be used instead of. + + + bounce + required + Any Metaproxy package arriving here did not do so by + purpose, and is bounced back with connection closure. this + prevents inifinite package hanging inside the SRU server. + + + +
+ + A typical minimal example SRU and + SRW server configuration file is found + in the tarball distribution at + etc/config-sru-to-z3950.xml. + + + Off course, any other metaproxy modules can be integrated into a + SRU server solution, including, but not limited to, load balancing, + multiple target querying + (see ), and complex RPN query rewrites. + + + +
+ @@ -1532,7 +1687,7 @@ Z> Stop! Do not read this! You won't enjoy it at all. You should just skip ahead to - the reference guide, + , which tells @@ -1781,9 +1936,9 @@ Z> - - - Reference guide + + Reference + The material in this chapter is drawn directly from the individual manual entries. In particular, the Metaproxy invocation section is @@ -1791,7 +1946,8 @@ Z> on each individual filter is available using the name of the filter as the argument to the man command. - &manref; + + &manref;