From: Adam Dickmeiss Date: Thu, 15 Apr 2004 12:04:01 +0000 (+0000) Subject: More documentation X-Git-Tag: YAZPROXY.0.8~12 X-Git-Url: http://git.indexdata.com/?p=yazproxy-moved-to-github.git;a=commitdiff_plain;h=101d9c2ca072f3cd7fb6cb89c67573318b14b8c1;ds=sidebyside More documentation --- diff --git a/doc/Makefile.am b/doc/Makefile.am index 3d21a54..d7e0a14 100644 --- a/doc/Makefile.am +++ b/doc/Makefile.am @@ -1,4 +1,4 @@ -## $Id: Makefile.am,v 1.4 2004-04-11 14:58:02 adam Exp $ +## $Id: Makefile.am,v 1.5 2004-04-15 12:04:01 adam Exp $ docdir=$(datadir)/doc/@PACKAGE@ SUPPORTFILES = \ @@ -10,7 +10,8 @@ SUPPORTFILES = \ XMLFILES = \ introduction.xml \ installation.xml \ - proxy.xml \ + reference.xml \ + using.xml \ yaz-proxy-ref.xml \ yaz-proxy-man.sgml \ license.xml \ @@ -26,16 +27,18 @@ HTMLFILES = \ otherinfo-encoding.html \ proxy-config-file.html \ proxy-keepalive.html \ + proxy-reference.html \ proxy-target.html \ proxy-usage.html \ - proxy.html \ query-cache.html \ query-validation.html \ record-cache.html \ record-validation.html \ + support.html \ + using.html \ windows.html \ - yazproxy-ref.html \ - yazproxy.html + yazproxy-man.html \ + yazproxy.html doc_DATA = $(HTMLFILES) yazproxy.pdf id.png yaz.css diff --git a/doc/installation.xml b/doc/installation.xml index 62a5228..fab423e 100644 --- a/doc/installation.xml +++ b/doc/installation.xml @@ -1,5 +1,5 @@ - + Installation You need a C++ compiler to compile and use YAZ proxy. @@ -17,6 +17,14 @@ You need to install these first. For some platforms there are binary packages for YAZ/YAZ++. + + We also highly recommend that + libxml2 and + libxslt are installed. + YAZ must be configured with libxml2 support. + If not, SRW/SRU is not supported. + The YAZ Proxy uses libxslt for record conversions via XSLT. +
Building on Unix On UNIX, the software is compiled as follows: @@ -70,8 +78,8 @@ Configure uses GCC's C/C++ compiler if available. To specify another compiler, set CXX. To use other compiler flags, - specify CXXFLAGS. To use CC - with debugging you could use: + specify CXXFLAGS. For example, to use + CC with debugging do: CXXFLAGS="-g" CXX=CC ./configure @@ -82,8 +90,8 @@ src/yazproxy - The YAZ Z39.50 Proxy. - This program gets installed in your binaries directory + The YAZ Proxy program. + It gets installed in your binaries directory (prefix/bin). @@ -92,7 +100,7 @@ src/libyazproxy.la The YAZ proxy library. This library gets installed in - your libraries directory + the libraries directory (prefix/lib). @@ -100,12 +108,22 @@ include/yazproxy/*.h - Various C++ header files, which you'll need for YAZ proxy - development. All these are installed in your header files area + C++ header files, which you'll need for YAZ proxy + development. All these are installed in the header files area (prefix/include/yazproxy). + + etc + + Various files that may be read by YAZ proxy - including + configuration file, XSLT files, CQL to RPN conversion. + These files are installed in the YAZ proxy's data area + (prefix/share/yazproxy). + + +
@@ -118,7 +136,16 @@ Version 6 and .NET has been tested. We expect that YAZ++ compiles with version 5 as well. + + The YAZ proxy has never been used in production on Windows. Although + it compiles and runs, doesn't mean it scale on that platform. + Furthermore the + YAZ proxy currently doesn't run as a Service - only as a Console + application. + + + Start a command prompt and switch the sub directory WIN where the file makefile is located. Customize the installation by editing the @@ -137,6 +164,18 @@ + YAZ_DIR + + This must be set to the home of the YAZ source directory. + + + + YAZPP_DIR + + This must be set to the home of the YAZ++ source directory. + + + HAVE_XSLT, LIBXSLT_DIR @@ -238,7 +277,6 @@ bin/yazproxy.exe YAZ proxy. It's a WIN32 console application. - See for more information. diff --git a/doc/introduction.xml b/doc/introduction.xml index 9ddfe34..e87211f 100644 --- a/doc/introduction.xml +++ b/doc/introduction.xml @@ -1,19 +1,90 @@ - + Introduction - YAZ proxy - is a Z39.50/SRW/SRU proxy. The proxy accepts connections from - protocol for information retrieval (client and server side). - While YAZ itself can be used from both C and C++ it is limited by the - common denominator C. + The YAZ Proxy is + highly configurable and can be used in a number of different + applications, ranging from debugging Z39.50-based applications + and protecting overworked servers, to improving the performance of + stateless WWW/Z39.50 gateways. Among other features, it includes: + + + + Configurable logging + + + + + + SRU/SRW server function, to allow any Z39.50 server to also support + the ZiNG + protocols + + + + + + Load balancing across multiple backend servers + + + + + + Session-sharing and pre-initialization to improve performance + in servers with expensive session initialization + + + + + + Configurable request filtering, to keep bad requests from + reaching the server + + + + + + XML support -- MARC records can be converted to MARCXML, and + XSLT-transformations allow the proxy to support arbitrary + retrieval schemas in XML + + + + + + Load governor function limits requests from aggressive + batch-mode clients + + + + + + Efficient multiplexing software enables small memory footprint + and very high performance + + + +
Licensing - The proxy application and the proxy library is covered by the + The proxy application and the proxy library is covered by the GPL.
+ +
+ Support + + Configuration and installation assistance and ongoing support is + available for the YAZ Proxy. + For futher information about support or licensing options, please + contact David Dorman in the + US (dorman at indexdata.com, 860-346-1237 or toll free 866-489-1568) + or Sebastian Hammer in Denmark (quinn a indexdata.com, or +45 3341 0100) + + +
+ License
GPL @@ -29,7 +29,6 @@ 02111-1307, USA. -
GNU General Public License GNU GENERAL PUBLIC LICENSE Version 2, June 1991 @@ -311,8 +310,7 @@ PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS - -
+
@@ -325,7 +323,7 @@ POSSIBILITY OF SUCH DAMAGES. sgml-always-quote-attributes:t sgml-indent-step:1 sgml-indent-data:t - sgml-parent-document: "yaz++.xml" + sgml-parent-document: "yazproxy.xml" sgml-local-catalogs: nil sgml-namecase-general:t End: diff --git a/doc/proxy.xml b/doc/proxy.xml deleted file mode 100644 index 2f15f7f..0000000 --- a/doc/proxy.xml +++ /dev/null @@ -1,690 +0,0 @@ - - The YAZ Proxy - - The YAZ proxy is a transparent SRW/SRU/Z39.50-to-Z39.50 gateway. - That is, it is a SRW/SRU/Z39.50 server which has as its back-end a - Z39.50 client that forwards requests on to another server (known as - the backend target.) - - - -- All config directives -- - -- SRW/SRU .. - -- Example config - -- Mention XSLT conversion - - - The YAZ Proxy is useful for debugging SRW/SRU/Z39.50 software, logging - APDUs, redirecting Z39.50 packages through firewalls, etc. - Furthermore, it offers facilities that often - boost performance for connectionless Z39.50 clients such - as web gateways. - - - Unlike most other server software, the proxy runs single-threaded, - single-process. Every I/O operation - is non-blocking so it is very lightweight and extremely fast. - It does not store any state information on the hard drive, - except any log files you ask for. - - -
- Example: Using the Proxy to Log APDUs - - Suppose you use a commercial Z39.50 client for which you do not - have source code, and it's not behaving how you think it should - when running against some specific server that you have no control - over. One way to diagnose the problem is to find out what packets - (APDUs) are being sent and received, but not all client - applications have facilities to do APDU logging. - - - No problem. Run the proxy on a friendly machine, get it to log - APDUs, and point the errant client at the proxy instead of - directly at the server that's causing it problems. - - - Suppose the server is running on foo.bar.com, - port 18398. Run the proxy on the machine of your choice, say - your.company.com like this: - - - yazproxy -a - -t tcp:foo.bar.com:18398 tcp:@:9000 - - - (The -a - option requests APDU logging on - standard output, -t tcp:foo.bar.com:18398 - specifies where the backend target is, and - tcp:@:9000 tells the proxy to listen on port - 9000 and accept connections from any machine.) - - - Now change your client application's configuration so that instead - of connecting to foo.bar.com port 18398, it - connects to your.company.com port 9000, and - start it up. It will work exactly as usual, but all the packets - will be sent via the proxy, which will generate a log like this: - - - -
- -
- Specifying the Backend Target - - When the proxy receives a Z39.50 Initialize Request from a Z39.50 - client, it determines the backend target by the following rules: - - - If the InitializeRequest PDU from the - client includes an - otherInfo - element with OID - 1.2.840.10003.10.1000.81.1, then the - contents of that element specify the target to be used, in the - usual YAZ address format (typically - tcp:hostname:port) - as described in - the Addresses section of the YAZ manual. - - - - Otherwise, the Proxy uses the default target, if one was - specified on the command-line with the -t - option. A default target can also be specified in the - XML Config file. - - - - Otherwise, the proxy closes the connection with - the client. - - - - -
-
- Keep-alive Facility - - The keep-alive is a facility where the proxy keeps the connection to the - backend - even if the client closes the connection to the proxy. - - - If a new or another client connects to the proxy again and requests the - same backend it will be reassigned to this backend. In this case, the - proxy sends an initialize response directly to the client and an - initialize handshake with the backend is omitted. - - - When a client reconnects, query and record caching works better, if the - proxy assigns it to the same backend as before. And the result set - (if any) is re-used. To achieve this, Index Data defined a session - cookie which identifies the backend session. - - - The cookie is defined by the client and is sent as part of the - Initialize Request and passed in an - otherInfo - element with OID 1.2.840.10003.10.1000.81.2. - - - Clients that do not send a cookie as part of the initialize request - may still better performance, since the init handshake is saved. - -
- -
- Query Caching - - Simple stateless clients often send identical Z39.50 searches - in a relatively short period of time (e.g. in order to produce a - results-list page, the next page, - a single full-record, etc). And for many targets, it's - much more expensive to produce a new result set than to - reuse an existing one. - - - The proxy tries to solve that by remembering the last query for each - backend target, so that if an identical query is received next, it - is turned into Present Requests rather than new Search Requests. - - - - In a future we release will will probably allows for - an arbitrary-sized cache for targets supporting named result sets. - - - - You can enable/disable query caching using option -o. - -
- -
- Record Caching - - As an option, the proxy may also cache result set records for the - last search. - The proxy takes into account the Record Syntax and CompSpec. - The CompSpec includes simple element set names as well. - By default the cache is 200000 bytes per session. - -
- -
- Query Validation - - The Proxy may also be configured to trap particular attributes in - Type-1 queries and send Bib-1 diagnostics back to the client without - even consulting the backend target. This facility may be useful if - a target does not properly issue diagnostics when unsupported attributes - are send to it. - -
- -
- Record Syntax Validation - - The proxy may be configured to accept, reject or convert records. - When accepted, the target passes search/present requests to the - backend target under the assumption that the target can honor the - request (In fact it may not do that). When a record is rejected because - the record syntax is "unsupported" the proxy returns a diagnostic to the - client. Finally, the proxy may convert records. - - - The proxy can convert from MARC to MARCXML and thereby offer an - XML version of any MARC record as long as it is ISO2709 encoded. - If the proxy is compiled with libXSLT support it can also - perform XSLT on XML. - -
- -
- Other Optimizations - - We've had some plans to support global caching of result set records, - but this has not yet been implemented. - -
- -
- Proxy Configuration File - - The Proxy may read a configuration file using option - -c followed by the filename of a config file. - - - The config file is XML based. The YAZ proxy must be compiled - with libxml2 and - libXSLT support in - order for the config file facility to be enabled. - - - To check for a config file to be well-formed, the yazproxy may - be invoked without specifying a listening port, i.e. - - yazproxy -c myconfig.xml - - If this does not produce errors, the file is well-formed. - - -
- Proxy Configuration Header - - The proxy config file must have a root element called - proxy. All information except an optional XML - header must be stored within the proxy element. - - - <?xml version="1.0"?> - <proxy> - <!-- content here .. --> - </proxy> - -
-
- Configuration: target - - The element target which may be repeated zero - or more times with parent element proxy contains - information about each backend target. - The target element have two attributes: - name which holds the logical name of the backend - target (required) and default (optional) which - (when given) specifies that the backend target is the default target - - equivalent to command line option -t. - - - - <?xml version="1.0"?> - <proxy> - <target name="server1" default="1"> - <!-- description of server1 .. --> - </target> - <target name="server2"> - <!-- description of server2 .. --> - </target> - </proxy> - - -
-
- Configuration:url - - The url which may be repeated one or more times - should be the child of the target element. - The CDATA of url is the Z-URL of the backend. - - - Multiple url element may be used. In that case, then - a client initiates a session, the proxy chooses the URL with the lowest - number of active sessions, thereby distributing the load. It is - assumed that each URL represents the same database (data). - -
-
- Configuration: keepalive - The keepalive element holds information about - the keepalive Z39.50 sessions. Keepalive sessions are proxy-to-backend - sessions that is no longer associated with a client session. - - The keepalive element which is the child of - the targetholds two elements: - bandwidth and pdu. - The bandwidth is the maximum total bytes - transferred to/from the target. If a target session exceeds this - limit, it is shut down (and no longer kept alive). - The pdu is the maximum number of requests sent - to the target. If a target session exceeds this limit, it is - shut down. The idea of these two limits is that avoid very long - sessions that use resources in a backend (that leaks!). - - - The following sets maximum number of bytes transferred in a - target session to 1 MB and maxinum of requests to 400. - - <keepalive> - <bandwidth>1048576</bandwidth> - <retrieve>400</retrieve> - </keepalive> - - -
-
- Configuration: limit - - The limit section specifies bandwidth/pdu requests - limits for an active session. - The proxy records bandwidth/pdu requests during the last 60 seconds - (1 minute). The limit may include the - elements bandwidth, pdu, - and retrieve. The bandwidth - measures the number of bytes transferred within the last minute. - The pdu is the number of requests in the last - minute. The retrieve holds the maximum records to - be retrieved in one Present Request. - - - If a bandwidth/pdu limit is reached the proxy will postpone the - requests to the target and wait one or more seconds. The idea of the - limit is to ensure that clients that downloads hundreds or thousands of - records do not hurt other users. - - - The following sets maximum number of bytes transferred per minute to - 500Kbytes and maximum number of requests to 40. - - <limit> - <bandwidth>524288</bandwidth> - <retrieve>40</retrieve> - </limit> - - - - - Typically the limits for keepalive are much higher than - those for session minute average. - - -
- -
- Configuration: attribute - - The attribute element specifies accept or reject - or a particular attribute type, value pair. - Well-behaving targets will reject unsupported attributes on their - own. This feature is useful for targets that do not gracefully - handle unsupported attributes. - - - Attribute elements may be repeated. The proxy inspects the attribute - specifications in the order as specified in the configuration file. - When a given attribute specification matches a given attribute list - in a query, the proxy takes appropriate action (reject, accept). - - - If no attribute specifications matches the attribute list in a query, - it is accepted. - - - The attribute element has two required attributes: - type which is the Attribute Type-1 type, and - value which is the Attribute Type-1 value. - The special value/type * matches any attribute - type/value. A value may also be specified as a list with each - value separated by comma, a value may also be specified as a - list: low value - dash - high value. - - - If attribute error is given, that holds a - Bib-1 diagnostic which is sent to the client if the particular - type, value is part of a query. - - - If attribute error is not given, the attribute - type, value is accepted and passed to the backend target. - - - A target that supports use attributes 1,4, 1000 through 1003 and - no other use attributes, could use the following rules: - - <attribute type="1" value="1,4,1000-1003"> - <attribute type="1" value="*" error="114"/> - - -
- -
- Configuration: syntax - - The syntax element specifies accept or reject - or a particular record syntax request from the client. - - - The syntax has one required attribute: - type which is the Preferred Record Syntax. - - - If attribute error is given, that holds a - Bib-1 diagnostic which is sent to the client if the particular - record syntax is part of a present - or search request. - - - If attribute error is not given, the record syntax - is accepted and passed to the backend target. - - - If attribute marcxml is given, the proxy will - perform MARC21 to MARCXML conversion. In this case the - type should be XML. The proxy will use - preferred record syntax USMARC/MARC21 against the backend target. - - To accept USMARC and offer MARCXML XML records but reject - all other requests the following configuration could be used: - - <proxy> - <target name="mytarget"> - <syntax type="usmarc"/> - <syntax type="xml" marcxml="1"/> - <syntax type="*" error="238"/> - </target> - </proxy> - - -
- -
- Configuration: target-timeout - - The element target-timeout is the child of element - target and specifies the amount in seconds before - a target session is shut down. - - - This can also be specified on the command line by using option - -T. Refer to . - -
- -
- Configuration: client-timeout - - The element client-timeout is the child of element - target and specifies the amount in seconds before - a client session is shut down. - - - This can also be specified on the command line by using option - -i. Refer to . - -
- -
- Configuration: preinit - - The element preinit is the child of element - target and specifies the number of spare - connection to a target. By default no spare connection are - created by the proxy. If the proxy uses a target exclusive or - a lot, the preinit session will ensure that target sessions - have been made before the client makes a connection and will therefore - reduce the connect-init handshake dramatically. Never set this to - more than 5. - -
- -
- Configuration: max-clients - - The element max-clients is the child of element - proxy and specifies the total number of - allowed connections to targets (all targets). If this limit - is reached the proxy will close the least recently used connection. - - - Note, that many Unix systems impose a system on the number of - open files allowed in a single process, typically in the - range 256 (Solaris) to 1024 (Linux). - The proxy uses 2 sockets per session + a few files - for logging. As a rule of thumb, ensure that 2*max-clients + 5 - can be opened by the proxy process. - - - - Using the - bash shell, you can set the limit with - ulimit -nno. - Use ulimit -a to display limits. - - -
- -
- Configuration: log - - The element log is the child of element - proxy and specifies what to be logged by the - proxy. - - - Specify the log file with command-line option -l. - - - The text of the log element is a sequence of - options separated by white space. See the table below: - Logging options - - - - - - Option - Description - - - - - client-apdu - - Log APDUs as reported by YAZ for the - communication between the client and the proxy. - This facility is equivalent to the APDU logging that - happens when using option -a, however - this tells the proxy to log in the same file as given - by -l. - - - - server-apdu - - Log APDUs as reported by YAZ for the - communication between the proxy and the server (backend). - - - - clients-requests - - Log a brief description about requests transferred between - the client and the proxy. The name of the request and the size - of the APDU is logged. - - - - server-requests - - Log a brief description about requests transferred between - the proxy and the server (backend). The name of the request - and the size of the APDU is logged. - - - - -
-
- - To log communication in details between the proxy and the backend, th - following configuration could be used: - - server-apdu server-requests - -]]> - - -
- -
-
- Proxy Usage - - - - &yaz-proxy-ref; - -
-
OtherInformation Encoding - - The proxy uses the OtherInformation definition to carry - information about the target address and cookie. - - - OtherInformation ::= [201] IMPLICIT SEQUENCE OF SEQUENCE{ - category [1] IMPLICIT InfoCategory OPTIONAL, - information CHOICE{ - characterInfo [2] IMPLICIT InternationalString, - binaryInfo [3] IMPLICIT OCTET STRING, - externallyDefinedInfo [4] IMPLICIT EXTERNAL, - oid [5] IMPLICIT OBJECT IDENTIFIER}} --- - InfoCategory ::= SEQUENCE{ - categoryTypeId [1] IMPLICIT OBJECT IDENTIFIER OPTIONAL, - categoryValue [2] IMPLICIT INTEGER} - - - The categoryTypeId is either - OID 1.2.840.10003.10.1000.81.1, 1.2.840.10003.10.1000.81.2 - for proxy target and proxy cookie respectively. The - integer element category is set to 0. - The value proxy and cookie is stored in element - characterInfo of the information - choice. - -
-
- - diff --git a/doc/reference.xml b/doc/reference.xml new file mode 100644 index 0000000..15274ea --- /dev/null +++ b/doc/reference.xml @@ -0,0 +1,643 @@ + + Proxy Reference +
+ Operating Environment + + The YAZ proxy is a single program. After startup it spawns + a child process (except on Windows or if option -X is given). + The child process is the core of the proxy and it handles all + communication with clients and servers. The parent process + will restart the child process if it dies unexpectedly and report + the reason. For options for YAZ proxy, + see . + + + As an option the proxy may change user identity to a less priviledged + user. + +
+
+ Specifying the Backend Server + + When the proxy receives a Z39.50 Initialize Request from a Z39.50 + client, it determines the backend server by the following rules: + + + If the InitializeRequest PDU from the + client includes an + otherInfo + element with OID + 1.2.840.10003.10.1000.81.1, then the + contents of that element specify the server to be used, in the + usual YAZ address format (typically + tcp:hostname:port) + as described in + the Addresses section of the YAZ manual. + + + + + Otherwise, the Proxy uses the default server, if one was + specified in the proxy configuration file. See + . + + + + + Otherwise, the Proxy uses the default server, if one was + specified on the command-line with the -t + option. + + + + Otherwise, the proxy closes the connection with + the client. + + + + +
+
+ Keep-alive Facility + + The keep-alive is a facility where the proxy keeps the connection to the + backend server - even if the client closes the connection to the proxy. + + + If a new or another client connects to the proxy again and requests the + same backend it will be reassigned to this backend. In this case, the + proxy sends an initialize response directly to the client and an + initialize handshake with the backend is omitted. + + + When a client reconnects, query and record caching works better, if the + proxy assigns it to the same backend as before. And the result set + (if any) is re-used. To achieve this, Index Data defined a session + cookie which identifies the backend session. + + + The cookie is defined by the client and is sent as part of the + Initialize Request and passed in an + otherInfo + element with OID 1.2.840.10003.10.1000.81.2. + + + Clients that do not send a cookie as part of the initialize request + may still better performance, since the init handshake is saved. + + + Refer to on how to setup + configuration parameters for keepalive. + +
+ +
+ Proxy Configuration File + + The Proxy may read a configuration file using option + -c followed by the filename of a config file. + + + The config file is XML based. The YAZ proxy must be compiled + with libxml2 and + libXSLT support in + order for the config file facility to be enabled. + + + To check for a config file to be well-formed, the yazproxy may + be invoked without specifying a listening port, i.e. + + yazproxy -c myconfig.xml + + If this does not produce errors, the file is well-formed. + + +
+ Proxy Configuration Header + + The proxy config file must have a root element called + proxy. All information except an optional XML + header must be stored within the proxy element. + + + <?xml version="1.0"?> + <proxy> + <!-- content here .. --> + </proxy> + +
+
+ target + + The element target which may be repeated zero + or more times with parent element proxy contains + information about each backend target. + The target element have two attributes: + name which holds the logical name of the backend + target (required) and default (optional) which + (when given) specifies that the backend target is the default target - + equivalent to command line option -t. + + + + <?xml version="1.0"?> + <proxy> + <target name="server1" default="1"> + <!-- description of server1 .. --> + </target> + <target name="server2"> + <!-- description of server2 .. --> + </target> + </proxy> + + +
+
+ url + + The url which may be repeated one or more times + should be the child of the target element. + The CDATA of url is the Z-URL of the backend. + + + Multiple url element may be used. In that case, then + a client initiates a session, the proxy chooses the URL with the lowest + number of active sessions, thereby distributing the load. It is + assumed that each URL represents the same database (data). + +
+ +
+ target-timeout + + The element target-timeout is the child of element + target and specifies the amount in seconds before + a target session is shut down. + + + This can also be specified on the command line by using option + -T. Refer to OPTIONS. + +
+ +
+ client-timeout + + The element client-timeout is the child of element + target and specifies the amount in seconds before + a client session is shut down. + + + This can also be specified on the command line by using option + -i. Refer to OPTIONS. + +
+ +
+ keepalive + The keepalive element holds information about + the keepalive Z39.50 sessions. Keepalive sessions are proxy-to-backend + sessions that is no longer associated with a client session. + + The keepalive element which is the child of + the targetholds two elements: + bandwidth and pdu. + The bandwidth is the maximum total bytes + transferred to/from the target. If a target session exceeds this + limit, it is shut down (and no longer kept alive). + The pdu is the maximum number of requests sent + to the target. If a target session exceeds this limit, it is + shut down. The idea of these two limits is that avoid very long + sessions that use resources in a backend (that leaks!). + + + The following sets maximum number of bytes transferred in a + target session to 1 MB and maxinum of requests to 400. + + <keepalive> + <bandwidth>1048576</bandwidth> + <retrieve>400</retrieve> + </keepalive> + + +
+
+ limit + + The limit section specifies bandwidth/pdu requests + limits for an active session. + The proxy records bandwidth/pdu requests during the last 60 seconds + (1 minute). The limit may include the + elements bandwidth, pdu, + and retrieve. The bandwidth + measures the number of bytes transferred within the last minute. + The pdu is the number of requests in the last + minute. The retrieve holds the maximum records to + be retrieved in one Present Request. + + + If a bandwidth/pdu limit is reached the proxy will postpone the + requests to the target and wait one or more seconds. The idea of the + limit is to ensure that clients that downloads hundreds or thousands of + records do not hurt other users. + + + The following sets maximum number of bytes transferred per minute to + 500Kbytes and maximum number of requests to 40. + + <limit> + <bandwidth>524288</bandwidth> + <retrieve>40</retrieve> + </limit> + + + + + Typically the limits for keepalive are much higher than + those for session minute average. + + +
+ +
+ attribute + + The attribute element specifies accept or reject + or a particular attribute type, value pair. + Well-behaving targets will reject unsupported attributes on their + own. This feature is useful for targets that do not gracefully + handle unsupported attributes. + + + Attribute elements may be repeated. The proxy inspects the attribute + specifications in the order as specified in the configuration file. + When a given attribute specification matches a given attribute list + in a query, the proxy takes appropriate action (reject, accept). + + + If no attribute specifications matches the attribute list in a query, + it is accepted. + + + The attribute element has two required attributes: + type which is the Attribute Type-1 type, and + value which is the Attribute Type-1 value. + The special value/type * matches any attribute + type/value. A value may also be specified as a list with each + value separated by comma, a value may also be specified as a + list: low value - dash - high value. + + + If attribute error is given, that holds a + Bib-1 diagnostic which is sent to the client if the particular + type, value is part of a query. + + + If attribute error is not given, the attribute + type, value is accepted and passed to the backend target. + + + A target that supports use attributes 1,4, 1000 through 1003 and + no other use attributes, could use the following rules: + + <attribute type="1" value="1,4,1000-1003"> + <attribute type="1" value="*" error="114"/> + + +
+ + + +
+ syntax + + The syntax element specifies accept or reject + or a particular record syntax request from the client. + + + The syntax has one required attribute: + type which is the Preferred Record Syntax. + + + If attribute error is given, that holds a + Bib-1 diagnostic which is sent to the client if the particular + record syntax is part of a present - or search request. + + + If attribute error is not given, the record syntax + is accepted and passed to the backend target. + + + If attribute marcxml is given, the proxy will + perform MARC21 to MARCXML conversion. In this case the + type should be XML. The proxy will use + preferred record syntax USMARC/MARC21 against the backend target. + + To accept USMARC and offer MARCXML XML records but reject + all other requests the following configuration could be used: + + <proxy> + <target name="mytarget"> + <syntax type="usmarc"/> + <syntax type="xml" marcxml="1"/> + <syntax type="*" error="238"/> + </target> + </proxy> + + +
+ +
+ explain + + The explain element includes Explain information + for SRW/SRU about the server in the target section. This + information must have a serverInfo element + with a database that this target must be available as (URL path). + For example, + + + myhost.org + 8000 + mydatabase + + + + ]]> + + In the above case, the SRW/SRU service is available as + http://myhost.org:8000/mydatabase. + + +
+ +
+ cql2rpn + + The CDATA of cql2rpn refers to CQL to a RPN conversion + file - for the server in the target section. This element + is required for SRW/SRU searches to operate against a Z39.50 + server that doesn't support CQL. Most Z39.50 servers only support + Type-1/RPN so this is usually required. + See YAZ documentation for more information about the + CQL + to PQF conversion. See also the + pqf.properties in the etc + (or prefix/share/yazproxy) + directory of the YAZ proxy. + +
+ +
+ preinit + + The element preinit is the child of element + target and specifies the number of spare + connection to a target. By default no spare connection are + created by the proxy. If the proxy uses a target exclusive or + a lot, the preinit session will ensure that target sessions + have been made before the client makes a connection and will therefore + reduce the connect-init handshake dramatically. Never set this to + more than 5. + +
+ +
+ max-clients + + The element max-clients is the child of element + proxy and specifies the total number of + allowed connections to targets (all targets). If this limit + is reached the proxy will close the least recently used connection. + + + Note, that many Unix systems impose a system on the number of + open files allowed in a single process, typically in the + range 256 (Solaris) to 1024 (Linux). + The proxy uses 2 sockets per session + a few files + for logging. As a rule of thumb, ensure that 2*max-clients + 5 + can be opened by the proxy process. + + + + Using the + bash shell, you can set the limit with + ulimit -nno. + Use ulimit -a to display limits. + + +
+ +
+ log + + The element log is the child of element + proxy and specifies what to be logged by the + proxy. + + + Specify the log file with command-line option -l. + + + The text of the log element is a sequence of + options separated by white space. See the table below: + Logging options + + + + + Option + Description + + + + + client-apdu + + Log APDUs as reported by YAZ for the + communication between the client and the proxy. + This facility is equivalent to the APDU logging that + happens when using option -a, however + this tells the proxy to log in the same file as given + by -l. + + + + server-apdu + + Log APDUs as reported by YAZ for the + communication between the proxy and the server (backend). + + + + clients-requests + + Log a brief description about requests transferred between + the client and the proxy. The name of the request and the size + of the APDU is logged. + + + + server-requests + + Log a brief description about requests transferred between + the proxy and the server (backend). The name of the request + and the size of the APDU is logged. + + + + +
+
+ + To log communication in details between the proxy and the backend, th + following configuration could be used: + + server-apdu server-requests + + ]]> + + +
+
+ +
+ Query Caching + + Simple stateless clients often send identical Z39.50 searches + in a relatively short period of time (e.g. in order to produce a + results-list page, the next page, + a single full-record, etc). And for many targets, it's + much more expensive to produce a new result set than to + reuse an existing one. + + + The proxy tries to solve that by remembering the last query for each + backend target, so that if an identical query is received next, it + is turned into Present Requests rather than new Search Requests. + + + + In a future we release will will probably allows for + an arbitrary-sized cache for targets supporting named result sets. + + + + You can enable/disable query caching using option -o. + +
+ +
+ Record Caching + + As an option, the proxy may also cache result set records for the + last search. + The proxy takes into account the Record Syntax and CompSpec. + The CompSpec includes simple element set names as well. + By default the cache is 200000 bytes per session. + +
+ +
+ Query Validation + + The Proxy may also be configured to trap particular attributes in + Type-1 queries and send Bib-1 diagnostics back to the client without + even consulting the backend target. This facility may be useful if + a target does not properly issue diagnostics when unsupported attributes + are send to it. + +
+ +
+ Record Syntax Validation + + The proxy may be configured to accept, reject or convert records. + When accepted, the target passes search/present requests to the + backend target under the assumption that the target can honor the + request (In fact it may not do that). When a record is rejected because + the record syntax is "unsupported" the proxy returns a diagnostic to the + client. Finally, the proxy may convert records. + + + The proxy can convert from MARC to MARCXML and thereby offer an + XML version of any MARC record as long as it is ISO2709 encoded. + If the proxy is compiled with libXSLT support it can also + perform XSLT on XML. + +
+ +
+ Other Optimizations + + We've had some plans to support global caching of result set records, + but this has not yet been implemented. + +
+ +
+ Proxy Usage (man page) + + &yaz-proxy-ref; + +
+ +
+ OtherInformation Encoding + + The proxy uses the OtherInformation definition to carry + information about the target address and cookie. + + + OtherInformation ::= [201] IMPLICIT SEQUENCE OF SEQUENCE{ + category [1] IMPLICIT InfoCategory OPTIONAL, + information CHOICE{ + characterInfo [2] IMPLICIT InternationalString, + binaryInfo [3] IMPLICIT OCTET STRING, + externallyDefinedInfo [4] IMPLICIT EXTERNAL, + oid [5] IMPLICIT OBJECT IDENTIFIER}} +-- + InfoCategory ::= SEQUENCE{ + categoryTypeId [1] IMPLICIT OBJECT IDENTIFIER OPTIONAL, + categoryValue [2] IMPLICIT INTEGER} + + + The categoryTypeId is either + OID 1.2.840.10003.10.1000.81.1, 1.2.840.10003.10.1000.81.2 + for proxy target and proxy cookie respectively. The + integer element category is set to 0. + The value proxy and cookie is stored in element + characterInfo of the information + choice. + +
+
+ + + diff --git a/doc/using.xml b/doc/using.xml new file mode 100644 index 0000000..f6106f8 --- /dev/null +++ b/doc/using.xml @@ -0,0 +1,145 @@ + + Using YAZ proxy + + As mentioned in the introduction the YAZ proxy has many uses. + This chapter includes a few examples. + + + -- All config directives -- + -- SRW/SRU .. + -- Example config + -- Mention XSLT conversion + + + The YAZ Proxy is useful for debugging SRW/SRU/Z39.50 software, logging + APDUs, redirecting Z39.50 packages through firewalls, etc. + Furthermore, it offers facilities that often + boost performance for connectionless Z39.50 clients such + as web gateways. + + + Unlike most other server software, the proxy runs single-threaded, + single-process. Every I/O operation + is non-blocking so it is very lightweight and extremely fast. + It does not store any state information on the hard drive, + except any log files you ask for. + + + + Using the Proxy to Log APDUs + + Suppose you use a commercial Z39.50 client for which you do not + have source code, and it's not behaving how you think it should + when running against some specific server that you have no control + over. One way to diagnose the problem is to find out what packets + (APDUs) are being sent and received, but not all client + applications have facilities to do APDU logging. + + + No problem. Run the proxy on a friendly machine, get it to log + APDUs, and point the errant client at the proxy instead of + directly at the server that's causing it problems. + + + Suppose the server is running on foo.bar.com, + port 18398. Run the proxy on the machine of your choice, say + your.company.com like this: + + + yazproxy -a - -t tcp:foo.bar.com:18398 tcp:@:9000 + + + (The -a - option requests APDU logging on + standard output, -t tcp:foo.bar.com:18398 + specifies where the backend target is, and + tcp:@:9000 tells the proxy to listen on port + 9000 and accept connections from any machine.) + + + Now change your client application's configuration so that instead + of connecting to foo.bar.com port 18398, it + connects to your.company.com port 9000, and + start it up. It will work exactly as usual, but all the packets + will be sent via the proxy, which will generate a log like this: + + + + + + + + diff --git a/doc/yaz-proxy-ref.xml b/doc/yaz-proxy-ref.xml index 8ba2090..c1301cb 100644 --- a/doc/yaz-proxy-ref.xml +++ b/doc/yaz-proxy-ref.xml @@ -4,7 +4,7 @@ yazproxy - The YAZ toolkit's transparent Z39.50 proxy + The YAZ toolkit's transparent Z39.50/SRW/SRU proxy @@ -27,7 +27,8 @@ DESCRIPTION - yazproxy is a Z39.50 optimizing proxy daemon. + yazproxy is a proxy that accepts connections + from Z39.50/SRW/SRU clients and contacts a Z39.50 backend. The listening port must be specified on the command-line. inetd operation is not supported. The host:port @@ -46,6 +47,7 @@ reopens log files when it receives the hangup signal, SIGHUP. + OPTIONS -a filename @@ -143,62 +145,46 @@ EXAMPLES The following command starts the proxy, listening on port - 9000, with its default backend target set to the Library of - Congress bibliographic server: - + 9000, with its default backend target set to Index Data's + test server: + - $ yazproxy -t z3950.loc.gov:7090 @:9000 + $ yazproxy -t indexdata.dk:210 @:9000 - The LOC target is sometimes very slow. You can connect to - it using yaz-client as follows: + You can connect to the proxy via yaz-client as follows: - $ yaz-client localhost:9000/voyager - Connecting...Ok. + $ ./yaz-client localhost:9000/gils + Connecting...OK. Sent initrequest. - Connection accepted by target. - ID : 34 - Name : Voyager LMS - Z39.50 Server - Version: 1.13 - Options: search present - Elapsed: 7.131197 - Z> f computer - Sent searchRequest. - Received SearchResponse. - Search was a success. - Number of hits: 10000 - records returned: 0 - Elapsed: 6.695174 + Connection accepted by v3 target. + ID : 81 + Name : Zebra Information Server/GFS/YAZ (YAZ Proxy) + Version: Zebra 1.3.15/1.23/2.0.19 + Options: search present delSet scan sort extendedServices namedResultSets + Elapsed: 0.152108 Z> f computer Sent searchRequest. Received SearchResponse. Search was a success. - Number of hits: 10000 + Number of hits: 3, setno 1 + SearchResult-1: computer(3) records returned: 0 - Elapsed: 0.001417 + Elapsed: 0.172533 - In this test, the second search was more than 4000 times faster - than the first, because the proxy cached the result of the first - search and noticed that the second was the same. - - The YAZ command-line client, yaz-client, allows you to set the proxy address by specifying option -p. In that case, the actual backend target is specified as part of the Initialize Request. - Suppose you have a proxy running on localhost, - port 9000 and wish to connect to Index Data's test target at - indexdata.dk:210/gils you could use: - - yaz-client -p localhost:9000 indexdata.dk:210/gils - - Since port 210 is the default, the port can be omitted: + Suppose the proxy running on localhost, port 9000. + To connect to British Library's server at + blpcz.bl.uk:21021 use: - yaz-client -p localhost:9000 indexdata.dk/gils + yaz-client -p localhost:9000 blpcz.bl.uk:21021/BLPC-ALL diff --git a/doc/yazproxy.xml.in b/doc/yazproxy.xml.in index 497ad07..3f34d49 100644 --- a/doc/yazproxy.xml.in +++ b/doc/yazproxy.xml.in @@ -3,11 +3,12 @@ "@DTD_DIR@/docbookx.dtd" [ - + + ]> - + YAZ proxy User's Guide and Reference @@ -33,7 +34,7 @@ This manual covers version @VERSION@. - CVS ID: $Id: yazproxy.xml.in,v 1.2 2004-04-11 11:58:34 adam Exp $ + CVS ID: $Id: yazproxy.xml.in,v 1.3 2004-04-15 12:04:01 adam Exp $ @@ -47,7 +48,8 @@ &chap-introduction; &chap-installation; - &chap-proxy; + &chap-using; + &chap-reference; &app-license;