X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fproxy.xml;fp=doc%2Fproxy.xml;h=0000000000000000000000000000000000000000;hb=213d52b0de82aa144159df657968a44cf9dafeab;hp=3f0c27ce1c1c9ed569fb2d71fb98109da6bf76d5;hpb=1b6019ff065a98af709be905adc6c08094471d57;p=yazpp-moved-to-github.git diff --git a/doc/proxy.xml b/doc/proxy.xml deleted file mode 100644 index 3f0c27c..0000000 --- a/doc/proxy.xml +++ /dev/null @@ -1,690 +0,0 @@ - - The YAZ Proxy - - The YAZ proxy is a transparent SRW/SRU/Z39.50-to-Z39.50 gateway. - That is, it is a SRW/SRU/Z39.50 server which has as its back-end a - Z39.50 client that forwards requests on to another server (known as - the backend target.) - - - -- All config directives -- - -- SRW/SRU .. - -- Example config - -- Mention XSLT conversion - - - The YAZ Proxy is useful for debugging SRW/SRU/Z39.50 software, logging - APDUs, redirecting Z39.50 packages through firewalls, etc. - Furthermore, it offers facilities that often - boost performance for connectionless Z39.50 clients such - as web gateways. - - - Unlike most other server software, the proxy runs single-threaded, - single-process. Every I/O operation - is non-blocking so it is very lightweight and extremely fast. - It does not store any state information on the hard drive, - except any log files you ask for. - - -
- Example: Using the Proxy to Log APDUs - - Suppose you use a commercial Z39.50 client for which you do not - have source code, and it's not behaving how you think it should - when running against some specific server that you have no control - over. One way to diagnose the problem is to find out what packets - (APDUs) are being sent and received, but not all client - applications have facilities to do APDU logging. - - - No problem. Run the proxy on a friendly machine, get it to log - APDUs, and point the errant client at the proxy instead of - directly at the server that's causing it problems. - - - Suppose the server is running on foo.bar.com, - port 18398. Run the proxy on the machine of your choice, say - your.company.com like this: - - - yaz-proxy -a - -t tcp:foo.bar.com:18398 tcp:@:9000 - - - (The -a - option requests APDU logging on - standard output, -t tcp:foo.bar.com:18398 - specifies where the backend target is, and - tcp:@:9000 tells the proxy to listen on port - 9000 and accept connections from any machine.) - - - Now change your client application's configuration so that instead - of connecting to foo.bar.com port 18398, it - connects to your.company.com port 9000, and - start it up. It will work exactly as usual, but all the packets - will be sent via the proxy, which will generate a log like this: - - - -
- -
- Specifying the Backend Target - - When the proxy receives a Z39.50 Initialize Request from a Z39.50 - client, it determines the backend target by the following rules: - - - If the InitializeRequest PDU from the - client includes an - otherInfo - element with OID - 1.2.840.10003.10.1000.81.1, then the - contents of that element specify the target to be used, in the - usual YAZ address format (typically - tcp:hostname:port) - as described in - the Addresses section of the YAZ manual. - - - - Otherwise, the Proxy uses the default target, if one was - specified on the command-line with the -t - option. A default target can also be specified in the - XML Config file. - - - - Otherwise, the proxy closes the connection with - the client. - - - - -
-
- Keep-alive Facility - - The keep-alive is a facility where the proxy keeps the connection to the - backend - even if the client closes the connection to the proxy. - - - If a new or another client connects to the proxy again and requests the - same backend it will be reassigned to this backend. In this case, the - proxy sends an initialize response directly to the client and an - initialize handshake with the backend is omitted. - - - When a client reconnects, query and record caching works better, if the - proxy assigns it to the same backend as before. And the result set - (if any) is re-used. To achieve this, Index Data defined a session - cookie which identifies the backend session. - - - The cookie is defined by the client and is sent as part of the - Initialize Request and passed in an - otherInfo - element with OID 1.2.840.10003.10.1000.81.2. - - - Clients that do not send a cookie as part of the initialize request - may still better performance, since the init handshake is saved. - -
- -
- Query Caching - - Simple stateless clients often send identical Z39.50 searches - in a relatively short period of time (e.g. in order to produce a - results-list page, the next page, - a single full-record, etc). And for many targets, it's - much more expensive to produce a new result set than to - reuse an existing one. - - - The proxy tries to solve that by remembering the last query for each - backend target, so that if an identical query is received next, it - is turned into Present Requests rather than new Search Requests. - - - - In a future we release will will probably allows for - an arbitrary-sized cache for targets supporting named result sets. - - - - You can enable/disable query caching using option -o. - -
- -
- Record Caching - - As an option, the proxy may also cache result set records for the - last search. - The proxy takes into account the Record Syntax and CompSpec. - The CompSpec includes simple element set names as well. - By default the cache is 200000 bytes per session. - -
- -
- Query Validation - - The Proxy may also be configured to trap particular attributes in - Type-1 queries and send Bib-1 diagnostics back to the client without - even consulting the backend target. This facility may be useful if - a target does not properly issue diagnostics when unsupported attributes - are send to it. - -
- -
- Record Syntax Validation - - The proxy may be configured to accept, reject or convert records. - When accepted, the target passes search/present requests to the - backend target under the assumption that the target can honor the - request (In fact it may not do that). When a record is rejected because - the record syntax is "unsupported" the proxy returns a diagnostic to the - client. Finally, the proxy may convert records. - - - The proxy can convert from MARC to MARCXML and thereby offer an - XML version of any MARC record as long as it is ISO2709 encoded. - If the proxy is compiled with libXSLT support it can also - perform XSLT on XML. - -
- -
- Other Optimizations - - We've had some plans to support global caching of result set records, - but this has not yet been implemented. - -
- -
- Proxy Configuration File - - The Proxy may read a configuration file using option - -c followed by the filename of a config file. - - - The config file is XML based. The YAZ proxy must be compiled - with libxml2 and - libXSLT support in - order for the config file facility to be enabled. - - - To check for a config file to be well-formed, the yaz-proxy may - be invoked without specifying a listening port, i.e. - - yaz-proxy -c myconfig.xml - - If this does not produce errors, the file is well-formed. - - -
- Proxy Configuration Header - - The proxy config file must have a root element called - proxy. All information except an optional XML - header must be stored within the proxy element. - - - <?xml version="1.0"?> - <proxy> - <!-- content here .. --> - </proxy> - -
-
- Configuration: target - - The element target which may be repeated zero - or more times with parent element proxy contains - information about each backend target. - The target element have two attributes: - name which holds the logical name of the backend - target (required) and default (optional) which - (when given) specifies that the backend target is the default target - - equivalent to command line option -t. - - - - <?xml version="1.0"?> - <proxy> - <target name="server1" default="1"> - <!-- description of server1 .. --> - </target> - <target name="server2"> - <!-- description of server2 .. --> - </target> - </proxy> - - -
-
- Configuration:url - - The url which may be repeated one or more times - should be the child of the target element. - The CDATA of url is the Z-URL of the backend. - - - Multiple url element may be used. In that case, then - a client initiates a session, the proxy chooses the URL with the lowest - number of active sessions, thereby distributing the load. It is - assumed that each URL represents the same database (data). - -
-
- Configuration: keepalive - The keepalive element holds information about - the keepalive Z39.50 sessions. Keepalive sessions are proxy-to-backend - sessions that is no longer associated with a client session. - - The keepalive element which is the child of - the targetholds two elements: - bandwidth and pdu. - The bandwidth is the maximum total bytes - transferred to/from the target. If a target session exceeds this - limit, it is shut down (and no longer kept alive). - The pdu is the maximum number of requests sent - to the target. If a target session exceeds this limit, it is - shut down. The idea of these two limits is that avoid very long - sessions that use resources in a backend (that leaks!). - - - The following sets maximum number of bytes transferred in a - target session to 1 MB and maxinum of requests to 400. - - <keepalive> - <bandwidth>1048576</bandwidth> - <retrieve>400</retrieve> - </keepalive> - - -
-
- Configuration: limit - - The limit section specifies bandwidth/pdu requests - limits for an active session. - The proxy records bandwidth/pdu requests during the last 60 seconds - (1 minute). The limit may include the - elements bandwidth, pdu, - and retrieve. The bandwidth - measures the number of bytes transferred within the last minute. - The pdu is the number of requests in the last - minute. The retrieve holds the maximum records to - be retrieved in one Present Request. - - - If a bandwidth/pdu limit is reached the proxy will postpone the - requests to the target and wait one or more seconds. The idea of the - limit is to ensure that clients that downloads hundreds or thousands of - records do not hurt other users. - - - The following sets maximum number of bytes transferred per minute to - 500Kbytes and maximum number of requests to 40. - - <limit> - <bandwidth>524288</bandwidth> - <retrieve>40</retrieve> - </limit> - - - - - Typically the limits for keepalive are much higher than - those for session minute average. - - -
- -
- Configuration: attribute - - The attribute element specifies accept or reject - or a particular attribute type, value pair. - Well-behaving targets will reject unsupported attributes on their - own. This feature is useful for targets that do not gracefully - handle unsupported attributes. - - - Attribute elements may be repeated. The proxy inspects the attribute - specifications in the order as specified in the configuration file. - When a given attribute specification matches a given attribute list - in a query, the proxy takes appropriate action (reject, accept). - - - If no attribute specifications matches the attribute list in a query, - it is accepted. - - - The attribute element has two required attributes: - type which is the Attribute Type-1 type, and - value which is the Attribute Type-1 value. - The special value/type * matches any attribute - type/value. A value may also be specified as a list with each - value separated by comma, a value may also be specified as a - list: low value - dash - high value. - - - If attribute error is given, that holds a - Bib-1 diagnostic which is sent to the client if the particular - type, value is part of a query. - - - If attribute error is not given, the attribute - type, value is accepted and passed to the backend target. - - - A target that supports use attributes 1,4, 1000 through 1003 and - no other use attributes, could use the following rules: - - <attribute type="1" value="1,4,1000-1003"> - <attribute type="1" value="*" error="114"/> - - -
- -
- Configuration: syntax - - The syntax element specifies accept or reject - or a particular record syntax request from the client. - - - The syntax has one required attribute: - type which is the Preferred Record Syntax. - - - If attribute error is given, that holds a - Bib-1 diagnostic which is sent to the client if the particular - record syntax is part of a present - or search request. - - - If attribute error is not given, the record syntax - is accepted and passed to the backend target. - - - If attribute marcxml is given, the proxy will - perform MARC21 to MARCXML conversion. In this case the - type should be XML. The proxy will use - preferred record syntax USMARC/MARC21 against the backend target. - - To accept USMARC and offer MARCXML XML records but reject - all other requests the following configuration could be used: - - <proxy> - <target name="mytarget"> - <syntax type="usmarc"/> - <syntax type="xml" marcxml="1"/> - <syntax type="*" error="238"/> - </target> - </proxy> - - -
- -
- Configuration: target-timeout - - The element target-timeout is the child of element - target and specifies the amount in seconds before - a target session is shut down. - - - This can also be specified on the command line by using option - -T. Refer to . - -
- -
- Configuration: client-timeout - - The element client-timeout is the child of element - target and specifies the amount in seconds before - a client session is shut down. - - - This can also be specified on the command line by using option - -i. Refer to . - -
- -
- Configuration: preinit - - The element preinit is the child of element - target and specifies the number of spare - connection to a target. By default no spare connection are - created by the proxy. If the proxy uses a target exclusive or - a lot, the preinit session will ensure that target sessions - have been made before the client makes a connection and will therefore - reduce the connect-init handshake dramatically. Never set this to - more than 5. - -
- -
- Configuration: max-clients - - The element max-clients is the child of element - proxy and specifies the total number of - allowed connections to targets (all targets). If this limit - is reached the proxy will close the least recently used connection. - - - Note, that many Unix systems impose a system on the number of - open files allowed in a single process, typically in the - range 256 (Solaris) to 1024 (Linux). - The proxy uses 2 sockets per session + a few files - for logging. As a rule of thumb, ensure that 2*max-clients + 5 - can be opened by the proxy process. - - - - Using the - bash shell, you can set the limit with - ulimit -nno. - Use ulimit -a to display limits. - - -
- -
- Configuration: log - - The element log is the child of element - proxy and specifies what to be logged by the - proxy. - - - Specify the log file with command-line option -l. - - - The text of the log element is a sequence of - options separated by white space. See the table below: - Logging options - - - - - - Option - Description - - - - - client-apdu - - Log APDUs as reported by YAZ for the - communication between the client and the proxy. - This facility is equivalent to the APDU logging that - happens when using option -a, however - this tells the proxy to log in the same file as given - by -l. - - - - server-apdu - - Log APDUs as reported by YAZ for the - communication between the proxy and the server (backend). - - - - clients-requests - - Log a brief description about requests transferred between - the client and the proxy. The name of the request and the size - of the APDU is logged. - - - - server-requests - - Log a brief description about requests transferred between - the proxy and the server (backend). The name of the request - and the size of the APDU is logged. - - - - -
-
- - To log communication in details between the proxy and the backend, th - following configuration could be used: - - server-apdu server-requests - -]]> - - -
- -
-
- Proxy Usage - - - - &yaz-proxy-ref; - -
-
OtherInformation Encoding - - The proxy uses the OtherInformation definition to carry - information about the target address and cookie. - - - OtherInformation ::= [201] IMPLICIT SEQUENCE OF SEQUENCE{ - category [1] IMPLICIT InfoCategory OPTIONAL, - information CHOICE{ - characterInfo [2] IMPLICIT InternationalString, - binaryInfo [3] IMPLICIT OCTET STRING, - externallyDefinedInfo [4] IMPLICIT EXTERNAL, - oid [5] IMPLICIT OBJECT IDENTIFIER}} --- - InfoCategory ::= SEQUENCE{ - categoryTypeId [1] IMPLICIT OBJECT IDENTIFIER OPTIONAL, - categoryValue [2] IMPLICIT INTEGER} - - - The categoryTypeId is either - OID 1.2.840.10003.10.1000.81.1, 1.2.840.10003.10.1000.81.2 - for proxy target and proxy cookie respectively. The - integer element category is set to 0. - The value proxy and cookie is stored in element - characterInfo of the information - choice. - -
-
- -