X-Git-Url: http://git.indexdata.com/?p=yazpp-moved-to-github.git;a=blobdiff_plain;f=doc%2Fproxy.xml;h=556245410a2c4366e16f7100a49b887549bba276;hp=9d04a650f4a208b1a509da61f4c6732cc21a356c;hb=d84b43231c7c5b0786e9aa62d0f7ca7ecd83bdb5;hpb=5554ddf9c4d9670aaaa8f8b9ce6def1dadff3c96 diff --git a/doc/proxy.xml b/doc/proxy.xml index 9d04a65..5562454 100644 --- a/doc/proxy.xml +++ b/doc/proxy.xml @@ -1,5 +1,5 @@ - - The YAZ Proxy + + The YAZ Proxy The YAZ proxy is a transparent Z39.50-to-Z39.50 gateway. That is, it is a Z39.50 server which has as its back-end a Z39.50 client @@ -58,7 +58,7 @@ start it up. It will work exactly as usual, but all the packets will be sent via the proxy, which will generate a log like this: - + - +
Specifying the Backend Target @@ -145,7 +146,8 @@ Otherwise, the Proxy uses the default target, if one was specified on the command-line with the -t - option. + option. A default target can also be specified in the + XML Config file. @@ -159,33 +161,33 @@
Keep-alive Facility - The keep-alive is a facility where the proxy keeps the connection to the - backend - even if the client closes the connection to the proxy. + The keep-alive is a facility where the proxy keeps the connection to the + backend - even if the client closes the connection to the proxy. - If a new or another client connects to the proxy again and requests the - same backend it will be reassigned to this backend. In this case, the - proxy sends an initialize response directly to the client and an - initialize handshake with the backend is omitted. + If a new or another client connects to the proxy again and requests the + same backend it will be reassigned to this backend. In this case, the + proxy sends an initialize response directly to the client and an + initialize handshake with the backend is omitted. - When a client reconnects, query and record caching works better, if the - proxy assigns it to the same backend as before. And the result set - (if any) is re-used. To achive this, Index Data defined a session - cookie which identifies the backend session. + When a client reconnects, query and record caching works better, if the + proxy assigns it to the same backend as before. And the result set + (if any) is re-used. To achieve this, Index Data defined a session + cookie which identifies the backend session. - The cookie is defined by the client and is sent as part of the - Initialize Request and passed in an + The cookie is defined by the client and is sent as part of the + Initialize Request and passed in an otherInfo - element with OID 1.2.840.10003.10.1000.81.2. + element with OID 1.2.840.10003.10.1000.81.2. - Clients that do not send a cookie as part of the initialize request - may still better performance, since the init handshake is saved. + Clients that do not send a cookie as part of the initialize request + may still better performance, since the init handshake is saved.
- +
Query Caching @@ -212,37 +214,421 @@
-
- Record Caching - - As an option, the proxy may also cache result set records for the - last search. - The proxy takes into account the Record Syntax and CompSpec. - The CompSpec includes simple element set names as well. - + Record Caching + + As an option, the proxy may also cache result set records for the + last search. + The proxy takes into account the Record Syntax and CompSpec. + The CompSpec includes simple element set names as well. + By default the cache is 200000 bytes per session. +
- +
- Query Validation - - + Query Validation + + The Proxy may also be configured to trap particular attributes in + Type-1 queries and send Bib-1 diagnostics back to the client without + even consulting the backend target. This facility may be useful if + a target does not properly issue diagnostics when unsupported attributes + are send to it. +
- +
- Record Syntax Validation - - + Record Syntax Validation + + The proxy may be configured to accept, reject or convert records. + When accepted, the target passes search/present requests to the + backend target under the assumption that the target can honor the + request (In fact it may not do that). When a record is rejected because + the record syntax is "unsupported" the proxy returns a diagnostic to the + client. Finally, the proxy may convert records. + + + In the current version the only supported conversion is + MARC21/USMARC in MARC-8 charset to MARCXML in UTF-8. Future version of + the proxy may do other record/charset conversions. +
- +
Other Optimizations - - We've had some plans to support global caching of result set records, + + We've had some plans to support global caching of result set records, but this has not yet been implemented.
+
+ Proxy Configuration File + + The Proxy as an option may read a configuration file using option + -c followed by the filename of a config file. + + + The config file is in XML format. The YAZ proxy must be compiled + with libxml2 and + libXSLT support in + order for the config file facility to be enabled. + + + To check for a config file to be well-formed, the yaz-proxy may + be invoked without specifying a listening port, i.e. + + yaz-proxy -c myconfig.xml + + If this does not produce errors, the file is well-formed. + + +
+ Proxy Configuration Header + + The proxy config file must have a root element called + proxy. All information except an optional XML + header must be stored within the proxy element. + + + <?xml version="1.0"?> + <proxy> + <!-- content here .. --> + </proxy> + +
+
+ Configuration: target + + The element target which may be repeated zero + or more times with parent element proxy contains + information about each backend target. + The target element have two attributes: + name which holds the logical name of the backend + target (required) and default (optional) which + (when given) specifies that the backend target is the default target - + equivalent to command line option -t. + + + + <?xml version="1.0"?> + <proxy> + <target name="server1" default="1"> + <!-- description of server1 .. --> + </target> + <target name="server2"> + <!-- description of server2 .. --> + </target> + </proxy> + + +
+
+ Configuration:url + + The url which may be repeated one or more times + should be the child of the target element. + The CDATA of url is the Z-URL of the backend. + + + Multiple url element may be used. In that case, then + a client initiates a session, the proxy chooses the URL with the lowest + number of active sessions, thereby distributing the load. It is + assumed that each URL represents the same database (data). + +
+
+ Configuration: keepalive + The keepalive element holds information about + the keepalive Z39.50 sessions. Keepalive sessions are proxy-to-backend + sessions that is no longer associated with a client session. + + The keepalive element which is the child of + the targetholds two elements: + bandwidth and pdu. + The bandwidth is the maximum total bytes + transferred to/from the target. If a target session exceeds this + limit, it is shut down (and no longer kept alive). + The pdu is the maximum number of requests sent + to the target. If a target session exceeds this limit, it is + shut down. The idea of these two limits is that avoid very long + sessions that use resources in a backend (that leaks!). + + + The following sets maximum number of bytes transferred in a + target session to 1 MB and maxinum of requests to 400. + + <keepalive> + <bandwidth>1048576</bandwidth> + <retrieve>400</retrieve> + </keepalive> + + +
+
+ Configuration: limit + + The limit section specifies bandwidth/pdu requests + limits for an active session. + The proxy records bandwidth/pdu requests during the last 60 seconds + (1 minute). The limit may include the + elements bandwidth, pdu, + and retrieve. The bandwidth + measures the number of bytes transferred within the last minute. + The pdu is the number of requests in the last + minute. The retrieve holds the maximum records to + be retrieved in one Present Request. + + + If a bandwidth/pdu limit is reached the proxy will postpone the + requests to the target and wait one or more seconds. The idea of the + limit is to ensure that clients that downloads hundreds or thousands of + records do not hurt other users. + + + The following sets maximum number of bytes transferred per minute to + 500Kbytes and maximum number of requests to 40. + + <limit> + <bandwidth>524288</bandwidth> + <retrieve>40</retrieve> + </limit> + + + + + Typically the limits for keepalive are much higher than + those for session minute average. + + +
+ +
+ Configuration: attribute + + The attribute element specifies accept or reject + or a particular attribute type, value pair. + Well-behaving targets will reject unsupported attributes on their + own. This feature is useful for targets that do not gracefully + handle unsupported attributes. + + + Attribute elements may be repeated. The proxy inspects the attribute + specifications in the order as specified in the configuration file. + When a given attribute specification matches a given attribute list + in a query, the proxy takes appropriate action (reject, accept). + + + If no attribute specifications matches the attribute list in a query, + it is accepted. + + + The attribute element has two required attributes: + type which is the Attribute Type-1 type, and + value which is the Attribute Type-1 value. + The special value/type * matches any attribute + type/value. A value may also be specified as a list with each + value separated by comma, a value may also be specified as a + list: low value - dash - high value. + + + If attribute error is given, that holds a + Bib-1 diagnostic which is sent to the client if the particular + type, value is part of a query. + + + If attribute error is not given, the attribute + type, value is accepted and passed to the backend target. + + + A target that supports use attributes 1,4, 1000 through 1003 and + no other use attributes, could use the following rules: + + <attribute type="1" value="1,4,1000-1003"> + <attribute type="1" value="*" error="114"/> + + +
+ +
+ Configuration: syntax + + The syntax element specifies accept or reject + or a particular record syntax request from the client. + + + The syntax has one required attribute: + type which is the Preferred Record Syntax. + + + If attribute error is given, that holds a + Bib-1 diagnostic which is sent to the client if the particular + record syntax is part of a present - or search request. + + + If attribute error is not given, the record syntax + is accepted and passed to the backend target. + + + If attribute marcxml is given, the proxy will + perform MARC21 to MARCXML conversion. In this case the + type should be XML. The proxy will use + preferred record syntax USMARC/MARC21 against the backend target. + + To accept USMARC and offer MARCXML XML records but reject + all other requests the following configuration could be used: + + <proxy> + <target name="mytarget"> + <syntax type="usmarc"/> + <syntax type="xml" marcxml="1"/> + <syntax type="*" error="238"/> + </target> + </proxy> + + +
+ +
+ Configuration: target-timeout + + The element target-timeout is the child of element + target and specifies the amount in seconds before + a target session is shut down. + + + This can also be specified on the command line by using option + -T. Refer to . + +
+ +
+ Configuration: client-timeout + + The element client-timeout is the child of element + target and specifies the amount in seconds before + a client session is shut down. + + + This can also be specified on the command line by using option + -i. Refer to . + +
+ +
+ Configuration: preinit + + The element preinit is the child of element + target and specifies the number of spare + connection to a target. By default no spare connection are + created by the proxy. If the proxy uses a target exclusive or + a lot, the preinit session will ensure that target sessions + have been made before the client makes a connection and will therefore + reduce the connect-init handshake dramatically. Never set this to + more than 5. + +
+ +
+ Configuration: max-clients + + The element max-clients is the child of element + proxy and specifies the total number of + allowed connections to targets (all targets). If this limit + is reached the proxy will close the least recently used connection. + + + Note, that many Unix systems impose a system on the number of + open files allowed in a single process, typically in the + range 256 (Solaris) to 1024 (Linux). + The proxy uses 2 sockets per session + a few files + for logging. As a rule of thumb, ensure that 2*max-clients + 5 + can be opened by the proxy process. + + + + Using the + bash shell, you can set the limit with + ulimit -nno. + Use ulimit -a to display limits. + + +
+ +
+ Configuration: log + + The element log is the child of element + proxy and specifies what to be logged by the + proxy. + + + Specify the log file with command-line option -l. + + + The text of the log element is a sequence of + options separated by white space. See the table below: + Logging options + + + + + + Option + Description + + + + + client-apdu + + Log APDUs as reported by YAZ for the + communication between the client and the proxy. + This facility is equivalent to the APDU logging that + happens when using option -a, however + this tells the proxy to log in the same file as given + by -l. + + + + server-apdu + + Log APDUs as reported by YAZ for the + communication between the proxy and the server (backend). + + + + clients-requests + + Log a brief description about requests transferred between + the client and the proxy. The name of the request and the size + of the APDU is logged. + + + + server-requests + + Log a brief description about requests transferred between + the proxy and the server (backend). The name of the request + and the size of the APDU is logged. + + + + +
+
+ + To log communication in details between the proxy and the backend, th + following configuration could be used: + + server-apdu server-requests + +]]> + + +
+ +
Proxy Usage @@ -270,15 +656,15 @@ categoryValue [2] IMPLICIT INTEGER} - The categoryTypeId is either - OID 1.2.840.10003.10.1000.81.1, 1.2.840.10003.10.1000.81.2 - for proxy target and proxy cookie respectively. The - integer element category is set to 0. - The value proxy and cookie is stored in element - characterInfo of the information - choice. - -
+ The categoryTypeId is either + OID 1.2.840.10003.10.1000.81.1, 1.2.840.10003.10.1000.81.2 + for proxy target and proxy cookie respectively. The + integer element category is set to 0. + The value proxy and cookie is stored in element + characterInfo of the information + choice. +
+
+