X-Git-Url: http://git.indexdata.com/?p=yazproxy-moved-to-github.git;a=blobdiff_plain;f=doc%2Freference.xml;h=2ef809c4f7706bbfa77a921930d1063e69c11788;hp=15274ea7b372d7d630412f5e7cd589b1da887b43;hb=9b8dda8226cff9fb0ee5bf8d7c0e88e9613df63e;hpb=101d9c2ca072f3cd7fb6cb89c67573318b14b8c1 diff --git a/doc/reference.xml b/doc/reference.xml index 15274ea..2ef809c 100644 --- a/doc/reference.xml +++ b/doc/reference.xml @@ -3,7 +3,7 @@
Operating Environment - The YAZ proxy is a single program. After startup it spawns + The YAZ proxy is a console program. After startup it spawns a child process (except on Windows or if option -X is given). The child process is the core of the proxy and it handles all communication with clients and servers. The parent process @@ -12,12 +12,12 @@ see . - As an option the proxy may change user identity to a less priviledged + As an option, the proxy may change user identity to a less privileged user.
- Specifying the Backend Server + Choosing the Backend Server When the proxy receives a Z39.50 Initialize Request from a Z39.50 client, it determines the backend server by the following rules: @@ -32,7 +32,7 @@ usual YAZ address format (typically tcp:hostname:port) as described in - the Addresses section of the YAZ manual. @@ -57,6 +57,22 @@ + + If the proxy receives an SRU request, the following rules are used. + + + If default target has Explain information with a + database that matches the path of the + HTTP request of SRU that backend server is used for SRU operation. + + + + + Otherwise the service will return HTTP 404 (Not found). + + + +
Keep-alive Facility @@ -92,6 +108,81 @@
+ +
+ Query Caching + + Simple stateless clients often send identical Z39.50 searches + in a relatively short period of time (e.g. in order to produce a + results-list page, the next page, + a single full-record, etc). And for many targets, it's + much more expensive to produce a new result set than to + reuse an existing one. + + + The proxy tries to solve that by remembering the last query for each + backend target, so that if an identical query is received next, it + is turned into Present Requests rather than new Search Requests. + + + + In a future we release will will probably allows for + an arbitrary-sized cache for targets supporting named result sets. + + + + You can enable/disable query caching using option -o. + +
+ +
+ Record Caching + + As an option, the proxy may also cache result set records for the + last search. + The proxy takes into account the Record Syntax and CompSpec. + The CompSpec includes simple element set names as well. + By default the cache is 200000 bytes per session. + +
+ +
+ Query Validation + + The Proxy may also be configured to trap particular attributes in + Type-1 queries and send Bib-1 diagnostics back to the client without + even consulting the backend target. This facility may be useful if + a target does not properly issue diagnostics when unsupported attributes + are send to it. + +
+ +
+ Record Syntax Validation + + The proxy may be configured to accept, reject or convert records. + When accepted, the target passes search/present requests to the + backend target under the assumption that the target can honor the + request (In fact it may not do that). When a record is rejected because + the record syntax is "unsupported" the proxy returns a diagnostic to the + client. Finally, the proxy may convert records. + + + The proxy can convert from MARC to MARCXML and thereby offer an + XML version of any MARC record as long as it is ISO2709 encoded. + If the proxy is compiled with libXSLT support it can also + perform XSLT on XML. + +
+ +
+ Other Optimizations + + We've had some plans to support global caching of result set records, + but this has not yet been implemented. + +
+
Proxy Configuration File @@ -100,10 +191,14 @@ The config file is XML based. The YAZ proxy must be compiled - with libxml2 and - libXSLT support in + with libxml2 and + libXSLT support in order for the config file facility to be enabled. + + See for an XML schema + for the configuration. + To check for a config file to be well-formed, the yazproxy may be invoked without specifying a listening port, i.e. @@ -117,14 +212,16 @@ Proxy Configuration Header The proxy config file must have a root element called - proxy. All information except an optional XML - header must be stored within the proxy element. + proxy and scoped within namespace + xmlns="http://indexdata.dk/yazproxy/schema/0.9/". + All information except an optional XML header must be stored + within the proxy element. <?xml version="1.0"?> - <proxy> - <!-- content here .. --> - </proxy> + <proxy xmlns="http://indexdata.dk/yazproxy/schema/0.9/"> + <!-- content here .. --> + </proxy>
@@ -140,16 +237,17 @@ equivalent to command line option -t. - - <?xml version="1.0"?> - <proxy> - <target name="server1" default="1"> - <!-- description of server1 .. --> - </target> - <target name="server2"> - <!-- description of server2 .. --> - </target> - </proxy> + + + + + + + + + + ]]>
@@ -177,7 +275,7 @@ This can also be specified on the command line by using option - -T. Refer to OPTIONS. + -T. Refer to OPTIONS in . @@ -190,7 +288,17 @@ This can also be specified on the command line by using option - -i. Refer to OPTIONS. + -i. Refer to OPTIONS in . + + + +
+ max-sockets + + The element max-sockets is the child of element + target and specifies the maximum number of sockets + to use for the target for all sessions using it. In other words: maximum + number of Z39.50 session to the target.
@@ -213,11 +321,11 @@ The following sets maximum number of bytes transferred in a - target session to 1 MB and maxinum of requests to 400. + target session to 1 MB and maximum of requests to 400. <keepalive> <bandwidth>1048576</bandwidth> - <retrieve>400</retrieve> + <pdu>400</pdu> </keepalive> @@ -230,32 +338,37 @@ The proxy records bandwidth/pdu requests during the last 60 seconds (1 minute). The limit may include the elements bandwidth, pdu, - and retrieve. The bandwidth + retrieve and search. + The bandwidth measures the number of bytes transferred within the last minute. The pdu is the number of requests in the last minute. The retrieve holds the maximum records to - be retrieved in one Present Request. + which may be retrieved in one Present Request. + The search is the maximum number of searches + within the last minute. - If a bandwidth/pdu limit is reached the proxy will postpone the + If a bandwidth/pdu/search limit is reached the proxy will postpone the requests to the target and wait one or more seconds. The idea of the limit is to ensure that clients that downloads hundreds or thousands of records do not hurt other users. The following sets maximum number of bytes transferred per minute to - 500Kbytes and maximum number of requests to 40. + 500Kbytes, maximum number of records retrievals to 40 + and maximum number of searches to 20. <limit> <bandwidth>524288</bandwidth> <retrieve>40</retrieve> + <search>20</search> </limit> - Typically the limits for keepalive are much higher than - those for session minute average. + Typically the values in the keepalive section are mugh higher + than their equivalent limit counterparts (bandwidth, pdu). @@ -301,29 +414,17 @@ A target that supports use attributes 1,4, 1000 through 1003 and no other use attributes, could use the following rules: - <attribute type="1" value="1,4,1000-1003"> + <attribute type="1" value="1,4,1000-1003"/> <attribute type="1" value="*" error="114"/> - - -
syntax The syntax element specifies accept or reject - or a particular record syntax request from the client. + or a particular record syntax request from the client. It also + allows record conversion of XML records via XSLT. The syntax has one required attribute: @@ -342,27 +443,92 @@ If attribute marcxml is given, the proxy will perform MARC21 to MARCXML conversion. In this case the type should be XML. The proxy will use - preferred record syntax USMARC/MARC21 against the backend target. + preferred record syntax USMARC/MARC21 or backendtype + (if given) against the backend target. + For the special case where backendtype is + opac the proxy will convert the OPAC + record to OPACXML. - To accept USMARC and offer MARCXML XML records but reject - all other requests the following configuration could be used: - - <proxy> + + When marcxml is used, yazproxy assumes + that records retrieved from the backend are encoded in the + MARC-8 character set. + This is correct for most MARC21 based systems, but not for + other MARC variants or UTF-8 based MARC21 systems. + The backendcharset attribute specifies + the character set of the MARC records to be converted. + + + If attribute backendtype is given, that holds the + record syntax to be transmitted to backend. + + + If attribute backendelementset is given, that holds + elementset to be transmitted to backend. An empty value of + backendelementset has the effect of omitting + any Comp-Spec (and elementset) sent to backend. + + If backendelementset is omitted, the element + set from client is used, except if marcxml is used. + In that case (using marcxml), no Comp-Spec and no + elementset is sent to backend. + + + If attribute stylesheet is given, the proxy + will convert XML record from server via XSLT. It is important + that the content from server is XML. If used in conjunction with + attribute marcxml, the MARC to MARCXML/OPACXML + conversion takes place before the XSLT conversion takes place. + + + If attribute identifier is given that is the + SRU record schema identifier for the resulting output record (after + MARCXML and/or XSLT conversion). + + + If sub element title is given (as child element + of syntax, then that is the official SRU + name of the resulting record schema. + + + If sub element name is given that is an alias + for the record schema identifier. Multiple names + may be specified. + + + MARCXML conversion + To accept USMARC and offer MARCXML XML plus Dublin Core (via + XSLT conversion) but the following configuration could be used: + + <proxy> <target name="mytarget"> + .. <syntax type="usmarc"/> - <syntax type="xml" marcxml="1"/> - <syntax type="*" error="238"/> + <syntax type="xml" marcxml="1" + identifier="info:srw/schema/1/marcxml-v1.1" + <title>MARCXML<title> + <name>marcxml<name> + </syntax> + <syntax type="xml" marcxml="1" stylesheet="MARC21slim2SRWDC.xsl" + identifier="info:srw/schema/1/dc-v1.1"> + <title>Dublin Core<title> + <name>dc<name> + </syntax> + <syntax type="*" error="238"/> + .. </target> - </proxy> - + </proxy> + + +
explain The explain element includes Explain information - for SRW/SRU about the server in the target section. This + for SRU about the server in the target section. This information must have a serverInfo element with a database that this target must be available as (URL path). For example, @@ -377,7 +543,7 @@ ]]> - In the above case, the SRW/SRU service is available as + In the above case, the SRU service is available as http://myhost.org:8000/mydatabase. @@ -386,17 +552,20 @@
cql2rpn - The CDATA of cql2rpn refers to CQL to a RPN conversion - file - for the server in the target section. This element - is required for SRW/SRU searches to operate against a Z39.50 - server that doesn't support CQL. Most Z39.50 servers only support + The content of the cql2rpn element specifies + the path from the working directory to a CQL-to-RPN conversion + file for the server in the target section. This element + is required for SRU searches to operate against Z39.50 + servers that don't support CQL. Most Z39.50 servers only support Type-1/RPN so this is usually required. + + See YAZ documentation for more information about the - CQL - to PQF conversion. See also the + CQL to PQF conversion. + See also the pqf.properties in the etc (or prefix/share/yazproxy) - directory of the YAZ proxy. + directory of the YAZ proxy distribution.
@@ -414,6 +583,67 @@
+
+ target-authentication + + The element target-authentication specifies + fixed authentication information to be sent to the backend target. + + + This element takes a an attribute type which is + the authenticatin type to be used.. + + + none + + + No authentication. There is no CDATA associated with this. + + + + + anonymous + + + Anonymous authentication. There is no CDATA associated with this. + + + + + open + + + Open authentication. The CDATA consists of the + open authentication string. + + + + + idPass + + + IdPass authentication. The CDATA consists of + three terms: user, group and password. + + + + +
+ +
+ target-charset + + The element target-charset specifies the + native character set that the target uses for queries. + + + If this is specified the proxy will act as a Z39.50 server + supporting character set negotiation. And in SRU mode + it will convert from UTF-8 (UNICODE) to this native character + set (if possible). + +
+
max-clients @@ -432,8 +662,8 @@ - Using the - bash shell, you can set the limit with + Using the bash shell, you can set + the limit with ulimit -nno. Use ulimit -a to display limits. @@ -497,6 +727,13 @@ and the size of the APDU is logged. + + client-ip + + Log the client IP for each log entry. By default, the client IP + is only logged when a new session starts. + + @@ -512,84 +749,77 @@
- - -
- Query Caching - - Simple stateless clients often send identical Z39.50 searches - in a relatively short period of time (e.g. in order to produce a - results-list page, the next page, - a single full-record, etc). And for many targets, it's - much more expensive to produce a new result set than to - reuse an existing one. - - - The proxy tries to solve that by remembering the last query for each - backend target, so that if an identical query is received next, it - is turned into Present Requests rather than new Search Requests. - - + +
+ max-connect - In a future we release will will probably allows for - an arbitrary-sized cache for targets supporting named result sets. + The element max-connect is a child of element + proxy and specifies the maximum number + of connections to be initiated within the last minute (or + value of period-connect. - - - You can enable/disable query caching using option -o. - -
- -
- Record Caching - - As an option, the proxy may also cache result set records for the - last search. - The proxy takes into account the Record Syntax and CompSpec. - The CompSpec includes simple element set names as well. - By default the cache is 200000 bytes per session. - -
- -
- Query Validation - - The Proxy may also be configured to trap particular attributes in - Type-1 queries and send Bib-1 diagnostics back to the client without - even consulting the backend target. This facility may be useful if - a target does not properly issue diagnostics when unsupported attributes - are send to it. - -
- -
- Record Syntax Validation - - The proxy may be configured to accept, reject or convert records. - When accepted, the target passes search/present requests to the - backend target under the assumption that the target can honor the - request (In fact it may not do that). When a record is rejected because - the record syntax is "unsupported" the proxy returns a diagnostic to the - client. Finally, the proxy may convert records. - - - The proxy can convert from MARC to MARCXML and thereby offer an - XML version of any MARC record as long as it is ISO2709 encoded. - If the proxy is compiled with libXSLT support it can also - perform XSLT on XML. - -
- -
- Other Optimizations - - We've had some plans to support global caching of result set records, - but this has not yet been implemented. - + + If the maximum number is reached the proxy will terminate the + just initiated session (connection terminated). + +
+ +
+ limit-connect + + The element max-connect is a child of element + proxy and specifies the limit of number + of connections to be initiated within the last minute (or + value of period-connect. + + + If the maximum number is reached the proxy delays the first operation + in the session by one second. + +
+ +
+ period-connect + + The element period-connect is a child of element + proxy and specifies period - in the number of seconds + that limit-connect and + max-connect + should measure connections. + + + If period-connect is omitted, 60 seconds is used. + +
+ +
+ docpath + + The element docpath is a child of element + proxy and specifies an allowed HTTP path + for local file access. Using docpath the + proxy may return static file content. + + + The value of docpath both serves as a HTTP path prefix + and as a local file prefix. + If a value of etc is used only URLs with the + prefix /etc/ results in a local file access to the + directory etc within the working directory + of yazproxy. + + + + Care has been taken to ensure that hostile URLs are rejected - including + strings such as .. and / (absolute + file system access). + + +
+
-
- Proxy Usage (man page) + Proxy Manual Pages &yaz-proxy-ref; @@ -618,12 +848,156 @@ The categoryTypeId is either OID 1.2.840.10003.10.1000.81.1, 1.2.840.10003.10.1000.81.2 for proxy target and proxy cookie respectively. The - integer element category is set to 0. + categoryValue is set to 1. The value proxy and cookie is stored in element characterInfo of the information choice. -
+ +
+ YAZ Proxy Configuration Schema + + Here an XML Schema for the YAZ proxy configuration file. + The schema, yazproxy.xsd is located in sub + directory etc of the distribution. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +]]> + +