X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fserver.xml;h=063802a84222a58b9e256c7c07815e605ce95d41;hb=47054fae00306e75212a26ee5305f00032c99001;hp=993fe63f60f73e6b51567e73cbcc785b0125d4ad;hpb=d2e692248eac6469ef7a3a3f8044010cb5cc1da7;p=idzebra-moved-to-github.git diff --git a/doc/server.xml b/doc/server.xml index 993fe63..063802a 100644 --- a/doc/server.xml +++ b/doc/server.xml @@ -1,5 +1,5 @@ - + The Z39.50 Server @@ -12,6 +12,72 @@ can be run (inetd, nt service, stand-alone program, daemon...) -H --> + + + + Description + Zebra is a high-performance, general-purpose structured text indexing + and retrieval engine. It reads structured records in a variety of input + formats (eg. email, XML, MARC) and allows access to them through exact + boolean search expressions and relevance-ranked free-text queries. + + + zebrasrv is the Z39.50 and SRW/U frontend + server for the Zebra indexer. + + + On Unix you can run the zebrasrv + server from the command line - and put it + in the background. It may also operate under the inet daemon. + On WIN32 you can run the server as a console application or + as a WIN32 Service. + + + + + Synopsis + &zebrasrv-synopsis; + + + + Options + + + The options for zebrasrv are the same + as those for YAZ' yaz-ztest. + Option -c specifies a Zebra configuration + file - if omitted zebra.cfg is read. + + + &zebrasrv-options; + + + Files + + zebra.cfg + + + See Also + + + zebraidx + 1 + , + + yaz-ztest + 8 + + + + The Zebra software is Copyright Index Data + http://www.indexdata.dk + and distributed under the + GPLv2 license. + + + + + Z39.50 Protocol Support and Behavior @@ -533,17 +536,339 @@ - + + + + The SRU/SRW Server + + In addition to Z39.50, Zebra supports the more recent and + web-friendly IR protocol SRU, described at + . + SRU is ``Search/Retrieve via URL'', a simple, REST-like protocol + that uses HTTP GET to request search responses. The request + itself is made of parameters such as + query, + startRecord, + maximumRecords + and + recordSchema; + the response is an XML document containing hit-count, result-set + records, diagnostics, etc. SRU can be thought of as a re-casting + of Z39.50 semantics in web-friendly terms; or as a standardisation + of the ad-hoc query parameters used by search engines such as Google + and AltaVista; or as a superset of A9's OpenSearch (which it + predates). + + + Zebra further supports SRW, described at + . + SRW is the ``Search/Retrieve Web Service'', a SOAP-based alternative + implementation of the abstract protocol that SRU implements as HTTP + GET requests. In SRW, requests are encoded as XML documents which + are posted to the server. The responses are identical to those + returned by SRU servers, except that they are wrapped in a several + layers of SOAP envelope. + + + Zebra supports all three protocols - Z39.50, SRU and SRW - on the + same port, recognising what protocol is used by each incoming + requests and handling them accordingly. This is a achieved through + the use of Deep Magic; civilians are warned not to stand too close. + + + From here on, ``SRU'' is used to indicate both the SRU and SRW + protocols, as they are identical except for the transport used for + the protocol packets and Zebra's support for them is equivalent. + + + + Running the SRU Server (zebrasrv) + + Because Zebra supports all three protocols on one port, it would + seem to follow that the SRU server is run in the same way as + the Z39.50 server, as described above. This is true, but only in + an uninterestingly vacuous way: a Zebra server run in this manner + will indeed recognise and accept SRU requests; but since it + doesn't know how to handle the CQL queries that these protocols + use, all it can do is send failure responses. + + + + It is possible to cheat, by having SRU search Zebra with + a PQF query instead of CQL, using the + x-pquery + parameter instead of + query. + This is a + non-standard extension + of CQL, and a + very naughty + thing to do, but it does give you a way to see Zebra serving SRU + ``right out of the box''. If you start your favourite Zebra + server in the usual way, on port 9999, then you can send your web + browser to: + + + http://localhost:9999/Default?version=1.1 + &operation=searchRetrieve + &x-pquery=mineral + &startRecord=1 + &maximumRecords=1 + + + This will display the XML-formatted SRU response that includes the + first record in the result-set found by the query + mineral. (For clarity, the SRU URL is shown + here broken across lines, but the lines should be joined to gether + to make single-line URL for the browser to submit.) + + + + In order to turn on Zebra's support for CQL queries, it's necessary + to have the YAZ generic front-end (which Zebra uses) translate them + into the Z39.50 Type-1 query format that is used internally. And + to do this, the generic front-end's own configuration file must be + used. This file is described + elsewhere; + the salient point for SRU support is that + zebrasrv + must be started with the + -f frontendConfigFile + option rather than the + -c zebraConfigFile + option, + and that the front-end configuration file must include both a + reference to the Zebra configuration file and the CQL-to-PQF + translator configuration file. + + + A minimal front-end configuration file that does this would read as + follows: + + + + zebra.cfg + ../../tab/pqf.properties + + +]]> + + The + <config> + element contains the name of the Zebra configuration file that was + previously specified by the + -c + command-line argument, and the + <cql2rpn> + element contains the name of the CQL properties file specifying how + various CQL indexes, relations, etc. are translated into Type-1 + queries. + + + A zebra server running with such a configuration can then be + queried using proper, conformant SRU URLs with CQL queries: + + + http://localhost:9999/Default?version=1.1 + &operation=searchRetrieve + &query=title=utah and description=epicent* + &startRecord=1 + &maximumRecords=1 + + + + + SRU and SRW Protocol Support and Behavior + + Zebra running as an SRU server supports SRU version 1.1, including + CQL version 1.1. In particular, it provides support for the + following elements of the protocol. + + + + Search and Retrieval + + Zebra fully supports SRU's core + searchRetrieve + operation, as described at + + + + One of the great strengths of SRU is that it mandates a standard + query language, CQL, and that all conforming implementations can + therefore be trusted to correctly interpret the same queries. It + is with some shame, then, that we admit that Zebra also supports + an additional query language, our own Prefix Query Format (PQF, + ). + A PQF query is submitted by using the extension parameter + x-pquery, + in which case the + query + parameter must be omitted, which makes the request not valid SRU. + Please don't do this. + + + + + Scan + + Zebra does not support SRU's + scan + operation, as described at + + + + This is a rather embarrassing surprise as the pieces are all + there: Z39.50 scan is supported, and SRU scan requests are + recognised and diagnosed. To add further to the embarrassment, a + mutant form of SRU scan is supported, using + the non-standard x-pScanClause parameter in + place of the standard scanClause to scan on a + PQF query clause. + + + + + Explain + + Zebra fully supports SRU's core + explain + operation, as described at + + + + The ZeeRex record explaining a database may be requested either + with a fully fledged SRU request (with + operation=explain + and version-number specified) + or with a simple HTTP GET at the server's basename. + The ZeeRex record returned in response is the one embedded + in the YAZ Frontend Server configuration file that is described in the + Virtual Hosts documentation. + + + Unfortunately, the data found in the + CQL-to-PQF text file must be added by hand-craft into the explain + section of the YAZ Frontend Server configuration file to be able + to provide a suitable explain record. + Too bad, but this is all extreme + new alpha stuff, and a lot of work has yet to be done .. + + + There is no linkeage whatsoever between the Z39.50 explain model + and the SRU/SRW explain response (well, at least not implemented + in Zebra, that is ..). Zebra does not provide a means using + Z39.50 to obtain the ZeeRex record. + + + + + Some SRU Examples + + Surf into http://localhost:9999 + to get an explain response, or use + + + + See number of hits for a query + + + + Fetch record 5-7 in Dublin Core format + + + + Even search using PQF queries using the extended naughty + verb x-pquery + + + + Or scan indexes using the extended extremely naughty + verb x-pScanClause + + Don't do this in production code! + But it's a great fast debugging aid. + + + + + Initialization, Present, Sort, Close + + In the Z39.50 protocol, Initialization, Present, Sort and Close + are separate operations. In SRU, however, these operations do not + exist. + + + + + SRU has no explicit initialization handshake phase, but + commences immediately with searching, scanning and explain + operations. + + + + + Neither does SRU have a close operation, since the protocol is + stateless and each request is self-contained. (It is true that + multiple SRU request/response pairs may be implemented as + multiple HTTP request/response pairs over a single persistent + TCP/IP connection; but the closure of that connection is not a + protocol-level operation.) + + + + + Retrieval in SRU is part of the + searchRetrieve operation, in which a search + is submitted and the response includes a subset of the records + in the result set. There is no direct analogue of Z39.50's + Present operation which requests records from an established + result set. In SRU, this is achieved by sending a subsequent + searchRetrieve request with the query + cql.resultSetId=id where + id is the identifier of the previously + generated result-set. + + + + + Sorting in CQL is done within the + searchRetrieve operation - in v1.1, by an + explicit sort parameter, but the forthcoming + v1.2 or v2.0 will most likely use an extension of the query + language, CQL for sorting: see + + + + + + It can be seen, then, that while Zebra operating as an SRU server + does not provide the same set of operations as when operating as a + Z39.50 server, it does provide equivalent functionality. + + + + + + +