X-Git-Url: http://git.indexdata.com/?p=idzebra-moved-to-github.git;a=blobdiff_plain;f=doc%2Fzebrasrv.xml;fp=doc%2Fzebrasrv.xml;h=20d8c3e6a5fd9893400bd9fecf056de38711eab7;hp=3e7a1542d15926d6aca608722598e9663647822c;hb=972bceaa6386f904bc3e4845f1c5598656c5c6f2;hpb=bb39ca3dd76e6339f66813bca1e64b644760e5a2 diff --git a/doc/zebrasrv.xml b/doc/zebrasrv.xml index 3e7a154..20d8c3e 100644 --- a/doc/zebrasrv.xml +++ b/doc/zebrasrv.xml @@ -1,4 +1,4 @@ - @@ -20,7 +20,7 @@ 8 Commands - + zebrasrv Zebra Server @@ -29,17 +29,17 @@ &zebrasrv-synopsis; - DESCRIPTION + DESCRIPTION Zebra is a high-performance, general-purpose structured text indexing and retrieval engine. It reads structured records in a variety of input formats (e.g. email, &acro.xml;, &acro.marc;) and allows access to them through exact - boolean search expressions and relevance-ranked free-text queries. + boolean search expressions and relevance-ranked free-text queries. zebrasrv is the &acro.z3950; and &acro.sru; frontend server for the Zebra search engine and indexer. - - + + On Unix you can run the zebrasrv server from the command line - and put it in the background. It may also operate under the inet daemon. @@ -49,17 +49,17 @@ OPTIONS - - + + The options for zebrasrv are the same as those for &yaz;' yaz-ztest. Option -c specifies a Zebra configuration file - if omitted zebra.cfg is read. - + &zebrasrv-options; - + &acro.z3950; Protocol Support and Behavior @@ -78,7 +78,7 @@ &acro.z3950; Search - + The supported query type are 1 and 101. All operators are currently supported with the restriction that only proximity units of type "word" @@ -88,14 +88,14 @@ without limitations. Searches may span multiple databases. - + The server has full support for piggy-backed retrieval (see also the following section). - - + + &acro.z3950; Present @@ -123,7 +123,7 @@ Of these Zebra supports the attribute specification type in which case the use attribute specifies the "Sort register". Sort registers are created for those fields that are of type "sort" in - the default.idx file. + the default.idx file. The corresponding character mapping file in default.idx specifies the ordinal of each character used in the actual sort. @@ -149,25 +149,25 @@ timeout. - - - &acro.z3950; Explain - - Zebra maintains a "classic" + + + &acro.z3950; Explain + + Zebra maintains a "classic" &acro.z3950; Explain database - on the side. + on the side. This database is called IR-Explain-1 and can be searched using the attribute set exp-1. - The records in the explain database are of type + The records in the explain database are of type grs.sgml. - The root element for the Explain grs.sgml records is - explain, thus + The root element for the Explain grs.sgml records is + explain, thus explain.abs is used for indexing. - + Zebra must be able to locate explain.abs in order to index the Explain records properly. Zebra will work without it but the information @@ -181,20 +181,20 @@ In addition to &acro.z3950;, Zebra supports the more recent and web-friendly IR protocol &acro.sru;. - &acro.sru; can be carried over &acro.soap; or a &acro.rest;-like protocol - that uses HTTP &acro.get; or &acro.post; to request search responses. The request - itself is made of parameters such as - query, - startRecord, - maximumRecords - and - recordSchema; - the response is an &acro.xml; document containing hit-count, result-set - records, diagnostics, etc. &acro.sru; can be thought of as a re-casting - of &acro.z3950; semantics in web-friendly terms; or as a standardisation - of the ad-hoc query parameters used by search engines such as Google - and AltaVista; or as a superset of A9's OpenSearch (which it - predates). + &acro.sru; can be carried over &acro.soap; or a &acro.rest;-like protocol + that uses HTTP &acro.get; or &acro.post; to request search responses. The request + itself is made of parameters such as + query, + startRecord, + maximumRecords + and + recordSchema; + the response is an &acro.xml; document containing hit-count, result-set + records, diagnostics, etc. &acro.sru; can be thought of as a re-casting + of &acro.z3950; semantics in web-friendly terms; or as a standardisation + of the ad-hoc query parameters used by search engines such as Google + and AltaVista; or as a superset of A9's OpenSearch (which it + predates). Zebra supports &acro.z3950;, &acro.sru; &acro.get;, SRU &acro.post;, SRU &acro.soap; (&acro.srw;) @@ -202,96 +202,96 @@ requests and handling them accordingly. This is a achieved through the use of Deep Magic; civilians are warned not to stand too close. - - Running zebrasrv as an &acro.sru; Server - - Because Zebra supports all protocols on one port, it would - seem to follow that the &acro.sru; server is run in the same way as - the &acro.z3950; server, as described above. This is true, but only in - an uninterestingly vacuous way: a Zebra server run in this manner - will indeed recognise and accept &acro.sru; requests; but since it - doesn't know how to handle the &acro.cql; queries that these protocols - use, all it can do is send failure responses. - - + + Running zebrasrv as an &acro.sru; Server - It is possible to cheat, by having &acro.sru; search Zebra with - a &acro.pqf; query instead of &acro.cql;, using the - x-pquery - parameter instead of - query. - This is a - non-standard extension - of &acro.cql;, and a - very naughty - thing to do, but it does give you a way to see Zebra serving &acro.sru; - ``right out of the box''. If you start your favourite Zebra - server in the usual way, on port 9999, then you can send your web - browser to: - - - http://localhost:9999/Default?version=1.1 + Because Zebra supports all protocols on one port, it would + seem to follow that the &acro.sru; server is run in the same way as + the &acro.z3950; server, as described above. This is true, but only in + an uninterestingly vacuous way: a Zebra server run in this manner + will indeed recognise and accept &acro.sru; requests; but since it + doesn't know how to handle the &acro.cql; queries that these protocols + use, all it can do is send failure responses. + + + + It is possible to cheat, by having &acro.sru; search Zebra with + a &acro.pqf; query instead of &acro.cql;, using the + x-pquery + parameter instead of + query. + This is a + non-standard extension + of &acro.cql;, and a + very naughty + thing to do, but it does give you a way to see Zebra serving &acro.sru; + ``right out of the box''. If you start your favourite Zebra + server in the usual way, on port 9999, then you can send your web + browser to: + + + http://localhost:9999/Default?version=1.1 &operation=searchRetrieve &x-pquery=mineral &startRecord=1 &maximumRecords=1 - + + + This will display the &acro.xml;-formatted &acro.sru; response that includes the + first record in the result-set found by the query + mineral. (For clarity, the &acro.sru; URL is shown + here broken across lines, but the lines should be joined together + to make single-line URL for the browser to submit.) + + - This will display the &acro.xml;-formatted &acro.sru; response that includes the - first record in the result-set found by the query - mineral. (For clarity, the &acro.sru; URL is shown - here broken across lines, but the lines should be joined together - to make single-line URL for the browser to submit.) + In order to turn on Zebra's support for &acro.cql; queries, it's necessary + to have the &yaz; generic front-end (which Zebra uses) translate them + into the &acro.z3950; Type-1 query format that is used internally. And + to do this, the generic front-end's own configuration file must be + used. See ; + the salient point for &acro.sru; support is that + zebrasrv + must be started with the + -f frontendConfigFile + option rather than the + -c zebraConfigFile + option, + and that the front-end configuration file must include both a + reference to the Zebra configuration file and the &acro.cql;-to-&acro.pqf; + translator configuration file. - - - In order to turn on Zebra's support for &acro.cql; queries, it's necessary - to have the &yaz; generic front-end (which Zebra uses) translate them - into the &acro.z3950; Type-1 query format that is used internally. And - to do this, the generic front-end's own configuration file must be - used. See ; - the salient point for &acro.sru; support is that - zebrasrv - must be started with the - -f frontendConfigFile - option rather than the - -c zebraConfigFile - option, - and that the front-end configuration file must include both a - reference to the Zebra configuration file and the &acro.cql;-to-&acro.pqf; - translator configuration file. - - - A minimal front-end configuration file that does this would read as - follows: - - - - - zebra.cfg - ../../tab/pqf.properties - - -]]> - - The - <config> - element contains the name of the Zebra configuration file that was - previously specified by the - -c - command-line argument, and the - <cql2rpn> - element contains the name of the &acro.cql; properties file specifying how - various &acro.cql; indexes, relations, etc. are translated into Type-1 - queries. - - - A zebra server running with such a configuration can then be - queried using proper, conformant &acro.sru; URLs with &acro.cql; queries: - - - http://localhost:9999/Default?version=1.1 + + A minimal front-end configuration file that does this would read as + follows: + + + + + zebra.cfg + ../../tab/pqf.properties + + + ]]> + + The + <config> + element contains the name of the Zebra configuration file that was + previously specified by the + -c + command-line argument, and the + <cql2rpn> + element contains the name of the &acro.cql; properties file specifying how + various &acro.cql; indexes, relations, etc. are translated into Type-1 + queries. + + + A zebra server running with such a configuration can then be + queried using proper, conformant &acro.sru; URLs with &acro.cql; queries: + + + http://localhost:9999/Default?version=1.1 &operation=searchRetrieve &query=title=utah and description=epicent* &startRecord=1 @@ -306,11 +306,11 @@ &acro.cql; version 1.1. In particular, it provides support for the following elements of the protocol. - + &acro.sru; Search and Retrieval - Zebra supports the + Zebra supports the &acro.sru; searchRetrieve operation. @@ -319,7 +319,7 @@ query language, &acro.cql;, and that all conforming implementations can therefore be trusted to correctly interpret the same queries. It is with some shame, then, that we admit that Zebra also supports - an additional query language, our own Prefix Query Format + an additional query language, our own Prefix Query Format (&acro.pqf;). A &acro.pqf; query is submitted by using the extension parameter x-pquery, @@ -332,7 +332,7 @@ query parameter. - + &acro.sru; Scan @@ -365,11 +365,11 @@ in the &yaz; Frontend Server configuration file that is described in the . - - Unfortunately, the data found in the + + Unfortunately, the data found in the &acro.cql;-to-&acro.pqf; text file must be added by hand-craft into the explain section of the &yaz; Frontend Server configuration file to be able - to provide a suitable explain record. + to provide a suitable explain record. Too bad, but this is all extreme new alpha stuff, and a lot of work has yet to be done .. @@ -415,7 +415,7 @@ Present operation which requests records from an established result set. In &acro.sru;, this is achieved by sending a subsequent searchRetrieve request with the query - cql.resultSetId=id where + cql.resultSetId=id where id is the identifier of the previously generated result-set. @@ -437,49 +437,49 @@ - + - &acro.sru; Examples - - Surf into http://localhost:9999 - to get an explain response, or use - - - - See number of hits for a query - - - - Fetch record 5-7 in Dublin Core format - - - - Even search using &acro.pqf; queries using the extended naughty - parameter x-pquery - - - - Or scan indexes using the extended extremely naughty - parameter x-pScanClause - - Don't do this in production code! - But it's a great fast debugging aid. - + &acro.sru; Examples + + Surf into http://localhost:9999 + to get an explain response, or use + + + + See number of hits for a query + + + + Fetch record 5-7 in Dublin Core format + + + + Even search using &acro.pqf; queries using the extended naughty + parameter x-pquery + + + + Or scan indexes using the extended extremely naughty + parameter x-pScanClause + + Don't do this in production code! + But it's a great fast debugging aid. + @@ -494,7 +494,7 @@ 1 - +