X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fserver.xml;h=dd0a9e9ac244681d659a91a13c21dd0b553af0b9;hb=558bf94a5f36eb89b0ca7ac4780b641da852c36b;hp=83a5e3c626fd17b216e8fedb7f4027eb0184274b;hpb=b367b4068098c28aa284c447edf42ba2c8d7c42b;p=idzebra-moved-to-github.git diff --git a/doc/server.xml b/doc/server.xml index 83a5e3c..dd0a9e9 100644 --- a/doc/server.xml +++ b/doc/server.xml @@ -1,5 +1,5 @@ - + The Z39.50 Server @@ -16,7 +16,7 @@ zebrasrv manpage --> - DESCRIPTION + Description Zebra is a high-performance, general-purpose structured text indexing and retrieval engine. It reads structured records in a variety of input formats (eg. email, XML, MARC) and allows access to them through exact @@ -36,12 +36,12 @@ - SYNOPSIS + Synopsis &zebrasrv-synopsis; - OPTIONS + Options The options for zebrasrv are the same @@ -52,19 +52,13 @@ &zebrasrv-options; - VIRTUAL HOSTS - - zebrasrv uses the YAZ server frontend and does - support multiple virtual servers behind multiple listening sockets. - - &zebrasrv-virtual; - - FILES + + Files zebra.cfg - SEE ALSO + See Also zebraidx @@ -76,18 +70,6 @@ - Section "The Z39.50 Server" in the Zebra manual. - http://www.indexdata.dk/zebra/doc/server.tkl - - - Section "Virtual Hosts" in the YAZ manual. - http://www.indexdata.dk/yaz/doc/server.vhosts.tkl - - - Section "Specification of CQL to RPN mappings" in the YAZ manual. - http://www.indexdata.dk/yaz/doc/tools.tkl#tools.cql.map - - The Zebra software is Copyright Index Data http://www.indexdata.dk and distributed under the @@ -260,245 +242,8 @@ also the following section). - - Use attributes are interpreted according to the - attribute sets which have been loaded in the - zebra.cfg file, and are matched against specific - fields as specified in the .abs file which - describes the profile of the records which have been loaded. - If no Use attribute is provided, a default of Bib-1 Any is assumed. - - - - If a Structure attribute of - Phrase is used in conjunction with a - Completeness attribute of - Complete (Sub)field, the term is matched - against the contents of the phrase (long word) register, if one - exists for the given Use attribute. - A phrase register is created for those fields in the - .abs file that contains a - p-specifier. - - - - - If Structure=Phrase is - used in conjunction with Incomplete Field - the - default value for Completeness, the - search is directed against the normal word registers, but if the term - contains multiple words, the term will only match if all of the words - are found immediately adjacent, and in the given order. - The word search is performed on those fields that are indexed as - type w in the .abs file. - - - - If the Structure attribute is - Word List, - Free-form Text, or - Document Text, the term is treated as a - natural-language, relevance-ranked query. - This search type uses the word register, i.e. those fields - that are indexed as type w in the - .abs file. - - - - If the Structure attribute is - Numeric String the term is treated as an integer. - The search is performed on those fields that are indexed - as type n in the .abs file. - - - - If the Structure attribute is - URx the term is treated as a URX (URL) entity. - The search is performed on those fields that are indexed as type - u in the .abs file. - - - - If the Structure attribute is - Local Number the term is treated as - native Zebra Record Identifier. - - - - If the Relation attribute is - Equals (default), the term is matched - in a normal fashion (modulo truncation and processing of - individual words, if required). - If Relation is Less Than, - Less Than or Equal, - Greater than, or Greater than or - Equal, the term is assumed to be numerical, and a - standard regular expression is constructed to match the given - expression. - If Relation is Relevance, - the standard natural-language query processor is invoked. - - - - For the Truncation attribute, - No Truncation is the default. - Left Truncation is not supported. - Process # in search term is supported, as is - Regxp-1. - Regxp-2 enables the fault-tolerant (fuzzy) - search. As a default, a single error (deletion, insertion, - replacement) is accepted when terms are matched against the register - contents. - - - - Regular expressions - - - Each term in a query is interpreted as a regular expression if - the truncation value is either Regxp-1 (102) - or Regxp-2 (103). - Both query types follow the same syntax with the operands: - - - - x - - - Matches the character x. - - - - - . - - - Matches any character. - - - - - [..] - - - Matches the set of characters specified; - such as [abc] or [a-c]. - - - - - and the operators: - - - - x* - - - Matches x zero or more times. Priority: high. - - - - - x+ - - - Matches x one or more times. Priority: high. - - - - - x? - - - Matches x zero or once. Priority: high. - - - - - xy - - - Matches x, then y. - Priority: medium. - - - - - x|y - - - Matches either x or y. - Priority: low. - - - - - The order of evaluation may be changed by using parentheses. - - - - If the first character of the Regxp-2 query - is a plus character (+) it marks the - beginning of a section with non-standard specifiers. - The next plus character marks the end of the section. - Currently Zebra only supports one specifier, the error tolerance, - which consists one digit. - - - - Since the plus operator is normally a suffix operator the addition to - the query syntax doesn't violate the syntax for standard regular - expressions. - - - - - - Query examples - - - Phrase search for information retrieval in - the title-register: - - @attr 1=4 "information retrieval" - - - - - Ranked search for the same thing: - - @attr 1=4 @attr 2=102 "Information retrieval" - - - - - Phrase search with a regular expression: - - @attr 1=4 @attr 5=102 "informat.* retrieval" - - - - - Ranked search with a regular expression: - - @attr 1=4 @attr 5=102 @attr 2=102 "informat.* retrieval" - - - - - In the GILS schema (gils.abs), the - west-bounding-coordinate is indexed as type n, - and is therefore searched by specifying - structure=Numeric String. - To match all those records with west-bounding-coordinate greater - than -114 we use the following query: - - @attr 4=109 @attr 2=5 @attr gils 1=2038 -114 - - - - - + + Present @@ -552,6 +297,32 @@ timeout. + + + Explain + + Zebra maintains a "classic" + Explain database + on the side. + This database is called IR-Explain-1 and can be + searched using the attribute set exp-1. + + + The records in the explain database are of type + grs.sgml. + The root element for the Explain grs.sgml records is + explain, thus + explain.abs is used for indexing. + + + + Zebra must be able to locate + explain.abs in order to index the Explain + records properly. Zebra will work without it but the information + will not be searchable. + + + @@ -627,11 +398,11 @@ browser to: - http://localhost:9999/Default?version=1.1& - operation=searchRetrieve& - x-pquery=mineral& - startRecord=1& - maximumRecords=1 + http://localhost:9999/Default?version=1.1 + &operation=searchRetrieve + &x-pquery=mineral + &startRecord=1 + &maximumRecords=1 This will display the XML-formatted SRU response that includes the @@ -683,6 +454,17 @@ various CQL indexes, relations, etc. are translated into Type-1 queries. + + A zebra server running with such a configuration can then be + queried using proper, conformant SRU URLs with CQL queries: + + + http://localhost:9999/Default?version=1.1 + &operation=searchRetrieve + &query=title=utah and description=epicent* + &startRecord=1 + &maximumRecords=1 + @@ -708,24 +490,110 @@ is with some shame, then, that we admit that Zebra also supports an additional query language, our own Prefix Query Format (PQF, ). -x-pquery - - + A PQF query is submitted by using the extension parameter + x-pquery, + in which case the + query + parameter must be omitted, which makes the request not valid SRU. + Please don't do this. Scan - ### + Zebra does not support SRU's + scan + operation, as described at + + + + This is a rather embarrassing surprise as the pieces are all + there: Z39.50 scan is supported, and SRU scan requests are + recognised and diagnosed. To add further to the embarrassment, a + mutant form of SRU scan is supported, using + the non-standard x-pScanClause parameter in + place of the standard scanClause to scan on a + PQF query clause. Explain - ### + Zebra fully supports SRU's core + explain + operation, as described at + + + The ZeeRex record explaining a database may be requested either + with a fully fledged SRU request (with + operation=explain + and version-number specified) + or with a simple HTTP GET at the server's basename. + The ZeeRex record returned in response is the one embedded + in the YAZ Frontend Server configuration file that is described in the + Virtual Hosts documentation. + + + Unfortunately, the data found in the + CQL-to-PQF text file must be added by hand-craft into the explain + section of the YAZ Frontend Server configuration file to be able + to provide a suitable explain record. + Too bad, but this is all extreme + new alpha stuff, and a lot of work has yet to be done .. + + + There is no linkeage whatsoever between the Z39.50 explain model + and the SRU/SRW explain response (well, at least not implemented + in Zebra, that is ..). Zebra does not provide a means using + Z39.50 to obtain the ZeeRex record. + + + + + Some SRU Examples + + Surf into http://localhost:9999 + to get an explain response, or use + + + + See number of hits for a query + + + + Fetch record 5-7 in Dublin Core format + + + + Even search using PQF queries using the extended naughty + verb x-pquery + + + + Or scan indexes using the extended extremely naughty + verb x-pScanClause + + Don't do this in production code! + But it's a great fast debugging aid. +