X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fquerymodel.xml;h=831eeff71e5e7e94013c342a9a0a897c37183af8;hb=930d9dc74a2bee5037ee1372dd5de4a6963ecd6b;hp=ff0107b15ac4df5c6f1cebaa2a4b36874aaa6126;hpb=d2a2cd0f1bb0a5f31bfc59039012efae30d8955e;p=idzebra-moved-to-github.git diff --git a/doc/querymodel.xml b/doc/querymodel.xml index ff0107b..831eeff 100644 --- a/doc/querymodel.xml +++ b/doc/querymodel.xml @@ -1,5 +1,5 @@ - + Query Model @@ -210,7 +210,7 @@ bib-1 Standard PQF query language attribute set which defines the semantics of Z39.50 searching. In addition, all of the - non-use attributes (type 2-9) define the hard-wired + non-use attributes (types 2-11) define the hard-wired Zebra internal query processing. default @@ -227,7 +227,7 @@ idxpath Hardwired XPATH like attribute set, only available for indexing with the GRS record model - depreciated + deprecated --> @@ -350,7 +350,7 @@ Atomic (APT) queries are always leaf nodes in the PQF query tree. - UN-supplied non-use attributes type 2-9 are either inherited from + UN-supplied non-use attributes types 2-11 are either inherited from higher nodes in the query tree, or are set to Zebra's default values. See for details. @@ -607,7 +607,8 @@ Escaping PQF keywords and other non-parseable XPath constructs - with '{ }' to prevent syntax errors: + with '{ }' to prevent client-side PQF parsing + syntax errors: Z> find @attr {1=/root/first[@attr='danish']} content Z> find @attr {1=/record/@set} oai @@ -785,13 +786,35 @@ tab/dan1.att, tab/explain.att, and tab/gils.att. + + + For example, some few Bib-1 use + attributes from the tab/bib1.att are: + + att 1 Personal-name + att 2 Corporate-name + att 3 Conference-name + att 4 Title + ... + att 1009 Subject-name-personal + att 1010 Body-of-text + att 1011 Date/time-added-to-db + ... + att 1016 Any + att 1017 Server-choice + att 1018 Publisher + ... + att 1035 Anywhere + att 1036 Author-Title-Subject + + + New attribute sets can be added by adding new tab/*.att configuration files, which need to - be sourced in the main configuration zebra.cfg. + be sourced in the main configuration zebra.cfg. - - In addition, Zebra allows the access of + In addition, Zebra allows the access of internal index names and dynamic XPath as use attributes; see and @@ -994,7 +1017,7 @@ Any position in field 3 - default + supported @@ -1002,9 +1025,9 @@ The position attribute values first in field (1), and first in subfield(2) are unsupported. - Using them does not trigger an error, but silently defaults to - any position in field (3). - + Using them silently maps to + any position in field (3). A proper diagnostic + should have been issued. @@ -1349,7 +1372,7 @@ Complete subfield 2 - depreciated + deprecated Complete field @@ -1535,9 +1558,21 @@ + + + + + Zebra Extension Rank Weight Attribute (type 9) @@ -1575,31 +1613,46 @@ - Zebra Extension Approximative Limit Attribute (type 9) + Zebra Extension Approximative Limit Attribute (type 11) - Newer Zebra versions normally estimate hit count for every APT + Zebra computes - unless otherwise configured - + the exact hit count for every APT (leaf) in the query tree. These hit counts are returned as part of the searchResult-1 facility in the binary encoded Z39.50 search response packages. - By setting a limit for the APT we can make Zebra turn into - approximate hit count when a certain hit count limit is - reached. A value of zero means exact hit count. + By setting an estimation limit size of the resultset of the APT + leaves, Zebra stoppes processing the result set when the limit + length is reached. + Hit counts under this limit are still precise, but hit counts over it + are estimated using the statistics gathered from the chopped + result set. + + + Specifying a limit of 0 resuts in exact hit counts. For example, we might be interested in exact hit count for a, but for b we allow hit count estimates for 1000 and higher. - Z> find @and a @attr 9=1000 b + Z> find @and a @attr 11=1000 b The estimated hit count facility makes searches faster, as one only needs to process large hit lists partially. + It is mostly used in huge databases, where you you want trade + exactness of hit counts against speed of execution. + Do not use approximative hit count limits + in conjunction with relevance ranking, as re-sorting of the + result set obviosly only works when the entire result set has + been processed. + + This facility clashes with rank weight, because there all documents in the hit lists need to be examined for scoring and re-sorting. @@ -1708,11 +1761,11 @@ - Zebra Extension Approximative Limit (type 9) + Zebra Extension Approximative Limit (type 11) The Zebra Extension Approximative Limit (type - 9) is a way to enable approximate + 11) is a way to enable approximate hit counts for scan hit counts, in the same way as for search hit counts. @@ -1745,7 +1798,7 @@ main Zebra configuration file zebra.cfg directive attset: idxpath.att must be enabled. - The idxpath is depreciated, may not be + The idxpath is deprecated, may not be supported in future Zebra versions, and should definitely not be used in production code. @@ -1778,31 +1831,31 @@ XPATH Begin 1 _XPATH_BEGIN - depreciated + deprecated XPATH End 2 _XPATH_END - depreciated + deprecated XPATH CData 1016 _XPATH_CDATA - depreciated + deprecated XPATH Attribute Name 3 _XPATH_ATTR_NAME - depreciated + deprecated XPATH Attribute CData 1015 _XPATH_ATTR_CDATA - depreciated + deprecated