X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fquerymodel.xml;h=b3df12e3da4e077b27ca229810020a11ddae71b2;hb=b19b79e382ef8196f1625763db1af3a82b1e0c81;hp=afdb4078ceae66d6eb96dc5f100c4e03a8c6dfb9;hpb=5ca4e60e990af6ad6b62ebff855d7b642f37c3ec;p=idzebra-moved-to-github.git diff --git a/doc/querymodel.xml b/doc/querymodel.xml index afdb407..b3df12e 100644 --- a/doc/querymodel.xml +++ b/doc/querymodel.xml @@ -1,5 +1,5 @@ - + Query Model
@@ -11,18 +11,18 @@ &zebra; is born as a networking Information Retrieval engine adhering to the international standards - Z39.50 and - SRU, + &z3950; and + &sru;, and implement the - type-1 Reverse Polish Notation (RPN) query + type-1 Reverse Polish Notation (&rpn;) query model defined there. Unfortunately, this model has only defined a binary encoded representation, which is used as transport packaging in - the Z39.50 protocol layer. This representation is not human + the &z3950; protocol layer. This representation is not human readable, nor defines any convenient way to specify queries. - Since the type-1 (RPN) + Since the type-1 (&rpn;) query structure has no direct, useful string representation, every client application needs to provide some form of mapping from a local query notation or representation to it. @@ -30,33 +30,33 @@
- Prefix Query Format (PQF) + Prefix Query Format (&pqf;) Index Data has defined a textual representation in the Prefix Query Format, short - PQF, which maps + &pqf;, which maps one-to-one to binary encoded - type-1 RPN queries. - PQF has been adopted by other - parties developing Z39.50 software, and is often referred to as + type-1 &rpn; queries. + &pqf; has been adopted by other + parties developing &z3950; software, and is often referred to as Prefix Query Notation, or in short - PQN. See + &pqn;. See for further explanations and descriptions of &zebra;'s capabilities.
- Common Query Language (CQL) + Common Query Language (&cql;) - The query model of the type-1 RPN, - expressed in PQF/PQN is natively supported. - On the other hand, the default SRU + The query model of the type-1 &rpn;, + expressed in &pqf;/&pqn; is natively supported. + On the other hand, the default &sru; web services Common Query Language - CQL is not natively supported. + &cql; is not natively supported. - &zebra; can be configured to understand and map CQL to PQF. See + &zebra; can be configured to understand and map &cql; to &pqf;. See .
@@ -67,7 +67,7 @@ Operation types &zebra; supports all of the three different - Z39.50/SRU operations defined in the + &z3950;/&sru; operations defined in the standards: explain, search, and scan. A short description of the functionality and purpose of each is quite in order here. @@ -76,7 +76,7 @@
Explain Operation - The syntax of Z39.50/SRU queries is + The syntax of &z3950;/&sru; queries is well known to any client, but the specific semantics - taking into account a particular servers functionalities and abilities - must be @@ -89,15 +89,15 @@ of the general query model are supported. - The Z39.50 embeds the explain operation + The &z3950; embeds the explain operation by performing a search in the magic IR-Explain-1 database; see . - In SRU, explain is an entirely separate - operation, which returns an ZeeRex XML record according to the + In &sru;, explain is an entirely separate + operation, which returns an ZeeRex &xml; record according to the structure defined by the protocol. @@ -117,7 +117,7 @@ simple free text searches to nested complex boolean queries, targeting specific indexes, and possibly enhanced with many query semantic specifications. Search interactions are the heart - and soul of Z39.50/SRU servers. + and soul of &z3950;/&sru; servers.
@@ -145,24 +145,24 @@
- RPN queries and semantics + &rpn; queries and semantics - The PQF grammar - is documented in the YAZ manual, and shall not be - repeated here. This textual PQF representation + The &pqf; grammar + is documented in the &yaz; manual, and shall not be + repeated here. This textual &pqf; representation is not transmistted to &zebra; during search, but it is in the - client mapped to the equivalent Z39.50 binary + client mapped to the equivalent &z3950; binary query parse tree.
- RPN tree structure + &rpn; tree structure - The RPN parse tree - or the equivalent textual representation in PQF - + The &rpn; parse tree - or the equivalent textual representation in &pqf; - may start with one specification of the attribute set used. Following is a query tree, which - consists of atomic query parts (APT) or + consists of atomic query parts (&apt;) or named result sets, eventually paired by boolean binary operators, and finally recursively combined into @@ -184,7 +184,7 @@ Attribute set - PQF notation (Short hand) + &pqf; notation (Short hand) Status Notes @@ -201,10 +201,10 @@ predefined - Bib-1 + &bib1; bib-1 - Standard PQF query language attribute set which defines the - semantics of Z39.50 searching. In addition, all of the + Standard &pqf; query language attribute set which defines the + semantics of &z3950; searching. In addition, all of the non-use attributes (types 2-12) define the hard-wired &zebra; internal query processing. @@ -213,15 +213,15 @@ GILS gils - Extension to the Bib-1 attribute set. + Extension to the &bib1; attribute set. predefined @@ -238,7 +238,7 @@ The &zebra; internal query processing is modeled after - the Bib-1 attribute set, and the non-use + the &bib1; attribute set, and the non-use attributes type 2-6 are hard-wired in. It is therefore essential to be familiar with . @@ -317,7 +317,7 @@ retrieval, taking proximity into account: The hit set is a subset of the corresponding AND query - (see the PQF grammar for + (see the &pqf; grammar for details on the proximity operator): Z> find @prox 0 3 0 2 k 2 information retrieval @@ -338,23 +338,23 @@
- Atomic queries (APT) + Atomic queries (&apt;) Atomic queries are the query parts which work on one access point only. These consist of an attribute list followed by a single term or a quoted term list, and are often called - Attributes-Plus-Terms (APT) queries. + Attributes-Plus-Terms (&apt;) queries. - Atomic (APT) queries are always leaf nodes in the PQF query tree. + Atomic (&apt;) queries are always leaf nodes in the &pqf; query tree. UN-supplied non-use attributes types 2-12 are either inherited from higher nodes in the query tree, or are set to &zebra;'s default values. See for details. - Atomic queries (APT) + Atomic queries (&apt;) @@ -407,7 +407,7 @@ The scan operation is only supported with - atomic APT queries, as it is bound to one access point at a + atomic &apt; queries, as it is bound to one access point at a time. Boolean query trees are not allowed during scan. @@ -429,8 +429,8 @@ Named result sets are supported in &zebra;, and result sets can be used as operands without limitations. It follows that named - result sets are leaf nodes in the PQF query tree, exactly as - atomic APT queries are. + result sets are leaf nodes in the &pqf; query tree, exactly as + atomic &apt; queries are. After the execution of a search, the result set is available at @@ -460,10 +460,10 @@ - Named result sets are only supported by the Z39.50 protocol. - The SRU web service is stateless, and therefore the notion of + Named result sets are only supported by the &z3950; protocol. + The &sru; web service is stateless, and therefore the notion of named result sets does not exist when accessing a &zebra; server by - the SRU protocol. + the &sru; protocol. @@ -501,7 +501,7 @@ It is possible to search in any silly string index - if it's defined in your - indexation rules and can be parsed by the PQF parser. + indexation rules and can be parsed by the &pqf; parser. This is definitely not the recommended use of this facility, as it might confuse your users with some very unexpected results. @@ -512,14 +512,14 @@ See also for details, and - for the SRU PQF query extension using string names as a fast + for the &sru; &pqf; query extension using string names as a fast debugging facility.
&zebra;'s special access point of type 'XPath' - for GRS filters + for &grs1; filters As we have seen above, it is possible (albeit seldom a great idea) to emulate @@ -531,15 +531,15 @@ be defined at indexation time, no new undefined XPath queries can entered at search time, and second, it might confuse users very much that an XPath-alike index name in fact - gets populated from a possible entirely different XML element + gets populated from a possible entirely different &xml; element than it pretends to access. - When using the GRS Record Model + When using the &grs1; Record Model (see ), we have the possibility to embed life XPath expressions - in the PQF queries, which are here called + in the &pqf; queries, which are here called use (type 1) xpath attributes. You must enable the xpath enable directive in your @@ -549,14 +549,14 @@ Only a very restricted subset of the XPath 1.0 - standard is supported as the GRS record model is simpler than - a full XML DOM structure. See the following examples for + standard is supported as the &grs1; record model is simpler than + a full &xml; &dom; structure. See the following examples for possibilities. Finding all documents which have the term "content" - inside a text node found in a specific XML DOM + inside a text node found in a specific &xml; &dom; subtree, whose starting element is addressed by XPath. @@ -586,7 +586,7 @@ Filter the addressing XPath by a predicate working on exact string values in - attributes (in the XML sense) can be done: return all those docs which + attributes (in the &xml; sense) can be done: return all those docs which have the term "english" contained in one of all text sub nodes of the subtree defined by the XPath /record/title[@lang='en']. And similar @@ -607,8 +607,8 @@ - Escaping PQF keywords and other non-parseable XPath constructs - with '{ }' to prevent client-side PQF parsing + Escaping &pqf; keywords and other non-parseable XPath constructs + with '{ }' to prevent client-side &pqf; parsing syntax errors: Z> find @attr {1=/root/first[@attr='danish']} content @@ -630,7 +630,7 @@
Explain Attribute Set - The Z39.50 standard defines the + The &z3950; standard defines the Explain attribute set Exp-1, which is used to discover information about a server's search semantics and functional capabilities @@ -644,11 +644,11 @@ In addition, the non-Use - Bib-1 attributes, that is, the types + &bib1; attributes, that is, the types Relation, Position, Structure, Truncation, and Completeness are imported from - the Bib-1 attribute set, and may be used + the &bib1; attribute set, and may be used within any explain query. @@ -669,7 +669,7 @@ See tab/explain.att and the - Z39.50 standard + &z3950; standard for more information.
@@ -678,11 +678,11 @@ Explain searches with yaz-client Classic Explain only defines retrieval of Explain information - via ASN.1. Practically no Z39.50 clients supports this. Fortunately + via ASN.1. Practically no &z3950; clients supports this. Fortunately they don't have to - &zebra; allows retrieval of this information in other formats: - SUTRS, XML, - GRS-1 and ASN.1 Explain. + &sutrs;, &xml;, + &grs1; and ASN.1 Explain. @@ -743,7 +743,7 @@ Default. This query is very useful to study the internal &zebra; indexes. If records have been indexed using the alvis - XSLT filter, the string representation names of the known indexes can be + &xslt; filter, the string representation names of the known indexes can be found. Z> base IR-Explain-1 @@ -760,13 +760,13 @@
- Bib-1 Attribute Set + &bib1; Attribute Set Most of the information contained in this section is an excerpt of - the ATTRIBUTE SET BIB-1 (Z39.50-1995) SEMANTICS - found at . The Bib-1 + the ATTRIBUTE SET &bib1; (&z3950;-1995) SEMANTICS + found at . The &bib1; Attribute Set Semantics from 1995, also in an updated - Bib-1 + &bib1; Attribute Set version from 2003. Index Data is not the copyright holder of this information, except for the configuration details, the listing of @@ -788,7 +788,7 @@ tab/gils.att. - For example, some few Bib-1 use + For example, some few &bib1; use attributes from the tab/bib1.att are: att 1 Personal-name @@ -979,7 +979,7 @@ AlwaysMatches (103) is a great way to discover how many documents have been indexed in a given field. The search term is ignored, but needed for correct - PQF syntax. An empty search term may be supplied. + &pqf; syntax. An empty search term may be supplied. Z> find @attr 1=Title @attr 2=103 "" Z> find @attr 1=Title @attr 2=103 @attr 4=1 "" @@ -1159,7 +1159,7 @@ is supported, and maps to the boolean AND combination of words supplied. The word list is useful when google-like bag-of-word queries need to be translated from a GUI - query language to PQF. For example, the following queries + query language to &pqf;. For example, the following queries are equivalent: Z> find @attr 1=Title @attr 4=6 "mozart amadeus" @@ -1213,7 +1213,7 @@ - The exact mapping between PQF queries and &zebra; internal indexes + The exact mapping between &pqf; queries and &zebra; internal indexes and index types is explained in . @@ -1408,14 +1408,14 @@ The Complete subfield (2) is a reminiscens - from the happy MARC + from the happy &marc; binary format days. &zebra; does not support it, but maps silently to Complete field (3). - The exact mapping between PQF queries and &zebra; internal indexes + The exact mapping between &pqf; queries and &zebra; internal indexes and index types is explained in . @@ -1427,7 +1427,7 @@
- Extended &zebra; RPN Features + Extended &zebra; &rpn; Features The &zebra; internal query engine has been extended to specific needs not covered by the bib-1 attribute set query @@ -1478,7 +1478,7 @@