From: Marc Cromme Date: Thu, 15 Jun 2006 13:41:49 +0000 (+0000) Subject: additional information on zebra functionalities and attribute lists added X-Git-Tag: before.bug.529~22 X-Git-Url: http://git.indexdata.com/?p=idzebra-moved-to-github.git;a=commitdiff_plain;h=0002d3ccff37e5598553683e95714ca5711f05e8 additional information on zebra functionalities and attribute lists added --- diff --git a/doc/querymodel.xml b/doc/querymodel.xml index f1a7964..bed7a2e 100644 --- a/doc/querymodel.xml +++ b/doc/querymodel.xml @@ -1,5 +1,5 @@ - + Query Model @@ -590,10 +590,33 @@ - Use Attributes (type = 1) + Use Attributes (type 1) + A use attribute specifies an access point for any atomic query. + These acess points are highly dependent on the attribute set used + in the query, and are user configurable using the following + default configuration files: + tab/bib1.att, + tab/dan1.att, + tab/explain.att, and + tab/gils.att. + New attribute sets can be added by adding new + tab/*.att configuration files, which need to + be sourced in the main configuration zebra.cfg. + + + + In addition, Zebra allows the acess of + internal index names and dynamic + XPath as use attributes. + See + for + alternative acess to the Zebra internal index names and XPath queries. + + + Phrase search for information retrieval in the title-register: @@ -601,28 +624,96 @@ - - See also - for - alternative acess to the Zebra internal index names and XPath queries. - - - Relation Attributes (type = 2) - - Supported operations: = (default, of omitted), < > <=, >= . - Unsupported: Not equal. + Relation Attributes (type 2) - The following relation attributes are also supported: relevance (102). - + + Relation attributes describe the relationship of the access + point (left side + of the relation) to the search term as qualified by the attributes (right + side of the relation), e.g., Date-publication <= 1975. + - All operations are based on a lexicographical ordering, - expect in the case for the - following structure attributes: numeric(109). - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Relation Attributes (type 2)
RelationValueNotes
Less than1supported
Less than or equal2supported
Equal3default
Greater or equal4supported
Greater than5supported
Not equal6unsupported
Phonetic100unsupported
Stem101unsupported
Relevance102supported
AlwaysMatches103supported
+ + The relation attribute + relevance (102) is supported, see + for full information. + + + + All ordering operations are based on a lexicographical ordering, + expect when the + structure attribute numeric (109) is used. In + this case, ordering is numerical. See + . + + + Ranked search for information retrieval in the title-register (see for the glory details): @@ -633,21 +724,169 @@
- Position Attributes (type = 3) + Position Attributes (type 3) + - Only value of (any position(3) is supported. first in field(1), - and first in subfield(2) are unsupported but using them - does not trigger an error. + The position attribute specifies the location of the search term + within the field or subfield in which it appears. + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Position Attributes (type 3)
PositionValueNotes
First in field 1unsupported
First in subfield2unsupported
Any position in field3default
+ + + The position attribute values first in field (1), + and first in subfield(2) are unsupported. + Using them does not trigger an error, but silent defaults to + any position in field (3).
- Structure Attributes (type = 4) - + Structure Attributes (type 4) + + + The structure attribute specifies the type of search + term. This causes the search to be mapped on + different Zebra internal indexes, which must have been defined + at index time. + + + + The possible values of the + structure attribute (type 4) can be defined + using the configuraiton file + tab/default.idx. + The default configuration is summerized in this table. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Structure Attributes (type 4)
StructureValueNotes
Phrase 1default
Word2supported
Key3supported
Year4supported
Date (normalized)5supported
Word list6supported
Date (un-normalized)100unsupported
Name (normalized) 101unsupported
Name (un-normalized) 102unsupported
Structure103unsupported
Urx104supported
Free-form-text105supported
Document-text106supported
Local-number107supported
String108unsupported
Numeric string109supported
+ The structure attribute value local-number + (107) + is supported, and maps always to the Zebra internal document ID. + + + For example, in the GILS schema (gils.abs), the west-bounding-coordinate is indexed as type n, @@ -662,16 +901,86 @@ Truncation Attributes (type = 5) + - Supported are: No truncation(100) which is the default, - Right trunation(1), Left truncation(2), - Left&Right truncation(3), - Process # in term(100) which maps - each # to .*, - Regexp-1(102) normal regular, Regexp-2(103) (regular with fuzzy), + The truncation attribute specifies whether variations of one or + more characters are allowed between serch term and hit terms, or + not. Using non-default truncation attributes will broaden the + document hit set of a search query. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Truncation Attributes (type 5)
TruncationValueNotes
Right truncation 1supported
Left truncation2supported
Left and right truncation3supported
Do not truncate100default
Process # in search term101supported
RegExpr-1 102supported
RegExpr-2103supported
+ + + Truncation attribute value + Process # in search term (100) is a + poor-man's regular expression search. It maps + each # to .*, and + performes then a Regexp-1 (102) regular + expression search. + + + Truncation attribute value + Regexp-1 (102) is a normal regular search, + see. + + + Truncation attribute value + Regexp-2 (103) is a Zebra specific extention + which allows fuzzy matches. One single + error in spelling of search terms is allowed, i.e., a document + is hit if it includes a term which can be mapped to the used + search term by one character substitution, addition, deletion or + change of posiiton. + -
@@ -701,34 +1010,40 @@ Zebra Search Attribute Extentions - Name and Type + Name + Value Operation Zebra version - Embedded Sort (type 7) + Embedded Sort + 7 search 1.1 - Term Set (type 8) + Term Set + 8 search 1.1 - Rank weight (type 9) + Rank Weight + 9 search 1.1 - Approx Limit (type 9) + Approx Limit + 9 search 1.4 - Term Reference (type 10) + Term Reference + 10 search 1.4