X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Ftools.xml;h=56fe958ff30604d8c0f034e1028ea49a82ac4b59;hb=07a80eea989576eaa13633f4e96e57e14b40ea0f;hp=51de23355a296217fa9cd0ecbb01a34250bcac95;hpb=d193403feb3df490f60175d387603f4daf89cf1f;p=yaz-moved-to-github.git diff --git a/doc/tools.xml b/doc/tools.xml index 51de233..56fe958 100644 --- a/doc/tools.xml +++ b/doc/tools.xml @@ -1,4 +1,4 @@ - + Supporting Tools @@ -181,45 +181,218 @@ - Z39.50 version 3 defines various encoding of terms. - Use the @term operator to indicate the encoding type: - general, numeric, - string (for InternationalString), .. + Version 3 of the Z39.50 specification defines various encoding of terms. + Use @term type + string, + where type is one of: general, + numeric or string + (for InternationalString). If no term type has been given, the general form - is used which is the only encoding allowed in both version 2 - and 3 + is used. This is the only encoding allowed in both versions 2 and 3 of the Z39.50 standard. - - The following are all examples of valid queries in the PQF. - - - - dylan - - "bob dylan" - - @or "dylan" "zimmerman" - - @set Result-1 - - @or @and bob dylan @set Result-1 - - @attr 1=4 computer - - @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming" - - @attr 4=1 @attr 1=4 "self portrait" - - @prox 0 3 1 2 k 2 dylan zimmerman - - @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109 - - @term string "a UTF-8 string, maybe?" + + Using Proximity Operators with PQF + + + This is an advanced topic, describing how to construct + queries that make very specific requirements on the + relative location of their operands. + You may wish to skip this section and go straight to + the example PQF queries. + + + + + Most Z39.50 servers do not support proximity searching, or + support only a small subset of the full functionality that + can be expressed using the PQF proximity operator. Be + aware that the ability to express a + query in PQF is no guarantee that any given server will + be able to execute it. + + + + + + The proximity operator @prox is a special + and more restrictive version of the conjunction operator + @and. Its semantics are described in + section 3.7.2 (Proximity) of Z39.50 the standard itself, which + can be read on-line at + + + + In PQF, the proximity operation is represented by a sequence + of the form + +@prox exclusion distance ordered relation which-code unit-code + + in which the meanings of the parameters are as described in in + the standard, and they can take the following values: + + exclusion + 0 = false (i.e. the proximity condition specified by the + remaining parameters must be satisfied) or + 1 = true (the proximity condition specified by the + remaining parameters must not be + satisifed). + + distance + An integer specifying the difference between the locations + of the operands: e.g. two adjacent words would have + distance=1 since their locations differ by one unit. + + ordered + 1 = ordered (the operands must occur in the order the + query specifies them) or + 0 = unordered (they may appear in either order). + + relation + Recognised values are + 1 (lessThan), + 2 (lessThanOrEqual), + 3 (equal), + 4 (greaterThanOrEqual), + 5 (greaterThan) and + 6 (notEqual). + + which-code + known + or + k + (the unit-code parameter is taken from the well-known list + of alternatives described in below) or + private + or + p + (the unit-code paramater has semantics specific to an + out-of-band agreement such as a profile). + + unit-code + If the which-code parameter is known + then the recognised values are + 1 (character), + 2 (word), + 3 (sentence), + 4 (paragraph), + 5 (section), + 6 (chapter), + 7 (document), + 8 (element), + 9 (subelement), + 10 (elementType) and + 11 (byte). + If which-code is private then the + acceptable values are determined by the profile. + + + (The numeric values of the relation and well-known unit-code + parameters are taken straight from + the ASN.1 of the proximity structure in the standard.) + + - @attr 1=/book/title computer - + PQF queries + Queries using simple terms. + + dylan + "bob dylan" + + + Boolean operators. + + @or "dylan" "zimmerman" + @and @or dylan zimmerman when + @and when @or dylan zimmerman + + + + Reference to result sets. + + @set Result-1 + @and @set seta setb + + + + Attributes for terms. + + @attr 1=4 computer + @attr 1=4 @attr 4=1 "self portrait" + @attr exp1 @attr 1=1 CategoryList + @attr gils 1=2008 Copenhagen + @attr 1=/book/title computer + + + + Proximity. + + @prox 0 3 1 2 k 2 dylan zimmerman + + + Here the parameters 0, 3, 1, 2, k and 2 represent exclusion, + distance, ordered, relation, which-code and unit-code, in that + order. So: + + + exclusion = 0: the proximity condition must hold + + + distance = 3: the terms must be three units apart + + + ordered = 1: they must occur in the order they are specified + + + relation = 2: lessThanOrEqual (to the distance of 3 units) + + + which-code is ``known'', so the standard unit-codes are used + + + unit-code = 2: word. + + + So the whole proximity query means that the words + dylan and zimmerman must + both occur in the record, in that order, differing in position + by three or fewer words (i.e. with two or fewer words between + them.) The query would find ``Bob Dylan, aka. Robert + Zimmerman'', but not ``Bob Dylan, born as Robert Zimmerman'' + since the distance in this case is four. + + + + Specifying term type. + + @term string "a UTF-8 string, maybe?" + + + Mixed queries + + @or @and bob dylan @set Result-1 + + @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming" + + @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109 + + + + The last of these examples is a spatial search: in + the GILS attribute set, + access point + 2038 indicates West Bounding Coordinate and + 2030 indicates East Bounding Coordinate, + so the query is for areas extending from -114 degrees + to no more than -109 degrees. + + + + Common Command Language @@ -293,40 +466,43 @@ -- Proximity operator - - - The following queries are all valid: - - - - dylan - - "bob dylan" - - dylan or zimmerman - - set=1 - - (dylan and bob) or set=1 - - - - Assuming that the qualifiers ti, au - and date are defined we may use: - - - - ti=self portrait - - au=(bob dylan and slow train coming) - - date>1980 and (ti=((self portrait))) - - - + + CCL queries + + The following queries are all valid: + + + + dylan + + "bob dylan" + + dylan or zimmerman + + set=1 + + (dylan and bob) or set=1 + + + + Assuming that the qualifiers ti, + au + and date are defined we may use: + + + + ti=self portrait + + au=(bob dylan and slow train coming) + + date>1980 and (ti=((self portrait))) + + + + CCL Qualifiers - + Qualifiers are used to direct the search to a particular searchable index, such as title (ti) and author indexes (au). The CCL standard @@ -340,66 +516,67 @@ - Consider a scenario where the target support ranked searches in the - title-index. In this case, the user could specify - - - - ti,ranked=knuth computer - - - and the ranked would map to relation=relevance - (2=102) and the ti would map to title (1=4). - - - - A "profile" with a set predefined CCL qualifiers can be read from a - file. The YAZ client reads its CCL qualifiers from a file named + A CCL profile is a set of predefined CCL qualifiers that may be + read from a file. + The YAZ client reads its CCL qualifiers from a file named default.bib. Each line in the file has the form: qualifier-name - type=val - type=val ... + [attributeset,]type=val + [attributeset,]type=val ... where qualifier-name is the name of the qualifier to be used (eg. ti), - type is a BIB-1 category type and - val is the corresponding BIB-1 attribute - value. - The type can be either numeric or it may be - either u (use), r (relation), - p (position), s (structure), - t (truncation) or c (completeness). - The qualifier-name term - has a special meaning. - The types and values for this definition is used when - no qualifiers are present. - - - - Consider the following definition: - - - - ti u=4 s=1 - au u=1 s=1 - term s=105 - - - Two qualifiers are defined, ti and - au. - They both set the structure-attribute to phrase (1). - ti - sets the use-attribute to 4. au sets the - use-attribute to 1. - When no qualifiers are used in the query the structure-attribute is - set to free-form-text (105). + type is attribute type in the attribute + set (Bib-1 is used if no attribute set is given) and + val is attribute value. + The type can be specified as an + integer or as it be specified either as a single-letter: + u for use, + r for relation,p for position, + s for structure,t for truncation + or c for completeness. + The attributes for the special qualifier name term + are used when no CCL qualifier is given in a query. + CCL profile + + Consider the following definition: + + + + ti u=4 s=1 + au u=1 s=1 + term s=105 + ranked r=102 + + + Three qualifiers are defined, ti, + au and ranked. + ti and au both set + structure attribute to phrase (s=1). + ti + sets the use-attribute to 4. au sets the + use-attribute to 1. + When no qualifiers are used in the query the structure-attribute is + set to free-form-text (105). + + + You can combine attributes. To Search for "ranked title" you + can do + + ti,ranked=knuth computer + + which will use "relation is ranked", "use is title", "structure is + phrase". + + + CCL API @@ -600,7 +777,7 @@ struct cql_node { struct cql_node *right; struct cql_node *modifiers; struct cql_node *prefixes; - } bool; + } boolean; struct { char *name; char *value; @@ -742,8 +919,23 @@ int cql_transform_buf(cql_transform_t ct, If conversion failed, cql_transform_buf - returns a non-zero error code; otherwise zero is returned - (conversion successful). + returns a non-zero SRW error code; otherwise zero is returned + (conversion successful). The meanings of the numeric error + codes are listed in the SRW specifications at + + + + If conversion fails, more information can be obtained by calling + +int cql_transform_error(cql_transform_t ct, char **addinfop); + + This function returns the most recently returned numeric + error-code and sets the string-pointer at + *addinfop to point to a string containing + additional information about the error that occurred: for + example, if the error code is 15 (``Illegal or unsupported index + set''), the additional information is the name of the requested + index set that was not recognised. If you wish to be able to produce a PQF result in a different @@ -902,12 +1094,12 @@ int cql_transform_FILE(cql_transform_t ct, - Small CQL to RPN mapping file + CQL to RPN mapping file - This small file defines two index sets, three qualifiers and three + This simple file defines two index sets, three qualifiers and three relations, a position pattern and a default structure. - + With the mappings above, the CQL query