X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Ftools.xml;h=450b3fc1602b7dd3a949da8e3ddb6c5d6a4113b2;hb=73f8c92214bd7afdb0e465dec053272130b53bb5;hp=7f1997c3a78894478d631e232ebbd3e9d8e51f5f;hpb=1e5554fc59c8cad2105b2a8cf095d13315dc6ed1;p=yaz-moved-to-github.git diff --git a/doc/tools.xml b/doc/tools.xml index 7f1997c..450b3fc 100644 --- a/doc/tools.xml +++ b/doc/tools.xml @@ -1,4 +1,4 @@ - + Supporting Tools @@ -182,16 +182,120 @@ Version 3 of the Z39.50 specification defines various encoding of terms. - Use the @term type, + Use @term type + string, where type is one of: general, - numeric, string - (for InternationalString), .. + numeric or string + (for InternationalString). If no term type has been given, the general form - is used which is the only encoding allowed in both version 2 - and 3 + is used. This is the only encoding allowed in both versions 2 and 3 of the Z39.50 standard. - PQF queries + + Using Proximity Operators with PQF + + + This is an advanced topic, describing how to construct + queries that make very specific requirements on the + relative location of their operands. + You may wish to skip this section and go straight to + the example PQF queries. + + + + + Most Z39.50 servers do not support proximity searching, or + support only a small subset of the full functionality that + can be expressed using the PQF proximity operator. Be + aware that the ability to express a + query in PQF is no guarantee that any given server will + be able to execute it. + + + + + + The proximity operator @prox is a special + and more restrictive version of the conjunction operator + @and. Its semantics are described in + section 3.7.2 (Proximity) of Z39.50 the standard itself, which + can be read on-line at + + + + In PQF, the proximity operation is represented by a sequence + of the form + +@prox exclusion distance ordered relation which-code unit-code + + in which the meanings of the parameters are as described in in + the standard, and they can take the following values: + + exclusion + 0 = false (i.e. the proximity condition specified by the + remaining parameters must be satisfied) or + 1 = true (the proximity condition specified by the + remaining parameters must not be + satisifed). + + distance + An integer specifying the difference between the locations + of the operands: e.g. two adjacent words would have + distance=1 since their locations differ by one unit. + + ordered + 1 = ordered (the operands must occur in the order the + query specifies them) or + 0 = unordered (they may appear in either order). + + relation + Recognised values are + 1 (lessThan), + 2 (lessThanOrEqual), + 3 (equal), + 4 (greaterThanOrEqual), + 5 (greaterThan) and + 6 (notEqual). + + which-code + known + or + k + (the unit-code parameter is taken from the well-known list + of alternatives described in below) or + private + or + p + (the unit-code paramater has semantics specific to an + out-of-band agreement such as a profile). + + unit-code + If the which-code parameter is known + then the recognised values are + 1 (character), + 2 (word), + 3 (sentence), + 4 (paragraph), + 5 (section), + 6 (chapter), + 7 (document), + 8 (element), + 9 (subelement), + 10 (elementType) and + 11 (byte). + If which-code is private then the + acceptable values are determined by the profile. + + + (The numeric values of the relation and well-known unit-code + parameters are taken straight from + the ASN.1 of the proximity structure in the standard.) + + + + PQF queries Queries using simple terms. @@ -227,8 +331,40 @@ Proximity. @prox 0 3 1 2 k 2 dylan zimmerman - - + + + Here the parameters 0, 3, 1, 2, k and 2 represent exclusion, + distance, ordered, relation, which-code and unit-code, in that + order. So: + + + exclusion = 0: the proximity condition must hold + + + distance = 3: the terms must be three units apart + + + ordered = 1: they must occur in the order they are specified + + + relation = 2: lessThanOrEqual (to the distance of 3 units) + + + which-code is ``known'', so the standard unit-codes are used + + + unit-code = 2: word. + + + So the whole proximity query means that the words + dylan and zimmerman must + both occur in the record, in that order, differing in position + by three or fewer words (i.e. with two or fewer words between + them.) The query would find ``Bob Dylan, aka. Robert + Zimmerman'', but not ``Bob Dylan, born as Robert Zimmerman'' + since the distance in this case is four. + + Specifying term type. @@ -243,16 +379,28 @@ @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109 + + + The last of these examples is a spatial search: in + the GILS attribute set, + access point + 2038 indicates West Bounding Coordinate and + 2030 indicates East Bounding Coordinate, + so the query is for areas extending from -114 degrees + to no more than -109 degrees. + + - + - Common Command Language + CCL Not all users enjoy typing in prefix query structures and numerical attribute values, even in a minimalistic test client. In the library - world, the more intuitive Common Command Language (or ISO 8777) has - enjoyed some popularity - especially before the widespread + world, the more intuitive Common Command Language - CCL (ISO 8777) + has enjoyed some popularity - especially before the widespread availability of graphical interfaces. It is still useful in applications where you for some reason or other need to provide a symbolic language for expressing boolean query structures. @@ -362,73 +510,247 @@ suggest a few short-hand notations. You can customize the CCL parser to support a particular set of qualifiers to reflect the current target profile. Traditionally, a qualifier would map to a particular - use-attribute within the BIB-1 attribute set. However, you could also - define qualifiers that would set, for example, the - structure-attribute. + use-attribute within the BIB-1 attribute set. It is also + possible to set other attributes, such as the structure + attribute. A CCL profile is a set of predefined CCL qualifiers that may be - read from a file. + read from a file or set in the CCL API. The YAZ client reads its CCL qualifiers from a file named - default.bib. Each line in the file has the form: - - - - qualifier-name - [attributeset,]type=val - [attributeset,]type=val ... - - - - where qualifier-name is the name of the - qualifier to be used (eg. ti), - type is attribute type in the attribute - set (Bib-1 is used if no attribute set is given) and - val is attribute value. - The type can be specified as an - integer or as it be specified either as a single-letter: - u for use, - r for relation,p for position, - s for structure,t for truncation - or c for completeness. - The attributes for the special qualifier name term - are used when no CCL qualifier is given in a query. + default.bib. There are four types of + lines in a CCL profile: qualifier specification, + qualifier alias, comments and directives. - - CCL profile + Qualifier specification - Consider the following definition: + A qualifier specification is of the form: - - ti u=4 s=1 - au u=1 s=1 - term s=105 - ranked r=102 - - Three qualifiers are defined, ti, - au and ranked. - ti and au both set - structure attribute to phrase (s=1). - ti - sets the use-attribute to 4. au sets the - use-attribute to 1. - When no qualifiers are used in the query the structure-attribute is - set to free-form-text (105). + qualifier-name + [attributeset,]type=val + [attributeset,]type=val ... + + + + where qualifier-name is the name of the + qualifier to be used (eg. ti), + type is attribute type in the attribute + set (Bib-1 is used if no attribute set is given) and + val is attribute value. + The type can be specified as an + integer or as it be specified either as a single-letter: + u for use, + r for relation,p for position, + s for structure,t for truncation + or c for completeness. + The attributes for the special qualifier name term + are used when no CCL qualifier is given in a query. - You can combine attributes. To Search for "ranked title" you - can do + The attribute value val may be + specified as in integer. It is also possible to specify + non-numeric values, however, which are used in combination with + certain types. The special combinations are: + + s=pw + + The structure is set to either word or phrase depending + on the number of tokens in a term (phrase-word). + + + + + s=al + + Each token in the term is ANDed. (and-list). + This does not set the structure at all. + + + + + s=ol + + Each token in the term is ORed. (or-list). + This does not set the structure at all. + + + + + r=o + + Allows operators greather-than, less-than, ... equals and + sets relation attribute accordingly (relation ordered). + + + + + t=l + + Allows term to be left-truncated. + If term is of the form ?x, the resulting + Type-1 term is x and truncation is left. + + + + + t=r + + Allows term to be right-truncated. + If term is of the form x?, the resulting + Type-1 term is x and truncation is right. + + + + + t=n + + If term is does not include ?, the + truncation attribute is set to none (100). + + + + + t=b + + Allows term to be both left&right truncated. + If term is of the form ?x?, the + resulting term is x and trunctation is + set to both left&right. + + + + + + + CCL profile + + Consider the following definition: + + - ti,ranked=knuth computer - - which will use "relation is ranked", "use is title", "structure is - phrase". + ti u=4 s=1 + au u=1 s=1 + term s=105 + ranked r=102 + + + Three qualifiers are defined, ti, + au and ranked. + ti and au both set + structure attribute to phrase (s=1). + ti + sets the use-attribute to 4. au sets the + use-attribute to 1. + When no qualifiers are used in the query the structure-attribute is + set to free-form-text (105). - - + + You can combine attributes. To Search for "ranked title" you + can do + + ti,ranked=knuth computer + + which will use "relation is ranked", "use is title", "structure is + phrase". + + + + Qualifier alias + + A qualifier alias is of the form: + + + q + q1 q2 .. + + + which declares q to + be an alias for q1, + q2... such that the CCL + query q=x is equivalent to + q1=x or w2=x or .... + + + + Comments + + Lines with white space or lines that begin with + character # are treated as comments. + + + + Directives + + Directive specifications takes the form + + @directive value + + CCL directives + + + + + + + Name + Description + Default + + + + + truncation + Truncation character + ? + + + field + Specifies how multiple fields are to be + combined. There are two modes: or: + multiple qualifier fields are ORed, + merge: attributes for the qualifier + fields are merged and assigned to one term. + + merge + + + case + Specificies if CCL operatores and qualifiers should be + compared with case sensitivity or not. Specify 0 for + case sensitive; 1 for case insensitive. + 0 + + + + and + Specifies token for CCL operator AND. + and + + + + or + Specifies token for CCL operator OR. + or + + + + not + Specifies token for CCL operator NOT. + not + + + + set + Specifies token for CCL operator SET. + set + + + +
+
CCL API @@ -570,7 +892,7 @@ int cql_parser_string(CQL_parser cp, const char *str); A CQL query is parsed by the cql_parser_string which takes a query str. If the query was valid (no syntax errors), then zero is returned; - otherwise a non-zero error code is returned. + otherwise -1 is returned to indicate a syntax error. @@ -594,7 +916,7 @@ int cql_parser_stdio(CQL_parser cp, FILE *f); CQL tree - The the query string is validl, the CQL parser + The the query string is valid, the CQL parser generates a tree representing the structure of the CQL query. @@ -771,8 +1093,30 @@ int cql_transform_buf(cql_transform_t ct, If conversion failed, cql_transform_buf - returns a non-zero error code; otherwise zero is returned - (conversion successful). + returns a non-zero SRW error code; otherwise zero is returned + (conversion successful). The meanings of the numeric error + codes are listed in the SRW specifications at + + + + If conversion fails, more information can be obtained by calling + +int cql_transform_error(cql_transform_t ct, char **addinfop); + + This function returns the most recently returned numeric + error-code and sets the string-pointer at + *addinfop to point to a string containing + additional information about the error that occurred: for + example, if the error code is 15 (``Illegal or unsupported index + set''), the additional information is the name of the requested + index set that was not recognised. + + + The SRW error-codes may be translated into brief human-readable + error messages using + +const char *cql_strerror(int code); + If you wish to be able to produce a PQF result in a different