X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Ftools.xml;h=7f1997c3a78894478d631e232ebbd3e9d8e51f5f;hb=ccb068efa655ace7835a122c1226356f2f933992;hp=051daf29a46cdf490139aa8e540edcb87bad3e5a;hpb=6406d7955629f3b8d4bbeae9fc93ec1bb9d5366a;p=yaz-moved-to-github.git diff --git a/doc/tools.xml b/doc/tools.xml index 051daf2..7f1997c 100644 --- a/doc/tools.xml +++ b/doc/tools.xml @@ -1,4 +1,4 @@ - + Supporting Tools @@ -181,45 +181,70 @@ - Z39.50 version 3 defines various encoding of terms. - Use the @term operator to indicate the encoding type: - general, numeric, - string (for InternationalString), .. + Version 3 of the Z39.50 specification defines various encoding of terms. + Use the @term type, + where type is one of: general, + numeric, string + (for InternationalString), .. If no term type has been given, the general form is used which is the only encoding allowed in both version 2 - and 3 of the Z39.50 standard. - - The following are all examples of valid queries in the PQF. - - - - dylan - - "bob dylan" - - @or "dylan" "zimmerman" - - @set Result-1 - - @or @and bob dylan @set Result-1 - - @attr 1=4 computer - - @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming" - - @attr 4=1 @attr 1=4 "self portrait" - - @prox 0 3 1 2 k 2 dylan zimmerman - - @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109 - - @term string "a UTF-8 string, maybe?" - - @attr 1=/book/title computer - + PQF queries + Queries using simple terms. + + dylan + "bob dylan" + + + Boolean operators. + + @or "dylan" "zimmerman" + @and @or dylan zimmerman when + @and when @or dylan zimmerman + + + + Reference to result sets. + + @set Result-1 + @and @set seta setb + + + + Attributes for terms. + + @attr 1=4 computer + @attr 1=4 @attr 4=1 "self portrait" + @attr exp1 @attr 1=1 CategoryList + @attr gils 1=2008 Copenhagen + @attr 1=/book/title computer + + + + Proximity. + + @prox 0 3 1 2 k 2 dylan zimmerman + + + + Specifying term type. + + @term string "a UTF-8 string, maybe?" + + + Mixed queries + + @or @and bob dylan @set Result-1 + + @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming" + + @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109 + + + Common Command Language @@ -293,40 +318,43 @@ -- Proximity operator - - - The following queries are all valid: - - - - dylan - - "bob dylan" - - dylan or zimmerman - - set=1 - - (dylan and bob) or set=1 - - - - Assuming that the qualifiers ti, au - and date are defined we may use: - - - - ti=self portrait - - au=(bob dylan and slow train coming) - - date>1980 and (ti=((self portrait))) - - - + + CCL queries + + The following queries are all valid: + + + + dylan + + "bob dylan" + + dylan or zimmerman + + set=1 + + (dylan and bob) or set=1 + + + + Assuming that the qualifiers ti, + au + and date are defined we may use: + + + + ti=self portrait + + au=(bob dylan and slow train coming) + + date>1980 and (ti=((self portrait))) + + + + CCL Qualifiers - + Qualifiers are used to direct the search to a particular searchable index, such as title (ti) and author indexes (au). The CCL standard @@ -340,66 +368,67 @@ - Consider a scenario where the target support ranked searches in the - title-index. In this case, the user could specify - - - - ti,ranked=knuth computer - - - and the ranked would map to relation=relevance - (2=102) and the ti would map to title (1=4). - - - - A "profile" with a set predefined CCL qualifiers can be read from a - file. The YAZ client reads its CCL qualifiers from a file named + A CCL profile is a set of predefined CCL qualifiers that may be + read from a file. + The YAZ client reads its CCL qualifiers from a file named default.bib. Each line in the file has the form: qualifier-name - type=val - type=val ... + [attributeset,]type=val + [attributeset,]type=val ... where qualifier-name is the name of the qualifier to be used (eg. ti), - type is a BIB-1 category type and - val is the corresponding BIB-1 attribute - value. - The type can be either numeric or it may be - either u (use), r (relation), - p (position), s (structure), - t (truncation) or c (completeness). - The qualifier-name term - has a special meaning. - The types and values for this definition is used when - no qualifiers are present. - - - - Consider the following definition: - - - - ti u=4 s=1 - au u=1 s=1 - term s=105 - - - Two qualifiers are defined, ti and - au. - They both set the structure-attribute to phrase (1). - ti - sets the use-attribute to 4. au sets the - use-attribute to 1. - When no qualifiers are used in the query the structure-attribute is - set to free-form-text (105). + type is attribute type in the attribute + set (Bib-1 is used if no attribute set is given) and + val is attribute value. + The type can be specified as an + integer or as it be specified either as a single-letter: + u for use, + r for relation,p for position, + s for structure,t for truncation + or c for completeness. + The attributes for the special qualifier name term + are used when no CCL qualifier is given in a query. + CCL profile + + Consider the following definition: + + + + ti u=4 s=1 + au u=1 s=1 + term s=105 + ranked r=102 + + + Three qualifiers are defined, ti, + au and ranked. + ti and au both set + structure attribute to phrase (s=1). + ti + sets the use-attribute to 4. au sets the + use-attribute to 1. + When no qualifiers are used in the query the structure-attribute is + set to free-form-text (105). + + + You can combine attributes. To Search for "ranked title" you + can do + + ti,ranked=knuth computer + + which will use "relation is ranked", "use is title", "structure is + phrase". + + + CCL API @@ -600,7 +629,7 @@ struct cql_node { struct cql_node *right; struct cql_node *modifiers; struct cql_node *prefixes; - } bool; + } boolean; struct { char *name; char *value; @@ -707,7 +736,7 @@ struct cql_node { Conversion to PQF (and Z39.50 RPN) is tricky by the fact that the resulting RPN depends on the Z39.50 target capabilities (combinations of supported attributes). - Furthermore, the CQL and SRW operates on index prefixes + In addition, the CQL and SRW operates on index prefixes (URI or strings), whereas the RPN uses Object Identifiers for attribute sets. @@ -721,7 +750,7 @@ cql_transform_t cql_transform_open_fname(const char *fname); void cql_transform_close(cql_transform_t ct); The first two functions create a tranformation handle from - either an already open FILE or from a filename. + either an already open FILE or from a filename respectively. The handle is destroyed by cql_transform_close @@ -762,7 +791,202 @@ int cql_transform_FILE(cql_transform_t ct, open FILE. - CQL to XCQL conversion + + Specification of CQL to RPN mapping + + The file supplied to functions + cql_transform_open_FILE, + cql_transform_open_fname follows + a structure found in many Unix utilities. + It consists of mapping specifications - one per line. + Lines starting with # are ignored (comments). + + + Each line is of the form + + CQL pattern = RPN equivalent + + + + An RPN pattern is a simple attribute list. Each attribute pair + takes the form: + + [set] type=value + + The attribute set is optional. + The type is the attribute type, + value the attribute value. + + + The following CQL patterns are recognized: + + + qualifier.set.name + + + + This pattern is invoked when a CQL qualifier, such as + dc.title is converted. set + and name is the index set and qualifier + name respectively. + Typically, the RPN specifies an equivalent use attribute. + + + For terms not bound by a qualifier the pattern + qualifier.srw.serverChoice is used. + Here, the prefix srw is defined as + http://www.loc.gov/zing/cql/srw-indexes/v1.0/. + If this pattern is not defined, the mapping will fail. + + + + + relation.relation + + + + This pattern specifies how a CQL relation is mapped to RPN. + pattern is name of relation + operator. Since = is used as + separator between CQL pattern and RPN, CQL relations + including = cannot be + used directly. To avoid a conflict, the names + ge, + eq, + le, + must be used for CQL operators, greater-than-or-equal, + equal, less-than-or-equal respectively. + The RPN pattern is supposed to include a relation attribute. + + + For terms not bound by a relation, the pattern + relation.scr is used. If the pattern + is not defined, the mapping will fail. + + + The special pattern, relation.* is used + when no other relation pattern is matched. + + + + + + relationModifier.mod + + + + This pattern specifies how a CQL relation modifier is mapped to RPN. + The RPN pattern is usually a relation attribute. + + + + + + structure.type + + + + This pattern specifies how a CQL structure is mapped to RPN. + Note that this CQL pattern is somewhat to similar to + CQL pattern relation. + The type is a CQL relation. + + + The pattern, structure.* is used + when no other structure pattern is matched. + Usually, the RPN equivalent specifies a structure attribute. + + + + + + position.type + + + + This pattern specifies how the anchor (position) of + CQL is mapped to RPN. + The type is one + of first, any, + last, firstAndLast. + + + The pattern, position.* is used + when no other position pattern is matched. + + + + + + set.prefix + + + + This specification defines a CQL index set for a given prefix. + The value on the right hand side is the URI for the set - + not RPN. All prefixes used in + qualifier patterns must be defined this way. + + + + + + CQL to RPN mapping file + + This simple file defines two index sets, three qualifiers and three + relations, a position pattern and a default structure. + + + + + With the mappings above, the CQL query + + computer + + is converted to the PQF: + + @attr 1=1016 @attr 2=3 @attr 4=1 @attr 3=3 @attr 6=1 "computer" + + by rules qualifier.srw.serverChoice, + relation.scr, structure.*, + position.any. + + + CQL query + + computer^ + + is rejected, since position.right is + undefined. + + + CQL query + + >my = "http://www.loc.gov/zing/cql/dc-indexes/v1.0/" my.title = x + + is converted to + + @attr 1=4 @attr 2=3 @attr 4=1 @attr 3=3 @attr 6=1 "x" + + + + + CQL to XCQL conversion Conversion from CQL to XCQL is trivial and does not require a mapping to be defined.