X-Git-Url: http://git.indexdata.com/?p=yaz-moved-to-github.git;a=blobdiff_plain;f=doc%2Ftools.xml;h=b0625684ed927ec53fe621c5a4cfdbb418e11585;hp=f82743623405a4a7b7ab56f8ed76491ba68933c1;hb=4f8ea8cfaf2f3d95e4efcf9494526c2b4be43eb8;hpb=6011a7156d007b94abde4bfc3d427a1bd853cd86 diff --git a/doc/tools.xml b/doc/tools.xml index f827436..b062568 100644 --- a/doc/tools.xml +++ b/doc/tools.xml @@ -1,4 +1,4 @@ - + Supporting Tools @@ -16,7 +16,7 @@ Z_RPNQuery structure. Some programmers will prefer to construct the query manually, perhaps using odr_malloc() to simplify memory management. - The &yaz; distribution includes two separate, query-generating tools + The &yaz; distribution includes three separate, query-generating tools that may be of use to you. @@ -131,7 +131,7 @@ top-set ::= [ '@attrset' string ] - query-struct ::= attr-spec | simple | complex | '@term' term-type + query-struct ::= attr-spec | simple | complex | '@term' term-type query attr-spec ::= '@attr' [ string ] string query-struct @@ -173,69 +173,270 @@ The @attr operator is followed by an attribute specification (attr-spec above). The specification consists - of optional an attribute set, an attribute type-value pair and - a sub query. The attribute type-value pair is packed in one string: - an attribute type, a dash, followed by an attribute value. + of an optional attribute set, an attribute type-value pair and + a sub-query. The attribute type-value pair is packed in one string: + an attribute type, an equals sign, and an attribute value, like this: + @attr 1=1003. The type is always an integer but the value may be either an integer or a string (if it doesn't start with a digit character). + A string attribute-value is encoded as a Type-1 ``complex'' + attribute with the list of values containing the single string + specified, and including no semantic indicators. - Z39.50 version 3 defines various encoding of terms. - Use the @term operator to indicate the encoding type: - general, numeric, - string (for InternationalString), .. + Version 3 of the Z39.50 specification defines various encoding of terms. + Use @term type + string, + where type is one of: general, + numeric or string + (for InternationalString). If no term type has been given, the general form - is used which is the only encoding allowed in both version 2 - and 3 + is used. This is the only encoding allowed in both versions 2 and 3 of the Z39.50 standard. - - The following are all examples of valid queries in the PQF. - - - - dylan - - "bob dylan" - - @or "dylan" "zimmerman" + + Using Proximity Operators with PQF + + + This is an advanced topic, describing how to construct + queries that make very specific requirements on the + relative location of their operands. + You may wish to skip this section and go straight to + the example PQF queries. + + + + + Most Z39.50 servers do not support proximity searching, or + support only a small subset of the full functionality that + can be expressed using the PQF proximity operator. Be + aware that the ability to express a + query in PQF is no guarantee that any given server will + be able to execute it. + + + + + + The proximity operator @prox is a special + and more restrictive version of the conjunction operator + @and. Its semantics are described in + section 3.7.2 (Proximity) of Z39.50 the standard itself, which + can be read on-line at + + + + In PQF, the proximity operation is represented by a sequence + of the form + +@prox exclusion distance ordered relation which-code unit-code + + in which the meanings of the parameters are as described in in + the standard, and they can take the following values: + + exclusion + 0 = false (i.e. the proximity condition specified by the + remaining parameters must be satisfied) or + 1 = true (the proximity condition specified by the + remaining parameters must not be + satisifed). + + distance + An integer specifying the difference between the locations + of the operands: e.g. two adjacent words would have + distance=1 since their locations differ by one unit. + + ordered + 1 = ordered (the operands must occur in the order the + query specifies them) or + 0 = unordered (they may appear in either order). + + relation + Recognised values are + 1 (lessThan), + 2 (lessThanOrEqual), + 3 (equal), + 4 (greaterThanOrEqual), + 5 (greaterThan) and + 6 (notEqual). + + which-code + known + or + k + (the unit-code parameter is taken from the well-known list + of alternatives described in below) or + private + or + p + (the unit-code paramater has semantics specific to an + out-of-band agreement such as a profile). + + unit-code + If the which-code parameter is known + then the recognised values are + 1 (character), + 2 (word), + 3 (sentence), + 4 (paragraph), + 5 (section), + 6 (chapter), + 7 (document), + 8 (element), + 9 (subelement), + 10 (elementType) and + 11 (byte). + If which-code is private then the + acceptable values are determined by the profile. + + + (The numeric values of the relation and well-known unit-code + parameters are taken straight from + the ASN.1 of the proximity structure in the standard.) + + - @set Result-1 + PQF queries - @or @and bob dylan @set Result-1 + + PQF queries using simple terms + + + dylan - @attr 1=4 computer + "bob dylan" + + + + + PQF boolean operators + + + @or "dylan" "zimmerman" - @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming" + @and @or dylan zimmerman when - @attr 4=1 @attr 1=4 "self portrait" + @and when @or dylan zimmerman + + + + + PQF references to result sets + + + @set Result-1 - @prox 0 3 1 2 k 2 dylan zimmerman + @and @set seta @set setb + + + + + Attributes for terms + + + @attr 1=4 computer - @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109 + @attr 1=4 @attr 4=1 "self portrait" - @term string "a UTF-8 string, maybe?" + @attrset exp1 @attr 1=1 CategoryList - @attr 1=/book/title computer - + @attr gils 1=2008 Copenhagen + @attr 1=/book/title computer + + + + + PQF Proximity queries + + + @prox 0 3 1 2 k 2 dylan zimmerman + + + Here the parameters 0, 3, 1, 2, k and 2 represent exclusion, + distance, ordered, relation, which-code and unit-code, in that + order. So: + + + exclusion = 0: the proximity condition must hold + + + distance = 3: the terms must be three units apart + + + ordered = 1: they must occur in the order they are specified + + + relation = 2: lessThanOrEqual (to the distance of 3 units) + + + which-code is ``known'', so the standard unit-codes are used + + + unit-code = 2: word. + + + So the whole proximity query means that the words + dylan and zimmerman must + both occur in the record, in that order, differing in position + by three or fewer words (i.e. with two or fewer words between + them.) The query would find ``Bob Dylan, aka. Robert + Zimmerman'', but not ``Bob Dylan, born as Robert Zimmerman'' + since the distance in this case is four. + + + + + PQF specification of search term type + + + @term string "a UTF-8 string, maybe?" + + + + + PQF mixed queries + + + @or @and bob dylan @set Result-1 + + @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming" + + @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109 + + + + The last of these examples is a spatial search: in + the GILS attribute set, + access point + 2038 indicates West Bounding Coordinate and + 2030 indicates East Bounding Coordinate, + so the query is for areas extending from -114 degrees + to no more than -109 degrees. + + + + + - Common Command Language + CCL Not all users enjoy typing in prefix query structures and numerical attribute values, even in a minimalistic test client. In the library - world, the more intuitive Common Command Language (or ISO 8777) has - enjoyed some popularity - especially before the widespread + world, the more intuitive Common Command Language - CCL (ISO 8777) + has enjoyed some popularity - especially before the widespread availability of graphical interfaces. It is still useful in applications where you for some reason or other need to provide a symbolic language for expressing boolean query structures. - The EUROPAGATE - research project working under the Libraries programme + The EUROPAGATE research project working under the Libraries programme of the European Commission's DG XIII has, amongst other useful tools, implemented a general-purpose CCL parser which produces an output structure that can be trivially converted to the internal RPN @@ -245,7 +446,8 @@ license, it is included as a supplement to &yaz;. - CCL Syntax + + CCL Syntax The CCL parser obeys the following grammar for the FIND argument. @@ -293,40 +495,45 @@ -- Proximity operator - - - The following queries are all valid: - - - - dylan - - "bob dylan" - - dylan or zimmerman - - set=1 - - (dylan and bob) or set=1 - - - - Assuming that the qualifiers ti, au - and date are defined we may use: - - - - ti=self portrait - - au=(bob dylan and slow train coming) - - date>1980 and (ti=((self portrait))) - - - + + + CCL queries + + The following queries are all valid: + + + + dylan + + "bob dylan" + + dylan or zimmerman + + set=1 + + (dylan and bob) or set=1 + + + + Assuming that the qualifiers ti, + au + and date are defined we may use: + + + + ti=self portrait + + au=(bob dylan and slow train coming) + + date>1980 and (ti=((self portrait))) + + + + - CCL Qualifiers - + + CCL Qualifiers + Qualifiers are used to direct the search to a particular searchable index, such as title (ti) and author indexes (au). The CCL standard @@ -334,74 +541,357 @@ suggest a few short-hand notations. You can customize the CCL parser to support a particular set of qualifiers to reflect the current target profile. Traditionally, a qualifier would map to a particular - use-attribute within the BIB-1 attribute set. However, you could also - define qualifiers that would set, for example, the - structure-attribute. - - - - Consider a scenario where the target support ranked searches in the - title-index. In this case, the user could specify - - - - ti,ranked=knuth computer - - - and the ranked would map to relation=relevance - (2=102) and the ti would map to title (1=4). - - - - A "profile" with a set predefined CCL qualifiers can be read from a - file. The YAZ client reads its CCL qualifiers from a file named - default.bib. Each line in the file has the form: + use-attribute within the BIB-1 attribute set. It is also + possible to set other attributes, such as the structure + attribute. - qualifier-name - type=val - type=val ... - - - - where qualifier-name is the name of the - qualifier to be used (eg. ti), - type is a BIB-1 category type and - val is the corresponding BIB-1 attribute - value. - The type can be either numeric or it may be - either u (use), r (relation), - p (position), s (structure), - t (truncation) or c (completeness). - The qualifier-name term - has a special meaning. - The types and values for this definition is used when - no qualifiers are present. - - - - Consider the following definition: + A CCL profile is a set of predefined CCL qualifiers that may be + read from a file or set in the CCL API. + The YAZ client reads its CCL qualifiers from a file named + default.bib. There are four types of + lines in a CCL profile: qualifier specification, + qualifier alias, comments and directives. + + Qualifier specification + + A qualifier specification is of the form: + + + + qualifier-name + [attributeset,]type=val + [attributeset,]type=val ... + + + + where qualifier-name is the name of the + qualifier to be used (eg. ti), + type is attribute type in the attribute + set (Bib-1 is used if no attribute set is given) and + val is attribute value. + The type can be specified as an + integer or as it be specified either as a single-letter: + u for use, + r for relation,p for position, + s for structure,t for truncation + or c for completeness. + The attributes for the special qualifier name term + are used when no CCL qualifier is given in a query. + + Common Bib-1 attributes + + + + + + Type + Description + + + + + u=value + + Use attribute (1). Common use attributes are + 1 Personal-name, 4 Title, 7 ISBN, 8 ISSN, 30 Date, + 62 Subject, 1003 Author), 1016 Any. Specify value + as an integer. + + + + + r=value + + Relation attribute (2). Common values are + 1 <, 2 <=, 3 =, 4 >=, 5 >, 6 <>, + 100 phonetic, 101 stem, 102 relevance, 103 always matches. + + + + + p=value + + Position attribute (3). Values: 1 first in field, 2 + first in any subfield, 3 any position in field. + + + + + s=value + + Structure attribute (4). Values: 1 phrase, 2 word, + 3 key, 4 year, 5 date, 6 word list, 100 date (un), + 101 name (norm), 102 name (un), 103 structure, 104 urx, + 105 free-form-text, 106 document-text, 107 local-number, + 108 string, 109 numeric string. + + + + + t=value + + Truncation attribute (5). Values: 1 right, 2 left, + 3 left& right, 100 none, 101 process #, 102 regular-1, + 103 regular-2, 104 CCL. + + + + + c=value + + Completeness attribute (6). Values: 1 incomplete subfield, + 2 complete subfield, 3 complete field. + + + + + +
+
+ + Refer to the complete + list of Bib-1 attributes + + + It is also possible to specify non-numeric attribute values, + which are used in combination with certain types. + The special combinations are: + + + Special attribute combos + + + + + + Name + Description + + + + + s=pw + The structure is set to either word or phrase depending + on the number of tokens in a term (phrase-word). + + + + s=al + Each token in the term is ANDed. (and-list). + This does not set the structure at all. + + + + s=ol + Each token in the term is ORed. (or-list). + This does not set the structure at all. + + + + r=o + Allows ranges and the operators greather-than, less-than, ... + equals. + This sets Bib-1 relation attribute accordingly (relation + ordered). A query construct is only treated as a range if + dash is used and that is surrounded by white-space. So + -1980 is treated as term + "-1980" not <= 1980. + If - 1980 is used, however, that is + treated as a range. + + + + r=r + Similar to r=o but assumes that terms + are non-negative (not prefixed with -). + Thus, a dash will always be treated as a range. + The construct 1980-1990 is + treated as a range with r=r but as a + single term "1980-1990" with + r=o. The special attribute + r=r is available in YAZ 2.0.24 or later. + + + + t=l + Allows term to be left-truncated. + If term is of the form ?x, the resulting + Type-1 term is x and truncation is left. + + + + t=r + Allows term to be right-truncated. + If term is of the form x?, the resulting + Type-1 term is x and truncation is right. + + + + t=n + If term is does not include ?, the + truncation attribute is set to none (100). + + + + t=b + Allows term to be both left&right truncated. + If term is of the form ?x?, the + resulting term is x and trunctation is + set to both left&right. + + + + +
+
+ CCL profile + + Consider the following definition: + + + + ti u=4 s=1 + au u=1 s=1 + term s=105 + ranked r=102 + date u=30 r=o + + + ti and au both set + structure attribute to phrase (s=1). + ti + sets the use-attribute to 4. au sets the + use-attribute to 1. + When no qualifiers are used in the query the structure-attribute is + set to free-form-text (105) (rule for term). + The date sets the relation attribute to + the relation used in the CCL query and sets the use attribute + to 30 (Bib-1 Date). + + + You can combine attributes. To Search for "ranked title" you + can do + + ti,ranked=knuth computer + + which will set relation=ranked, use=title, structure=phrase. + + + Query + + date > 1980 + + is a valid query. But + + ti > 1980 + + is invalid. + + +
+ + Qualifier alias + + A qualifier alias is of the form: + + + q + q1 q2 .. + + + which declares q to + be an alias for q1, + q2... such that the CCL + query q=x is equivalent to + q1=x or q2=x or .... + + - - ti u=4 s=1 - au u=1 s=1 - term s=105 - - - Two qualifiers are defined, ti and - au. - They both set the structure-attribute to phrase (1). - ti - sets the use-attribute to 4. au sets the - use-attribute to 1. - When no qualifiers are used in the query the structure-attribute is - set to free-form-text (105). - + + Comments + + Lines with white space or lines that begin with + character # are treated as comments. + + + + Directives + + Directive specifications takes the form + + @directive value + + + CCL directives + + + + + + + Name + Description + Default + + + + + truncation + Truncation character + ? + + + field + Specifies how multiple fields are to be + combined. There are two modes: or: + multiple qualifier fields are ORed, + merge: attributes for the qualifier + fields are merged and assigned to one term. + + merge + + + case + Specificies if CCL operatores and qualifiers should be + compared with case sensitivity or not. Specify 0 for + case sensitive; 1 for case insensitive. + 0 + + + + and + Specifies token for CCL operator AND. + and + + + + or + Specifies token for CCL operator OR. + or + + + + not + Specifies token for CCL operator NOT. + not + + + + set + Specifies token for CCL operator SET. + set + + + +
+
- CCL API + + CCL API All public definitions can be found in the header file ccl.h. A profile identifier is of type @@ -464,22 +954,20 @@ struct ccl_rpn_node *ccl_find_str (CCL_bibset bibset, const char *str,
- CQL + CQL - CQL + CQL - Common Query Language - was defined for the - SRW - protocol. + SRU protocol. In many ways CQL has a similar syntax to CCL. The objective of CQL is different. Where CCL aims to be an end-user language, CQL is the protocol - query language for SRW. + query language for SRU. If you are new to CQL, read the - Gentle - Introduction. + Gentle Introduction. @@ -499,17 +987,16 @@ struct ccl_rpn_node *ccl_find_str (CCL_bibset bibset, const char *str, The parser converts a valid CQL query to PQF, thus providing a - way to use CQL for both SRW/SRU servers and Z39.50 targets at the + way to use CQL for both SRU servers and Z39.50 targets at the same time. The parser converts CQL to - - XCQL. + XCQL. XCQL is an XML representation of CQL. - XCQL is part of the SRW specification. However, since SRU + XCQL is part of the SRU specification. However, since SRU supports CQL only, we don't expect XCQL to be widely used. Furthermore, CQL has the advantage over XCQL that it is easy to read. @@ -517,7 +1004,7 @@ struct ccl_rpn_node *ccl_find_str (CCL_bibset bibset, const char *str, - CQL parsing + CQL parsing A CQL parser is represented by the CQL_parser handle. Its contents should be considered &yaz; internal (private). @@ -541,7 +1028,7 @@ int cql_parser_string(CQL_parser cp, const char *str); A CQL query is parsed by the cql_parser_string which takes a query str. If the query was valid (no syntax errors), then zero is returned; - otherwise a non-zero error code is returned. + otherwise -1 is returned to indicate a syntax error. @@ -563,9 +1050,9 @@ int cql_parser_stdio(CQL_parser cp, FILE *f); - CQL tree + CQL tree - The the query string is validl, the CQL parser + The the query string is valid, the CQL parser generates a tree representing the structure of the CQL query. @@ -583,34 +1070,28 @@ struct cql_node *cql_parser_result(CQL_parser cp); #define CQL_NODE_ST 1 #define CQL_NODE_BOOL 2 -#define CQL_NODE_MOD 3 struct cql_node { int which; union { struct { char *index; + char *index_uri; char *term; char *relation; + char *relation_uri; struct cql_node *modifiers; - struct cql_node *prefixes; } st; struct { char *value; struct cql_node *left; struct cql_node *right; struct cql_node *modifiers; - struct cql_node *prefixes; } boolean; - struct { - char *name; - char *value; - struct cql_node *next; - } mod; } u; }; - There are three kinds of nodes, search term (ST), boolean (BOOL), - and modifier (MOD). + There are two node types: search term (ST) and boolean (BOOL). + A modifier is treated as a search term too. The search term node has five members: @@ -624,6 +1105,12 @@ struct cql_node { + index_uri: index URi for search term + or NULL if none could be resolved for the index. + + + + term: the search term itself. @@ -634,18 +1121,14 @@ struct cql_node { - modifiers: relation modifiers for search - term. The modifiers is a simple linked - list (NULL for last entry). Each relation modifier node - is of type MOD. + relation_uri: relation URI for search term. - prefixes: index prefixes for search - term. The prefixes is a simple linked - list (NULL for last entry). Each prefix node - is of type MOD. + modifiers: relation modifiers for search + term. The modifiers list itself of cql_nodes + each of type ST. @@ -667,47 +1150,16 @@ struct cql_node { modifiers: proximity arguments. - - - prefixes: index prefixes. - The prefixes is a simple linked - list (NULL for last entry). Each prefix node - is of type MOD. - - - - - - - The modifier node is a "utility" node used for name-value pairs, - such as prefixes, proximity arguements, etc. - - - - name name of mod node. - - - - - value value of mod node. - - - - - next: pointer to next node which is - always a mod node (NULL for last entry). - - - CQL to PQF conversion + CQL to PQF conversion Conversion to PQF (and Z39.50 RPN) is tricky by the fact that the resulting RPN depends on the Z39.50 target capabilities (combinations of supported attributes). - In addition, the CQL and SRW operates on index prefixes + In addition, the CQL and SRU operates on index prefixes (URI or strings), whereas the RPN uses Object Identifiers for attribute sets. @@ -742,8 +1194,30 @@ int cql_transform_buf(cql_transform_t ct, If conversion failed, cql_transform_buf - returns a non-zero error code; otherwise zero is returned - (conversion successful). + returns a non-zero SRU error code; otherwise zero is returned + (conversion successful). The meanings of the numeric error + codes are listed in the SRU specifications at + + + + If conversion fails, more information can be obtained by calling + +int cql_transform_error(cql_transform_t ct, char **addinfop); + + This function returns the most recently returned numeric + error-code and sets the string-pointer at + *addinfop to point to a string containing + additional information about the error that occurred: for + example, if the error code is 15 (``Illegal or unsupported context + set''), the additional information is the name of the requested + context set that was not recognised. + + + The SRU error-codes may be translated into brief human-readable + error messages using + +const char *cql_strerror(int code); + If you wish to be able to produce a PQF result in a different @@ -762,8 +1236,8 @@ int cql_transform_FILE(cql_transform_t ct, open FILE. - - Specification of CQL to RPN mapping + + Specification of CQL to RPN mappings The file supplied to functions cql_transform_open_FILE, @@ -792,26 +1266,37 @@ int cql_transform_FILE(cql_transform_t ct, The following CQL patterns are recognized: - qualifier.set.name + index.set.name - This pattern is invoked when a CQL qualifier, such as + This pattern is invoked when a CQL index, such as dc.title is converted. set - and name is the index set and qualifier + and name are the context set and index name respectively. Typically, the RPN specifies an equivalent use attribute. - For terms not bound by a qualifier the pattern - qualifier.srw.serverChoice is used. - Here, the prefix srw is defined as - http://www.loc.gov/zing/cql/srw-indexes/v1.0/. + For terms not bound by an index the pattern + index.cql.serverChoice is used. + Here, the prefix cql is defined as + http://www.loc.gov/zing/cql/cql-indexes/v1.0/. If this pattern is not defined, the mapping will fail. + qualifier.set.name + (DEPRECATED) + + + + For backwards compatibility, this is recognised as a synonym of + index.set.name + + + + relation.relation @@ -893,27 +1378,27 @@ int cql_transform_FILE(cql_transform_t ct, - This specification defines a CQL index set for a given prefix. + This specification defines a CQL context set for a given prefix. The value on the right hand side is the URI for the set - not RPN. All prefixes used in - qualifier patterns must be defined this way. + index patterns must be defined this way. - Small CQL to RPN mapping file + CQL to RPN mapping file - This small file defines two index sets, three qualifiers and three + This simple file defines two context sets, three indexes and three relations, a position pattern and a default structure. @attr 1=1016 @attr 2=3 @attr 4=1 @attr 3=3 @attr 6=1 "computer" - by rules qualifier.srw.serverChoice, + by rules index.cql.serverChoice, relation.scr, structure.*, position.any. @@ -956,8 +1441,18 @@ int cql_transform_FILE(cql_transform_t ct, + + CQL to RPN using Bath Profile + + The file etc/pqf.properties has mappings from + the Bath Profile and Dublin Core to RPN. + If YAZ is installed as a package it's usually located + in /usr/share/yaz/etc and part of the + development package, such as libyaz-dev. + + - CQL to XCQL conversion + CQL to XCQL conversion Conversion from CQL to XCQL is trivial and does not require a mapping to be defined. @@ -1042,15 +1537,13 @@ typedef struct oident PROTO_Z3950 - PROTO_SR + PROTO_GENERAL - If you don't care about talking to SR-based implementations (few - exist, and they may become fewer still if and when the ISO SR and ANSI - Z39.50 documents are merged into a single standard), you can ignore - this field on incoming packages, and always set it to PROTO_Z3950 - for outgoing packages. + Use PROTO_Z3950 for Z39.50 Object Identifers, + PROTO_GENERAL for other types (such as + those associated with ILL). @@ -1137,6 +1630,10 @@ typedef struct oident again, corresponding to the specific OIDs defined by the standard. + Refer to the + + Registry of Z39.50 Object Identifiers for the + whole list. @@ -1201,6 +1698,49 @@ typedef struct oident + Three utility functions are provided for translating OIDs' + symbolic names (e.g. Usmarc into OID structures + (int arrays) and strings containing the OID in dotted notation + (e.g. 1.2.840.10003.9.5.1). They are: + + + + int *oid_name_to_oid(oid_class oclass, const char *name, int *oid); + char *oid_to_dotstring(const int *oid, char *oidbuf); + char *oid_name_to_dotstring(oid_class oclass, const char *name, char *oidbuf); + + + + oid_name_to_oid() + translates the specified symbolic name, + interpreted as being of class oclass. (The + class must be specified as many symbolic names exist within + multiple classes - for example, Zthes is the + symbolic name of an attribute set, a schema and a tag-set.) The + sequence of integers representing the OID is written into the + area oid provided by the caller; it is the + caller's responsibility to ensure that this area is large enough + to contain the translated OID. As a convenience, the address of + the buffer (i.e. the value of oid) is + returned. + + + oid_to_dotstring() + Translates the int-array oid into a dotted + string which is written into the area oidbuf + supplied by the caller; it is the caller's responsibility to + ensure that this area is large enough. The address of the buffer + is returned. + + + oid_name_to_dotstring() + combines the previous two functions to derive a dotted string + representing the OID specified by oclass and + name, writing it into the buffer passed as + oidbuf and returning its address. + + + Finally, the module provides the following utility functions, whose meaning should be obvious: @@ -1238,7 +1778,7 @@ typedef struct oident release the associated memory again. For the structures describing the Z39.50 PDUs and related structures, it is convenient to use the memory-management system of the &odr; subsystem (see - Using ODR). However, in some circumstances + ). However, in some circumstances where you might otherwise benefit from using a simple nibble memory management system, it may be impractical to use odr_malloc() and odr_reset(). @@ -1288,6 +1828,267 @@ typedef struct oident + + Log + + &yaz; has evolved a fairly complex log system which should be useful both + for debugging &yaz; itself, debugging applications that use &yaz;, and for + production use of those applications. + + + The log functions are declared in header yaz/log.h + and implemented in src/log.c. + Due to name clash with syslog and some math utilities the logging + interface has been modified as of YAZ 2.0.29. The obsolete interface + is still available if in header file yaz/log.h. + The key points of the interface are: + + + void yaz_log(int level, const char *fmt, ...) + + void yaz_log_init(int level, const char *prefix, const char *name); + void yaz_log_init_file(const char *fname); + void yaz_log_init_level(int level); + void yaz_log_init_prefix(const char *prefix); + void yaz_log_time_format(const char *fmt); + void yaz_log_init_max_size(int mx); + + int yaz_log_mask_str(const char *str); + int yaz_log_module_level(const char *name); + + + + The reason for the whole log module is the yaz_log + function. It takes a bitmask indicating the log levels, a + printf-like format string, and a variable number of + arguments to log. + + + + The log level is a bit mask, that says on which level(s) + the log entry should be made, and optionally set some behaviour of the + logging. In the most simple cases, it can be one of YLOG_FATAL, + YLOG_DEBUG, YLOG_WARN, YLOG_LOG. Those can be combined with bits + that modify the way the log entry is written:YLOG_ERRNO, + YLOG_NOTIME, YLOG_FLUSH. + Most of the rest of the bits are deprecated, and should not be used. Use + the dynamic log levels instead. + + + + Applications that use &yaz;, should not use the LOG_LOG for ordinary + messages, but should make use of the dynamic loglevel system. This consists + of two parts, defining the loglevel and checking it. + + + + To define the log levels, the (main) program should pass a string to + yaz_log_mask_str to define which log levels are to be + logged. This string should be a comma-separated list of log level names, + and can contain both hard-coded names and dynamic ones. The log level + calculation starts with YLOG_DEFAULT_LEVEL and adds a bit + for each word it meets, unless the word starts with a '-', in which case it + clears the bit. If the string 'none' is found, + all bits are cleared. Typically this string comes from the command-line, + often identified by -v. The + yaz_log_mask_str returns a log level that should be + passed to yaz_log_init_level for it to take effect. + + + + Each module should check what log bits it should be used, by calling + yaz_log_module_level with a suitable name for the + module. The name is cleared from a preceding path and an extension, if any, + so it is quite possible to use __FILE__ for it. If the + name has been passed to yaz_log_mask_str, the routine + returns a non-zero bitmask, which should then be used in consequent calls + to yaz_log. (It can also be tested, so as to avoid unnecessary calls to + yaz_log, in time-critical places, or when the log entry would take time + to construct.) + + + + Yaz uses the following dynamic log levels: + server, session, request, requestdetail for the server + functionality. + zoom for the zoom client api. + ztest for the simple test server. + malloc, nmem, odr, eventl for internal debugging of yaz itself. + Of course, any program using yaz is welcome to define as many new ones, as + it needs. + + + + By default the log is written to stderr, but this can be changed by a call + to yaz_log_init_file or + yaz_log_init. If the log is directed to a file, the + file size is checked at every write, and if it exceeds the limit given in + yaz_log_init_max_size, the log is rotated. The + rotation keeps one old version (with a .1 appended to + the name). The size defaults to 1GB. Setting it to zero will disable the + rotation feature. + + + + A typical yaz-log looks like this + 13:23:14-23/11 yaz-ztest(1) [session] Starting session from tcp:127.0.0.1 (pid=30968) + 13:23:14-23/11 yaz-ztest(1) [request] Init from 'YAZ' (81) (ver 2.0.28) OK + 13:23:17-23/11 yaz-ztest(1) [request] Search Z: @attrset Bib-1 foo OK:7 hits + 13:23:22-23/11 yaz-ztest(1) [request] Present: [1] 2+2 OK 2 records returned + 13:24:13-23/11 yaz-ztest(1) [request] Close OK + + + + The log entries start with a time stamp. This can be omitted by setting the + YLOG_NOTIME bit in the loglevel. This way automatic tests + can be hoped to produce identical log files, that are easy to diff. The + format of the time stamp can be set with + yaz_log_time_format, which takes a format string just + like strftime. + + + + Next in a log line comes the prefix, often the name of the program. For + yaz-based servers, it can also contain the session number. Then + comes one or more logbits in square brackets, depending on the logging + level set by yaz_log_init_level and the loglevel + passed to yaz_log_init_level. Finally comes the format + string and additional values passed to yaz_log + + + + The log level YLOG_LOGLVL, enabled by the string + loglevel, will log all the log-level affecting + operations. This can come in handy if you need to know what other log + levels would be useful. Grep the logfile for [loglevel]. + + + + The log system is almost independent of the rest of &yaz;, the only + important dependence is of nmem, and that only for + using the semaphore definition there. + + + + The dynamic log levels and log rotation were introduced in &yaz; 2.0.28. At + the same time, the log bit names were changed from + LOG_something to YLOG_something, + to avoid collision with syslog.h. + + + + + MARC + + + YAZ provides a fast utility that decodes MARC records and + encodes to a varity of output formats. The MARC records must + be encoded in ISO2709. + + + + /* create handler */ + yaz_marc_t yaz_marc_create(void); + /* destroy */ + void yaz_marc_destroy(yaz_marc_t mt); + + /* set XML mode YAZ_MARC_LINE, YAZ_MARC_SIMPLEXML, ... */ + void yaz_marc_xml(yaz_marc_t mt, int xmlmode); + #define YAZ_MARC_LINE 0 + #define YAZ_MARC_SIMPLEXML 1 + #define YAZ_MARC_OAIMARC 2 + #define YAZ_MARC_MARCXML 3 + #define YAZ_MARC_ISO2709 4 + #define YAZ_MARC_XCHANGE 5 + + /* supply iconv handle for character set conversion .. */ + void yaz_marc_iconv(yaz_marc_t mt, yaz_iconv_t cd); + + /* set debug level, 0=none, 1=more, 2=even more, .. */ + void yaz_marc_debug(yaz_marc_t mt, int level); + + /* decode MARC in buf of size bsize. Returns >0 on success; <=0 on failure. + On success, result in *result with size *rsize. */ + int yaz_marc_decode_buf (yaz_marc_t mt, const char *buf, int bsize, + char **result, int *rsize); + + /* decode MARC in buf of size bsize. Returns >0 on success; <=0 on failure. + On success, result in WRBUF */ + int yaz_marc_decode_wrbuf (yaz_marc_t mt, const char *buf, + int bsize, WRBUF wrbuf); +]]> + + + A MARC conversion handle must be created by using + yaz_marc_create and destroyed + by calling yaz_marc_destroy. + + + All other function operate on a yaz_marc_t handle. + The output is specified by a call to yaz_marc_xml. + The xmlmode must be one of + + + YAZ_MARC_LINE + + + A simple line-by-line format suitable for display but not + recommend for further (machine) processing. + + + + + + YAZ_MARC_MARCXML + + + The resulting record is converted to MARCXML. + + + + + + YAZ_MARC_ISO2709 + + + The resulting record is converted to ISO2709 (MARC). + + + + + + + The actual conversion functions are + yaz_marc_decode_buf and + yaz_marc_decode_wrbuf which decodes and encodes + a MARC record. The former function operates on simple buffers, the + stores the resulting record in a WRBUF handle (WRBUF is a simple string + type). + + + Display of MARC record + + The followint program snippet illustrates how the MARC API may + be used to convert a MARC record to the line-by-line format: + + + + + +