Supporting Tools In support of the service API - primarily the ASN module, which provides the programmatic interface to the Z39.50 APDUs, YAZ contains a collection of tools that support the development of applications. Query Syntax Parsers Since the type-1 (RPN) query structure has no direct, useful string representation, every origin application needs to provide some form of mapping from a local query notation or representation to a Z_RPNQuery structure. Some programmers will prefer to construct the query manually, perhaps using odr_malloc() to simplify memory management. The &yaz; distribution includes two separate, query-generating tools that may be of use to you. Prefix Query Format Since RPN or reverse polish notation is really just a fancy way of describing a suffix notation format (operator follows operands), it would seem that the confusion is total when we now introduce a prefix notation for RPN. The reason is one of simple laziness - it's somewhat simpler to interpret a prefix format, and this utility was designed for maximum simplicity, to provide a baseline representation for use in simple test applications and scripting environments (like Tcl). The demonstration client included with YAZ uses the PQF. The PQF is defined by the pquery module in the YAZ library. The pquery.h file provides the declaration of the functions Z_RPNQuery *p_query_rpn (ODR o, oid_proto proto, const char *qbuf); Z_AttributesPlusTerm *p_query_scan (ODR o, oid_proto proto, Odr_oid **attributeSetP, const char *qbuf); int p_query_attset (const char *arg); The function p_query_rpn() takes as arguments an &odr; stream (see section The ODR Module) to provide a memory source (the structure created is released on the next call to odr_reset() on the stream), a protocol identifier (one of the constants PROTO_Z3950 and PROTO_SR), an attribute set reference, and finally a null-terminated string holding the query string. If the parse went well, p_query_rpn() returns a pointer to a Z_RPNQuery structure which can be placed directly into a Z_SearchRequest. The p_query_attset specifies which attribute set to use if the query doesn't specify one by the @attrset operator. The p_query_attset returns 0 if the argument is a valid attribute set specifier; otherwise the function returns -1. The grammar of the PQF is as follows: Query ::= [ AttSet ] QueryStruct. AttSet ::= string. QueryStruct ::= { Attribute } Simple | Complex. Attribute ::= '@attr' AttributeType '=' AttributeValue. AttributeType ::= integer. AttributeValue ::= integer. Complex ::= Operator QueryStruct QueryStruct. Operator ::= '@and' | '@or' | '@not' | '@prox' Proximity. Simple ::= ResultSet | Term. ResultSet ::= '@set' string. Term ::= string | '"' string '"'. Proximity ::= Exclusion Distance Ordered Relation WhichCode UnitCode. Exclusion ::= '1' | '0' | 'void'. Distance ::= integer. Ordered ::= '1' | '0'. Relation ::= integer. WhichCode ::= 'known' | 'private' | integer. UnitCode ::= integer. You will note that the syntax above is a fairly faithful representation of RPN, except for the Attibute, which has been moved a step away from the term, allowing you to associate one or more attributes with an entire query structure. The parser will automatically apply the given attributes to each term as required. The following are all examples of valid queries in the PQF. dylan "bob dylan" @or "dylan" "zimmerman" @set Result-1 @or @and bob dylan @set Result-1 @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming" @attr 4=1 @attr 1=4 "self portrait" @prox 0 3 1 2 k 2 dylan zimmerman Common Command Language Not all users enjoy typing in prefix query structures and numerical attribute values, even in a minimalistic test client. In the library world, the more intuitive Common Command Language (or ISO 8777) has enjoyed some popularity - especially before the widespread availability of graphical interfaces. It is still useful in applications where you for some reason or other need to provide a symbolic language for expressing boolean query structures. The EUROPAGATE research project working under the Libraries programme of the European Commission's DG XIII has, amongst other useful tools, implemented a general-purpose CCL parser which produces an output structure that can be trivially converted to the internal RPN representation of YAZ (The Z_RPNQuery structure). Since the CCL utility - along with the rest of the software produced by EUROPAGATE - is made freely available on a liberal license, it is included as a supplement to YAZ. CCL Syntax The CCL parser obeys the following grammar for the FIND argument. The syntax is annotated by in the lines prefixed by ‐‐. CCL-Find ::= CCL-Find Op Elements | Elements. Op ::= "and" | "or" | "not" -- The above means that Elements are separated by boolean operators. Elements ::= '(' CCL-Find ')' | Set | Terms | Qualifiers Relation Terms | Qualifiers Relation '(' CCL-Find ')' | Qualifiers '=' string '-' string -- Elements is either a recursive definition, a result set reference, a -- list of terms, qualifiers followed by terms, qualifiers followed -- by a recursive definition or qualifiers in a range (lower - upper). Set ::= 'set' = string -- Reference to a result set Terms ::= Terms Prox Term | Term -- Proximity of terms. Term ::= Term string | string -- This basically means that a term may include a blank Qualifiers ::= Qualifiers ',' string | string -- Qualifiers is a list of strings separated by comma Relation ::= '=' | '>=' | '<=' | '<>' | '>' | '<' -- Relational operators. This really doesn't follow the ISO8777 -- standard. Prox ::= '%' | '!' -- Proximity operator The following queries are all valid: dylan "bob dylan" dylan or zimmerman set=1 (dylan and bob) or set=1 Assuming that the qualifiers ti, au and date are defined we may use: ti=self portrait au=(bob dylan and slow train coming) date>1980 and (ti=((self portrait))) CCL Qualifiers Qualifiers are used to direct the search to a particular searchable index, such as title (ti) and author indexes (au). The CCL standard itself doesn't specify a particular set of qualifiers, but it does suggest a few short-hand notations. You can customize the CCL parser to support a particular set of qualifiers to relect the current target profile. Traditionally, a qualifier would map to a particular use-attribute within the BIB-1 attribute set. However, you could also define qualifiers that would set, for example, the structure-attribute. Consider a scenario where the target support ranked searches in the title-index. In this case, the user could specify > ti,ranked=knuth computer and the ranked would map to structure=free-form-text (4=105) and the ti would map to title (1=4). A "profile" with a set predefined CCL qualifiers can be read from a file. The YAZ client reads its CCL qualifiers from a file named default.bib. Each line in the file has the form: qualifier-name type=val type=val ... where qualifier-name is the name of the qualifier to be used (eg. ti), type is a BIB-1 category type and val is the corresponding BIB-1 attribute value. The type can be either numeric or it may be either u (use), r (relation), p (position), s (structure), t (truncation) or c (completeness). The qualifier-name term has a special meaning. The types and values for this definition is used when no qualifiers are present. Consider the following definition: ti u=4 s=1 au u=1 s=1 term s=105 Two qualifiers are defined, ti and au. They both set the structure-attribute to phrase (1). ti sets the use-attribute to 4. au sets the use-attribute to 1. When no qualifiers are used in the query the structure-attribute is set to free-form-text (105). CCL API All public definitions can be found in the header file ccl.h. A profile identifier is of type CCL_bibset. A profile must be created with the call to the function ccl_qual_mk which returns a profile handle of type CCL_bibset. To read a file containing qualifier definitions the function ccl_qual_file may be convenient. This function takes an already opened FILE handle pointer as argument along with a CCL_bibset handle. To parse a simple string with a FIND query use the function struct ccl_rpn_node *ccl_find_str (CCL_bibset bibset, const char *str, int *error, int *pos); which takes the CCL profile (bibset) and query (str) as input. Upon successful completion the RPN tree is returned. If an error eccur, such as a syntax error, the integer pointed to by error holds the error code and pos holds the offset inside query string in which the parsing failed. An english representation of the error may be obtained by calling the ccl_err_msg function. The error codes are listed in ccl.h. To convert the CCL RPN tree (type struct ccl_rpn_node *) to the Z_RPNQuery of YAZ the function ccl_rpn_query must be used. This function which is part of YAZ is implemented in yaz-ccl.c. After calling this function the CCL RPN tree is probably no longer needed. The ccl_rpn_delete destroys the CCL RPN tree. A CCL profile may be destroyed by calling the ccl_qual_rm function. The token names for the CCL operators may be changed by setting the globals (all type char *) ccl_token_and, ccl_token_or, ccl_token_not and ccl_token_set. An operator may have aliases, i.e. there may be more than one name for the operator. To do this, separate each alias with a space character. Object Identifiers The basic YAZ representation of an OID is an array of integers, terminated with the value -1. The &odr; module provides two utility-functions to create and copy this type of data elements: Odr_oid *odr_getoidbystr(ODR o, char *str); Creates an OID based on a string-based representation using dots (.) to separate elements in the OID. Odr_oid *odr_oiddup(ODR odr, Odr_oid *o); Creates a copy of the OID referenced by the o parameter. Both functions take an &odr; stream as parameter. This stream is used to allocate memory for the data elements, which is released on a subsequent call to odr_reset() on that stream. The OID module provides a higher-level representation of the family of object identifers which describe the Z39.50 protocol and its related objects. The definition of the module interface is given in the oid.h file. The interface is mainly based on the oident structure. The definition of this structure looks like this: typedef struct oident { oid_proto proto; oid_class oclass; oid_value value; int oidsuffix[OID_SIZE]; char *desc; } oident; The proto field takes one of the values PROTO_Z3950 PROTO_SR If you don't care about talking to SR-based implementations (few exist, and they may become fewer still if and when the ISO SR and ANSI Z39.50 documents are merged into a single standard), you can ignore this field on incoming packages, and always set it to PROTO_Z3950 for outgoing packages. The oclass field takes one of the values CLASS_APPCTX CLASS_ABSYN CLASS_ATTSET CLASS_TRANSYN CLASS_DIAGSET CLASS_RECSYN CLASS_RESFORM CLASS_ACCFORM CLASS_EXTSERV CLASS_USERINFO CLASS_ELEMSPEC CLASS_VARSET CLASS_SCHEMA CLASS_TAGSET CLASS_GENERAL corresponding to the OID classes defined by the Z39.50 standard. Finally, the value field takes one of the values VAL_APDU VAL_BER VAL_BASIC_CTX VAL_BIB1 VAL_EXP1 VAL_EXT1 VAL_CCL1 VAL_GILS VAL_WAIS VAL_STAS VAL_DIAG1 VAL_ISO2709 VAL_UNIMARC VAL_INTERMARC VAL_CCF VAL_USMARC VAL_UKMARC VAL_NORMARC VAL_LIBRISMARC VAL_DANMARC VAL_FINMARC VAL_MAB VAL_CANMARC VAL_SBN VAL_PICAMARC VAL_AUSMARC VAL_IBERMARC VAL_EXPLAIN VAL_SUTRS VAL_OPAC VAL_SUMMARY VAL_GRS0 VAL_GRS1 VAL_EXTENDED VAL_RESOURCE1 VAL_RESOURCE2 VAL_PROMPT1 VAL_DES1 VAL_KRB1 VAL_PRESSET VAL_PQUERY VAL_PCQUERY VAL_ITEMORDER VAL_DBUPDATE VAL_EXPORTSPEC VAL_EXPORTINV VAL_NONE VAL_SETM VAL_SETG VAL_VAR1 VAL_ESPEC1 again, corresponding to the specific OIDs defined by the standard. The desc field contains a brief, mnemonic name for the OID in question. The function struct oident *oid_getentbyoid(int *o); takes as argument an OID, and returns a pointer to a static area containing an oident structure. You typically use this function when you receive a PDU containing an OID, and you wish to branch out depending on the specific OID value. The function int *oid_ent_to_oid(struct oident *ent, int *dst); Takes as argument an oident structure - in which the proto, oclass/, and value fields are assumed to be set correctly - and returns a pointer to a the buffer as given by dst containing the base representation of the corresponding OID. The function returns NULL and the array dst is unchanged if a mapping couldn't place. The array dst should be at least of size OID_SIZE. The oid_ent_to_oid() function can be used whenever you need to prepare a PDU containing one or more OIDs. The separation of the protocol element from the remainer of the OID-description makes it simple to write applications that can communicate with either Z39.50 or OSI SR-based applications. The function < oid_value oid_getvalbyname(const char *name); takes as argument a mnemonic OID name, and returns the /value field of the first entry in the database that contains the given name in its desc field. Finally, the module provides the following utility functions, whose meaning should be obvious: void oid_oidcpy(int *t, int *s); void oid_oidcat(int *t, int *s); int oid_oidcmp(int *o1, int *o2); int oid_oidlen(int *o); The OID module has been criticized - and perhaps rightly so - for needlessly abstracting the representation of OIDs. Other toolkits use a simple string-representation of OIDs with good results. In practice, we have found the interface comfortable and quick to work with, and it is a simple matter (for what it's worth) to create applications compatible with both ISO SR and Z39.50. Finally, the use of the /oident database is by no means mandatory. You can easily create your own system for representing OIDs, as long as it is compatible with the low-level integer-array representation of the ODR module. Nibble Memory Sometimes when you need to allocate and construct a large, interconnected complex of structures, it can be a bit of a pain to release the associated memory again. For the structures describing the Z39.50 PDUs and related structures, it is convenient to use the memory-management system of the &odr; subsystem (see Using ODR). However, in some circumstances where you might otherwise benefit from using a simple nibble memory management system, it may be impractical to use odr_malloc() and odr_reset(). For this purpose, the memory manager which also supports the &odr; streams is made available in the NMEM module. The external interface to this module is given in the nmem.h file. The following prototypes are given: NMEM nmem_create(void); void nmem_destroy(NMEM n); void *nmem_malloc(NMEM n, int size); void nmem_reset(NMEM n); int nmem_total(NMEM n); void nmem_init(void); The nmem_create() function returns a pointer to a memory control handle, which can be released again by nmem_destroy() when no longer needed. The function nmem_malloc() allocates a block of memory of the requested size. A call to nmem_reset() or nmem_destroy() will release all memory allocated on the handle since it was created (or since the last call to nmem_reset(). The function nmem_total() returns the number of bytes currently allocated on the handle. The nibble memory pool is shared amonst threads. POSIX mutex'es and WIN32 Critical sections are introduced to keep the module thread safe. On WIN32 function nmem_init() initialises the Critical Section handle and should be called once before any other nmem function is used.