X-Git-Url: http://git.indexdata.com/?p=yaz-moved-to-github.git;a=blobdiff_plain;f=doc%2Ftools.xml;h=ec502205785b0434b35539fd7577861754364408;hp=6f565dcc5579faecb781032b2891aaff0f28525f;hb=58b73e42cc56014e9a1f57474884c6bd4e97ccc1;hpb=b05a82226d0840e32772c62285e9eab584ae000a diff --git a/doc/tools.xml b/doc/tools.xml index 6f565dc..ec50220 100644 --- a/doc/tools.xml +++ b/doc/tools.xml @@ -1,4 +1,3 @@ - Supporting Tools @@ -435,17 +434,6 @@ symbolic language for expressing boolean query structures. - - The EUROPAGATE research project working under the Libraries programme - of the European Commission's DG XIII has, amongst other useful tools, - implemented a general-purpose CCL parser which produces an output - structure that can be trivially converted to the internal RPN - representation of &yaz; (The Z_RPNQuery structure). - Since the CCL utility - along with the rest of the software - produced by EUROPAGATE - is made freely available on a liberal - license, it is included as a supplement to &yaz;. - - CCL Syntax @@ -652,7 +640,7 @@ - Refer to the complete + Refer to or the complete list of Bib-1 attributes @@ -691,6 +679,17 @@ + s=ag + Tokens that appears as phrases (with blank in them) gets + structure phrase attached. Tokens that appers as words + gets structure phrase attached. Phrases and words are + ANDed. This is a variant of s=al and s=pw, with the main + difference that words are not split (with operator AND) + but instead kept in one RPN token. This facility appeared + in YAZ 4.2.38. + + + r=o Allows ranges and the operators greather-than, less-than, ... equals. @@ -743,6 +742,24 @@ set to both left&right. + + t=x + Allows masking anywhere in a term, thus fully supporting + # (mask one character) and ? (zero or more of any). + If masking is used, trunction is set to 102 (regexp-1 in term) + and the term is converted accordingly to a regular expression. + + + + t=z + Allows masking anywhere in a term, thus fully supporting + # (mask one character) and ? (zero or more of any). + If masking is used, trunction is set to 104 (Z39.58 in term) + and the term is converted accordingly to Z39.58 masking term - + actually the same truncation as CCL itself. + + + @@ -857,9 +874,9 @@ case Specificies if CCL operatores and qualifiers should be - compared with case sensitivity or not. Specify 0 for - case sensitive; 1 for case insensitive. - 0 + compared with case sensitivity or not. Specify 1 for + case sensitive; 0 for case insensitive. + 1 @@ -1541,293 +1558,178 @@ void cql_to_xml_stdio(struct cql_node *cn, FILE *f); The basic YAZ representation of an OID is an array of integers, - terminated with the value -1. The &odr; module provides two - utility-functions to create and copy this type of data elements: - - - - Odr_oid *odr_getoidbystr(ODR o, char *str); - - - - Creates an OID based on a string-based representation using dots (.) - to separate elements in the OID. - - - - Odr_oid *odr_oiddup(ODR odr, Odr_oid *o); - - - - Creates a copy of the OID referenced by the o - parameter. - Both functions take an &odr; stream as parameter. This stream is used to - allocate memory for the data elements, which is released on a - subsequent call to odr_reset() on that stream. - - - - The OID module provides a higher-level representation of the - family of object identifiers which describe the Z39.50 protocol and its - related objects. The definition of the module interface is given in - the oid.h file. + terminated with the value -1. This integer is of type + Odr_oid. - - The interface is mainly based on the oident structure. - The definition of this structure looks like this: + Fundamental OID operations and the type Odr_oid + are defined in yaz/oid_util.h. - - -typedef struct oident -{ - oid_proto proto; - oid_class oclass; - oid_value value; - int oidsuffix[OID_SIZE]; - char *desc; -} oident; - - - The proto field takes one of the values + An OID can either be declared as a automatic variable or it can + allocated using the memory utilities or ODR/NMEM. It's + guaranteed that an OID can fit in OID_SIZE integers. - - - PROTO_Z3950 - PROTO_GENERAL - - - - Use PROTO_Z3950 for Z39.50 Object Identifers, - PROTO_GENERAL for other types (such as - those associated with ILL). - - - - The oclass field takes one of the values - - - - CLASS_APPCTX - CLASS_ABSYN - CLASS_ATTSET - CLASS_TRANSYN - CLASS_DIAGSET - CLASS_RECSYN - CLASS_RESFORM - CLASS_ACCFORM - CLASS_EXTSERV - CLASS_USERINFO - CLASS_ELEMSPEC - CLASS_VARSET - CLASS_SCHEMA - CLASS_TAGSET - CLASS_GENERAL - - - - corresponding to the OID classes defined by the Z39.50 standard. - - Finally, the value field takes one of the values - - - - VAL_APDU - VAL_BER - VAL_BASIC_CTX - VAL_BIB1 - VAL_EXP1 - VAL_EXT1 - VAL_CCL1 - VAL_GILS - VAL_WAIS - VAL_STAS - VAL_DIAG1 - VAL_ISO2709 - VAL_UNIMARC - VAL_INTERMARC - VAL_CCF - VAL_USMARC - VAL_UKMARC - VAL_NORMARC - VAL_LIBRISMARC - VAL_DANMARC - VAL_FINMARC - VAL_MAB - VAL_CANMARC - VAL_SBN - VAL_PICAMARC - VAL_AUSMARC - VAL_IBERMARC - VAL_EXPLAIN - VAL_SUTRS - VAL_OPAC - VAL_SUMMARY - VAL_GRS0 - VAL_GRS1 - VAL_EXTENDED - VAL_RESOURCE1 - VAL_RESOURCE2 - VAL_PROMPT1 - VAL_DES1 - VAL_KRB1 - VAL_PRESSET - VAL_PQUERY - VAL_PCQUERY - VAL_ITEMORDER - VAL_DBUPDATE - VAL_EXPORTSPEC - VAL_EXPORTINV - VAL_NONE - VAL_SETM - VAL_SETG - VAL_VAR1 - VAL_ESPEC1 - - - - again, corresponding to the specific OIDs defined by the standard. - Refer to the - - Registry of Z39.50 Object Identifiers for the - whole list. - - - - The desc field contains a brief, mnemonic name for the OID in question. - - + Create OID on stack + + We can create an OID for the Bib-1 attribute set with: + + Odr_oid bib1[OID_SIZE]; + bib1[0] = 1; + bib1[1] = 2; + bib1[2] = 840; + bib1[3] = 10003; + bib1[4] = 3; + bib1[5] = 1; + bib1[6] = -1; + + + - The function + And OID may also be filled from a string-based representation using + dots (.). This is achieved by function + + int oid_dotstring_to_oid(const char *name, Odr_oid *oid); + + This functions returns 0 if name could be converted; -1 otherwise. - - - struct oident *oid_getentbyoid(int *o); - - - - takes as argument an OID, and returns a pointer to a static area - containing an oident structure. You typically use - this function when you receive a PDU containing an OID, and you wish - to branch out depending on the specific OID value. + Using oid_oiddotstring_to_oid + + We can fill the Bib-1 attribute set OID easier with: + + Odr_oid bib1[OID_SIZE]; + oid_oiddotstring_to_oid("1.2.840.10003.3.1", bib1); + - + - The function - - + We can also allocate an OID dynamically on a ODR stream with: - int *oid_ent_to_oid(struct oident *ent, int *dst); + Odr_oid *odr_getoidbystr(ODR o, const char *str); - - - Takes as argument an oident structure - in which - the proto, oclass/, and - value fields are assumed to be set correctly - - and returns a pointer to a the buffer as given by dst - containing the base - representation of the corresponding OID. The function returns - NULL and the array dst is unchanged if a mapping couldn't place. - The array dst should be at least of size - OID_SIZE. + This creates an OID from string-based representation using dots. + This function take an &odr; stream as parameter. This stream is used to + allocate memory for the data elements, which is released on a + subsequent call to odr_reset() on that stream. - - The oid_ent_to_oid() function can be used whenever - you need to prepare a PDU containing one or more OIDs. The separation of - the protocol element from the remainder of the - OID-description makes it simple to write applications that can - communicate with either Z39.50 or OSI SR-based applications. - + Using odr_getoidbystr + + We can create a OID for the Bib-1 attribute set with: + + Odr_oid *bib1 = odr_getoidbystr(odr, "1.2.840.10003.3.1"); + + + The function - - - - oid_value oid_getvalbyname(const char *name); - - - - takes as argument a mnemonic OID name, and returns the - /value field of the first entry in the database that - contains the given name in its desc field. + + char *oid_oid_to_dotstring(const Odr_oid *oid, char *oidbuf) + + does the reverse of oid_oiddotstring_to_oid. It + converts an OID to the string-based representation using dots. + The supplied char buffer oidbuf holds the resulting + string and must be at least OID_STR_MAX in size. - Three utility functions are provided for translating OIDs' - symbolic names (e.g. Usmarc into OID structures - (int arrays) and strings containing the OID in dotted notation - (e.g. 1.2.840.10003.9.5.1). They are: + OIDs can be copied with oid_oidcpy which takes + two OID lists as arguments. Alternativly, an OID copy can be allocated + on a ODR stream with: + + Odr_oid *odr_oiddup(ODR odr, const Odr_oid *o); + - - - int *oid_name_to_oid(oid_class oclass, const char *name, int *oid); - char *oid_to_dotstring(const int *oid, char *oidbuf); - char *oid_name_to_dotstring(oid_class oclass, const char *name, char *oidbuf); - - - - oid_name_to_oid() - translates the specified symbolic name, - interpreted as being of class oclass. (The - class must be specified as many symbolic names exist within - multiple classes - for example, Zthes is the - symbolic name of an attribute set, a schema and a tag-set.) The - sequence of integers representing the OID is written into the - area oid provided by the caller; it is the - caller's responsibility to ensure that this area is large enough - to contain the translated OID. As a convenience, the address of - the buffer (i.e. the value of oid) is - returned. - - - oid_to_dotstring() - Translates the int-array oid into a dotted - string which is written into the area oidbuf - supplied by the caller; it is the caller's responsibility to - ensure that this area is large enough. The address of the buffer - is returned. - - - oid_name_to_dotstring() - combines the previous two functions to derive a dotted string - representing the OID specified by oclass and - name, writing it into the buffer passed as - oidbuf and returning its address. - - + - Finally, the module provides the following utility functions, whose - meaning should be obvious: + OIDs can be compared with oid_oidcmp which returns + zero if the two OIDs provided are identical; non-zero otherwise. + + OID database + + From YAZ version 3 and later, the oident system has been replaced + by an OID database. OID database is a misnomer .. the old odient + system was also a database. + + + The OID database is really just a map between named Object Identifiers + (string) and their OID raw equivalents. Most operations either + convert from string to OID or other way around. + + + Unfortunately, whenever we supply a string we must also specify the + OID class. The class is necessary because some + strings correspond to multiple OIDs. An example of such a string is + Bib-1 which may either be an attribute-set + or a diagnostic-set. + + + Applications using the YAZ database should include + yaz/oid_db.h. + + + A YAZ database handle is of type yaz_oid_db_t. + Actually that's a pointer. You need not think deal with that. + YAZ has a built-in database which can be considered "constant" for + most purposes. + We can get hold that by using function yaz_oid_std. + + + All functions with prefix yaz_string_to_oid + converts from class + string to OID. We have variants of this + operation due to different memory allocation strategies. + + + All functions with prefix + yaz_oid_to_string converts from OID to string + + class. + - - void oid_oidcpy(int *t, int *s); - void oid_oidcat(int *t, int *s); - int oid_oidcmp(int *o1, int *o2); - int oid_oidlen(int *o); - + Create OID with YAZ DB + + We can create an OID for the Bib-1 attribute set on the ODR stream + odr with: + + Odr_oid *bib1 = + yaz_string_to_oid_odr(yaz_oid_std(), CLASS_ATTSET, "Bib-1", odr); + + This is more complex than using odr_getoidbystr. + You would only use yaz_string_to_oid_odr when the + string (here Bib-1) is supplied by a user or configuration. + + - + + Standard OIDs + - The OID module has been criticized - and perhaps rightly so - - for needlessly abstracting the - representation of OIDs. Other toolkits use a simple - string-representation of OIDs with good results. In practice, we have - found the interface comfortable and quick to work with, and it is a - simple matter (for what it's worth) to create applications compatible - with both ISO SR and Z39.50. Finally, the use of the - /oident database is by no means mandatory. - You can easily create your own system for representing OIDs, as long - as it is compatible with the low-level integer-array representation - of the ODR module. + All the object identifers in the standard OID database as returned + by yaz_oid_std can referenced directly in a + program as a constant OID. + Each constant OID is prefixed with yaz_oid_ - + followed by OID class (lowercase) - then by OID name (normalized and + lowercase). + + + See for list of all object identifiers + built into YAZ. + These are declared in yaz/oid_std.h but are + included by yaz/oid_db.h as well. - + Use a built-in OID + + We can allocate our own OID filled with the constant OID for + Bib-1 with: + + Odr_oid *bib1 = odr_oiddup(o, yaz_oid_attset_bib1); + + + + - Nibble Memory @@ -1852,9 +1754,9 @@ typedef struct oident NMEM nmem_create(void); void nmem_destroy(NMEM n); - void *nmem_malloc(NMEM n, int size); + void *nmem_malloc(NMEM n, size_t size); void nmem_reset(NMEM n); - int nmem_total(NMEM n); + size_t nmem_total(NMEM n); void nmem_init(void); void nmem_exit(void); @@ -2039,9 +1941,9 @@ typedef struct oident MARC - YAZ provides a fast utility that decodes MARC records and - encodes to a varity of output formats. The MARC records must - be encoded in ISO2709. + YAZ provides a fast utility for working with MARC records. + Early versions of the MARC utility only allowed decoding of ISO2709. + Today the utility may both encode - and decode to a varity of formats. @@ -2059,6 +1961,8 @@ typedef struct oident #define YAZ_MARC_MARCXML 3 #define YAZ_MARC_ISO2709 4 #define YAZ_MARC_XCHANGE 5 + #define YAZ_MARC_CHECK 6 + #define YAZ_MARC_TURBOMARC 7 /* supply iconv handle for character set conversion .. */ void yaz_marc_iconv(yaz_marc_t mt, yaz_iconv_t cd); @@ -2068,15 +1972,22 @@ typedef struct oident /* decode MARC in buf of size bsize. Returns >0 on success; <=0 on failure. On success, result in *result with size *rsize. */ - int yaz_marc_decode_buf (yaz_marc_t mt, const char *buf, int bsize, - char **result, int *rsize); + int yaz_marc_decode_buf(yaz_marc_t mt, const char *buf, int bsize, + const char **result, size_t *rsize); /* decode MARC in buf of size bsize. Returns >0 on success; <=0 on failure. On success, result in WRBUF */ - int yaz_marc_decode_wrbuf (yaz_marc_t mt, const char *buf, - int bsize, WRBUF wrbuf); + int yaz_marc_decode_wrbuf(yaz_marc_t mt, const char *buf, + int bsize, WRBUF wrbuf); ]]> + + + The synopsis is just a basic subset of all functionality. Refer + to the actual header file marcdisp.h for + details. + + A MARC conversion handle must be created by using yaz_marc_create and destroyed @@ -2101,7 +2012,7 @@ typedef struct oident YAZ_MARC_MARCXML - The resulting record is converted to MARCXML. + MARCXML. @@ -2110,10 +2021,41 @@ typedef struct oident YAZ_MARC_ISO2709 - The resulting record is converted to ISO2709 (MARC). + ISO2709 (sometimes just referred to as "MARC"). + + + YAZ_MARC_XCHANGE + + + MarcXchange. + + + + + + YAZ_MARC_CHECK + + + Pseudo format for validation only. Does not generate + any real output except diagnostics. + + + + + + YAZ_MARC_TURBOMARC + + + XML format with same semantics as MARCXML but more compact + and geared towards fast processing with XSLT. Refer to + for more information. + + + + @@ -2127,13 +2069,13 @@ typedef struct oident Display of MARC record - The followint program snippet illustrates how the MARC API may + The following program snippet illustrates how the MARC API may be used to convert a MARC record to the line-by-line format: + + TurboMARC + + TurboMARC is yet another XML encoding of a MARC record. The format + was designed for fast processing with XSLT. + + + Applications like + Pazpar2 uses XSLT to convert an XML encoded MARC record to an internal + representation. This conversion mostly check the tag of a MARC field + to determine the basic rules in the conversion. This check is + costly when that is tag is encoded as an attribute in MARCXML. + By having the tag value as the element instead, makes processing + many times faster (at least for Libxslt). + + + TurboMARC is encoded as follows: + + + Record elements is part of namespace + "http://www.indexdata.com/turbomarc". + + + A record is enclosed in element r. + + + A collection of records is enclosed in element + collection. + + + The leader is encoded as element l with the + leader content as its (text) value. + + + A control field is encoded as element c concatenated + with the tag value of the control field if the tag value + matches the regular expression [a-zA-Z0-9]*. + If the tag value do not match the regular expression + [a-zA-Z0-9]* the control field is encoded + as element c and attribute code + will hold the tag value. + This rule ensure that in the rare cases where a tag value might + result in a non-wellformed XML YAZ encode it as a coded attribute + (as in MARCXML). + + + The control field content is the the text value of this element. + Indicators are encoded as attribute names + i1, i2, etc.. and + corresponding values for each indicator. + + + A data field is encoded as element d concatenated + with the tag value of the data field or using the attribute + code as described in the rules for control fields. + The children of the data field element is subfield elements. + Each subfield element is encoded as s + concatenated with the sub field code. + The text of the subfield element is the contents of the subfield. + Indicators are encoded as attributes for the data field element similar + to the encoding for control fields. + + + + @@ -2229,7 +2236,10 @@ typedef struct oident Defines the name of the retrieval format. This can be any string. For SRU, the value, is equivalent to schema (short-hand); - for Z39.50 it's equivalent to simple element set name. + for Z39.50 it's equivalent to simple element set name. + For YAZ 3.0.24 and later this name may be specified as a glob + expression with operators + * and ?. @@ -2268,7 +2278,7 @@ typedef struct oident The marc element specifies a conversion to - and from ISO2709 encoded MARC and - &marcxml;/MarcXchange. + &acro.marcxml;/MarcXchange. The following attributes may be specified: @@ -2322,7 +2332,7 @@ typedef struct oident The xslt element specifies a conversion - via &xslt;. The following attributes may be specified: + via &acro.xslt;. The following attributes may be specified: stylesheet (REQUIRED) @@ -2342,8 +2352,8 @@ typedef struct oident Retrieval Facility Examples - - MARC21 backend + + MARC21 backend A typical way to use the retrieval facility is to enable XML for servers that only supports ISO2709 encoded MARC21 records. @@ -2398,7 +2408,7 @@ typedef struct oident - + API It should be easy to use the retrieval systems from applications. Refer