doc/tools.xml

   1 <!-- $Id: tools.xml,v 1.16 2003-01-23 20:26:37 adam Exp $ -->
   2  <chapter id="tools"><title>Supporting Tools</title>
   3
   4   <para>
   5    In support of the service API - primarily the ASN module, which
   6    provides the pro-grammatic interface to the Z39.50 APDUs, &yaz; contains
   7    a collection of tools that support the development of applications.
   8   </para>
   9
  10   <sect1 id="tools.query"><title>Query Syntax Parsers</title>
  11
  12    <para>
  13     Since the type-1 (RPN) query structure has no direct, useful string
  14     representation, every origin application needs to provide some form of
  15     mapping from a local query notation or representation to a
  16     <token>Z_RPNQuery</token> structure. Some programmers will prefer to
  17     construct the query manually, perhaps using
  18     <function>odr_malloc()</function> to simplify memory management.
  19     The &yaz; distribution includes two separate, query-generating tools
  20     that may be of use to you.
  21    </para>
  22
  23    <sect2 id="PQF"><title>Prefix Query Format</title>
  24
  25     <para>
  26      Since RPN or reverse polish notation is really just a fancy way of
  27      describing a suffix notation format (operator follows operands), it
  28      would seem that the confusion is total when we now introduce a prefix
  29      notation for RPN. The reason is one of simple laziness - it's somewhat
  30      simpler to interpret a prefix format, and this utility was designed
  31      for maximum simplicity, to provide a baseline representation for use
  32      in simple test applications and scripting environments (like Tcl). The
  33      demonstration client included with YAZ uses the PQF.
  34     </para>
  35
  36     <note>
  37      <para>
  38       The PQF have been adopted by other parties developing Z39.50
  39       software. It is often referred to as Prefix Query Notation
  40       - PQN.
  41      </para>
  42     </note>
  43     <para>
  44      The PQF is defined by the pquery module in the YAZ library.
  45      There are two sets of function that have similar behavior. First
  46      set operates on a PQF parser handle, second set doesn't. First set
  47      set of functions are more flexible than the second set. Second set
  48      is obsolete and is only provided to ensure backwards compatibility.
  49     </para>
  50     <para>
  51      First set of functions all operate on a PQF parser handle:
  52     </para>
  53     <synopsis>
  54      #include &lt;yaz/pquery.h&gt;
  55
  56      YAZ_PQF_Parser yaz_pqf_create (void);
  57
  58      void yaz_pqf_destroy (YAZ_PQF_Parser p);
  59
  60      Z_RPNQuery *yaz_pqf_parse (YAZ_PQF_Parser p, ODR o, const char *qbuf);
  61
  62      Z_AttributesPlusTerm *yaz_pqf_scan (YAZ_PQF_Parser p, ODR o,
  63                           Odr_oid **attributeSetId, const char *qbuf);
  64
  65
  66      int yaz_pqf_error (YAZ_PQF_Parser p, const char **msg, size_t *off);
  67     </synopsis>
  68     <para>
  69      A PQF parser is created and destructed by functions
  70      <function>yaz_pqf_create</function> and
  71      <function>yaz_pqf_destroy</function> respectively.
  72      Function <function>yaz_pqf_parse</function> parses query given
  73      by string <literal>qbuf</literal>. If parsing was successful,
  74      a Z39.50 RPN Query is returned which is created using ODR stream
  75      <literal>o</literal>. If parsing failed, a NULL pointer is
  76      returned.
  77      Function <function>yaz_pqf_scan</function> takes a scan query in
  78      <literal>qbuf</literal>. If parsing was successful, the function
  79      returns attributes plus term pointer and modifies
  80      <literal>attributeSetId</literal> to hold attribute set for the
  81      scan request - both allocated using ODR stream <literal>o</literal>.
  82      If parsing failed, yaz_pqf_scan returns a NULL pointer.
  83      Error information for bad queries can be obtained by a call to
  84      <function>yaz_pqf_error</function> which returns an error code and
  85      modifies <literal>*msg</literal> to point to an error description,
  86      and modifies <literal>*off</literal> to the offset within last
  87      query were parsing failed.
  88     </para>
  89     <para>
  90      The second set of functions are declared as follows:
  91     </para>
  92     <synopsis>
  93      #include &lt;yaz/pquery.h&gt;
  94
  95      Z_RPNQuery *p_query_rpn (ODR o, oid_proto proto, const char *qbuf);
  96
  97      Z_AttributesPlusTerm *p_query_scan (ODR o, oid_proto proto,
  98                              Odr_oid **attributeSetP, const char *qbuf);
  99
 100      int p_query_attset (const char *arg);
 101     </synopsis>
 102     <para>
 103      The function <function>p_query_rpn()</function> takes as arguments an
 104       &odr; stream (see section <link linkend="odr">The ODR Module</link>)
 105      to provide a memory source (the structure created is released on
 106      the next call to <function>odr_reset()</function> on the stream), a
 107      protocol identifier (one of the constants <token>PROTO_Z3950</token> and
 108      <token>PROTO_SR</token>), an attribute set reference, and
 109      finally a null-terminated string holding the query string.
 110     </para>
 111     <para>
 112      If the parse went well, <function>p_query_rpn()</function> returns a
 113      pointer to a <literal>Z_RPNQuery</literal> structure which can be
 114      placed directly into a <literal>Z_SearchRequest</literal>.
 115      If parsing failed, due to syntax error, a NULL pointer is returned.
 116     </para>
 117     <para>
 118      The <literal>p_query_attset</literal> specifies which attribute set
 119      to use if the query doesn't specify one by the
 120      <literal>@attrset</literal> operator.
 121      The <literal>p_query_attset</literal> returns 0 if the argument is a
 122      valid attribute set specifier; otherwise the function returns -1.
 123     </para>
 124
 125     <para>
 126      The grammar of the PQF is as follows:
 127     </para>
 128
 129     <literallayout>
 130      query ::= top-set query-struct.
 131
 132      top-set ::= &lsqb; '@attrset' string &rsqb;
 133
 134      query-struct ::= attr-spec | simple | complex | '@term' term-type
 135
 136      attr-spec ::= '@attr' &lsqb; string &rsqb; string query-struct
 137
 138      complex ::= operator query-struct query-struct.
 139
 140      operator ::= '@and' | '@or' | '@not' | '@prox' proximity.
 141
 142      simple ::= result-set | term.
 143
 144      result-set ::= '@set' string.
 145
 146      term ::= string.
 147
 148      proximity ::= exclusion distance ordered relation which-code unit-code.
 149
 150      exclusion ::= '1' | '0' | 'void'.
 151
 152      distance ::= integer.
 153
 154      ordered ::= '1' | '0'.
 155
 156      relation ::= integer.
 157
 158      which-code ::= 'known' | 'private' | integer.
 159
 160      unit-code ::= integer.
 161
 162      term-type ::= 'general' | 'numeric' | 'string' | 'oid' | 'datetime' | 'null'.
 163     </literallayout>
 164
 165     <para>
 166      You will note that the syntax above is a fairly faithful
 167      representation of RPN, except for the Attribute, which has been
 168      moved a step away from the term, allowing you to associate one or more
 169      attributes with an entire query structure. The parser will
 170      automatically apply the given attributes to each term as required.
 171     </para>
 172
 173     <para>
 174      The @attr operator is followed by an attribute specification
 175      (<literal>attr-spec</literal> above). The specification consists
 176      of optional an attribute set, an attribute type-value pair and
 177      a sub query. The attribute type-value pair is packed in one string:
 178      an attribute type, a dash, followed by an attribute value.
 179      The type is always an integer but the value may be either an
 180      integer or a string (if it doesn't start with a digit character).
 181     </para>
 182
 183     <para>
 184      Z39.50 version 3 defines various encoding of terms.
 185      Use the @term operator to indicate the encoding type:
 186      <literal>general</literal>, <literal>numeric</literal>,
 187      <literal>string</literal> (for InternationalString), ..
 188      If no term type has been given, the <literal>general</literal> form
 189      is used which is the only encoding allowed in both version 2 - and 3
 190      of the Z39.50 standard.
 191     </para>
 192
 193     <para>
 194      The following are all examples of valid queries in the PQF.
 195     </para>
 196
 197     <screen>
 198      dylan
 199
 200      "bob dylan"
 201
 202      @or "dylan" "zimmerman"
 203
 204      @set Result-1
 205
 206      @or @and bob dylan @set Result-1
 207
 208      @attr 1=4 computer
 209
 210      @attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming"
 211
 212      @attr 4=1 @attr 1=4 "self portrait"
 213
 214      @prox 0 3 1 2 k 2 dylan zimmerman
 215
 216      @and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109
 217
 218      @term string "a UTF-8 string, maybe?"
 219
 220      @attr 1=/book/title computer
 221     </screen>
 222
 223    </sect2>
 224    <sect2 id="CCL"><title>Common Command Language</title>
 225
 226     <para>
 227      Not all users enjoy typing in prefix query structures and numerical
 228      attribute values, even in a minimalistic test client. In the library
 229      world, the more intuitive Common Command Language (or ISO 8777) has
 230      enjoyed some popularity - especially before the widespread
 231      availability of graphical interfaces. It is still useful in
 232      applications where you for some reason or other need to provide a
 233      symbolic language for expressing boolean query structures.
 234     </para>
 235
 236     <para>
 237      The <ulink url="http://europagate.dtv.dk/">EUROPAGATE</ulink>
 238      research project working under the Libraries programme
 239      of the European Commission's DG XIII has, amongst other useful tools,
 240      implemented a general-purpose CCL parser which produces an output
 241      structure that can be trivially converted to the internal RPN
 242      representation of &yaz; (The <literal>Z_RPNQuery</literal> structure).
 243      Since the CCL utility - along with the rest of the software
 244      produced by EUROPAGATE - is made freely available on a liberal
 245      license, it is included as a supplement to &yaz;.
 246     </para>
 247
 248     <sect3><title>CCL Syntax</title>
 249
 250      <para>
 251       The CCL parser obeys the following grammar for the FIND argument.
 252       The syntax is annotated by in the lines prefixed by
 253       <literal>&dash;&dash;</literal>.
 254      </para>
 255
 256      <screen>
 257       CCL-Find ::= CCL-Find Op Elements
 258                 | Elements.
 259
 260       Op ::= "and" | "or" | "not"
 261       -- The above means that Elements are separated by boolean operators.
 262
 263       Elements ::= '(' CCL-Find ')'
 264                 | Set
 265                 | Terms
 266                 | Qualifiers Relation Terms
 267                 | Qualifiers Relation '(' CCL-Find ')'
 268                 | Qualifiers '=' string '-' string
 269       -- Elements is either a recursive definition, a result set reference, a
 270       -- list of terms, qualifiers followed by terms, qualifiers followed
 271       -- by a recursive definition or qualifiers in a range (lower - upper).
 272
 273       Set ::= 'set' = string
 274       -- Reference to a result set
 275
 276       Terms ::= Terms Prox Term
 277              | Term
 278       -- Proximity of terms.
 279
 280       Term ::= Term string
 281             | string
 282       -- This basically means that a term may include a blank
 283
 284       Qualifiers ::= Qualifiers ',' string
 285                   | string
 286       -- Qualifiers is a list of strings separated by comma
 287
 288       Relation ::= '=' | '>=' | '&lt;=' | '&lt;>' | '>' | '&lt;'
 289       -- Relational operators. This really doesn't follow the ISO8777
 290       -- standard.
 291
 292       Prox ::= '%' | '!'
 293       -- Proximity operator
 294
 295      </screen>
 296
 297      <para>
 298       The following queries are all valid:
 299      </para>
 300
 301      <screen>
 302       dylan
 303
 304       "bob dylan"
 305
 306       dylan or zimmerman
 307
 308       set=1
 309
 310       (dylan and bob) or set=1
 311
 312      </screen>
 313      <para>
 314       Assuming that the qualifiers <literal>ti</literal>, <literal>au</literal>
 315       and <literal>date</literal> are defined we may use:
 316      </para>
 317
 318      <screen>
 319       ti=self portrait
 320
 321       au=(bob dylan and slow train coming)
 322
 323       date>1980 and (ti=((self portrait)))
 324
 325      </screen>
 326
 327     </sect3>
 328     <sect3><title>CCL Qualifiers</title>
 329
 330      <para>
 331       Qualifiers are used to direct the search to a particular searchable
 332       index, such as title (ti) and author indexes (au). The CCL standard
 333       itself doesn't specify a particular set of qualifiers, but it does
 334       suggest a few short-hand notations. You can customize the CCL parser
 335       to support a particular set of qualifiers to reflect the current target
 336       profile. Traditionally, a qualifier would map to a particular
 337       use-attribute within the BIB-1 attribute set. However, you could also
 338       define qualifiers that would set, for example, the
 339       structure-attribute.
 340      </para>
 341
 342      <para>
 343       Consider a scenario where the target support ranked searches in the
 344       title-index. In this case, the user could specify
 345      </para>
 346
 347      <screen>
 348       ti,ranked=knuth computer
 349      </screen>
 350      <para>
 351       and the <literal>ranked</literal> would map to relation=relevance
 352       (2=102) and the <literal>ti</literal> would map to title (1=4).
 353      </para>
 354
 355      <para>
 356       A "profile" with a set predefined CCL qualifiers can be read from a
 357       file. The YAZ client reads its CCL qualifiers from a file named
 358       <filename>default.bib</filename>. Each line in the file has the form:
 359      </para>
 360
 361      <para>
 362       <replaceable>qualifier-name</replaceable>
 363       <replaceable>type</replaceable>=<replaceable>val</replaceable>
 364       <replaceable>type</replaceable>=<replaceable>val</replaceable> ...
 365      </para>
 366
 367      <para>
 368       where <replaceable>qualifier-name</replaceable> is the name of the
 369       qualifier to be used (eg. <literal>ti</literal>),
 370       <replaceable>type</replaceable> is a BIB-1 category type and
 371       <replaceable>val</replaceable> is the corresponding BIB-1 attribute
 372       value.
 373       The <replaceable>type</replaceable> can be either numeric or it may be
 374       either <literal>u</literal> (use), <literal>r</literal> (relation),
 375       <literal>p</literal> (position), <literal>s</literal> (structure),
 376       <literal>t</literal> (truncation) or <literal>c</literal> (completeness).
 377       The <replaceable>qualifier-name</replaceable> <literal>term</literal>
 378       has a special meaning.
 379       The types and values for this definition is used when
 380       <emphasis>no</emphasis> qualifiers are present.
 381      </para>
 382
 383      <para>
 384       Consider the following definition:
 385      </para>
 386
 387      <screen>
 388       ti       u=4 s=1
 389       au       u=1 s=1
 390       term     s=105
 391      </screen>
 392      <para>
 393       Two qualifiers are defined, <literal>ti</literal> and
 394       <literal>au</literal>.
 395       They both set the structure-attribute to phrase (1).
 396       <literal>ti</literal>
 397       sets the use-attribute to 4. <literal>au</literal> sets the
 398       use-attribute to 1.
 399       When no qualifiers are used in the query the structure-attribute is
 400       set to free-form-text (105).
 401      </para>
 402
 403     </sect3>
 404     <sect3><title>CCL API</title>
 405      <para>
 406       All public definitions can be found in the header file
 407       <filename>ccl.h</filename>. A profile identifier is of type
 408       <literal>CCL_bibset</literal>. A profile must be created with the call
 409       to the function <function>ccl_qual_mk</function> which returns a profile
 410       handle of type <literal>CCL_bibset</literal>.
 411      </para>
 412
 413      <para>
 414       To read a file containing qualifier definitions the function
 415       <function>ccl_qual_file</function> may be convenient. This function
 416       takes an already opened <literal>FILE</literal> handle pointer as
 417       argument along with a <literal>CCL_bibset</literal> handle.
 418      </para>
 419
 420      <para>
 421       To parse a simple string with a FIND query use the function
 422      </para>
 423      <screen>
 424 struct ccl_rpn_node *ccl_find_str (CCL_bibset bibset, const char *str,
 425                                    int *error, int *pos);
 426      </screen>
 427      <para>
 428       which takes the CCL profile (<literal>bibset</literal>) and query
 429       (<literal>str</literal>) as input. Upon successful completion the RPN
 430       tree is returned. If an error occur, such as a syntax error, the integer
 431       pointed to by <literal>error</literal> holds the error code and
 432       <literal>pos</literal> holds the offset inside query string in which
 433       the parsing failed.
 434      </para>
 435
 436      <para>
 437       An English representation of the error may be obtained by calling
 438       the <literal>ccl_err_msg</literal> function. The error codes are
 439       listed in <filename>ccl.h</filename>.
 440      </para>
 441
 442      <para>
 443       To convert the CCL RPN tree (type
 444       <literal>struct ccl_rpn_node *</literal>)
 445       to the Z_RPNQuery of YAZ the function <function>ccl_rpn_query</function>
 446       must be used. This function which is part of YAZ is implemented in
 447       <filename>yaz-ccl.c</filename>.
 448       After calling this function the CCL RPN tree is probably no longer
 449       needed. The <literal>ccl_rpn_delete</literal> destroys the CCL RPN tree.
 450      </para>
 451
 452      <para>
 453       A CCL profile may be destroyed by calling the
 454       <function>ccl_qual_rm</function> function.
 455      </para>
 456
 457      <para>
 458       The token names for the CCL operators may be changed by setting the
 459       globals (all type <literal>char *</literal>)
 460       <literal>ccl_token_and</literal>, <literal>ccl_token_or</literal>,
 461       <literal>ccl_token_not</literal> and <literal>ccl_token_set</literal>.
 462       An operator may have aliases, i.e. there may be more than one name for
 463       the operator. To do this, separate each alias with a space character.
 464      </para>
 465     </sect3>
 466    </sect2>
 467    <sect2 id="tools.cql"><title>CQL</title>
 468     <para>
 469      <ulink url="http://www.loc.gov/z3950/agency/zing/cql/">CQL</ulink>
 470       - Common Query Language - was defined for the
 471      <ulink url="http://www.loc.gov/z3950/agency/zing/srw/">SRW</ulink>
 472      protocol.
 473      In many ways CQL has a similar syntax to CCL.
 474      The objective of CQL is different. Where CCL aims to be
 475      an end-user language, CQL is <emphasis>the</emphasis> protocol
 476      query language for SRW.
 477     </para>
 478     <tip>
 479      <para>
 480       If you are new to CQL, read the
 481       <ulink url="http://zing.z3950.org/cql/intro.html">Gentle
 482        Introduction</ulink>.
 483      </para>
 484     </tip>
 485     <para>
 486      The CQL parser in &yaz; provides the following:
 487      <itemizedlist>
 488       <listitem>
 489        <para>
 490         It parses and validates a CQL query.
 491        </para>
 492       </listitem>
 493       <listitem>
 494        <para>
 495         It generates a C structure that allows you to convert
 496         a CQL query to some other query language, such as SQL.
 497        </para>
 498       </listitem>
 499       <listitem>
 500        <para>
 501         The parser converts a valid CQL query to PQF, thus providing a
 502         way to use CQL for both SRW/SRU servers and Z39.50 targets at the
 503         same time.
 504        </para>
 505       </listitem>
 506       <listitem>
 507        <para>
 508         The parser converts CQL to
 509         <ulink url="http://www.loc.gov/z3950/agency/zing/cql/xcql.html">
 510          XCQL</ulink>.
 511         XCQL is an XML representation of CQL.
 512         XCQL is part of the SRW specification. However, since SRU
 513         supports CQL only, we don't expect XCQL to be widely used.
 514         Furthermore, CQL has the advantage over XCQL that it is
 515         easy to read.
 516        </para>
 517       </listitem>
 518      </itemizedlist>
 519     </para>
 520     <sect3 id="tools.cql.parsing"><title>CQL parsing</title>
 521      <para>
 522       A CQL parser is represented by the <literal>CQL_parser</literal>
 523       handle. Its contents should be considered &yaz; internal (private).
 524       <synopsis>
 525 #include &lt;yaz/cql.h&gt;
 526
 527 typedef struct cql_parser *CQL_parser;
 528
 529 CQL_parser cql_parser_create(void);
 530 void cql_parser_destroy(CQL_parser cp);
 531       </synopsis>
 532      A parser is created by <function>cql_parser_create</function> and
 533      is destroyed by <function>cql_parser_destroy</function>.
 534      </para>
 535      <para>
 536       To parse a CQL query string, the following function
 537       is provided:
 538       <synopsis>
 539 int cql_parser_string(CQL_parser cp, const char *str);
 540       </synopsis>
 541       A CQL query is parsed by the <function>cql_parser_string</function>
 542       which takes a query <parameter>str</parameter>.
 543       If the query was valid (no syntax errors), then zero is returned;
 544       otherwise a non-zero error code is returned.
 545      </para>
 546      <para>
 547       <synopsis>
 548 int cql_parser_stream(CQL_parser cp,
 549                       int (*getbyte)(void *client_data),
 550                       void (*ungetbyte)(int b, void *client_data),
 551                       void *client_data);
 552
 553 int cql_parser_stdio(CQL_parser cp, FILE *f);
 554       </synopsis>
 555       The functions <function>cql_parser_stream</function> and
 556       <function>cql_parser_stdio</function> parses a CQL query
 557       - just like <function>cql_parser_string</function>.
 558       The only difference is that the CQL query can be
 559       fed to the parser in different ways.
 560       The <function>cql_parser_stream</function> uses a generic
 561       byte stream as input. The <function>cql_parser_stdio</function>
 562       uses a <literal>FILE</literal> handle which is opened for reading.
 563      </para>
 564     </sect3>
 565
 566     <sect3 id="tools.cql.tree"><title>CQL tree</title>
 567      <para>
 568       The the query string is validl, the CQL parser
 569       generates a tree representing the structure of the
 570       CQL query.
 571      </para>
 572      <para>
 573       <synopsis>
 574 struct cql_node *cql_parser_result(CQL_parser cp);
 575       </synopsis>
 576       <function>cql_parser_result</function> returns the
 577       a pointer to the root node of the resulting tree.
 578      </para>
 579      <para>
 580       Each node in a CQL tree is represented by a
 581       <literal>struct cql_node</literal>.
 582       It is defined as follows:
 583       <synopsis>
 584 #define CQL_NODE_ST 1
 585 #define CQL_NODE_BOOL 2
 586 #define CQL_NODE_MOD 3
 587 struct cql_node {
 588     int which;
 589     union {
 590         struct {
 591             char *index;
 592             char *term;
 593             char *relation;
 594             struct cql_node *modifiers;
 595             struct cql_node *prefixes;
 596         } st;
 597         struct {
 598             char *value;
 599             struct cql_node *left;
 600             struct cql_node *right;
 601             struct cql_node *modifiers;
 602             struct cql_node *prefixes;
 603         } bool;
 604         struct {
 605             char *name;
 606             char *value;
 607             struct cql_node *next;
 608         } mod;
 609     } u;
 610 };
 611       </synopsis>
 612       There are three kinds of nodes, search term (ST), boolean (BOOL),
 613       and modifier (MOD).
 614      </para>
 615      <para>
 616       The search term node has five members:
 617       <itemizedlist>
 618        <listitem>
 619         <para>
 620          <literal>index</literal>: index for search term.
 621          If an index is unspecified for a search term,
 622          <literal>index</literal> will be NULL.
 623         </para>
 624        </listitem>
 625        <listitem>
 626         <para>
 627          <literal>term</literal>: the search term itself.
 628         </para>
 629        </listitem>
 630        <listitem>
 631         <para>
 632          <literal>relation</literal>: relation for search term.
 633         </para>
 634        </listitem>
 635        <listitem>
 636         <para>
 637          <literal>modifiers</literal>: relation modifiers for search
 638          term. The <literal>modifiers</literal> is a simple linked
 639          list (NULL for last entry). Each relation modifier node
 640          is of type <literal>MOD</literal>.
 641         </para>
 642        </listitem>
 643        <listitem>
 644         <para>
 645          <literal>prefixes</literal>: index prefixes for search
 646          term. The <literal>prefixes</literal> is a simple linked
 647          list (NULL for last entry). Each prefix node
 648          is of type <literal>MOD</literal>.
 649         </para>
 650        </listitem>
 651       </itemizedlist>
 652      </para>
 653
 654      <para>
 655       The boolean node represents both <literal>and</literal>,
 656       <literal>or</literal>, not as well as
 657       proximity.
 658       <itemizedlist>
 659        <listitem>
 660         <para>
 661          <literal>left</literal> and <literal>right</literal>: left
 662          - and right operand respectively.
 663         </para>
 664        </listitem>
 665        <listitem>
 666         <para>
 667          <literal>modifiers</literal>: proximity arguments.
 668         </para>
 669        </listitem>
 670        <listitem>
 671         <para>
 672          <literal>prefixes</literal>: index prefixes.
 673          The <literal>prefixes</literal> is a simple linked
 674          list (NULL for last entry). Each prefix node
 675          is of type <literal>MOD</literal>.
 676         </para>
 677        </listitem>
 678       </itemizedlist>
 679      </para>
 680
 681      <para>
 682       The modifier node is a "utility" node used for name-value pairs,
 683       such as prefixes, proximity arguements, etc.
 684       <itemizedlist>
 685        <listitem>
 686         <para>
 687          <literal>name</literal> name of mod node.
 688         </para>
 689        </listitem>
 690        <listitem>
 691         <para>
 692          <literal>value</literal> value of mod node.
 693         </para>
 694        </listitem>
 695        <listitem>
 696         <para>
 697          <literal>next</literal>: pointer to next node which is
 698          always a mod node (NULL for last entry).
 699         </para>
 700        </listitem>
 701       </itemizedlist>
 702      </para>
 703
 704     </sect3>
 705     <sect3 id="tools.cql.pqf"><title>CQL to PQF conversion</title>
 706      <para>
 707       Conversion to PQF (and Z39.50 RPN) is tricky by the fact
 708       that the resulting RPN depends on the Z39.50 target
 709       capabilities (combinations of supported attributes).
 710       Furthermore, the CQL and SRW operates on index prefixes
 711       (URI or strings), whereas the RPN uses Object Identifiers
 712       for attribute sets.
 713      </para>
 714      <para>
 715       The CQL library of &yaz; defines a <literal>cql_transform_t</literal>
 716       type. It represents a particular mapping between CQL and RPN.
 717       This handle is created and destroyed by the functions:
 718      <synopsis>
 719 cql_transform_t cql_transform_open_FILE (FILE *f);
 720 cql_transform_t cql_transform_open_fname(const char *fname);
 721 void cql_transform_close(cql_transform_t ct);
 722       </synopsis>
 723       The first two functions create a tranformation handle from
 724       either an already open FILE or from a filename.
 725      </para>
 726      <para>
 727       The handle is destroyed by <function>cql_transform_close</function>
 728       in which case no further reference of the handle is allowed.
 729      </para>
 730      <para>
 731       When a <literal>cql_transform_t</literal> handle has been created
 732       you can convert to RPN.
 733       <synopsis>
 734 int cql_transform_buf(cql_transform_t ct,
 735                       struct cql_node *cn, char *out, int max);
 736       </synopsis>
 737       This function converts the CQL tree <literal>cn</literal>
 738       using handle <literal>ct</literal>.
 739       For the resulting PQF, you supply a buffer <literal>out</literal>
 740       which must be able to hold at at least <literal>max</literal>
 741       characters.
 742      </para>
 743      <para>
 744       If conversion failed, <function>cql_transform_buf</function>
 745       returns a non-zero error code; otherwise zero is returned
 746       (conversion successful).
 747      </para>
 748      <para>
 749       If you wish to be able to produce a PQF result in a different
 750       way, there are two alternatives.
 751       <synopsis>
 752 void cql_transform_pr(cql_transform_t ct,
 753                       struct cql_node *cn,
 754                       void (*pr)(const char *buf, void *client_data),
 755                       void *client_data);
 756
 757 int cql_transform_FILE(cql_transform_t ct,
 758                        struct cql_node *cn, FILE *f);
 759       </synopsis>
 760       The former function produces output to a user-defined
 761       output stream. The latter writes the result to an already
 762       open <literal>FILE</literal>.
 763      </para>
 764     </sect3>
 765     <sect3 id="toolq.cql.xcql"><title>CQL to XCQL conversion</title>
 766      <para>
 767       Conversion from CQL to XCQL is trivial and does not
 768       require a mapping to be defined.
 769       There three functions to choose from depending on the
 770       way you wish to store the resulting output (XML buffer
 771       containing XCQL).
 772       <synopsis>
 773 int cql_to_xml_buf(struct cql_node *cn, char *out, int max);
 774 void cql_to_xml(struct cql_node *cn,
 775                 void (*pr)(const char *buf, void *client_data),
 776                 void *client_data);
 777 void cql_to_xml_stdio(struct cql_node *cn, FILE *f);
 778       </synopsis>
 779       Function <function>cql_to_xml_buf</function> converts
 780       to XCQL and stores result in a user supplied buffer of a given
 781       max size.
 782      </para>
 783      <para>
 784       <function>cql_to_xml</function> writes the result in
 785       a user defined output stream.
 786       <function>cql_to_xml_stdio</function> writes to a
 787       a file.
 788      </para>
 789     </sect3>
 790    </sect2>
 791   </sect1>
 792   <sect1 id="tools.oid"><title>Object Identifiers</title>
 793
 794    <para>
 795     The basic YAZ representation of an OID is an array of integers,
 796     terminated with the value -1. The &odr; module provides two
 797     utility-functions to create and copy this type of data elements:
 798    </para>
 799
 800    <screen>
 801     Odr_oid *odr_getoidbystr(ODR o, char *str);
 802    </screen>
 803
 804    <para>
 805     Creates an OID based on a string-based representation using dots (.)
 806     to separate elements in the OID.
 807    </para>
 808
 809    <screen>
 810     Odr_oid *odr_oiddup(ODR odr, Odr_oid *o);
 811    </screen>
 812
 813    <para>
 814     Creates a copy of the OID referenced by the <emphasis>o</emphasis>
 815     parameter.
 816     Both functions take an &odr; stream as parameter. This stream is used to
 817     allocate memory for the data elements, which is released on a
 818     subsequent call to <function>odr_reset()</function> on that stream.
 819    </para>
 820
 821    <para>
 822     The OID module provides a higher-level representation of the
 823     family of object identifiers which describe the Z39.50 protocol and its
 824     related objects. The definition of the module interface is given in
 825     the <filename>oid.h</filename> file.
 826    </para>
 827
 828    <para>
 829     The interface is mainly based on the <literal>oident</literal> structure.
 830     The definition of this structure looks like this:
 831    </para>
 832
 833    <screen>
 834 typedef struct oident
 835 {
 836     oid_proto proto;
 837     oid_class oclass;
 838     oid_value value;
 839     int oidsuffix[OID_SIZE];
 840     char *desc;
 841 } oident;
 842    </screen>
 843
 844    <para>
 845     The proto field takes one of the values
 846    </para>
 847
 848    <screen>
 849     PROTO_Z3950
 850     PROTO_SR
 851    </screen>
 852
 853    <para>
 854     If you don't care about talking to SR-based implementations (few
 855     exist, and they may become fewer still if and when the ISO SR and ANSI
 856     Z39.50 documents are merged into a single standard), you can ignore
 857     this field on incoming packages, and always set it to PROTO_Z3950
 858     for outgoing packages.
 859    </para>
 860    <para>
 861
 862     The oclass field takes one of the values
 863    </para>
 864
 865    <screen>
 866     CLASS_APPCTX
 867     CLASS_ABSYN
 868     CLASS_ATTSET
 869     CLASS_TRANSYN
 870     CLASS_DIAGSET
 871     CLASS_RECSYN
 872     CLASS_RESFORM
 873     CLASS_ACCFORM
 874     CLASS_EXTSERV
 875     CLASS_USERINFO
 876     CLASS_ELEMSPEC
 877     CLASS_VARSET
 878     CLASS_SCHEMA
 879     CLASS_TAGSET
 880     CLASS_GENERAL
 881    </screen>
 882
 883    <para>
 884     corresponding to the OID classes defined by the Z39.50 standard.
 885
 886     Finally, the value field takes one of the values
 887    </para>
 888
 889    <screen>
 890     VAL_APDU
 891     VAL_BER
 892     VAL_BASIC_CTX
 893     VAL_BIB1
 894     VAL_EXP1
 895     VAL_EXT1
 896     VAL_CCL1
 897     VAL_GILS
 898     VAL_WAIS
 899     VAL_STAS
 900     VAL_DIAG1
 901     VAL_ISO2709
 902     VAL_UNIMARC
 903     VAL_INTERMARC
 904     VAL_CCF
 905     VAL_USMARC
 906     VAL_UKMARC
 907     VAL_NORMARC
 908     VAL_LIBRISMARC
 909     VAL_DANMARC
 910     VAL_FINMARC
 911     VAL_MAB
 912     VAL_CANMARC
 913     VAL_SBN
 914     VAL_PICAMARC
 915     VAL_AUSMARC
 916     VAL_IBERMARC
 917     VAL_EXPLAIN
 918     VAL_SUTRS
 919     VAL_OPAC
 920     VAL_SUMMARY
 921     VAL_GRS0
 922     VAL_GRS1
 923     VAL_EXTENDED
 924     VAL_RESOURCE1
 925     VAL_RESOURCE2
 926     VAL_PROMPT1
 927     VAL_DES1
 928     VAL_KRB1
 929     VAL_PRESSET
 930     VAL_PQUERY
 931     VAL_PCQUERY
 932     VAL_ITEMORDER
 933     VAL_DBUPDATE
 934     VAL_EXPORTSPEC
 935     VAL_EXPORTINV
 936     VAL_NONE
 937     VAL_SETM
 938     VAL_SETG
 939     VAL_VAR1
 940     VAL_ESPEC1
 941    </screen>
 942
 943    <para>
 944     again, corresponding to the specific OIDs defined by the standard.
 945    </para>
 946
 947    <para>
 948     The desc field contains a brief, mnemonic name for the OID in question.
 949    </para>
 950
 951    <para>
 952     The function
 953    </para>
 954
 955    <screen>
 956     struct oident *oid_getentbyoid(int *o);
 957    </screen>
 958
 959    <para>
 960     takes as argument an OID, and returns a pointer to a static area
 961     containing an <literal>oident</literal> structure. You typically use
 962     this function when you receive a PDU containing an OID, and you wish
 963     to branch out depending on the specific OID value.
 964    </para>
 965
 966    <para>
 967     The function
 968    </para>
 969
 970    <screen>
 971     int *oid_ent_to_oid(struct oident *ent, int *dst);
 972    </screen>
 973
 974    <para>
 975     Takes as argument an <literal>oident</literal> structure - in which
 976     the <literal>proto</literal>, <literal>oclass</literal>/, and
 977     <literal>value</literal> fields are assumed to be set correctly -
 978     and returns a pointer to a the buffer as given by <literal>dst</literal>
 979     containing the base
 980     representation of the corresponding OID. The function returns
 981     NULL and the array dst is unchanged if a mapping couldn't place.
 982     The array <literal>dst</literal> should be at least of size
 983     <literal>OID_SIZE</literal>.
 984    </para>
 985    <para>
 986
 987     The <function>oid_ent_to_oid()</function> function can be used whenever
 988     you need to prepare a PDU containing one or more OIDs. The separation of
 989     the <literal>protocol</literal> element from the remainder of the
 990     OID-description makes it simple to write applications that can
 991     communicate with either Z39.50 or OSI SR-based applications.
 992    </para>
 993
 994    <para>
 995     The function
 996    </para>
 997
 998    <screen>
 999     oid_value oid_getvalbyname(const char *name);
1000    </screen>
1001
1002    <para>
1003     takes as argument a mnemonic OID name, and returns the
1004     <literal>/value</literal> field of the first entry in the database that
1005     contains the given name in its <literal>desc</literal> field.
1006    </para>
1007
1008    <para>
1009     Finally, the module provides the following utility functions, whose
1010     meaning should be obvious:
1011    </para>
1012
1013    <screen>
1014     void oid_oidcpy(int *t, int *s);
1015     void oid_oidcat(int *t, int *s);
1016     int oid_oidcmp(int *o1, int *o2);
1017     int oid_oidlen(int *o);
1018    </screen>
1019
1020    <note>
1021     <para>
1022      The OID module has been criticized - and perhaps rightly so
1023      - for needlessly abstracting the
1024      representation of OIDs. Other toolkits use a simple
1025      string-representation of OIDs with good results. In practice, we have
1026      found the interface comfortable and quick to work with, and it is a
1027      simple matter (for what it's worth) to create applications compatible
1028      with both ISO SR and Z39.50. Finally, the use of the
1029      <literal>/oident</literal> database is by no means mandatory.
1030      You can easily create your own system for representing OIDs, as long
1031      as it is compatible with the low-level integer-array representation
1032      of the ODR module.
1033     </para>
1034    </note>
1035
1036   </sect1>
1037
1038   <sect1 id="tools.nmem"><title>Nibble Memory</title>
1039
1040    <para>
1041     Sometimes when you need to allocate and construct a large,
1042     interconnected complex of structures, it can be a bit of a pain to
1043     release the associated memory again. For the structures describing the
1044     Z39.50 PDUs and related structures, it is convenient to use the
1045     memory-management system of the &odr; subsystem (see
1046     <link linkend="odr-use">Using ODR</link>). However, in some circumstances
1047     where you might otherwise benefit from using a simple nibble memory
1048     management system, it may be impractical to use
1049     <function>odr_malloc()</function> and <function>odr_reset()</function>.
1050     For this purpose, the memory manager which also supports the &odr;
1051     streams is made available in the NMEM module. The external interface
1052     to this module is given in the <filename>nmem.h</filename> file.
1053    </para>
1054
1055    <para>
1056     The following prototypes are given:
1057    </para>
1058
1059    <screen>
1060     NMEM nmem_create(void);
1061     void nmem_destroy(NMEM n);
1062     void *nmem_malloc(NMEM n, int size);
1063     void nmem_reset(NMEM n);
1064     int nmem_total(NMEM n);
1065     void nmem_init(void);
1066     void nmem_exit(void);
1067    </screen>
1068
1069    <para>
1070     The <function>nmem_create()</function> function returns a pointer to a
1071     memory control handle, which can be released again by
1072     <function>nmem_destroy()</function> when no longer needed.
1073     The function <function>nmem_malloc()</function> allocates a block of
1074     memory of the requested size. A call to <function>nmem_reset()</function>
1075     or <function>nmem_destroy()</function> will release all memory allocated
1076     on the handle since it was created (or since the last call to
1077     <function>nmem_reset()</function>. The function
1078     <function>nmem_total()</function> returns the number of bytes currently
1079     allocated on the handle.
1080    </para>
1081
1082    <para>
1083     The nibble memory pool is shared amongst threads. POSIX
1084     mutex'es and WIN32 Critical sections are introduced to keep the
1085     module thread safe. Function <function>nmem_init()</function>
1086     initializes the nibble memory library and it is called automatically
1087     the first time the <literal>YAZ.DLL</literal> is loaded. &yaz; uses
1088     function <function>DllMain</function> to achieve this. You should
1089     <emphasis>not</emphasis> call <function>nmem_init</function> or
1090     <function>nmem_exit</function> unless you're absolute sure what
1091     you're doing. Note that in previous &yaz; versions you'd have to call
1092     <function>nmem_init</function> yourself.
1093    </para>
1094
1095   </sect1>
1096  </chapter>
1097
1098  <!-- Keep this comment at the end of the file
1099  Local variables:
1100  mode: sgml
1101  sgml-omittag:t
1102  sgml-shorttag:t
1103  sgml-minimize-attributes:nil
1104  sgml-always-quote-attributes:t
1105  sgml-indent-step:1
1106  sgml-indent-data:t
1107  sgml-parent-document: "yaz.xml"
1108  sgml-local-catalogs: nil
1109  sgml-namecase-general:t
1110  End:
1111  -->