-<!-- $Id: tools.xml,v 1.11 2002-05-30 20:57:31 adam Exp $ -->
+<!-- $Id: tools.xml,v 1.15 2003-01-22 09:43:32 adam Exp $ -->
<chapter id="tools"><title>Supporting Tools</title>
<para>
that may be of use to you.
</para>
- <sect2><title id="PQF">Prefix Query Format</title>
+ <sect2 id="PQF"><title>Prefix Query Format</title>
<para>
Since RPN or reverse polish notation is really just a fancy way of
in simple test applications and scripting environments (like Tcl). The
demonstration client included with YAZ uses the PQF.
</para>
+
+ <note>
+ <para>
+ The PQF have been adopted by other parties developing Z39.50
+ software. It is often referred to as Prefix Query Notation
+ - PQN.
+ </para>
+ </note>
<para>
- The PQF is defined by the pquery module in the YAZ library. The
- <filename>pquery.h</filename> file provides the declaration of the
- functions
+ The PQF is defined by the pquery module in the YAZ library.
+ There are two sets of function that have similar behavior. First
+ set operates on a PQF parser handle, second set doesn't. First set
+ set of functions are more flexible than the second set. Second set
+ is obsolete and is only provided to ensure backwards compatibility.
</para>
- <screen>
-Z_RPNQuery *p_query_rpn (ODR o, oid_proto proto, const char *qbuf);
+ <para>
+ First set of functions all operate on a PQF parser handle:
+ </para>
+ <synopsis>
+ #include <yaz/pquery.h>
-Z_AttributesPlusTerm *p_query_scan (ODR o, oid_proto proto,
- Odr_oid **attributeSetP, const char *qbuf);
+ YAZ_PQF_Parser yaz_pqf_create (void);
-int p_query_attset (const char *arg);
- </screen>
+ void yaz_pqf_destroy (YAZ_PQF_Parser p);
+
+ Z_RPNQuery *yaz_pqf_parse (YAZ_PQF_Parser p, ODR o, const char *qbuf);
+
+ Z_AttributesPlusTerm *yaz_pqf_scan (YAZ_PQF_Parser p, ODR o,
+ Odr_oid **attributeSetId, const char *qbuf);
+
+
+ int yaz_pqf_error (YAZ_PQF_Parser p, const char **msg, size_t *off);
+ </synopsis>
+ <para>
+ A PQF parser is created and destructed by functions
+ <function>yaz_pqf_create</function> and
+ <function>yaz_pqf_destroy</function> respectively.
+ Function <function>yaz_pqf_parse</function> parses query given
+ by string <literal>qbuf</literal>. If parsing was successful,
+ a Z39.50 RPN Query is returned which is created using ODR stream
+ <literal>o</literal>. If parsing failed, a NULL pointer is
+ returned.
+ Function <function>yaz_pqf_scan</function> takes a scan query in
+ <literal>qbuf</literal>. If parsing was successful, the function
+ returns attributes plus term pointer and modifies
+ <literal>attributeSetId</literal> to hold attribute set for the
+ scan request - both allocated using ODR stream <literal>o</literal>.
+ If parsing failed, yaz_pqf_scan returns a NULL pointer.
+ Error information for bad queries can be obtained by a call to
+ <function>yaz_pqf_error</function> which returns an error code and
+ modifies <literal>*msg</literal> to point to an error description,
+ and modifies <literal>*off</literal> to the offset within last
+ query were parsing failed.
+ </para>
+ <para>
+ The second set of functions are declared as follows:
+ </para>
+ <synopsis>
+ #include <yaz/pquery.h>
+
+ Z_RPNQuery *p_query_rpn (ODR o, oid_proto proto, const char *qbuf);
+
+ Z_AttributesPlusTerm *p_query_scan (ODR o, oid_proto proto,
+ Odr_oid **attributeSetP, const char *qbuf);
+
+ int p_query_attset (const char *arg);
+ </synopsis>
<para>
The function <function>p_query_rpn()</function> takes as arguments an
&odr; stream (see section <link linkend="odr">The ODR Module</link>)
<para>
If the parse went well, <function>p_query_rpn()</function> returns a
pointer to a <literal>Z_RPNQuery</literal> structure which can be
- placed directly into a <literal>Z_SearchRequest</literal>.
+ placed directly into a <literal>Z_SearchRequest</literal>.
+ If parsing failed, due to syntax error, a NULL pointer is returned.
</para>
<para>
-
The <literal>p_query_attset</literal> specifies which attribute set
to use if the query doesn't specify one by the
<literal>@attrset</literal> operator.
top-set ::= [ '@attrset' string ]
- query-struct ::= attr-spec | simple | complex
+ query-struct ::= attr-spec | simple | complex | '@term' term-type
attr-spec ::= '@attr' [ string ] string query-struct
result-set ::= '@set' string.
- term ::= string
+ term ::= string.
proximity ::= exclusion distance ordered relation which-code unit-code.
which-code ::= 'known' | 'private' | integer.
unit-code ::= integer.
+
+ term-type ::= 'general' | 'numeric' | 'string' | 'oid' | 'datetime' | 'null'.
</literallayout>
<para>
</para>
<para>
+ The @attr operator is followed by an attribute specification
+ (<literal>attr-spec</literal> above). The specification consists
+ of optional an attribute set, an attribute type-value pair and
+ a sub query. The attribute type-value pair is packed in one string:
+ an attribute type, a dash, followed by an attribute value.
+ The type is always an integer but the value may be either an
+ integer or a string (if it doesn't start with a digit character).
+ </para>
+
+ <para>
+ Z39.50 version 3 defines various encoding of terms.
+ Use the @term operator to indicate the encoding type:
+ <literal>general</literal>, <literal>numeric</literal>,
+ <literal>string</literal> (for InternationalString), ..
+ If no term type has been given, the <literal>general</literal> form
+ is used which is the only encoding allowed in both version 2 - and 3
+ of the Z39.50 standard.
+ </para>
+
+ <para>
The following are all examples of valid queries in the PQF.
</para>
@or @and bob dylan @set Result-1
+ @attr 1=4 computer
+
@attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming"
@attr 4=1 @attr 1=4 "self portrait"
@prox 0 3 1 2 k 2 dylan zimmerman
@and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109
+
+ @term string "a UTF-8 string, maybe?"
+
+ @attr 1=/book/title computer
</screen>
</sect2>
- <sect2><title id="CCL">Common Command Language</title>
+ <sect2 id="CCL"><title>Common Command Language</title>
<para>
Not all users enjoy typing in prefix query structures and numerical
</para>
</sect3>
</sect2>
+ <sect2 id="tools.cql"><title>CQL</title>
+ <para>
+ <ulink url="http://www.loc.gov/z3950/agency/zing/cql/">CQL</ulink>
+ - Common Query Language - was defined for the
+ <ulink url="http://www.loc.gov/z3950/agency/zing/srw/">SRW</ulink>
+ protocol.
+ In many ways CQL has a similar syntax to CCL.
+ The objective of CQL is different. Where CCL aims to be
+ an end-user language, CQL is <emphasis>the</emphasis> protocol
+ query language for SRW. Unlike PQF (Z39.50 Type-1), CQL is easy
+ to read.
+ </para>
+ <tip>
+ <para>
+ If you are new to CQL, read the
+ <ulink url="http://zing.z3950.org/cql/intro.html">Gentle
+ Introduction</ulink>.
+ </para>
+ </tip>
+ <para>
+ The CQL parser in &yaz; provides the following:
+ <itemizedlist>
+ <listitem>
+ <para>
+ It parses and validates a CQL query.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ It generates a C structure that allows you to convert
+ a CQL query to some other query language, such as SQL.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The parser converts a valid CQL query to PQF, thus providing a
+ way to use CQL for both SRW/SRU servers and Z39.50 targets at the
+ same time.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The parser converts CQL to
+ <ulink url="http://www.loc.gov/z3950/agency/zing/cql/xcql.html">
+ XCQL</ulink>.
+ XCQL is an XML representation of CQL.
+ XCQL is part of the SRW specification. However, since SRU
+ supports CQL only, we don't expect XCQL to be widely used.
+ Furthermore, CQL has the advantage over XCQL that it is
+ easy to read.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ <sect3 id="tools.cql.parsing"><title>CQL parsing</title>
+ <para>
+ A CQL parser is represented by the <literal>CQL_parser</literal>
+ handle. Its contents should be considered &yaz; internal (private).
+ <synopsis>
+#include <yaz/cql.h>
+
+typedef struct cql_parser *CQL_parser;
+
+CQL_parser cql_parser_create(void);
+void cql_parser_destroy(CQL_parser cp);
+
+int cql_parser_string(CQL_parser cp, const char *str);
+ </synopsis>
+ A parser is created by <function>cql_parser_create</function> and
+ is destroyed by <function>cql_parser_destroy</function>.
+ </para>
+ <para>
+ A CQL query is parsed by the <function>cql_parser_string</function>
+ which takes a query <parameter>str</parameter>.
+ If the query was valid (no syntax errors), then zero is returned;
+ otherwise a non-zero error code is returned.
+ </para>
+ <para>
+ <synopsis>
+int cql_parser_stream(CQL_parser cp,
+ int (*getbyte)(void *client_data),
+ void (*ungetbyte)(int b, void *client_data),
+ void *client_data);
+
+int cql_parser_stdio(CQL_parser cp, FILE *f);
+ </synopsis>
+ The functions <function>cql_parser_stream</function> and
+ <function>cql_parser_stdio</function> parses a CQL query
+ - just like <function>cql_parser_string</function>.
+ The only difference is that the CQL query can be
+ fed to the parser in different ways.
+ The <function>cql_parser_stream</function> uses a generic
+ byte stream as input. The <function>cql_parser_stdio</function>
+ uses a <literal>FILE</literal> handle which is opened for reading.
+ </para>
+ </sect3>
+ <sect3 id="tools.cql.tree"><title>CQL tree</title>
+ <para>
+ We now turn to the tree representation of a valid CQL query.
+ <synopsis>
+#define CQL_NODE_ST 1
+#define CQL_NODE_BOOL 2
+#define CQL_NODE_MOD 3
+struct cql_node {
+ int which;
+ union {
+ struct {
+ char *index;
+ char *term;
+ char *relation;
+ struct cql_node *modifiers;
+ struct cql_node *prefixes;
+ } st;
+ struct {
+ char *value;
+ struct cql_node *left;
+ struct cql_node *right;
+ struct cql_node *modifiers;
+ struct cql_node *prefixes;
+ } bool;
+ struct {
+ char *name;
+ char *value;
+ struct cql_node *next;
+ } mod;
+ } u;
+};
+ </synopsis>
+ There are three kinds of nodes, search term (ST), boolean (BOOL),
+ and modifier (MOD).
+ </para>
+ <para>
+ The search term node has five members:
+ <itemizedlist>
+ <listitem>
+ <para>
+ <literal>index</literal>: index for search term.
+ If an index is unspecified for a search term,
+ <literal>index</literal> will be NULL.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>term</literal>: the search term itself.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>relation</literal>: relation for search term.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>modifiers</literal>: relation modifiers for search
+ term. The <literal>modifiers</literal> is a simple linked
+ list (NULL for last entry). Each relation modifier node
+ is of type <literal>MOD</literal>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>prefixes</literal>: index prefixes for search
+ term. The <literal>prefixes</literal> is a simple linked
+ list (NULL for last entry). Each prefix node
+ is of type <literal>MOD</literal>.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ The boolean node represents both <literal>and</literal>,
+ <literal>or</literal>, not as well as
+ proximity.
+ <itemizedlist>
+ <listitem>
+ <para>
+ <literal>left</literal> and <literal>right</literal>: left
+ - and right operand respectively.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>modifiers</literal>: proximity arguments.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>prefixes</literal>: index prefixes.
+ The <literal>prefixes</literal> is a simple linked
+ list (NULL for last entry). Each prefix node
+ is of type <literal>MOD</literal>.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ The modifier node is a "utility" node used for name-value pairs,
+ such as prefixes, proximity arguements, etc.
+ <itemizedlist>
+ <listitem>
+ <para>
+ <literal>name</literal> name of mod node.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>value</literal> value of mod node.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>next</literal>: pointer to next node which is
+ always a mod node (NULL for last entry).
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ </sect3>
+ </sect2>
</sect1>
<sect1 id="tools.oid"><title>Object Identifiers</title>