-# $Id: ZOOM.pod,v 1.22 2005-12-13 16:46:59 mike Exp $
+# $Id: ZOOM.pod,v 1.34 2006-04-12 08:49:20 mike Exp $
use strict;
use warnings;
use ZOOM;
eval {
- $conn = new ZOOM::Connection($host, $port)
+ $conn = new ZOOM::Connection($host, $port,
+ databaseName => "mydb");
$conn->option(preferredRecordSyntax => "usmarc");
$rs = $conn->search_pqf('@attr 1=4 dinosaur');
$n = $rs->size();
C<ZOOM::ScanSet>
and
C<ZOOM::Package>.
-Of these, the Query class is abstract, and has two concrete
+Of these, the Query class is abstract, and has three concrete
subclasses:
-C<ZOOM::Query::CQL>
+C<ZOOM::Query::CQL>,
+C<ZOOM::Query::PQF>
and
-C<ZOOM::Query::PQF>.
+C<ZOOM::Query::CQL2RPN>.
+Finally, it also provides a
+C<ZOOM::Query::Log>
+module which supplies a useful general-purpose logging facility.
Many useful ZOOM applications can be built using only the Connection,
ResultSet, Record and Exception classes, as in the example
code-snippet above.
=head2 ZOOM::event()
-B<Warning.>
-Lark's vomit. Do not read this section.
-
- $which = ZOOM::event([ $conn1, $conn2, $conn3 ]);
+ $connsRef = [ $conn1, $conn2, $conn3 ];
+ $which = ZOOM::event($connsRef);
+ $ev = $connsRef->[$which-1]->last_event()
+ if ($which != 0);
Used only in complex asynchronous applications, this function takes a
reference to a list of Connection objects, waits until an event
into the list; 0 is returned if no event occurs within the longest
timeout specified by the C<timeout> options of all the connections.
-B<Warning.>
-This function is not yet implemented.
+See the section below on asynchronous applications.
=head1 CLASSES
$conn = new ZOOM::Connection("indexdata.dk:210/gils");
$conn = new ZOOM::Connection("tcp:indexdata.dk:210/gils");
$conn = new ZOOM::Connection("http:indexdata.dk:210/gils");
+ $conn = new ZOOM::Connection("indexdata.dk", 210,
+ databaseName => "mydb",
+ preferredRecordSyntax => "marc");
Creates a new Connection object, and immediately connects it to the
specified server. If you want to make a new Connection object but
number of the Z39.50 server to connect to; in the latter case, the
single argument is a YAZ service-specifier string of the form
+When the two-option form is used (which may be done using a vacuous
+second argument of zero), any number of additional argument pairs may
+be provided, which are interpreted as key-value pairs to be set as
+options after the Connection object is created but before it is
+connected to the server. This is a convenient way to set options,
+including those that must be set before connecting such as
+authentication tokens.
+
=over 4
=item
of records resulting from the search.
Since queries using PQF (Prefix Query Format) are so common, we make
-them a special case by providing a C<search_prefix()> method. This is
+them a special case by providing a C<search_pqf()> method. This is
identical to C<search()> except that it accepts a string containing
the query rather than an object, thereby obviating the need to create
a C<ZOOM::Query::PQF> object. See the documentation of that class for
information about PQF.
-=head4 scan()
+=head4 scan() / scan_pqf()
+
+ $rs = $conn->scan(new ZOOM::Query::CQL('title=dinosaur'));
+ # The next two lines are equivalent
+ $rs = $conn->scan(new ZOOM::Query::PQF('@attr 1=4 dinosaur'));
+ $rs = $conn->scan_pqf('@attr 1=4 dinosaur');
Many Z39.50 servers allow you to browse their indexes to find terms to
search for. This is done using the C<scan> method, which creates and
four words ``Back'', ``Empire'', ``Strikes'' and ``The'', interleaved
with words from other titles in the same index.
-All of this is done by using a single term from the PQF query as the
-C<scan()> argument. (At present, only PQF is supported, although
-there is no reason in principle why CQL and other query syntaxes
-should not be supported in future). The attributes associated with
+All of this is done by using a Query object representing a query of a
+single term as the C<scan()> argument. The attributes associated with
the term indicate which index is to be used, and the term itself
indicates the point in the index at which to start the scan. For
-example, if the argument is C<@attr 1=4 fish>, then
+example, if the argument is the query C<@attr 1=4 fish>, then
=over 4
but overriding this can be useful to get a high-level overview of the
index.
+Since scans using PQF (Prefix Query Format) are so common, we make
+them a special case by providing a C<scan_pqf()> method. This is
+identical to C<scan()> except that it accepts a string containing the
+query rather than an object, thereby obviating the need to create a
+C<ZOOM::Query::PQF> object.
+
=back
=head4 package()
Returns a C<ZOOM::Event> enumerated value indicating the type of the
last event that occurred on the connection. This is used only in
-complex asynchronous applications - see the section below on
-C<ZOOM::Event> for more information.
-
-B<Warning.>
-This method has not been tested.
+complex asynchronous applications - see the sections below on the
+C<ZOOM::Event> enumeration and asynchronous applications.
=head4 destroy()
There is no C<new()> method nor any other explicit constructor. The
only way to create a new ResultSet is by using C<search()> (or
-C<search_prefix()>) on a Connection.
+C<search_pqf()>) on a Connection.
See the description of the C<Result Set> class in the ZOOM Abstract
API at
=head4 render()
- print $rec->render()
+ print $rec->render();
+ print $rec->render("charset=latin1,utf8");
Returns a human-readable representation of the record. Beyond that,
no promises are made: careful programs should not make assumptions
about the format of the returned string.
+If the optional argument is provided, then it is interpreted as in the
+C<get()> method (q.v.)
+
This method is useful mostly for debugging.
=head4 raw()
- use MARC::Record
+ use MARC::Record;
$raw = $rec->raw();
$marc = new_from_usmarc MARC::Record($raw);
+ $trans = $rec->render("charset=latin1,utf8");
Returns an opaque blob of data that is the raw form of the record.
Exactly what this is, and what you can do with it, varies depending on
module; GRS-1 record will be ... gosh, what an interesting question.
But no-one uses GRS-1 any more, do they?
+If the optional argument is provided, then it is interpreted as in the
+C<get()> method (q.v.)
+
+=head4 get()
+
+ $raw = $rec->get("raw");
+ $rendered = $rec->get("render");
+ $trans = $rec->get("render;charset=latin1,utf8");
+ $trans = $rec->get("render", "charset=latin1,utf8");
+
+This is the underlying method used by C<render()> and C<raw()>, and
+which in turn delegates to the C<ZOOM_record_get()> function of the
+underlying ZOOM-C library. Most applications will find it more
+natural to work with C<render()> and C<raw()>.
+
+C<get()> may be called with either one or two arguments. The
+two-argument form is syntactic sugar: the two arguments are simply
+joined with a semi-colon to make a single argument, so the third and
+fourth example invocations above are equivalent. The second argument
+(or portion of the first argument following the semicolon) is used in
+the C<type> argument of C<ZOOM_record_get()>, as described in
+http://www.indexdata.com/yaz/doc/zoom.records.tkl
+This is useful primarily for invoking the character-set transformation
+- in the examples above, from ISO Latin-1 to UTF-8 Unicode.
+
=head4 clone() / destroy()
$rec = $rs->record($i);
finally, destroy the package.
Package options are listed at
-http://indexdata.com/yaz/doc/zoom.ext.html
+http://indexdata.com/yaz/doc/zoom.ext.tkl
The particular options that have meaning are determined by the
top-level operation string specified as the argument to C<send()>.
=head4 send()
- $p->send("createdb");
+ $p->send("create");
Sends a package to the server associated with the Connection that
created it. Problems are reported by throwing an exception. The
C<search()> on a Connection; because PQF is such a common special
case, the shortcut Connection method C<search_pqf()> is provided.
-The following Query subclasses are provided, both of the providing the
+The following Query subclasses are provided, each providing the
same set of methods described below:
=over 4
and in a slight out-of-date but nevertheless useful tutorial at
http://zing.z3950.org/cql/intro.html
+=item ZOOM::Query::CQL2RPN
+
+Implements CQL by compiling it on the client-side into a Z39.50
+Type-1 (RPN) query, and sending that. This provides essentially the
+same functionality as C<ZOOM::Query::CQL>, but it will work against
+any standard Z39.50 server rather than only against the small subset
+that support CQL natively. The drawback is that, because the
+compilation is done on the client side, a configuration file is
+required to direct the mapping of CQL constructs such as index names,
+relations and modifiers into Type-1 query attributes. An example CQL
+configuration file is included in the ZOOM-Perl distribution, in the
+file C<samples/cql/pqf.properties>
+
=back
See the description of the C<Query> class in the ZOOM Abstract
=head4 new()
- $q = new ZOOM::Query::CQL('title=dinosaur'));
- $q = new ZOOM::Query::PQF('@attr 1=4 dinosaur'));
+ $q = new ZOOM::Query::CQL('title=dinosaur');
+ $q = new ZOOM::Query::PQF('@attr 1=4 dinosaur');
Creates a new query object, compiling the query passed as its argument
according to the rules of the particular query-type being
instantiated. If compilation fails, an exception is thrown.
Otherwise, the query may be passed to the C<Connection> method
-<search()>.
+C<search()>.
+
+ $conn->option(cqlfile => "samples/cql/pqf.properties");
+ $q = new ZOOM::Query::CQL2RPN('title=dinosaur', $conn);
+
+Note that for the C<ZOOM::Query::CQL2RPN> subclass, the Connection
+must also be passed into the constructor. This is used for two
+purposes: first, its C<cqlfile> option is used to find the CQL
+configuration file that directs the translations into RPN; and second,
+if compilation fails, then diagnostic information is cached in the
+Connection and be retrieved using C<$conn-E<gt>errcode()> and related
+methods.
=head4 sortby()
specification language is the same as the C<yaz> sort-specification
type of the C<ResultSet> method C<sort()>, described above.
-B<It ought to be possible to sort by CQL query, too, but at present
-limitations in the underlying ZOOM-C library make this impossible.>
-
=head4 destroy()
$p->destroy()
C<QUERY_PQF>,
C<SORTBY>,
C<CLONE>,
-C<PACKAGE>
+C<PACKAGE>,
+C<SCANTERM>
and
-C<SCANTERM>,
+C<LOGLEVEL>,
each of which specifies a client-side error. These codes constitute
the C<ZOOM> diagnostic set.
return an indication of the last event that occurred on a particular
connection. It always returns a value drawn from this enumeration,
that is, one of C<NONE>, C<CONNECT>, C<SEND_DATA>, C<RECV_DATA>,
-C<TIMEOUT>, C<UNKNOWN>, C<SEND_APDU>, C<RECV_APDU>, C<RECV_RECORD> or
-C<RECV_SEARCH>.
+C<TIMEOUT>, C<UNKNOWN>, C<SEND_APDU>, C<RECV_APDU>, C<RECV_RECORD>,
+C<RECV_SEARCH> or C<ZEND>.
+
+See the section below on asynchronous applications.
+
+=head1 LOGGING
+
+ ZOOM::Log::init_level(ZOOM::Log::mask_str("zoom,myapp,-warn"));
+ ZOOM::Log::log("myapp", "starting up with pid ", $$);
+
+Logging facilities are provided by a set of functions in the
+C<ZOOM::Log> module. Note that C<ZOOM::Log> is not a class, and it
+is not possible to create C<ZOOM::Log> objects: the API is imperative,
+reflecting that of the underlying YAZ logging facilities. Although
+there are nine logging functions altogether, you can ignore nearly
+all of them: most applications that use logging will begin by calling
+C<mask_str()> and C<init_level()> once each, as above, and will then
+repeatedly call C<log()>.
+
+=head2 mask_str()
+
+ $level = ZOOM::Log::mask_str("zoom,myapp,-warn");
+
+Returns an integer corresponding to the log-level specified by the
+parameter. This is a string of zero or more comma-separated
+module-names, each indicating an individual module to be either added
+to the default log-level or removed from it (for those components
+prefixed by a minus-sign). The names may be those of either standard
+YAZ-logging modules such as C<fatal>, C<debug> and C<warn>, or custom
+modules such as C<myapp> in the example above. The module C<zoom>
+requests logging from the ZOOM module itself, which may be helpful for
+debugging.
+
+Note that calling this function does not in any way change the logging
+state: it merely returns a value. To change the state, this value
+must be passed to C<init_level()>.
+
+=head2 module_level()
+
+ $level = ZOOM::Log::module_level("zoom");
+ ZOOM::Log::log($level, "all systems clear: thrusters invogriated");
+
+Returns the integer corresponding to the single log-level specified as
+the parameter, or zero if that level has not been registered by a
+prior call to C<mask_str()>. Since C<log()> accepts either a numeric
+log-level or a string, there is no reason to call this function; but,
+what the heck, maybe you enjoy that kind of thing. Who are we to
+judge?
+
+=head2 init_level()
+
+ ZOOM::Log::init_level($level);
+
+Initialises the log-level to the specified integer, which is a bitmask
+of values, typically as returned from C<mask_str()>. All subsequent
+calls to C<log()> made with a log-level that matches one of the bits
+in this mask will result in a log-message being emitted. All logging
+can be turned off by calling C<init_level(0)>.
-You almost certainly don't need to know about this. Frankly, I'm not
-sure how to use it myself.
+=head2 init_prefix()
+
+ ZOOM::Log::init_prefix($0);
+
+Initialises a prefix string to be included in all log-messages.
+
+=head2 init_file()
+
+ ZOOM::Log::init_file("/tmp/myapp.log");
+
+Initialises the output file to be used for logging: subsequent
+log-messages are written to the nominated file. If this function is
+not called, log-messages are written to the standard error stream.
+
+=head2 init()
+
+ ZOOM::Log::init($level, $0, "/tmp/myapp.log");
+
+Initialises the log-level, the logging prefix and the logging output
+file in a single operation.
+
+=head2 time_format()
+
+ ZOOM::Log::time_format("%Y-%m-%d %H:%M:%S");
+
+Sets the format in which log-messages' timestamps are emitted, by
+means of a format-string like that used in the C function
+C<strftime()>. The example above emits year, month, day, hours,
+minutes and seconds in big-endian order, such that timestamps can be
+sorted lexicographically.
+
+=head2 init_max_size()
+
+(This doesn't seem to work, so I won't bother describing it.)
+
+=head2 log()
+
+ ZOOM::Log::log(8192, "reducing to warp-factor $wf");
+ ZOOM::Log::log("myapp", "starting up with pid ", $$);
+
+Provided that the first argument, log-level, is among the modules
+previously established by C<init_level()>, this function emits a
+log-message made up of a timestamp, the prefix supplied to
+C<init_prefix()>, if any, and the concatenation of all arguments after
+the first. The message is written to the standard output stream, or
+to the file previous specified by C<init_file()> if this has been
+called.
+
+The log-level argument may be either a numeric value, as returned from
+C<module_level()>, or a string containing the module name.
+
+=head1 ASYNCHRONOUS APPLICATIONS
+
+Although asynchronous applications are conceptually complex, the ZOOM
+support for them is provided through a very simple interface,
+consisting of one option (C<async>), one function (C<ZOOM::event()>),
+one Connection method (C<last_event()> and an enumeration
+(C<ZOOM::Event>).
+
+The approach is as follows:
+
+=over 4
+
+=item Initialisation
+
+Create several connections to the various servers, each of them having
+the option C<async> set, and with whatever additional options are
+required - e.g. the piggyback retrieval record-count can be set so
+that records will be returned in search responses.
+
+=item Operations
+
+Send searches to the connections, request records, etc.
+
+=item Event harvesting
+
+Repeatedly call C<ZOOM::event()> to discover what responses are being
+received from the servers. Each time this function returns, it
+indicates which of the connections has fired; this connection can then
+be interrogated with the C<last_event()> method to discover what event
+has occurred, and the return value - an element of the C<ZOOM::Event>
+enumeration - can be tested to determine what to do next. For
+example, the C<ZEND> event indicates that no further operations are
+outstanding on the connection, so any fetched records can now be
+immediately obtained.
+
+=back
+
+Here is a very short program (omitting all error-checking!) which
+demonstrates this process. It parallel-searches three servers (or more
+of you add them the list), displaying the first record in the
+result-set of each server as soon as it becomes available.
+
+ use ZOOM;
+ @servers = ('z3950.loc.gov:7090/Voyager',
+ 'bagel.indexdata.com:210/gils',
+ 'agricola.nal.usda.gov:7190/Voyager');
+ for ($i = 0; $i < @servers; $i++) {
+ $z[$i] = new ZOOM::Connection($servers[$i], 0,
+ async => 1, # asynchronous mode
+ count => 1, # piggyback retrieval count
+ preferredRecordSyntax => "usmarc");
+ $r[$i] = $z[$i]->search_pqf("mineral");
+ }
+ while (($i = ZOOM::event(\@z)) != 0) {
+ $ev = $z[$i-1]->last_event();
+ print("connection ", $i-1, ": ", ZOOM::event_str($ev), "\n");
+ if ($ev == ZOOM::Event::ZEND) {
+ $size = $r[$i-1]->size();
+ print "connection ", $i-1, ": $size hits\n";
+ print $r[$i-1]->record(0)->render()
+ if $size > 0;
+ }
+ }
=head1 SEE ALSO