X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=lib%2FZOOM.pod;h=337f080abab009add5ab598ffeda5c4b0216b83c;hb=5fd91c3a88f5d2d9e8f11f69ec620fba5dc15af0;hp=a57b3b96f57bdb169874c1a2245bafe33bbb51ff;hpb=400789c7f048c737c543e5c5f516bb0666cd1959;p=ZOOM-Perl-moved-to-github.git diff --git a/lib/ZOOM.pod b/lib/ZOOM.pod index a57b3b9..337f080 100644 --- a/lib/ZOOM.pod +++ b/lib/ZOOM.pod @@ -1,4 +1,4 @@ -# $Id: ZOOM.pod,v 1.9 2005-11-17 15:31:06 mike Exp $ +# $Id: ZOOM.pod,v 1.19 2005-12-13 15:30:26 mike Exp $ use strict; use warnings; @@ -247,33 +247,9 @@ Objects of the Connection, ResultSet, ScanSet and Package classes carry with them a set of named options which affect their behaviour in certain ways. See the ZOOM-C options documentation for details: -=over 4 - -=item * - Connection options are listed at http://indexdata.com/yaz/doc/zoom.tkl#zoom.connections -=item * - -ResultSet options are listed at -http://indexdata.com/yaz/doc/zoom.resultsets.tkl -I<### move this obvservation down to the appropriate place> - -=item * - -ScanSet options are listed at -http://indexdata.com/yaz/doc/zoom.scan.tkl -I<### move this obvservation down to the appropriate place> - -=item * - -Package options are listed at -http://indexdata.com/yaz/doc/zoom.ext.html -I<### move this obvservation down to the appropriate place> - -=back - These options are set and fetched using the C method, which may be called with either one or two arguments. In the two-argument form, the option named by the first argument is set to the value of @@ -353,7 +329,33 @@ titles. This takes a bit of getting used to. -I<###> discuss how the values of options affect scanning. +The behaviour is C is affected by the following options, which +may be set on the Connection through which the scan is done: + +=over 4 + +=item number [default: 10] + +Indicates how many terms should be returned in the ScanSet. The +number actually returned may be less, if the start-point is near the +end of the index, but will not be greater. + +=item position [default: 1] + +A 1-based index specifying where in the returned list of terms the +seed-term should appear. By default it should be the first term +returned, but C may be set, for example, to zero (requesting +the next terms I the seed-term), or to the same value as +C (requesting the index terms I the seed term). + +=item stepSize [default: 0] + +An integer indicating how many indexed terms are to be skipped between +each one returned in the ScanSet. By default, no terms are skipped, +but overriding this can be useful to get a high-level overview of the +index. + +=back =head4 package() @@ -376,11 +378,214 @@ a Connection that has been Ced. =head2 ZOOM::ResultSet -I<###> + $rs = $conn->search_pqf('@attr 1=4 mineral'); + $n = $rs->size(); + for $i (1 .. $n) { + $rec = $rs->record($i-1); + print $rec->render(); + } + +A ResultSet object represents the set of zero or more records +resulting from a search, and is the means whereby these records can be +retrieved. A ResultSet object may maintain client side cache or some, +less, none, all or more of the server's records: in general, this is +supposed to an implementaton detail of no interest to a typical +application, although more sophisticated applications do have +facilities for messing with the cache. Most applications will only +need the C, C and C methods. + +There is no C method nor any other explicit constructor. The +only way to create a new ResultSet is by using C (or +C) on a Connection. + +See the description of the C class in the ZOOM Abstract +API at +http://zoom.z3950.org/api/zoom-current.html#3.4 + +=head3 Methods + +=head4 option() + + $rs->option(elementSetName => "f"); + +Allows options to be set into, and read from, a ResultSet, just like +the Connection class's C method. There is no +C method for ResultSet objects. + +ResultSet options are listed at +http://indexdata.com/yaz/doc/zoom.resultsets.tkl + +=head4 size() + + print "Found ", $rs->size(), " records\n"; + +Returns the number of records in the result set. + +=head4 record() / record_immediate() + + $rec = $rs->record(0); + $rec2 = $rs->record_immediate(0); + $rec3 = $rs->record_immediate(1) + or print "second record wasn't in cache\n"; + +The C method returns a C object representing +a record from result-set, whose position is indicated by the argument +passed in. This is a zero-based index, so that legitimate values +range from zero to C<$rs->size()-1>. + +The C API is identical, but it never invokes a +network operation, merely returning the record from the ResultSet's +cache if it's already there, or an undefined value otherwise. So if +you use this method, B. + +=head4 records() + + $rs->records(0, 10, 0); + for $i (0..10) { + print $rs->record_immediate($i)->render(); + } + + @nextseven = $rs->records(10, 7, 1); + +The C method only fetches records from the cache, +whereas C fetches them from the server if they have not +already been cached; but the ZOOM module has to guess what the most +efficient strategy for this is. It might fetch each record, alone +when asked for: that's optimal in an application that's only +interested in the top hit from each search, but pessimal for one that +wants to display a whole list of results. Conversely, the software's +strategy might be always to ask for blocks of a twenty records: +that's great for assembling long lists of things, but wasteful when +only one record is wanted. The problem is that the ZOOM module can't +tell, when you call C<$rs->record()>, what your intention is. + +But you can tell it. The C method fetches a sequence of +records, all in one go. It takes three arguments: the first is the +zero-based index of the first record in the sequence, the second is +the number of records to fetch, and the third is a boolean indication +of whether or not to return the retrieved records as well as adding +them to the cache. (You can always pass 1 for this if you like, and +Perl will discard the unused return value, but there is a small +efficiency gain to be had by passing 0.) + +Once the records have been retrieved from the server +(i.e. C has completed without throwing an exception), they +can be fetched much more efficiently using C - or +C, which is then guaranteed to succeed. + +=head4 cache_reset() + + $rs->cache_reset() + +Resets the ResultSet's record cache, so that subsequent invocations of +C will fail. I struggle to imagine a real +scenario where you'd want to do this. + +=head4 sort() + + if ($rs->sort("yaz", "1=4 >i 1=21 >s") < 0) { + die "sort failed"; + } + +Sorts the ResultSet in place (discarding any cached records, as they +will in general be sorted into a different position). There are two +arguments: the first is a string indicating the type of the +sort-specification, and the second is the specification itself. + +The C method returns 0 on success, or -1 if the +sort-specification is invalid. + +At present, the only supported sort-specification type is C. +Such a specification consists of a space-separated sequence of keys, +each of which itself consists of two space-separated words (so that +the total number of words in the sort-specification is even). The two +words making up each key are a field and a set of flags. The field +can take one of two forms: if it contains an C<=> sign, then it is a +BIB-1 I=I pair specifying which field to sort +(e.g. C<1=4> for a title sort); otherwise it is sent for the server to +interpret as best it can. The word of flags is made up from one or +more of the following: C for case sensitive, C for case +insensitive; C<<> for ascending order and C> for descending +order. + +For example, the sort-specification in the code-fragment above will +sort the records in C<$rs> case-insensitively in descending order of +title, with records having equivalent titles sorted case-sensitively +in ascending order of subject. (The BIB-1 access points 4 and 21 +represent title and subject respectively.) + +=head4 destroy() + + $rs->destroy() + +Destroys a ResultSet object, freeing its resources. It is an error to +reuse a ResultSet that has been Ced. =head2 ZOOM::Record -I<###> + $rec = $rs->record($i); + print $rec->render(); + $raw = $rec->raw(); + $marc = new_from_usmarc MARC::Record($raw); + print "Record title is: ", $marc->title(), "\n"; + +A Record object represents a record that has been retrived from the +server. + +There is no C method nor any other explicit constructor. The +only way to create a new Record is by using C (or +C, or C) on a ResultSet. + +In general, records are ``owned'' by their result-sets that they were +retrieved from, so they do not have to be explicitly memory-managed: +they are deallocated (and therefore can no longer be used) when the +result-set is destroyed. + +See the description of the C class in the ZOOM Abstract +API at +http://zoom.z3950.org/api/zoom-current.html#3.5 + +=head3 Methods + +=head4 render() + + print $rec->render() + +Returns a human-readable representation of the record. Beyond that, +no promises are made: careful programs should not make assumptions +about the format of the returned string. + +This method is useful mostly for debugging. + +=head4 raw() + + use MARC::Record + $raw = $rec->raw(); + $marc = new_from_usmarc MARC::Record($raw); + +Returns an opaque blob of data that is the raw form of the record. +Exactly what this is, and what you can do with it, varies depending on +the record-syntax. For example, XML records will be returned as, +well, XML; MARC records will be returned as ISO 2709-encoded blocks +that can be decoded by software such as the fine C +module; GRS-1 record will be ... gosh, what an interesting question. +But no-one uses GRS-1 any more, do they? + +=head4 clone() / destroy() + + $rec = $rs->record($i); + $newrec = $rec->clone(); + $rs->destroy(); + print $newrec->render(); + $newrec->destroy(); + +Usually, it's convenient that Record objects are owned by their +ResultSets and go away when the ResultSet is destroyed; but +occasionally you need a Record to outlive its parent and destroy it +later, explicitly. To do this, C the record, keep the new +Record object that is returned, and C it when it's no +longer needed. This is B situation in which a Record needs to +be destroyed. =head2 ZOOM::Exception @@ -401,24 +606,403 @@ circumstances that do not merit throwing an exception. For this reason, the return values of these methods should be checked. See the individual methods' documentation for details. +An exception carries the following pieces of information: + +=over 4 + +=item error-code + +A numeric code that specifies the type of error. This can be checked +for equality with known values, so that intelligent applications can +take appropriate action. + +=item error-message + +A human-readable message corresponding with the code. This can be +shown to users, but its value should not be tested, as it could vary +in different versions or under different locales. + +=item additional information [optional] + +A string containing information specific to the error-code. For +example, when the error-code is the BIB-1 diagnostic 109 ("Database +unavailable"), the additional information is the name of the database +that the application tried to use. For some error-codes, there is no +additional information at all; for some others, the additional +information is undefined and may just be an human-readable string. + +=item diagnostic set [optional] + +A short string specifying the diagnostic set from which the error-code +was drawn: for example, C for a ZOOM-specific error such as +C ("out of memory"), and C for a Z39.50 +error-code drawn from the BIB-1 diagnostic set. + +=back + +In theory, the error-code should be interpreted in the context of the +diagnostic set from which it is drawn; in practice, nearly all errors +are from either the ZOOM or BIB-1 diagnostic sets, and the codes in +those sets have been chosen so as not to overlap, so the diagnostic +set can usually be ignored. + +See the description of the C class in the ZOOM Abstract +API at +http://zoom.z3950.org/api/zoom-current.html#3.7 + =head3 Methods -I<###> +=head4 new() + + die new ZOOM::Exception($errcode, $errmsg, $addinfo, $diagset); + +Creates and returns a new Exception object with the specified +error-code, error-message, additional information and diagnostic set. +Applications will not in general need to use this, but may find it +useful to simulate ZOOM exceptions. As is usual with Perl, exceptions +are thrown using C. + +=head4 code() / message() / addinfo() / diagset() + + print "Error ", $@->code(), ": ", $@->message(), "\n"; + print "(addinfo '", $@->addinfo(), "', set '", $@->diagset(), "')\n"; + +These methods, of no arguments, return the exception's error-code, +error-message, additional information and diagnostic set respectively. + +=head4 render() + + print $@->render(); + +Returns a human-readable rendition of an exception. The C<""> +operator is overloaded on the Exception class, so that an Exception +used in a string context is automatically rendered. Among other +consequences, this has the useful result that a ZOOM application that +died due to an uncaught exception will emit an informative message +before exiting. =head2 ZOOM::ScanSet -I<###> + $ss = $conn->scan('@attr 1=1003 a'); + $n = $ss->size(); + ($term, $occ) = $ss->term($n-1); + $rs = $conn->search_pqf('@attr 1=1003 "' . $term . "'"); + assert($rs->size() == $occ); + +A ScanSet represents a set of candidate search-terms returned from an +index scan. Its sole purpose is to provide access to those term, to +the corresponding display terms, and to the occurrence-counts of the +terms. + +There is no C method nor any other explicit constructor. The +only way to create a new ScanSet is by using C on a +Connection. + +See the description of the C class in the ZOOM Abstract +API at +http://zoom.z3950.org/api/zoom-current.html#3.6 + +=head3 Methods + +=head4 size() + + print "Found ", $ss->size(), " terms\n"; + +Returns the number of terms in the scan set. In general, this will be +the scan-set size requested by the C option in the Connection +on which the scan was performed [default 10], but it may be fewer if +the scan is close to the end of the index. + +=head4 term() / display_term() + + $ss = $conn->scan('@attr 1=1004 whatever'); + ($term, $occurrences) = $ss->term(0); + ($displayTerm, $occurrences2) = $ss->display_term(0); + assert($occurrences == $occurrences2); + if (user_likes_the_look_of($displayTerm)) { + $rs = $conn->search_pqf('@attr 1=4 "' . $term . '"'); + assert($rs->size() == $occurrences); + } + +These methods return the scanned terms themselves. C returns +the term is a form suitable for submitting as part of a query, whereas +C returns it in a form suitable for displaying to a +user. Both versions also return the number of occurrences of the term +in the index, i.e. the number of hits that will be found if the term +is subsequently used in a query. + +In most cases, the term and display term will be identical; however, +they may be different in cases where punctuation or case is +normalised, or where identifiers rather than the original document +terms are indexed. + +=head4 option() + + print "scan status is ", $ss->option("scanStatus"); + +Allows options to be set into, and read from, a ScanSet, just like +the Connection class's C method. There is no +C method for ScanSet objects. + +ScanSet options are also described, though not particularly +informatively, at +http://indexdata.com/yaz/doc/zoom.scan.tkl + +=head4 destroy() + + $ss->destroy() + +Destroys a ScanSet object, freeing its resources. It is an error to +reuse a ScanSet that has been Ced. =head2 ZOOM::Package -I<###> + $p = $conn->package(); + $p->option(action => "specialUpdate"); + $p->option(recordIdOpaque => 145); + $p->option(record => content_of("/tmp/record.xml")); + $p->send("update"); + $p->destroy(); + +This class represents an Extended Services Package: an instruction to +the server to do something not covered by the core parts of the Z39.50 +standard (or the equivalent in SRW or SRU). Since the core protocols +are read-only, such requests are often used to make changes to the +database, such as in the record update example above. + +Requesting an extended service is a four-step process: first, create a +package associated with the connection to the relevant database; +second, set options on the package to instruct the server on what to +do; third, send the package (which may result in an exception being +thrown if the server cannot execute the requested operations; and +finally, destroy the package. + +Package options are listed at +http://indexdata.com/yaz/doc/zoom.ext.html + +The particular options that have meaning are determined by the +top-level operation string specified as the argument to C. +For example, when the operation is C (the most commonly used +extended service), the C option may be set to any of +C +(add a new record, failing if that record already exists), +C +(delete a record, failing if it is not in the database). +C +(replace a record, failing if an old version is not already present) +or +C +(add a record, replacing any existing version that may be present). + +For update, the C option should be set to the full text of the +XML record to added, deleted or replaced. Depending on how the server +is configured, it may extract the record's unique ID from the text +(i.e. from a known element such as the C<001> field of a MARCXML +record), or it may require the unique ID to passed in explicitly using +the C option. + +Extended services packages are B in the ZOOM +Abstract API at +http://zoom.z3950.org/api/zoom-current.html +They will be added in a forthcoming version, and will function much +as those implemented in this module. + +=head3 Methods + +=head4 option() + + $p->option(recordIdOpaque => "46696f6e61"); + +Allows options to be set into, and read from, a Package, just like +the Connection class's C method. There is no +C method for Package objects. + +Package options are listed at +http://indexdata.com/yaz/doc/zoom.ext.tkl + +=head4 send() + + $p->send("createdb"); + +Sends a package to the server associated with the Connection that +created it. Problems are reported by throwing an exception. The +single parameter indicates the operation that the server is being +requested to perform, and controls the interpretation of the package's +options. Valid operations include: + +=over 4 + +=item itemorder + +Request a copy of a nominated object, e.g. place an ILL request. + +=item create + +Create a new database, the name of which is specified by the +C option. + +=item drop + +Drop an existing database, the name of which is specified by the +C option. + +=item commit + +Commit changes made to the database within a transaction. + +=item update + +Modify the contents of the database by adding, deleting or replacing +records (as described above in the overview of the C +class). + +=item xmlupdate + +I have no idea what this does. + +=back + +Although the module is capable of I all these requests, not +all servers are capable of I them. Refusal is indicated by +throwing an exception. Problems may also be caused by lack of +privileges; so C must be used with caution, and is perhaps +best wrapped in a clause that checks for execptions, like so: + + eval { $p->send("create") }; + if ($@ && $@->isa("ZOOM::Exception")) { + print "Oops! ", $@->message(), "\n"; + return $@->code(); + } + +=head4 destroy() + + $p->destroy() + +Destroys a Package object, freeing its resources. It is an error to +reuse a Package that has been Ced. =head2 ZOOM::Query -I<###> + $q = new ZOOM::Query::CQL("creator=pike and subject=unix"); + $q->sortby("1=4 >i 1=21 >s"); + $rs = $conn->search($q); + $q->destroy(); + +C is a virtual base class from which various concrete +subclasses can be derived. Different subclasses implement different +types of query. The sole purpose of a Query object is to be used in a +C on a Connection; because PQF is such a common special +case, the shortcut Connection method C is provided. + +The following Query subclasses are provided, both of the providing the +same set of methods described below: + +=over 4 + +=item ZOOM::Query::PQF + +Implements Prefix Query Format (PQF), also sometimes known as Prefix +Query Notation (PQN). This esoteric but rigorous and expressive +format is described in the YAZ Manual at +http://indexdata.com/yaz/doc/tools.tkl#PQF + +=item ZOOM::Query::CQL + +Implements the Common Query Language (CQL) of SRU, the Search/Retrieve +URL. CQL is a much friendlier notation than PQF, using a simple infix +notation. The queries are passed ``as is'' to the server rather than +being compiled into a Z39.50 Type-1 query, so only CQL-compliant +servers can support such querier. CQL is described at +http://www.loc.gov/standards/sru/cql/ +and in a slight out-of-date but nevertheless useful tutorial at +http://zing.z3950.org/cql/intro.html + +=back + +See the description of the C class in the ZOOM Abstract +API at +http://zoom.z3950.org/api/zoom-current.html#3.3 + +=head3 Methods + +=head4 new() + + $q = new ZOOM::Query::CQL('title=dinosaur')); + $q = new ZOOM::Query::PQF('@attr 1=4 dinosaur')); + +Creates a new query object, compiling the query passed as its argument +according to the rules of the particular query-type being +instantiated. If compilation fails, an exception is thrown. +Otherwise, the query may be passed to the C method +. + +=head4 sortby() + + $q->sortby("1=4 >i 1=21 >s"); + +Sets a sort specification into the query, so that when a C +is run on the query, the result is automatically sorted. The sort +specification language is the same as the C sort-specification +type of the C method C, described above. + +B + +=head4 destroy() + + $p->destroy() + +Destroys a Query object, freeing its resources. It is an error to +reuse a Query that has been Ced. =head2 ZOOM::Options + $o1 = new ZOOM::Options(); + $o1->option(user => "alf"); + $o2 = new ZOOM::Options(); + $o2->option(password => "fruit"); + $opts = new ZOOM::Options($o1, $o2); + $conn = create ZOOM::Connection($opts); + $conn->connect($host); # Uses the specified username and password + +Several classes of ZOOM objects carry their own sets of options, which +can be manipulated using their C method. Sometimes, +however, it's useful to deal with the option sets directly, and the +C class exists to enable this approach. + +Option sets are B in the ZOOM +Abstract API at +http://zoom.z3950.org/api/zoom-current.html +They are an extension to that specification. + +=head3 Methods + +=head4 new() + +I<###> + +=head4 option() + +I<###> + +=head4 option_binary() + +I<###> + +=head4 bool() / int() + +I<###> + +=head4 set_int() + +I<###> + +=head4 set_callback() + +I<###> + +=head4 destroy() + I<###> =head1 ENUMERATIONS @@ -462,10 +1046,12 @@ C, C and C, -each of which specifies a client-side error. Since errors may also be -diagnosed by the server, and returned to the client, error codes may -also take values from the BIB-1 diagnostic set of Z39.50, listed at -the Z39.50 Maintenance Agency's web-site at +each of which specifies a client-side error. These codes constitute +the C diagnostic set. + +Since errors may also be diagnosed by the server, and returned to the +client, error codes may also take values from the BIB-1 diagnostic set +of Z39.50, listed at the Z39.50 Maintenance Agency's web-site at http://www.loc.gov/z3950/agency/defns/bib1diag.html All error-codes, whether client-side from the C