X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=lib%2FZOOM.pod;h=176f05ca50503b52836fa09868c8db51beee1663;hb=8909986d78c95958d1f63217050afde991109e0a;hp=4fe8a3c8d9aa82b9ad35a889b11de281b7df0b98;hpb=642622d050330937ddd5261707dcc6ba9b039ee7;p=ZOOM-Perl-moved-to-github.git diff --git a/lib/ZOOM.pod b/lib/ZOOM.pod index 4fe8a3c..176f05c 100644 --- a/lib/ZOOM.pod +++ b/lib/ZOOM.pod @@ -1,4 +1,4 @@ -# $Id: ZOOM.pod,v 1.8 2005-11-17 13:32:30 mike Exp $ +# $Id: ZOOM.pod,v 1.11 2005-11-24 17:35:29 mike Exp $ use strict; use warnings; @@ -109,8 +109,7 @@ relevant section of the ZOOM Abstract API. $conn = new ZOOM::Connection("indexdata.dk:210/gils"); print("server is '", $conn->option("serverImplementationName"), "'\n"); $conn->option(preferredRecordSyntax => "usmarc"); - $conn->option_binary(iconBlob => "foo\0bar"); - $rs = $conn->search_pqf('@attr 1=4 mineral');/usr/local/src/mike/records/acc-ounts/cheques- + $rs = $conn->search_pqf('@attr 1=4 mineral'); $ss = $conn->scan('@attr 1=1003 a'); if ($conn->errcode() != 0) { die("somthing went wrong: " . $conn->errmsg()) @@ -239,6 +238,11 @@ See the C for the interpretation of these elements. =head4 option() / option_binary() + print("server is '", $conn->option("serverImplementationName"), "'\n"); + $conn->option(preferredRecordSyntax => "usmarc"); + $conn->option_binary(iconBlob => "foo\0bar"); + die if length($conn->option_binary("iconBlob") != 7); + Objects of the Connection, ResultSet, ScanSet and Package classes carry with them a set of named options which affect their behaviour in certain ways. See the ZOOM-C options documentation for details: @@ -252,21 +256,15 @@ http://indexdata.com/yaz/doc/zoom.tkl#zoom.connections =item * -ResultSet options are listed at -http://indexdata.com/yaz/doc/zoom.resultsets.tkl -I<### move this obvservation down the appropriate place> - -=item * - ScanSet options are listed at http://indexdata.com/yaz/doc/zoom.scan.tkl -I<### move this obvservation down the appropriate place> +I<### move this obvservation down to the appropriate place> =item * Package options are listed at http://indexdata.com/yaz/doc/zoom.ext.html -I<### move this obvservation down the appropriate place> +I<### move this obvservation down to the appropriate place> =back @@ -284,15 +282,109 @@ and returned correctly. =head4 search() / search_pqf() -I<###> + $rs = $conn->search(new ZOOM::Query::CQL('title=dinosaur')); + # The next two lines are equivalent + $rs = $conn->search(new ZOOM::Query::PQF('@attr 1=4 dinosaur')); + $rs = $conn->search_pqf('@attr 1=4 dinosaur'); + +The principal purpose of a search-and-retrieve protocol is searching +(and, er, retrieval), so the principal method used on a Connection +object is C. It accepts a single argument, a C +object (or, more precisely, an object of a subclass of this class); +and it creates and returns a new ResultSet object representing the set +of records resulting from the search. + +Since queries using PQF (Prefix Query Format) are so common, we make +them a special case by providing a C method. This is +identical to C except that it accepts a string containing +the query rather than an object, thereby obviating the need to create +a C object. See the documentation of that class for +information about PQF. =head4 scan() -I<###> +Many Z39.50 servers allow you to browse their indexes to find terms to +search for. This is done using the C method, which creates and +returns a new ScanSet object representing the set of terms resulting +from the scan. + +C takes a single argument, but it has to work hard: it +specifies both what index to scan for terms, and where in the index to +start scanning. What's more, the specification of what index to scan +includes multiple facets, such as what database fields it's an index +of (author, subject, title, etc.) and whether to scan for whole fields +or single words (e.g. the title ``I'', or the +four words ``Back'', ``Empire'', ``Strikes'' and ``The'', interleaved +with words from other titles in the same index. + +All of this is done by using a single term from the PQF query as the +C argument. (At present, only PQF is supported, although +there is no reason in principle why CQL and other query syntaxes +should not be supported in future). The attributes associated with +the term indicate which index is to be used, and the term itself +indicates the point in the index at which to start the scan. For +example, if the argument is C<@attr 1=4 fish>, then + +=over 4 + +=item @attr 1=4 + +This is the BIB-1 attribute with type 1 (meaning access-point, which +specifies an index), and type 4 (which means ``title''). So the scan +is in the title index. + +=item fish + +Start the scan from the lexicographically earliest term that is equal +to or falls after ``fish''. + +=back + +The argument C<@attr 1=4 @attr 6=3 fish> would behave similarly; but +the BIB-1 attribute 6=3 mean completeness=``complete field'', so the +scan would be for complete titles rather than for words occurring in +titles. + +This takes a bit of getting used to. + +The behaviour is C is affected by the following options, which +may be set on the Connection through which the scan is done: + +=over 4 + +=item number [default: 10] + +Indicates how many terms should be returned in the ScanSet. The +number actually returned may be less, if the start-point is near the +end of the index, but will not be greater. + +=item position [default: 1] + +A 1-based index specifying where in the returned list of terms the +seed-term should appear. By default it should be the first term +returned, but C may be set, for example, to zero (requesting +the next terms I the seed-term), or to the same value as +C (requesting the index terms I the seed term). + +=item stepSize [default: 0] + +An integer indicating how many indexed terms are to be skipped between +each one returned in the ScanSet. By default, no terms are skipped, +but overriding this can be useful to get a high-level overview of the +index. + +=back =head4 package() -I<###> + $p = $conn->package(); + $o = new ZOOM::Options(); + $o->option(databaseName => "newdb"); + $p = $conn->package($o); + +Creates and returns a new C, to be used in invoking an +Extended Service. An options block may optionally be passed in. See +the C documentation. =head4 destroy() @@ -304,11 +396,214 @@ a Connection that has been Ced. =head2 ZOOM::ResultSet -I<###> + $rs = $conn->search_pqf('@attr 1=4 mineral'); + $n = $rs->size(); + for $i (1 .. $n) { + $rec = $rs->record($i-1); + print $rec->render(); + } + +A ResultSet object represents the set of zero or more records +resulting from a search, and is the means whereby these records can be +retrieved. A ResultSet object may maintain client side cache or some, +less, none, all or more of the server's records: in general, this is +supposed to an implementaton detail of no interest to a typical +application, although more sophisticated applications do have +facilities for messing with the cache. Most applications will only +need the C, C and C methods. + +There is no C method nor any other explicit constructor. The +only way to create a new ResultSet is by using C (or +C) on a Connection. + +See the description of the C class in the ZOOM Abstract +API at +http://zoom.z3950.org/api/zoom-current.html#3.4 + +=head3 Methods + +=head4 option() + + $conn->option(elementSetName => "f"); + +Allows options to be set into, and read from a ResultSet, just like +the Connection class's C method. There is no +C method for ResultSet objects. + +ResultSet options are listed at +http://indexdata.com/yaz/doc/zoom.resultsets.tkl + +=head4 size() + + print "Found ", $rs->size(), " records\n"; + +Returns the number of records in the result set. + +=head4 record(), record_immediate() + + $rec = $rs->record(0); + $rec2 = $rs->record_immediate(0); + $rec3 = $rs->record_immediate(1) + or print "second record wasn't in cache\n"; + +The C method returns a C object representing +a record from result-set, whose position is indicated by the argument +passed in. This is a zero-based index, so that legitimate values +range from zero to C<$rs->size()-1>. + +The C API is identical, but it never invokes a +network operation, merely returning the record from the ResultSet's +cache if it's already there, or an undefined value otherwise. So if +you use this method, B. + +=head4 records() + + $rs->records(0, 10, 0); + for $i (0..10) { + print $rs->record_immediate($i)->render(); + } + + @nextseven = $rs->records(10, 7, 1); + +The C method only fetches records from the cache, +whereas C fetches them from the server if they have not +already been cached; but the ZOOM module has to guess what the most +efficient strategy for this is. It might fetch each record, alone +when asked for: that's optimal in an application that's only +interested in the top hit from each search, but pessimal for one that +wants to display a whole list of results. Conversely, the software's +strategy might be always to ask for blocks of a twenty records: +that's great for assembling long lists of things, but wasteful when +only one record is wanted. The problem is that the ZOOM module can't +tell, when you call C<$rs->record()>, what your intention is. + +But you can tell it. The C method fetches a sequence of +records, all in one go. It takes three arguments: the first is the +zero-based index of the first record in the sequence, the second is +the number of records to fetch, and the third is a boolean indication +of whether or not to return the retrieved records as well as adding +them to the cache. (You can always pass 1 for this if you like, and +Perl will discard the unused return value, but there is a small +efficiency gain to be had by passing 0.) + +Once the records have been retrieved from the server +(i.e. C has completed without throwing an exception), they +can be fetched much more efficiently using C - or +Ccache_reset() + +Resets the ResultSet's record cache, so that subsequent invocations of +C will fail. I struggle to imagine a real +scenario where you'd want to do this. + +=head4 sort() + + if ($rs->sort("yaz", "1=4 >i 1=21 >s") < 0) { + die "sort failed"; + } + +Sorts the ResultSet in place (discarding any cached records, as they +will in general be sorted into a different position). There are two +arguments: the first is a string indicating the type of the +sort-specification, and the second is the specification itself. + +The C method returns 0 on success, or -1 if the +sort-specification is invalid. + +At present, the only supported sort-specification type is C. +Such a specification consists of a space-separated sequence of keys, +each of which itself consists of two space-separated words (so that +the total number of words in the sort-specification is even). The two +words making up each key are a field and a set of flags. The field +can take one of two forms: if it contains an C<=> sign, then it is a +BIB-1 I=I pair specifying which field to sort +(e.g. C<1=4> for a title sort); otherwise it is sent for the server to +interpret as best it can. The word of flags is made up from one or +more of the following: C for case sensitive, C for case +insensitive; C<<> for ascending order and C> for descending +order. + +For example, the sort-specification in the code-fragment above will +sort the records in C<$rs> case-insensitively in descending order of +title, with records having equivalent titles sorted case-sensitively +in ascending order of subject. (The BIB-1 access points 4 and 21 +represent title and subject respectively.) + +=head4 destroy() + + $rs->destroy() + +Destroys a ResultSet object, freeing its resources. It is an error to +reuse a ResultSet that has been Ced. =head2 ZOOM::Record -I<###> + $rec = $rs->record($i); + print $rec->render(); + $raw = $rec->raw(); + $marc = new_from_usmarc MARC::Record($raw); + print "Record title is: ", $marc->title(), "\n"; + +A Record object represents a record that has been retrived from the +server. + +There is no C method nor any other explicit constructor. The +only way to create a new Record is by using C (or +C, or C) on a ResultSet. + +In general, records are ``owned'' by their result-sets that they were +retrieved from, so they do not have to be explicitly memory-managed: +they are deallocated (and therefore can no longer be used) when the +result-set is destroyed. + +See the description of the C class in the ZOOM Abstract +API at +http://zoom.z3950.org/api/zoom-current.html#3.4 + +=head3 Methods + +=head4 render() + + print $rec->render() + +Returns a human-readable representation of the record. Beyond that, +no promises are made: careful programs should not make assumptions +about the format of the returned string. + +This method is useful mostly for debugging. + +=head4 raw() + + use MARC::Record + $raw = $rec->raw(); + $marc = new_from_usmarc MARC::Record($raw); + +Returns an opaque blob of data that is the raw form of the record. +Exactly what this is, and what you can do with it, varies depending on +the record-syntax. For example, XML records will be returned as, +well, XML; MARC records will be returned as ISO 2709-encoded blocks +that can be decoded by software such as the fine C +module; GRS-1 record will be ... gosh, what an interesting question. +But no-one uses GRS-1 any more, do they? + +=head4 clone(), destroy() + + $rec = $rs->record($i); + $newrec = $rec->clone(); + $rs->destroy(); + print $newrec->render(); + $newrec->destroy(); + +Usually, it's convenient that Record objects are owned by their +ResultSets and go away when the ResultSet is destroyed; but +occasionally you need a Record to outlive its parent and destroy it +later, explicitly. To do this, C the record, keep the new +Record object that is returned, and C it when it's no +longer needed. This is B situation in which a Record needs to +be destroyed. =head2 ZOOM::Exception @@ -426,6 +721,7 @@ http://zoom.z3950.org/api/zoom-current.html The C module, included in the same distribution as this one. The C module, which this one supersedes. +http://perl.z3950.org/ The documentation for the ZOOM-C module of the YAZ Toolkit, which this module is built on. Specifically, its lists of options are useful.