X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fexamples.xml;h=ab8b64a51dfc740f0787ca09602e32c78ac3abc8;hb=25a37c9be836f891281688788a7a1f967ea2b2cb;hp=53f839da4673da6c05f084e3e9870b5168a13d66;hpb=528edc9943ba3311a40b4ab875c0bc9aca5caa87;p=idzebra-moved-to-github.git diff --git a/doc/examples.xml b/doc/examples.xml index 53f839d..ab8b64a 100644 --- a/doc/examples.xml +++ b/doc/examples.xml @@ -1,5 +1,5 @@ - + Example Configurations @@ -106,7 +106,7 @@ $ yaz-client tcp:@:9999 Connecting...Ok. - Z> find @attr 1=/GENUS/MEANING @and lizard earthquakes + Z> find @attr 1=/GENUS/SPECIES/AUTHOR/@name Wedel Number of hits: 1 Z> format xml Z> show 1 @@ -139,9 +139,99 @@ + - Example 2: Supporting Z39.50 Searches + Example 2: Supporting Interoperable Searches + + + The problem with the previous example is that you need to know the + structure of the documents in order to find them. For example, + when we wanted to know the genera for which Matt Wedel is an + author, we had to formulate a complex XPath + 1=/GENUS/SPECIES/AUTHOR/@name + which embodies the knowledge that author names are specified in the + name attribute of the + <AUTHOR> element, + which is inside the + <SPECIES> element, + which in turn is inside the top-level + <GENUS> element. + + + This is bad not just because it requires a lot of typing, but more + significantly because it ties searching semantics to the physical + structure of the searched records. You can't use the same search + specification to search two databases if their internal + representations are different. Consider an alternative dinosaur + database in which the records have author names specified + inside an <authorName> element directly + inside a top-level <taxon> element: then + you'd need to search for them using + 1=/taxon/authorName + + + How, then, can we build broadcasting Information Retrieval + applications that look for records in many different databases? + The Z39.50 protocol offers a powerful and general solution to this: + abstract ``access points''. In the Z39.50 model, an access point + is simply a point at which searches can be directed. Nothing is + said about implementation: in a given database, an access point + might be implemented as an index, a path into physical records, an + algorithm for interrogating relational tables or whatever works. + The key point is that the semantics of an access point are fixed + and well defined. + + + For convenience, access points are gathered into attribute + sets. For example, the BIB-1 attribute set is supposed to + contain bibliographic access points such as author, title, subject + and ISBN; the GEO attribute set contains access points pertaining + to geospatial information (bounding box, ###, etc.); the CIMI + attribute set contains access points to do with museum collections + (provenance, inscriptions, etc.) + + + In practice, the BIB-1 attribute set has tended to be a dumping + ground for all sorts of access points, so that, for example, it + includes some geospatial access points as well as strictly + bibliographic ones. Nevertheless, the key point is that this model + allows a layer of abstraction over the physical representation of + records in databases. + + + In the BIB-1 attribute set, an author search is represented by + access point 1003. (See + ) + So we need to configure our dinosaur database so that searches for + BIB-1 access point 1003 look the + name attribute of the + <AUTHOR> element, + inside the + <SPECIES> element, + inside the top-level + <GENUS> element. + + + This is a two-step process. First, we need to tell Zebra that we + want to support the BIB-1 attribute set. Then we need to tell it + which elements of its record pertain to access point 1003. + + + + + + + + + + + + + + + +