- <idzebra xmlns="http://www.indexdata.dk/zebra/">
- <size>300</size>
- <localnumber>23</localnumber>
- <filename>records/dino.xml</filename>
- </idzebra>
- </Zthes>
- </screen>
- </para>
- <para>
- Now wasn't that nice and easy?
- </para>
- </sect1>
-
-
- <sect1 id="example2">
- <title>Example 2: Supporting Interoperable Searches</title>
-
- <para>
- The problem with the previous example is that you need to know the
- structure of the documents in order to find them. For example,
- when we wanted to find the record for the taxon
- <foreignphrase role="taxon">Sauroposeidon</foreignphrase>,
- we had to formulate a complex XPath
- <literal>/Zthes/termName</literal>
- which embodies the knowledge that taxon names are specified in a
- <literal><termName></literal> element inside the top-level
- <literal><Zthes></literal> element.
- </para>
- <para>
- This is bad not just because it requires a lot of typing, but more
- significantly because it ties searching semantics to the physical
- structure of the searched records. You can't use the same search
- specification to search two databases if their internal
- representations are different. Consider a different taxonomy
- database in which the records have taxon names specified
- inside a <literal><name></literal> element nested within a
- <literal><identification></literal> element
- inside a top-level <literal><taxon></literal> element: then
- you'd need to search for them using
- <literal>1=/taxon/identification/name</literal>
- </para>
- <para>
- How, then, can we build broadcasting Information Retrieval
- applications that look for records in many different databases?
- The &acro.z3950; protocol offers a powerful and general solution to this:
- abstract ``access points''. In the &acro.z3950; model, an access point
- is simply a point at which searches can be directed. Nothing is
- said about implementation: in a given database, an access point
- might be implemented as an index, a path into physical records, an
- algorithm for interrogating relational tables or whatever works.
- The only important thing is that the semantics of an access
- point is fixed and well defined.
- </para>
- <para>
- For convenience, access points are gathered into <firstterm>attribute
- sets</firstterm>. For example, the &acro.bib1; attribute set is supposed to
- contain bibliographic access points such as author, title, subject
- and ISBN; the GEO attribute set contains access points pertaining
- to geospatial information (bounding coordinates, stratum, latitude
- resolution, etc.); the CIMI
- attribute set contains access points to do with museum collections
- (provenance, inscriptions, etc.)
- </para>
- <para>
- In practice, the &acro.bib1; attribute set has tended to be a dumping
- ground for all sorts of access points, so that, for example, it
- includes some geospatial access points as well as strictly
- bibliographic ones. Nevertheless, this model
- allows a layer of abstraction over the physical representation of
- records in databases.
- </para>
- <para>
- In the &acro.bib1; attribute set, a taxon name is probably best
- interpreted as a title - that is, a phrase that identifies the item
- in question. &acro.bib1; represents title searches by
- access point 4. (See
- <ulink url="&url.z39.50.bib1.semantics;">The &acro.bib1; Attribute
- Set Semantics</ulink>)
- So we need to configure our dinosaur database so that searches for
- &acro.bib1; access point 4 look in the
- <literal><termName></literal> element,
- inside the top-level
- <literal><Zthes></literal> element.
- </para>
- <para>
- This is a two-step process. First, we need to tell &zebra; that we
- want to support the &acro.bib1; attribute set. Then we need to tell it
- which elements of its record pertain to access point 4.
+ <idzebra xmlns="http://www.indexdata.dk/zebra/">
+ <size>300</size>
+ <localnumber>23</localnumber>
+ <filename>records/dino.xml</filename>
+ </idzebra>
+ </Zthes>
+ </screen>
+ </para>
+ <para>
+ Now wasn't that nice and easy?
+ </para>
+ </sect1>
+
+
+ <sect1 id="example2">
+ <title>Example 2: Supporting Interoperable Searches</title>
+
+ <para>
+ The problem with the previous example is that you need to know the
+ structure of the documents in order to find them. For example,
+ when we wanted to find the record for the taxon
+ <foreignphrase role="taxon">Sauroposeidon</foreignphrase>,
+ we had to formulate a complex XPath
+ <literal>/Zthes/termName</literal>
+ which embodies the knowledge that taxon names are specified in a
+ <literal><termName></literal> element inside the top-level
+ <literal><Zthes></literal> element.
+ </para>
+ <para>
+ This is bad not just because it requires a lot of typing, but more
+ significantly because it ties searching semantics to the physical
+ structure of the searched records. You can't use the same search
+ specification to search two databases if their internal
+ representations are different. Consider a different taxonomy
+ database in which the records have taxon names specified
+ inside a <literal><name></literal> element nested within a
+ <literal><identification></literal> element
+ inside a top-level <literal><taxon></literal> element: then
+ you'd need to search for them using
+ <literal>1=/taxon/identification/name</literal>
+ </para>
+ <para>
+ How, then, can we build broadcasting Information Retrieval
+ applications that look for records in many different databases?
+ The &acro.z3950; protocol offers a powerful and general solution to this:
+ abstract ``access points''. In the &acro.z3950; model, an access point
+ is simply a point at which searches can be directed. Nothing is
+ said about implementation: in a given database, an access point
+ might be implemented as an index, a path into physical records, an
+ algorithm for interrogating relational tables or whatever works.
+ The only important thing is that the semantics of an access
+ point is fixed and well defined.
+ </para>
+ <para>
+ For convenience, access points are gathered into <firstterm>attribute
+ sets</firstterm>. For example, the &acro.bib1; attribute set is supposed to
+ contain bibliographic access points such as author, title, subject
+ and ISBN; the GEO attribute set contains access points pertaining
+ to geospatial information (bounding coordinates, stratum, latitude
+ resolution, etc.); the CIMI
+ attribute set contains access points to do with museum collections
+ (provenance, inscriptions, etc.)
+ </para>
+ <para>
+ In practice, the &acro.bib1; attribute set has tended to be a dumping
+ ground for all sorts of access points, so that, for example, it
+ includes some geospatial access points as well as strictly
+ bibliographic ones. Nevertheless, this model
+ allows a layer of abstraction over the physical representation of
+ records in databases.
+ </para>
+ <para>
+ In the &acro.bib1; attribute set, a taxon name is probably best
+ interpreted as a title - that is, a phrase that identifies the item
+ in question. &acro.bib1; represents title searches by
+ access point 4. (See
+ <ulink url="&url.z39.50.bib1.semantics;">The &acro.bib1; Attribute
+ Set Semantics</ulink>)
+ So we need to configure our dinosaur database so that searches for
+ &acro.bib1; access point 4 look in the
+ <literal><termName></literal> element,
+ inside the top-level
+ <literal><Zthes></literal> element.