X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fexamples.xml;h=f2af44421d6e1a8f9c3908e45df5e493b3153b61;hb=518c06f68ffac6658aa792da45282a165b32ca95;hp=1a08eae6a443535a4ee5c688630b52e4104af324;hpb=8ad5e21914fe3a09f6241a06b25fd7e1bbc1d73e;p=idzebra-moved-to-github.git
diff --git a/doc/examples.xml b/doc/examples.xml
index 1a08eae..f2af444 100644
--- a/doc/examples.xml
+++ b/doc/examples.xml
@@ -1,5 +1,5 @@
-
+
Example Configurations
@@ -19,80 +19,167 @@
- Where to find the default indexing rules (### default.idx)
+ Where to find subsidiary configuration files, including
+ default.idx
+ which specifies the default indexing rules.
- ### Something to do with explain.abs?!
+ What attribute sets to recognise in searches.
- ### Where to find other configuration files, e.g. searches using
- BIB-1 attributes require a bib1.att configuration file (even if
- the access point is actually an XPath expression). These are
- searched for in the working directory unless otherwise
- specified.
+ Policy details such as what record type to expect, what
+ low-level indexing algorithm to use, how to identify potential
+ duplicate records, etc.
+
+ Now let's see what goes in the zebra.cfg file
+ for some example configurations.
+
-
- First Example: Minimal Configuration
+
+ Example 1: XML Indexing And Searching
- This example shows how Zebra can be used, with absolutely minimal
- configuration, to index a body of XML documents, and search them
- using XPath expressions to specify access points.
+ This example shows how Zebra can be used with absolutely minimal
+ configuration to index a body of
+ XML
+ documents, and search them using
+ XPath
+ expressions to specify access points.
- Go to the
- zebra/examples/dinosauricon
- directory. There you will find three significant files:
+ Go to the examples/dinosauricon subdirectory
+ of the distribution archive.
+ There you will find a records subdirectory,
+ which contains some raw XML data to be added to the database: in
+ this case, as single file, genera.xml,
+ which contain information about all the known dinosaur genera as of
+ August 2002.
+
+ Now we need to create the Zebra database, which we do with the
+ Zebra indexer, zebraidx, which is
+ driven by the zebra.cfg configuration file.
+ For our purposes, we don't need any
+ special behaviour - we can use the defaults - so we start with a
+ minimal file that just tells zebraidx where to
+ find the default indexing rules, and how to parse the records:
+
+ profilePath: .:../../tab:../../../yaz/tab
+ recordType: grs.sgml
+
+
+
+ That's all you need for a minimal Zebra configuration. Now you can
+ roll the XML records into the database and build the indexes:
+
+ zebraidx update records
+
+
+
+ Now start the server. Like the indexer, its behaviour is
+ controlled by the
+ zebra.cfg file; and like the indexer, it works
+ just fine with this minimal configuration.
+
+ zebrasrv
+
+ By default, the server listens on IP port number 9999, although
+ this can easily be changed - see
+ .
+
+
+ Now you can use the Z39.50 client program of your choice to execute
+ XPath-based boolean queries and fetch the XML records that satisfy
+ them:
+
+ $ yaz-client tcp:@:9999
+ Connecting...Ok.
+ Z> find @attr 1=/GENUS/MEANING @and lizard earthquakes
+ Number of hits: 1
+ Z> format xml
+ Z> show 1
+ <GENUS name="Sauroposeidon" type="with">
+ <MEANING>lizard Poseidon <LOW>(Greek god of, among other things, earthquakes)</LOW></MEANING>
+ <SPECIES name="proteles">
+ <AUTHOR type="vide" name="Franklin" year="2000"></AUTHOR>
+ <AUTHOR name="Wedel, Cifelli, Sanders"></AUTHOR>
+ </SPECIES>
+ <PLACE name="Oklahoma"></PLACE>
+ <TIME value="Albian"></TIME>
+ <LENGTH value="30" q="1"></LENGTH>
+ <REMAINS content="rib, cervical vertebrae"></REMAINS>
+ <ESSAY>
+ <P> This new <NOMEN name="Brachiosaurus"></NOMEN>-like <LINK content="dinosaur"></LINK>
+ was perhaps the tallest. With its head raised, it stood 60 feet (nearly
+ 20 m) tall. </P>
+ </ESSAY>
-
-
-
- The records subdirectory, which contains the
- raw XML data to be added to the database: in this case, just one
- file, genera.xml, which contains information
- about all the known dinosaur genera as of October 2000.
-
-
-
+ <idzebra xmlns="http://www.indexdata.dk/zebra/">
+ <size>593</size>
+ <localnumber>891</localnumber>
+ <filename>records/genera.xml</filename>
+ </idzebra>
+ </GENUS>
+
+
+
+ Now wasn't that easy?
+
+
+
+
+ Example 2: Supporting Z39.50 Searches
+
+
+ You may have noticed as zebraidx was building
+ the database that it issued a warning, which we ignored at the
+ time:
+
+ $ zebraidx update records
+ 00:45:46-08/10: ../../index/zebraidx(5016) [warn] records/genera.xml:0 Couldn't open GENUS.abs [No such file or directory]
+
+
+
+
+
+
+
- # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.1 2002-08-29 01:16:12 mike Exp $
+ # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.8 2002-10-10 14:27:18 heikki Exp $
# Bare-bones master configuration file for Zebra
- attset: bib1.att
+ profilePath: .:../../tab:../../../yaz/tab
Apart from the comments, which are ignored, all this specifies is
that the server should recognise the attribute set described in
the file called
bib1.att.
+ ### What is an attribute set?
-
The BIB-1 attribute set configuration file,
bib1.att, which is also as short as possible:
-
- # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.1 2002-08-29 01:16:12 mike Exp $
+ # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.8 2002-10-10 14:27:18 heikki Exp $
# Bare-bones BIB-1 attribute set file for Zebra
reference Bib-1
@@ -101,44 +188,87 @@
Bib-1, a name recognised by the system as
referring to a well-known opaque identifier that is transmitted
by clients as part of their searches.
-
+ ### Yeuch! Surely we can say that better!
### Can't we somehow say this trivial thing in the main
configuration file?
-
+-->
-
- That's all you need for a minimal Zebra configuration. Now you can
- roll the XML records into the database and build the indexes:
-
- zebraidx -t grs.sgml update records
-
-
- and start the server which, by default listens on port 9999:
-
- zebrasrv
-
-
-
- Now you can use the Z39.50 client program of your choice to execute
- XPath-based boolean queries and fetch the XML records that satisfy
- them:
-
- Z> open tcp:@:9999
- Connecting...Ok.
- Z> find @attr 1=/GENUS/MEANING @or vertebra jaw
- Number of hits: 2
- Z> format xml
- Z> show 1
- <GENUS name="Anurognathus" type="with" xmlns:idzebra="http://www.indexdata.dk/zebra/"><SPECIES name="ammoni"><AUTHOR name="Doederline" year="1923"></AUTHOR></SPECIES><MEANING>tailless<I>or</I>anuran<LOW>(frog)</LOW>jaw</MEANING><TIME value="Tithonian" section="late"></TIME><PLACE name="Germany"></PLACE><LENGTH wingspan="1" value=".5"></LENGTH><idzebra:size>304</idzebra:size><idzebra:localnumber>70</idzebra:localnumber><idzebra:filename>records/genera.xml</idzebra:filename></GENUS>
-
-
-
+
-
+