Renamed API function zebra_deleleResultSet to zebra_deleteResultSet.

[idzebra-moved-to-github.git] / doc / introduction.xml
diff --git a/doc/introduction.xml b/doc/introduction.xml

index 3e4d19f..68fcc92 100644 (file)
--- a/doc/introduction.xml
+++ b/doc/introduction.xml
@@ -1,15 +1,14 @@
  <chapter id="introduction">
- <!-- $Id: introduction.xml,v 1.19 2002-10-20 14:02:03 mike Exp $ -->
+ <!-- $Id: introduction.xml,v 1.26 2003-10-30 11:11:57 adam Exp $ -->
   <title>Introduction</title>
   
   <sect1>
    <title>Overview</title>
    
    <para>
-   <ulink url="http://indexdata.dk/zebra/">
-     Zebra</ulink>
+   <ulink url="http://indexdata.dk/zebra/">Zebra</ulink>
     is a high-performance, general-purpose structured text
-   indexing and retrieval engine. It reads structured records in a
+   indexing and retrieval engine. It reads records in a
     variety of input formats (eg. email, XML, MARC) and provides access
     to them through a powerful combination of boolean search
     expressions and relevance-ranked free-text queries.
@@ -49,7 +48,7 @@
  
      <listitem>
       <para>
-      Very large databases: files for indexes, etc. can be
+      Very large databases: logical files can be
        automatically partitioned over multiple disks.
       </para>
      </listitem>
@@ -57,7 +56,7 @@
      <listitem>
       <para>
        Arbitrarily complex records.  The internal data format
-      is an structured format conceptually similar to XML or GRS-1,
+      is a structured format conceptually similar to XML or GRS-1,
        which allows lists, nested structured data elements and
        variant forms of data.
       </para>
@@ -236,7 +235,7 @@
    <sect2>
     <title>NLI-Z39.50 - a Natural Language Interface for Libraries</title>
     <para>
-    Fernuniversität Hagen in Germany have developed a natural
+    Fernuniversit&#x00E4;t Hagen in Germany have developed a natural
      language interface for access to library databases.
      <ulink url="http://ki212.fernuni-hagen.de/nli/NLIintro.html"/>
      In order to evaluate this interface for recall and precision, they
@@ -264,11 +263,13 @@
    <sect2>
     <title>ULS (Union List of Serials)</title>
     <para>
-    The M25-Link systems team
-    (<ulink url="http://www.m25lib.ac.uk/M25link/"/>)
-    are involved in a project called ULS to provide a union catalogue
-    for periodicals in 21 member libraries.  They do this with an
-    unusual architecture which they call a
+    The M25 Systems Team
+    has created a union catalogue for the periodicals of the
+    twenty-one constituent libraries of the University of London and
+    the University of Westminster
+    (<ulink url="http://www.m25lib.ac.uk/ULS/"/>).
+    They have achieved this using an
+    unusual architecture, which they describe as a
      ``non-distributed virtual union catalogue''.
     </para>
     <para>
@@ -304,9 +305,45 @@
      which is populated by the Harvest-NG web-crawling software.
     </para>
     <para>
-    For more information, contact John Gilbertson
+    For more information on Liverpool university's intranet search
+    architecture, contact John Gilbertson
      <email>jgilbert@liverpool.ac.uk</email>
     </para>
+   <para>
+    Kang-Jin Lee
+    <email>lee@arco.de</email>,
+    has recently modified the Harvest web indexer to use Zebra as
+    its native repository engine.  His comments on the switch over
+    from the old engine are revealing:
+    <blockquote>
+     <para>
+      The first results after some testing with Zebra are very
+      promising.  The tests were done with around 220,000 SOIF files,
+      which occupies 1.6GB of disk space.
+     </para>
+     <para>
+      Building the index from scratch takes around one hour with Zebra
+      where [old-engine] needs around five hours.  While [old-engine]
+      blocks search requests when updating its index, Zebra can still
+      answer search requests.
+      [...]
+      Zebra supports incremental indexing which will speed up indexing
+      even further.
+     </para>
+     <para>
+      While the search time of [old-engine] varies from some seconds
+      to some minutes depending how expensive the query is, Zebra
+      usually takes around one to three seconds, even for expensive
+      queries.
+      [...]
+      Zebra can search more than 100 times faster than [old-engine]
+      and can process multiple search requests simultaneously
+     </para>
+     <para>
+      I am very happy to see such nice software available under GPL.
+     </para>
+    </blockquote>
+   </para>
    </sect2>
   </sect1>
  
@@ -331,15 +368,14 @@
     announcements from the authors (new
     releases, bug fixes, etc.) and general discussion.  You are welcome
     to seek support there.  Join by sending email to
-   <email>zebra-request@indexdata.dk</email>. Put the word
+   <email>zebra-request@indexdata.dk</email> with the word
     <literal>subscribe</literal> in the body of the message.
    </para>
    <para>
     Third, it's possible to buy a commercial support contract, with
     well defined service levels and response times, from Index Data.
     See
-   <ulink url="http://indexdata.dk/support/?lang=en"/>
-   <!-- ### compare this page with http://indexdata.dk/support2/ -->
+   <ulink url="http://indexdata.dk/support/"/>
     for details.
    </para>
   </sect1>  
@@ -361,20 +397,17 @@
         Improved support for XML in search and retrieval. Eventually,
         the goal is for Zebra to pull double duty as a flexible
         information retrieval engine and high-performance XML
-       repository.
-     </para>
-     <para>
-       ### Partially done.
+       repository.  The recent addition of XPath searching is one
+       example of the kind of enhancement we're working on.
       </para>
      </listitem>
  
      <listitem>
       <para>
-       Access to search engine through SOAP/RPC API to allow the
+       Access to the search engine through SOAP/RPC API to allow the
         construction of applications without requiring Z39.50 tools.
-     </para>
-     <para>
-       ### Partially done, thanks to the new SRW/Z39.50 gateway.
+       This will shortly be available by means of Index Data's
+       SRW-to-Z39.50 gateway, currently in beta test.
       </para>
      </listitem>
  
@@ -389,6 +422,17 @@
  
      <listitem>
       <para>
+       Support for the use of Perl both for access to the Zebra API
+       and for building extension ``plug-ins'' such as input filters.
+       The code for this has been contributed to the source tree by
+       Peter Popovics
+       <email>pop@technomat.hu</email>,
+       and is in the process of being integrated and tested.
+     </para>
+    </listitem>
+
+    <listitem>
+     <para>
         Improved free-text searching. We're first and foremost octet jockeys and
         we're actively looking for organisations or people who'd like
         to contribute experience in relevance ranking and text