From: Mike Taylor Date: Thu, 29 Aug 2002 01:15:25 +0000 (+0000) Subject: Smallish mods to introduction. Rephrasing, mostly. X-Git-Tag: ZEBRA.1.3.2~43 X-Git-Url: http://git.indexdata.com/?p=idzebra-moved-to-github.git;a=commitdiff_plain;h=ca2d3fd0b71d134c00d5b741d48367e7b06263d2 Smallish mods to introduction. Rephrasing, mostly. --- diff --git a/doc/introduction.xml b/doc/introduction.xml index b9a68d2..189c4bc 100644 --- a/doc/introduction.xml +++ b/doc/introduction.xml @@ -1,5 +1,5 @@ - + Introduction @@ -52,8 +52,7 @@ Features - This is an overview of some of the most important features of the - system. + This is an overview of some of Zebra's most important features: @@ -61,34 +60,36 @@ - Supports large databases - files for indexes, etc. can be + Very large databases: files for indexes, etc. can be automatically partitioned over multiple disks. - Supports arbitrarily complex records - base input format is an - SGML-like syntax which allows nested (structured) data elements, as - well as variant forms of data. + Arbitrarily complex records. The internal data format + is an structured format conceptually similar to XML or GRS-1, + which allows nested structured data elements and + variant forms of data. - Robust updating - records can be added and deleted without - rebuilding the index from scratch. + Robust updating - records can be added and deleted ``on the fly'' + without rebuilding the index from scratch. + Registers can be safely updated even while users are accessing + the server. The update procedure is tolerant to crashes or hard interrupts during register updating - registers can be reconstructed following a crash. - Registers can be safely updated even while users are accessing - the server. - Supports random storage formats. A system of input filters driven by + Configurable to understand many input formats. + A system of input filters driven by regular expressions allows you to easily process most ASCII-based data formats. SGML, XML, ISO2709 (MARC), and raw text are also supported. @@ -97,40 +98,27 @@ - Supports boolean queries as well as relevance-ranking (free-text) - searching. Right truncation and masking in terms are supported, as - well as full regular expressions. + Searching supports a powerful combination of boolean queries as + well as relevance-ranking (free-text) queries. Truncation, + masking, full regular expression matching and "approximate + matching" (eg. spelling mistakes) are all supported. - Can import the data into Zebras own storage, or just refer to - external files (good for building indexes of "live" - collections). + Index-only databases: data can be, and usually is, imported + into Zebra's own storage, but Zebra can also refer to + external files, building and maintaining indexes of "live" + collections. - Supports multiple concrete syntaxes - for record exchange (depending on the configuration): GRS-1, SUTRS, - XML, ISO2709 (*MARC). Records can be mapped between record syntaxes - and schema on the fly. - - - - - - Supports approximate matching in registers (ie. spelling mistakes, - etc). - - - - - Zebra is written in portable C, so it runs on most Unix-like systems - as well as Windows NT - a binary distribution for Windows NT is available. + as well as Windows NT. A binary distribution for Windows NT is + available. @@ -146,7 +134,8 @@ - Protocol facilities: Init, Search, Retrieve, Delete, Browse and Sort. + Protocol facilities: Init, Search, Present (retrieval), Delete, + Scan (index browsing) and Sort. @@ -161,6 +150,7 @@ Named result sets are supported. + Easily configured to support different application profiles, with @@ -172,16 +162,19 @@ - Complex composition specifications using Espec-1 are partially - supported (simple element requests only). + Complex composition specifications using Espec-1 (partial support). + Element sets are defined using the Espec-1 capability, + and are specified in configuration files as simple element + requests (and, optionally, variant requests). - Element Set Names are defined using the Espec-1 capability of the - system, and are given in configuration files as simple element - requests (and possibly variant requests). + Multiple record syntaxes + for data retrieval: GRS-1, SUTRS, + XML, ISO2709 (MARC), etc. Records can be mapped between record syntaxes + and schemas on the fly. @@ -196,7 +189,8 @@ Zebra has been deployed in numerous applications, in both the academic and commercial worlds, in application domains as diverse - as bibliographic information, geospatial, ### (Help, guys!) + as bibliographic catalogues, geospatial information, structured + vocabulary browsing, ### (Help, guys!) Notable applications include the following: @@ -238,11 +232,11 @@ - Future Work + Future Directions These are some of the plans that we have for the software in the near - and far future, approximately ordered after their relative importance. + and far future, ordered approximately as we expect to work on them. @@ -266,9 +260,10 @@ - Finalisation, documentation of the Zebra API. Consider - exposing the API through SOAP as well (allowing updates, - database management). + Finalisation and documentation of Zebra's C programming + API, allowing updates, database management and other functions + not readily expressed in Z39.50. We will also consider + exposing the API through SOAP. @@ -287,7 +282,10 @@ Programmers thrive on user feedback. If you are interested in a facility that you don't see mentioned here, or if there's something - you think we could do better, please drop us a mail. + you think we could do better, please drop us a mail. Better still, + implement it and send us the patches. + + If you think it's all really neat, you're welcome to drop us a line saying that, too. You'll find contact info at the end of this file.