X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fintroduction.xml;h=645c0aa635a7e0a4438e86bc5cf6f93a7114071b;hb=4fe772289b1ab968655c27b144d08fc69c113fd9;hp=1fbb3aa43cea4802d5c2e68984b9674664b640af;hpb=a31f9b2d25006c89ae7e9fb5870c0d222ee88a3a;p=idzebra-moved-to-github.git diff --git a/doc/introduction.xml b/doc/introduction.xml index 1fbb3aa..645c0aa 100644 --- a/doc/introduction.xml +++ b/doc/introduction.xml @@ -1,293 +1,290 @@ -Introduction - - -Overview - - -The Zebra system is a fielded free-text indexing and retrieval engine with a -Z39.50 frontend. You can use any commercial or freeware Z39.50 client -to access data stored in Zebra. - - - -The Zebra server is our first step towards the development of a fully -configurable, open information system. Eventually, it will be paired -off with a powerful Z39.50 client to support complex information -management tasks within almost any application domain. We're making -the server available now because it's no fun to be in the open -information retrieval business all by yourself. We want to allow -people with interesting data to make their things -available in interesting ways, without having to start out -by implementing yet another protocol stack from scratch. - - - -This document is an introduction to the Zebra system. It will tell you -how to compile the software, and how to prepare your first database. -It also explains how the server can be configured to give you the -functionality that you need. - - - -If you find the software interesting, you should join the support -mailing-list by sending email to zebra-request@indexdata.dk. - - - - - -Features - - -This is a list of some of the most important features of the -system. - - - - - - - - -Supports updating - records can be added and deleted without -rebuilding the index from scratch. -The update procedure is tolerant to crashes or hard interrupts -during register updating - registers can be reconstructed following a crash. -Registers can be safely updated even while users are accessing the server. - - - - - - -Supports large databases - files for indices, etc. can be -automatically partitioned over multiple disks. - - - - - - -Supports arbitrarily complex records - base input format is an -SGML-like syntax which allows nested (structured) data elements, as -well as variant forms of data. - - - - - - -Supports random storage formats. A system of input filters driven by -regular expressions allows you to easily process most ASCII-based -data formats. SGML, ISO2709 (MARC), and raw text are also supported. - - - - - - -Supports boolean queries as well as relevance-ranking (free-text) -searching. Right truncation and masking in terms are supported, as -well as full regular expressions. - - - - - - -Supports multiple concrete syntaxes -for record exchange (depending on the configuration): GRS-1, SUTRS, -ISO2709 (*MARC). Records can be mapped between record syntaxes and -schema on the fly. - - - - - - -Supports approximate matching in registers (ie. spelling mistakes, -etc). - - - - - - - - - -Protocol support: - - - - - - - - -Protocol facilities: Init, Search, Retrieve, Browse and Sort. - - - - - - -Piggy-backed presents are honored in the search-request. - - - - - - -Named result sets are supported. - - - - - - -Easily configured to support different application profiles, with -tables for attribute sets, tag sets, and abstract syntaxes. -Additional tables control facilities such as element mappings to -different schema (eg., GILS-to-USMARC). - - - - - - -Complex composition specifications using Espec-1 are partially -supported (simple element requests only). - - - - - - -Element Set Names are defined using the Espec-1 capability of the -system, and are given in configuration files as simple element -requests (and possibly variant requests). - - - - - - -Some variant support (not fully implemented yet). - - - - - - -Using the YAZ toolkit for the protocol implementation, the -server can utilise a plug-in XTI/mOSI implementation (not included) to -provide SR services over an OSI stack, as well as Z39.50 over TCP/IP. - - - - - - -Zebra runs on most Unix-like systems as well as Windows NT - a binary -distribution for Windows NT is forthcoming - so far, the installation -requires MSVC++ to compile the system (we use version 5.0). - - - - - - - - - - - -Future Work - - -This is a beta-release of the software, to allow you to look at -it - try it out, and assess whether it can be of use to you. - - - -These are some of the plans that we have for the software in the near -and far future, approximately ordered after their relative importance. -Items marked with an -asterisk will be implemented before the -last beta release. - - - - - - - - -*Complete the support for variants. - - - - - - -*Finalize the data element include facility -to support multimedia data elements in records. - - - - - - -Add more sophisticated relevance ranking mechanisms. Add support for soundex -and stemming. Add relevance feedback support. - - - - - - -Complete EXPLAIN support. - - - - - - -Add support for very large records by implementing segmentation and/or -variant pieces. - - - - - - -Support the Item Update extended service of the protocol. - - - - - - -We want to add a management system that allows you to -control your databases and configuration tables from a graphical -interface. We'll probably use Tcl/Tk to stay platform-independent. - - - - - - - - - -Programmers thrive on user feedback. If you are interested in a facility that -you don't see mentioned here, or if there's something you think we -could do better, please drop us a mail. If you think it's all really -neat, you're welcome to drop us a line saying that, too. You'll find -contact info at the end of this file. - - - + + Introduction + + + Overview + + + The + + Zebra + system is a fielded free-text indexing and retrieval engine with a + Z39.50 front-end. You can use our various toolkits or any commercial + or free-ware Z39.50 client to access data stored in Zebra. + + + + FIXME - not a "first step" but a part of a complete system! -H + + + + The Zebra server is our first step towards the development of a fully + configurable, open information system. Eventually, it will be paired + off with a powerful Z39.50 client to support complex information + management tasks within almost any application domain. We're making + the server available now because it's no fun to be in the open + information retrieval business all by yourself. We want to allow + people with interesting data to make their things + available in interesting ways, without having to start out + by implementing yet another protocol stack from scratch. + + + + This document is an introduction to the Zebra system. It will tell you + how to compile the software, and how to prepare your first database. + It also explains how the server can be configured to give you the + functionality that you need. + + + + + If you find the software interesting, you should visit the + + Zebra web site, where you can join the + + mailing-list + by sending email to + + + + + + Features + + + This is a list of some of the most important features of the + system. + + + + + + + + Supports large databases - files for indices, etc. can be + automatically partitioned over multiple disks. + + + + + + Supports arbitrarily complex records - base input format is an + SGML-like syntax which allows nested (structured) data elements, as + well as variant forms of data. + + + + + + Robust updating - records can be added and deleted without + rebuilding the index from scratch. + The update procedure is tolerant to crashes or hard interrupts + during register updating - registers can be reconstructed following + a crash. + Registers can be safely updated even while users are accessing + the server. + + + + + + Supports random storage formats. A system of input filters driven by + regular expressions allows you to easily process most ASCII-based + data formats. SGML, XML, ISO2709 (MARC), and raw text are also + supported. + + + + + + Supports boolean queries as well as relevance-ranking (free-text) + searching. Right truncation and masking in terms are supported, as + well as full regular expressions. + + + + + + Can import the data into Zebras own storage, or just refer to + external files (html pages). + + + + + + Supports multiple concrete syntaxes + for record exchange (depending on the configuration): GRS-1, SUTRS, + XML, ISO2709 (*MARC). Records can be mapped between record syntaxes + and schema on the fly. + + + + + + Supports approximate matching in registers (ie. spelling mistakes, + etc). + + + + + + Zebra is written in portable C, so it runs on most Unix-like systems + as well as Windows NT - a binary distribution for Windows NT is available. + + + + + + + + + Protocol support: + + + + + + + Protocol facilities: Init, Search, Retrieve, Delete, Browse and Sort. + FIXME - Itemupdate. (Remove delete until that time, confuses people) -H + + + + + + Piggy-backed presents are honored in the search-request. + + + + + + Named result sets are supported. + + + + + Easily configured to support different application profiles, with + tables for attribute sets, tag sets, and abstract syntaxes. + Additional tables control facilities such as element mappings to + different schema (eg., GILS-to-USMARC). + + + + + + Complex composition specifications using Espec-1 are partially + supported (simple element requests only). + + + + + + Element Set Names are defined using the Espec-1 capability of the + system, and are given in configuration files as simple element + requests (and possibly variant requests). + + + + + + Some variant support (not fully implemented yet). + FIXME - Test if complete enough - is it worth mentioning at all -H + + + + + + + + + + + Future Work + + + These are some of the plans that we have for the software in the near + and far future, approximately ordered after their relative importance. + Items marked with an + asterisk will be implemented before the + last beta release. + FIXME - What are the current plans? + + + + + + + *Complete the support for variants. + FIXME - who cares -H + + + + + + *Finalize the data element include facility + to support multimedia data elements in records. + + + + + + Add more sophisticated relevance ranking mechanisms. + Add support for soundex and stemming. + Add relevance feedback support. + + + + + + Complete EXPLAIN support. + + + + + + Add support for very large records by implementing segmentation and/or + variant pieces. + + + + + + Support the Item Update extended service of the protocol. + + + + + + We want to add a management system that allows you to + control your databases and configuration tables from a graphical + interface. + + + + + + + Programmers thrive on user feedback. If you are interested in a + facility that you don't see mentioned here, or if there's something + you think we could do better, please drop us a mail. + If you think it's all really neat, you're welcome to drop us a line + saying that, too. You'll find contact info at the end of this file. + + + +