X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fadministration.xml;h=912f216b2f7f75faf396d384feb3e4f542dada19;hb=878a848853ad9e4f63da983476161613e114672d;hp=773edfda23ef3a233529135ba72cb8c6beff4a6d;hpb=3fbdef96a1c730eb52d1ffbd7c90143fb7168f25;p=idzebra-moved-to-github.git diff --git a/doc/administration.xml b/doc/administration.xml index 773edfd..912f216 100644 --- a/doc/administration.xml +++ b/doc/administration.xml @@ -1,5 +1,5 @@ - + Administrating Zebra + + + + + + + The rank-1 algorithm + does not use the static rank + information in the list keys, and will produce the same ordering + with or without static ranking enabled. + + + + + + - Notice that dynamic ranking is not compatible + Dynamic ranking is not compatible with estimated hit sizes, as all documents in - a hit set must be acessed to compute the correct placing in a + a hit set must be accessed to compute the correct placing in a ranking sorted list. Therefore the use attribute setting - @attr 2=102 clashes with - @attr 9=. + @attr 2=102 clashes with + @attr 9=integer. - - It is possible to apply dynamic ranking on parts of the PQF query - allone: - - Z> f @and @attr 2=102 @attr 1=1010 Utah @attr 1=1018 Springer - - searches for all documents which have the term 'Utah' on the - body of text, and which have the term 'Springer' in the publisher - field, and sort them in the order of the relvance ranking made on - the body of text index only. - - - Rank weight is a way to pass a value to a ranking algorithm - so that - one APT has one value - while another as a different one. For - example, we can - search for 'utah' in use attribute set 'title' with weight 30, as - well as in use attribute set 'any' with weight 20. - - Z> f @attr 2=102 @or @attr 9=30 @attr 1=4 utah @attr 9=20 utah - - - + + + + + Dynamically ranking CQL queries - The rank weight feature is experimental. It may change in future - releases of zebra, and is not production mature. + Dynamic ranking can be enabled during sever side CQL + query expansion by adding @attr 2=102 + chunks to the CQL config file. For example + + relationModifier.relevant = 2=102 + + invokes dynamic ranking each time a CQL query of the form + + Z> querytype cql + Z> f alvis.text =/relevant house + + is issued. Dynamic ranking can also be automatically used on + specific CQL indexes by (for example) setting + + index.alvis.text = 1=text 2=102 + + which then invokes dynamic ranking each time a CQL query of the form + + Z> querytype cql + Z> f alvis.text = house + + is issued. - - - - Notice that dynamic ranking can be enabled in - sever side CQL query expansion by adding @attr - 2=102 to the CQL config file. For example - - relationModifier.relevant = 2=102 - - invokes dynamik ranking each time a CQL query of the form - - Z> querytype cql - Z> f alvis.text =/relevant house - - is issued. Dynamic ranking can be enabled on specific CQL indexes - by (for example) setting - - index.alvis.text = 1=text 2=102 - - which then invokes dynamik ranking each time a CQL query of the form - - Z> querytype cql - Z> f alvis.text = house - - is issued. - + + @@ -1144,59 +1368,63 @@ Sorting - Sorting is enabled in the configuration of record indexing. For - example, to enable sorting according to the BIB-1 + Zebra sorts efficiently using special sorting indexes + (type=s; so each sortable index must be known + at indexing time, specified in the configuration of record + indexing. For example, to enable sorting according to the BIB-1 Date/time-added-to-db field, one could add the line xelm /*/@created Date/time-added-to-db:s - to any .abs record indexing config file, or - similarily, one could add an indexing element of the form + to any .abs record-indexing configuration file. + Similarly, one could add an indexing element of the form ]]> - to any alvis indexing rule. + to any alvis-filter indexing stylesheet. - To trigger a sorting on a pre-defined sorting index of type - s, we can issue a sort with BIB-1 - embedded sort attribute set 7. - The embedded sort is a way to specify sort within a query - thus - removing the need to send a Z39.50 Sort - Request separately. + Indexing can be specified at searching time using a query term + carrying the non-standard + BIB-1 attribute-type 7. This removes the + need to send a Z39.50 Sort Request + separately, and can dramatically improve latency when the client + and server are on separate networks. + The sorting part of the query is separate from the rest of the + query - the actual search specification - and must be combined + with it using OR. - The value after attribute type 7 is - 1 (=ascending), or 2 - (=descending). - The attributes+term (APT) node is separate from the rest of the - PQF query, and must be @or'ed. - The term associated with this attribute is the sorting level, - where - 0 specifies the primary sort key, - 1 the secondary sort key, and so on. + A sorting subquery needs two attributes: an index (such as a + BIB-1 type-1 attribute) specifying which index to sort on, and a + type-7 attribute whose value is be 1 for + ascending sorting, or 2 for descending. The + term associated with the sorting attribute is the priority of + the sort key, where 0 specifies the primary + sort key, 1 the secondary sort key, and so + on. For example, a search for water, sort by title (ascending), is expressed by the PQF query - Z> f @or @attr 1=1016 water @attr 7=1 @attr 1=4 0 + @or @attr 1=1016 water @attr 7=1 @attr 1=4 0 whereas a search for water, sort by title ascending, then date descending would be - Z> f @or @or @attr 1=1016 water @attr 7=1 @attr 1=4 0 @attr 7=2 @attr 1=30 1 + @or @or @attr 1=1016 water @attr 7=1 @attr 1=4 0 @attr 7=2 @attr 1=30 1 Notice the fundamental differences between dynamic - ranking and sorting: there can only - be one ranking function defined and configured, but there can be - specified multiple sorting indexes dynamically at search - time. Ranking does not need to use specific indexes, which means, + ranking and sorting: there can be + only one ranking function defined and configured; but multiple + sorting indexes can be specified dynamically at search + time. Ranking does not need to use specific indexes, so dynamic ranking can be enabled and disabled without - re-indexing. On the other hand, sorting indexes need to be + re-indexing; whereas, sorting indexes need to be defined before indexing. @@ -1208,26 +1436,212 @@ Extended Services: Remote Insert, Update and Delete + + Extended services are only supported when acessing the Zebra + server using the Z39.50 + protocol. The SRU protocol does + not support extended services. + + The extended services are not enabled by default in zebra - due to the - fact that they modify the system. - In order to allow anybody to update, use - - perm.anonymous: rw - + fact that they modify the system. Zebra can be configured + to allow anybody to + search, and to allow only updates for a particular admin user in the main zebra configuration file zebra.cfg. - Or, even better, allow only updates for a particular admin user. For - user admin, you could use: + For user admin, you could use: + perm.anonymous: r perm.admin: rw passwd: passwordfile - And in passwordfile, specify users and - passwords as colon seperated strings: + And in the password file + passwordfile, you have to specify users and + encrypted passwords as colon seperated strings. + Use a tool like htpasswd + to maintain the encrypted passwords. admin:secret - + + It is essential to configure Zebra to store records internally, + and to support + modifications and deletion of records: + + storeData: 1 + storeKeys: 1 + + The general record type should be set to any record filter which + is able to parse XML records, you may use any of the two + declarations (but not both simultaniously!) + + recordType: grs.xml + # recordType: alvis.filter_alvis_config.xml + + To enable transaction safe shadow indexing, + which is extra important for this kind of operation, set + + shadow: directoryname: size (e.g. 1000M) + + It is not possible to carry information about record types or + similar to Zebra when using extended services, due to + limitations of the Z39.50 + protocol. Therefore, indexing filters can not be choosen on a + per-record basis. One and only one general XML indexing filter + must be defined. + + + + + + + Extended services in the Z39.50 protocol + + + The Z39.50 standard allowes + servers to accept special binary extended services + protocol packages, which may be used to insert, update and delete + records into servers. These carry control and update + information to the servers, which are encoded in seven package fields: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Extended services Z39.50 Package Fields
ParameterValueNotes
type'update'Must be set to trigger extended services
actionstring + Extended service action type with + one of four possible values: recordInsert, + recordReplace, + recordDelete, + and specialUpdate +
recordXML stringAn XML formatted string containing the record
syntax'xml'Only XML record syntax is supported
recordIdOpaquestring + Optional client-supplied, opaque record + identifier used under insert operations. +
recordIdNumber positive numberZebra's internal system number, only for update + actions. +
databaseNamedatabase identifier + The name of the database to which the extended services should be + applied. +
+ + + + The action parameter can be any of + recordInsert (will fail if the record already exists), + recordReplace (will fail if the record does not exist), + recordDelete (will fail if the record does not + exist), and + specialUpdate (will insert or update the record + as needed). + + + + During a recordInsert action, the + usual rules for internal record ID generation apply, unless an + optional recordIdNumber Zebra internal ID or a + recordIdOpaque string identifier is assigned. + The default ID generation is + configured using the recordId: from + zebra.cfg. + + + + The actions recordReplace or + recordDelete need specification of the additional + recordIdNumber parameter, which must be an + existing Zebra internal system ID number, or the optional + recordIdOpaque string parameter. + + + + When retrieving existing + records indexed with GRS indexing filters, the Zebra internal + ID number is returned in the field + /*/id:idzebra/localnumber in the namespace + xmlns:id="http://www.indexdata.dk/zebra/", + where it can be picked up for later record updates or deletes. + + + Records indexed with the alvis filter + have similar means to discover the internal Zebra ID. + + + + The recordIdOpaque string parameter + is an client-supplied, opaque record + identifier, which may be used under + insert, update and delete operations. The + client software is responsible for assigning these to + records. This identifier will + replace zebra's own automagic identifier generation with a unique + mapping from recordIdOpaque to the + Zebra internal recordIdNumber. + The opaque recordIdOpaque string + identifiers + are not visible in retrieval records, nor are + searchable, so the value of this parameter is + questionable. It serves mostly as a convenient mapping from + application domain string identifiers to Zebra internal ID's. + + +
+ + + + Extended services from yaz-client + We can now start a yaz-client admin session and create a database: @@ -1241,14 +1655,11 @@ from example/gils/records) and index it: update insert 1 esdd0006.grs + Z> update insert id1234 esdd0006.grs ]]> - The 3rd parameter - 1 here - - is the opaque record ID from Ext update. - It a record ID that we assign to the record - in question. If we do not - assign one, the usual rules for match apply (recordId: from zebra.cfg). + The 3rd parameter - id1234 here - + is the recordIdOpaque package field. Actually, we should have a way to specify "no opaque record id" for @@ -1270,10 +1681,11 @@ - Let's delete the beast: + Let's delete the beast, using the same + recordIdOpaque string parameter: update delete 1 + Z> update delete id1234 No last record (update ignored) Z> update delete 1 esdd0006.grs Got extended services response @@ -1302,8 +1714,14 @@ after each update session in order write your changes from the shadow to the life register space. + + + + + Extended services from yaz-php + - Extended services are also available from the YAZ client layer. An + Extended services are also available from the YAZ PHP client layer. An example of an YAZ-PHP extended service transaction is given here: - The action parameter can be any of - recordInsert (will fail if the record already exists), - recordReplace (will fail if the record does not exist), - recordDelete (will fail if the record does not - exist), and - specialUpdate (will insert or update the record - as needed). - - - If a record is inserted - using the action recordInsert - one can specify the optional - recordIdOpaque parameter, which is a - client-supplied, opaque record identifier. This identifier will - replace zebra's own automagic identifier generation. - - - When using the action recordReplace or - recordDelete, one must specify the additional - recordIdNumber parameter, which must be an - existing Zebra internal system ID number. When retrieving existing - records, the ID number is returned in the field - /*/id:idzebra/localnumber in the namespace - xmlns:id="http://www.indexdata.dk/zebra/", - where it can be picked up for later record updates or deletes. - + +
@@ -1367,74 +1761,6 @@ - - Server Side CQL to PQF Query Translation - - Using the - <cql2rpn>l2rpn.txt</cql2rpn> - YAZ Frontend Virtual - Hosts option, one can configure - the YAZ Frontend CQL-to-PQF - converter, specifying the interpretation of various - CQL - indexes, relations, etc. in terms of Type-1 query attributes. - - - - For example, using server-side CQL-to-PQF conversion, one might - query a zebra server like this: - - querytype cql - Z> find text=(plant and soil) - ]]> - - and - if properly configured - even static relevance ranking can - be performed using CQL query syntax: - - find text = /relevant (plant and soil) - ]]> - - - - - By the way, the same configuration can be used to - search using client-side CQL-to-PQF conversion: - (the only difference is querytype cql2rpn - instead of - querytype cql, and the call specifying a local - conversion file) - - querytype cql2rpn - Z> find text=(plant and soil) - ]]> - - - - - Exhaustive information can be found in the - Section "Specification of CQL to RPN mappings" in the YAZ manual. - - http://www.indexdata.dk/yaz/doc/tools.tkl#tools.cql.map, - and shall therefore not be repeated here. - - - - -