X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fadministration.xml;h=42e13c1f724e485b8b9dc30eba0efb839582ddc2;hb=66291dd47d5d44e4af1e3f9b1158bd5c6a32f0c3;hp=6f29d82ded11b65b9bc6f12fb16c502fbc6504eb;hpb=e14b48004c060bfbb6649e35e635dacd0983ac3e;p=idzebra-moved-to-github.git diff --git a/doc/administration.xml b/doc/administration.xml index 6f29d82..42e13c1 100644 --- a/doc/administration.xml +++ b/doc/administration.xml @@ -1,9 +1,9 @@ - + Administrating Zebra @@ -379,6 +379,18 @@ + + dbaccess accessfile + + + Names a file which lists database subscriptions for individual users. + The access file should consists of lines of the form username: + dbnames, where dbnames is a list of database names, seprated by + '+'. No whitespace is allowed in the database list. + + + + @@ -657,7 +669,7 @@ - (see + (see for details of how the mapping between elements of your records and searchable attributes is established). @@ -831,7 +843,6 @@ register: /d1:500M - shadow: /scratch1:100M /scratch2:200M @@ -909,8 +920,342 @@ + + + + Static and Dynamic Ranking + + + Zebra uses internally inverted indexes to look up term occurencies + in documents. Multiple queries from different indexes can be + combined by the binary boolean operations AND, + OR and/or NOT (which + is in fact a binary AND NOT operation). + To ensure fast query execution + speed, all indexes have to be sorted in the same order. + + + The indexes are normally sorted according to document + ID in + ascending order, and any query which does not invoke a special + re-ranking function will therefore retrieve the result set in + document + ID + order. + + + If one defines the + + staticrank: 1 + + directive in the main core Zebra config file, the internal document + keys used for ordering are augmented by a preceeding integer, which + contains the static rank of a given document, and the index lists + are ordered + first by ascending static rank, + then by ascending document ID. + + + This implies that the default rank 0 + is the best rank at the + beginning of the list, and max int + is the worst static rank. + + + The experimental alvis filter provides a + directive to fetch static rank information out of the indexed XML + records, thus making all hit sets orderd + after ascending static + rank, and for those doc's which have the same static rank, ordered + after ascending doc ID. + See for the glory details. + + + If one wants to do a little fiddeling with the static rank order, + one has to invoke additional re-ranking/re-ordering using dynamic + reranking or score functions. These functions return positive + interger scores, where highest score is + best, which means that the + hit sets will be sorted according to + decending + scores (in contrary + to the index lists which are sorted according to + ascending rank number and document ID). + + + + Those are in the zebra config file enabled by a directive like (use + only one of these a time!): + + rank: rank-1 # default + rank: rank-static # dummy + rank: zvrank # TDF-IDF like + + Notice that the rank-1 and + zvrank do not use the static rank + information in the list keys, and will produce the same ordering + with our without static ranking enabled. + + + The dummy rank-static reranking/scoring + function returns just + score = max int - staticrank + in order to preserve the ordering of hit sets with and without it's + call. + Obviously, to combine static and dynamic ranking usefully, one wants + to make a new ranking + function, which is left + as an exercise for the reader. + + + + + + Extended Services: Remote Insert, Update and Delete + + + The extended services are not enabled by default in zebra - due to the + fact that they modify the system. + In order to allow anybody to update, use + + perm.anonymous: rw + + in the main zebra configuration file zebra.cfg. + Or, even better, allow only updates for a particular admin user. For + user admin, you could use: + + perm.admin: rw + passwd: passwordfile + + And in passwordfile, specify users and + passwords as colon seperated strings: + + admin:secret + + + + We can now start a yaz-client admin session and create a database: + + adm-create + ]]> + + Now the Default database was created, + we can insert an XML file (esdd0006.grs + from example/gils/records) and index it: + + update insert 1 esdd0006.grs + ]]> + + The 3rd parameter - 1 here - + is the opaque record ID from Ext update. + It a record ID that we assign to the record + in question. If we do not + assign one, the usual rules for match apply (recordId: from zebra.cfg). + + + Actually, we should have a way to specify "no opaque record id" for + yaz-client's update command.. We'll fix that. + + + The newly inserted record can be searched as usual: + + f utah + Sent searchRequest. + Received SearchResponse. + Search was a success. + Number of hits: 1, setno 1 + SearchResult-1: term=utah cnt=1 + records returned: 0 + Elapsed: 0.014179 + ]]> + + + + Let's delete the beast: + + update delete 1 + No last record (update ignored) + Z> update delete 1 esdd0006.grs + Got extended services response + Status: done + Elapsed: 0.072441 + Z> f utah + Sent searchRequest. + Received SearchResponse. + Search was a success. + Number of hits: 0, setno 2 + SearchResult-1: term=utah cnt=0 + records returned: 0 + Elapsed: 0.013610 + ]]> + + + + If shadow register is enabled in your + zebra.cfg, + you must run the adm-commit command + + adm-commit + ]]> + + after each update session in order write your changes from the + shadow to the life register space. + + + Extended services are also available from the YAZ client layer. An + example of an YAZ-PHP extended service transaction is given here: + + A fine specimen of a record'; + + $options = array('action' => 'recordInsert', + 'syntax' => 'xml', + 'record' => $record, + 'databaseName' => 'mydatabase' + ); + + yaz_es($yaz, 'update', $options); + yaz_es($yaz, 'commit', array()); + yaz_wait(); + + if ($error = yaz_error($yaz)) + echo "$error"; + ]]> + + The action parameter can be any of + recordInsert (will fail if the record already exists), + recordReplace (will fail if the record does not exist), + recordDelete (will fail if the record does not + exist), and + specialUpdate (will insert or update the record + as needed). + + + If a record is inserted + using the action recordInsert + one can specify the optional + recordIdOpaque parameter, which is a + client-supplied, opaque record identifier. This identifier will + replace zebra's own automagic identifier generation. + + + When using the action recordReplace or + recordDelete, one must specify the additional + recordIdNumber parameter, which must be an + existing Zebra internal system ID number. When retrieving existing + records, the ID number is returned in the field + /*/id:idzebra/localnumber in the namespace + xmlns:id="http://www.indexdata.dk/zebra/", + where it can be picked up for later record updates or deletes. + + + + + + YAZ Frontend Virtual Hosts + + zebrasrv uses the YAZ server frontend and does + support multiple virtual servers behind multiple listening sockets. + + &zebrasrv-virtual; + + + Section "Virtual Hosts" in the YAZ manual. + http://www.indexdata.dk/yaz/doc/server.vhosts.tkl + + + + + + Server Side CQL to PQF Query Translation + + Using the + <cql2rpn>l2rpn.txt</cql2rpn> + YAZ Frontend Virtual + Hosts option, one can configure + the YAZ Frontend CQL-to-PQF + converter, specifying the interpretation of various + CQL + indexes, relations, etc. in terms of Type-1 query attributes. + + + + For example, using server-side CQL-to-PQF conversion, one might + query a zebra server like this: + + querytype cql + Z> find text=(plant and soil) + ]]> + + and - if properly configured - even static relevance ranking can + be performed using CQL query syntax: + + find text = /relevant (plant and soil) + ]]> + + + + + By the way, the same configuration can be used to + search using client-side CQL-to-PQF conversion: + (the only difference is querytype cql2rpn + instead of + querytype cql, and the call specifying a local + conversion file) + + querytype cql2rpn + Z> find text=(plant and soil) + ]]> + + + + + Exhaustive information can be found in the + Section "Specification of CQL to RPN mappings" in the YAZ manual. + + http://www.indexdata.dk/yaz/doc/tools.tkl#tools.cql.map, + and shall therefore not be repeated here. + + + + + +