From 8ade6bf0476b510499488f499156604172b8d1fc Mon Sep 17 00:00:00 2001 From: Marc Cromme Date: Thu, 1 Mar 2007 11:21:20 +0000 Subject: [PATCH] removed section on special record retrieval features, which need a rewrite - only commented out. added section on debugging of DOM filter configurations added a bullet point on semantics of DOM filter explaining that records not emerging record and index instructions are discarted, i.e. dropped on the floor. This meets Seb's wishes for the gutenberg collection --- doc/recordmodel-domxml.xml | 39 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/doc/recordmodel-domxml.xml b/doc/recordmodel-domxml.xml index bb5b300..a9b85db 100644 --- a/doc/recordmodel-domxml.xml +++ b/doc/recordmodel-domxml.xml @@ -1,5 +1,5 @@ - + &dom; &xml; Record Model and Filter Module @@ -400,6 +400,19 @@ for details. + + + &dom; input documents which are not resulting in both one + unique valid + record instruction and one or more valid + index instructions can not be searched and + found. Therefore, + invalid document processing is aborted, and any content of + the <extract> and + <store> pipelines is discarted. + A warning is issued in the logs. + + @@ -651,6 +664,25 @@ +
+ Debuggig &dom; Filter Configurations + + It can be very hard to debug a &dom; filter setup due to the many + sucessive &marc; syntax translations, &xml; stream splitting and + &xslt; transformations involved. As an aid, you have always the + power of the -s command line switch to the + zebraidz indexing command at your hand: + + zebraidx -s -c zebra.cfg update some_record_stream.xml + + This command line simulates indexing and dumps a lot of debug + information in the logs, telling exactly which transformations + have been applied, how the documents look like after each + transformation, and which record ids and terms are send to the indexer. + +
+ + + @@ -683,7 +715,7 @@ - + @@ -699,6 +731,7 @@ + -->