X-Git-Url: http://git.indexdata.com/?p=idzebra-moved-to-github.git;a=blobdiff_plain;f=doc%2Frecordmodel-domxml.xml;h=16d7001e69f849232f0253a9f8285be40b51868b;hp=50876cb1d83e2c4473eaf45cc1b3fb9823c44a91;hb=c3ff843e467932c6027a8b3b2ebda7b44612447e;hpb=97a7adeb9e5059463f039495cc01cfa448463a27 diff --git a/doc/recordmodel-domxml.xml b/doc/recordmodel-domxml.xml index 50876cb..16d7001 100644 --- a/doc/recordmodel-domxml.xml +++ b/doc/recordmodel-domxml.xml @@ -1,6 +1,6 @@ - + &acro.dom; &acro.xml; Record Model and Filter Module - + The record model described in this chapter applies to the fundamental, structured &acro.xml; @@ -10,174 +10,174 @@ releases of the &zebra; Information Server. - - + +
&acro.dom; Record Filter Architecture - - The &acro.dom; &acro.xml; filter uses a standard &acro.dom; &acro.xml; structure as - internal data model, and can therefore parse, index, and display - any &acro.xml; document type. It is well suited to work on - standardized &acro.xml;-based formats such as Dublin Core, MODS, METS, - MARCXML, OAI-PMH, RSS, and performs equally well on any other - non-standard &acro.xml; format. - - - A parser for binary &acro.marc; records based on the ISO2709 library - standard is provided, it transforms these to the internal - &acro.marcxml; &acro.dom; representation. Other binary document parsers - are planned to follow. - + + The &acro.dom; &acro.xml; filter uses a standard &acro.dom; &acro.xml; structure as + internal data model, and can therefore parse, index, and display + any &acro.xml; document type. It is well suited to work on + standardized &acro.xml;-based formats such as Dublin Core, MODS, METS, + MARCXML, OAI-PMH, RSS, and performs equally well on any other + non-standard &acro.xml; format. + + + A parser for binary &acro.marc; records based on the ISO2709 library + standard is provided, it transforms these to the internal + &acro.marcxml; &acro.dom; representation. Other binary document parsers + are planned to follow. + - - The &acro.dom; filter architecture consists of four - different pipelines, each being a chain of arbitrarily many successive - &acro.xslt; transformations of the internal &acro.dom; &acro.xml; - representations of documents. - + + The &acro.dom; filter architecture consists of four + different pipelines, each being a chain of arbitrarily many successive + &acro.xslt; transformations of the internal &acro.dom; &acro.xml; + representations of documents. + -
- &acro.dom; &acro.xml; filter architecture - - - - - - - - - - - [Here there should be a diagram showing the &acro.dom; &acro.xml; - filter architecture, but is seems that your - tool chain has not been able to include the diagram in this - document.] - - - -
- - - - &acro.dom; &acro.xml; filter pipelines overview - - - - Name - When - Description - Input - Output - - - - - - input - first - input parsing and initial - transformations to common &acro.xml; format - Input raw &acro.xml; record buffers, &acro.xml; streams and - binary &acro.marc; buffers - Common &acro.xml; &acro.dom; - - - extract - second - indexing term extraction - transformations - Common &acro.xml; &acro.dom; - Indexing &acro.xml; &acro.dom; - - - store - second - transformations before internal document - storage - Common &acro.xml; &acro.dom; - Storage &acro.xml; &acro.dom; - - - retrieve - third - multiple document retrieve transformations from - storage to different output - formats are possible - Storage &acro.xml; &acro.dom; - Output &acro.xml; syntax in requested formats - - - -
+
+ &acro.dom; &acro.xml; filter architecture + + + + + + + + + + + [Here there should be a diagram showing the &acro.dom; &acro.xml; + filter architecture, but is seems that your + tool chain has not been able to include the diagram in this + document.] + + + +
+ + + + &acro.dom; &acro.xml; filter pipelines overview + + + + Name + When + Description + Input + Output + + + + + + input + first + input parsing and initial + transformations to common &acro.xml; format + Input raw &acro.xml; record buffers, &acro.xml; streams and + binary &acro.marc; buffers + Common &acro.xml; &acro.dom; + + + extract + second + indexing term extraction + transformations + Common &acro.xml; &acro.dom; + Indexing &acro.xml; &acro.dom; + + + store + second + transformations before internal document + storage + Common &acro.xml; &acro.dom; + Storage &acro.xml; &acro.dom; + + + retrieve + third + multiple document retrieve transformations from + storage to different output + formats are possible + Storage &acro.xml; &acro.dom; + Output &acro.xml; syntax in requested formats + + + +
- - The &acro.dom; &acro.xml; filter pipelines use &acro.xslt; (and if supported on - your platform, even &acro.exslt;), it brings thus full &acro.xpath; - support to the indexing, storage and display rules of not only - &acro.xml; documents, but also binary &acro.marc; records. - -
+ + The &acro.dom; &acro.xml; filter pipelines use &acro.xslt; (and if supported on + your platform, even &acro.exslt;), it brings thus full &acro.xpath; + support to the indexing, storage and display rules of not only + &acro.xml; documents, but also binary &acro.marc; records. + + -
- &acro.dom; &acro.xml; filter pipeline configuration +
+ &acro.dom; &acro.xml; filter pipeline configuration The experimental, loadable &acro.dom; &acro.xml;/&acro.xslt; filter module - mod-dom.so + mod-dom.so is invoked by the zebra.cfg configuration statement recordtype.xml: dom.db/filter_dom_conf.xml - In this example the &acro.dom; &acro.xml; filter is configured to work - on all data files with suffix + In this example the &acro.dom; &acro.xml; filter is configured to work + on all data files with suffix *.xml, where the configuration file is found in the path db/filter_dom_conf.xml. The &acro.dom; &acro.xslt; filter configuration file must be valid &acro.xml;. It might look like this: - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + - ]]> + ]]> - The root &acro.xml; element <dom> and all other &acro.dom; - &acro.xml; filter elements are residing in the namespace - xmlns="http://indexdata.com/zebra-2.0". + The root &acro.xml; element <dom> and all other &acro.dom; + &acro.xml; filter elements are residing in the namespace + xmlns="http://indexdata.com/zebra-2.0". All pipeline definition elements - i.e. the - <input>, - <extract>, - <store>, and - <retrieve> elements - are optional. - Missing pipeline definitions are just interpreted - do-nothing identity pipelines. + <input>, + <extract>, + <store>, and + <retrieve> elements - are optional. + Missing pipeline definitions are just interpreted + do-nothing identity pipelines. - All pipeline definition elements may contain zero or more + All pipeline definition elements may contain zero or more ]]> &acro.xslt; transformation instructions, which are performed sequentially from top to bottom. @@ -188,80 +188,80 @@
- Input pipeline - - The <input> pipeline definition element - may contain either one &acro.xml; Reader definition - ]]>, used to split - an &acro.xml; collection input stream into individual &acro.xml; &acro.dom; - documents at the prescribed element level, - or one &acro.marc; binary - parsing instruction - ]]>, which defines - a conversion to &acro.marcxml; format &acro.dom; trees. The allowed values - of the inputcharset attribute depend on your - local iconv set-up. - - - Both input parsers deliver individual &acro.dom; &acro.xml; documents to the - following chain of zero or more - ]]> - &acro.xslt; transformations. At the end of this pipeline, the documents - are in the common format, used to feed both the - <extract> and + Input pipeline + + The <input> pipeline definition element + may contain either one &acro.xml; Reader definition + ]]>, used to split + an &acro.xml; collection input stream into individual &acro.xml; &acro.dom; + documents at the prescribed element level, + or one &acro.marc; binary + parsing instruction + ]]>, which defines + a conversion to &acro.marcxml; format &acro.dom; trees. The allowed values + of the inputcharset attribute depend on your + local iconv set-up. + + + Both input parsers deliver individual &acro.dom; &acro.xml; documents to the + following chain of zero or more + ]]> + &acro.xslt; transformations. At the end of this pipeline, the documents + are in the common format, used to feed both the + <extract> and <store> pipelines. - +
- Extract pipeline - - The <extract> pipeline takes documents - from any common &acro.dom; &acro.xml; format to the &zebra; specific - indexing &acro.dom; &acro.xml; format. - It may consist of zero ore more - ]]> - &acro.xslt; transformations, and the outcome is handled to the - &zebra; core to drive the process of building the inverted - indexes. See - for - details. - + Extract pipeline + + The <extract> pipeline takes documents + from any common &acro.dom; &acro.xml; format to the &zebra; specific + indexing &acro.dom; &acro.xml; format. + It may consist of zero ore more + ]]> + &acro.xslt; transformations, and the outcome is handled to the + &zebra; core to drive the process of building the inverted + indexes. See + for + details. +
- Store pipeline - The <store> pipeline takes documents - from any common &acro.dom; &acro.xml; format to the &zebra; specific - storage &acro.dom; &acro.xml; format. - It may consist of zero ore more - ]]> - &acro.xslt; transformations, and the outcome is handled to the - &zebra; core for deposition into the internal storage system. -
+ Store pipeline + The <store> pipeline takes documents + from any common &acro.dom; &acro.xml; format to the &zebra; specific + storage &acro.dom; &acro.xml; format. + It may consist of zero ore more + ]]> + &acro.xslt; transformations, and the outcome is handled to the + &zebra; core for deposition into the internal storage system. +
- Retrieve pipeline + Retrieve pipeline - Finally, there may be one or more - <retrieve> pipeline definitions, each - of them again consisting of zero or more - ]]> - &acro.xslt; transformations. These are used for document - presentation after search, and take the internal storage &acro.dom; - &acro.xml; to the requested output formats during record present - requests. + Finally, there may be one or more + <retrieve> pipeline definitions, each + of them again consisting of zero or more + ]]> + &acro.xslt; transformations. These are used for document + presentation after search, and take the internal storage &acro.dom; + &acro.xml; to the requested output formats during record present + requests. - The possible multiple + The possible multiple <retrieve> pipeline definitions are distinguished by their unique name - attributes, these are the literal schema or - element set names used in - &acro.srw;, - &acro.sru; and - &acro.z3950; protocol queries. - + attributes, these are the literal schema or + element set names used in + &acro.srw;, + &acro.sru; and + &acro.z3950; protocol queries. +
@@ -277,288 +277,303 @@ namespace xmlns:z="http://indexdata.com/zebra-2.0". -
- Processing-instruction governed indexing format - - The output of the processing instruction driven +
+ Processing-instruction governed indexing format + + The output of the processing instruction driven indexing &acro.xslt; stylesheets must contain - processing instructions named - zebra-2.0. + processing instructions named + zebra-2.0. The output of the &acro.xslt; indexing transformation is then parsed using &acro.dom; methods, and the contained instructions are performed on the elements and their - subtrees directly following the processing instructions. - - - For example, the output of the command - + subtrees directly following the processing instructions. + + + For example, the output of the command + xsltproc dom-index-pi.xsl marc-one.xml - - might look like this: - - - - - - 11224466 - - How to program a computer + + might look like this: + + + + + + 11224466 + + How to program a computer - ]]> - - -
+ ]]> + +
+
-
- Magic element governed indexing format - - The output of the indexing &acro.xslt; stylesheets must contain - certain elements in the magic - xmlns:z="http://indexdata.com/zebra-2.0" - namespace. The output of the &acro.xslt; indexing transformation is then - parsed using &acro.dom; methods, and the contained instructions are - performed on the magic elements and their - subtrees. - - - For example, the output of the command - - xsltproc dom-index-element.xsl marc-one.xml - - might look like this: - - - - 11224466 - - How to program a computer +
+ Magic element governed indexing format + + The output of the indexing &acro.xslt; stylesheets must contain + certain elements in the magic + xmlns:z="http://indexdata.com/zebra-2.0" + namespace. The output of the &acro.xslt; indexing transformation is then + parsed using &acro.dom; methods, and the contained instructions are + performed on the magic elements and their + subtrees. + + + For example, the output of the command + + xsltproc dom-index-element.xsl marc-one.xml + + might look like this: + + + + 11224466 + + How to program a computer - ]]> - - -
+ ]]> +
+
+
-
- Semantics of the indexing formats +
+ Semantics of the indexing formats - - Both indexing formats are defined with equal semantics and - behavior in mind: - + + Both indexing formats are defined with equal semantics and + behavior in mind: + - &zebra; specific instructions are either + &zebra; specific instructions are either processing instructions named zebra-2.0 or elements contained in the namespace xmlns:z="http://indexdata.com/zebra-2.0". - + - There must be exactly one record - instruction, which sets the scope for the following, - possibly nested index instructions. - + There must be exactly one record + instruction, which sets the scope for the following, + possibly nested index instructions. + - - The unique record instruction - may have additional attributes id, - rank and type. - Attribute id is the value of the opaque ID - and may be any string not containing the whitespace character - ' '. - The rank attribute value must be a - non-negative integer. See - . - The type attribute specifies how the record - is to be treated. The following values may be given for - type: - - - insert - - - The record is inserted. If the record already exists, it is - skipped (i.e. not replaced). - - - - - replace - - - The record is replaced. If the record does not already exist, - it is skipped (i.e. not inserted). - - - - - delete - - - The record is deleted. If the record does not already exist, - it is skipped (i.e. nothing is deleted). - - - - - update - - - The record is inserted or replaced depending on whether the - record exists or not. This is the default behavior but may - be effectively changed by "outside" the scope of the DOM - filter by zebraidx commands or extended services updates. - - - - - Note that the value of type is only used to - determine the action if and only if the Zebra indexer is running - in "update" mode (i.e zebraidx update) or if the specialUpdate - action of the - Extended + + The unique record instruction + may have additional attributes id, + rank and type. + Attribute id is the value of the opaque ID + and may be any string not containing the whitespace character + ' '. + The rank attribute value must be a + non-negative integer. See + . + The type attribute specifies how the record + is to be treated. The following values may be given for + type: + + + insert + + + The record is inserted. If the record already exists, it is + skipped (i.e. not replaced). + + + + + replace + + + The record is replaced. If the record does not already exist, + it is skipped (i.e. not inserted). + + + + + delete + + + The record is deleted. If the record does not already exist, + a warning issued and rest of records are skipped in + from the input stream. + + + + + update + + + The record is inserted or replaced depending on whether the + record exists or not. This is the default behavior but may + be effectively changed by "outside" the scope of the DOM + filter by zebraidx commands or extended services updates. + + + + + adelete + + + The record is deleted. If the record does not already exist, + it is skipped (i.e. nothing is deleted). + + + + Requires version 2.0.54 or later. + + + + + + Note that the value of type is only used to + determine the action if and only if the Zebra indexer is running + in "update" mode (i.e zebraidx update) or if the specialUpdate + action of the + Extended Service Update is used. - For this reason a specialUpdate may end up deleting records! - + For this reason a specialUpdate may end up deleting records! + - Multiple and possible nested index - instructions must contain at least one + Multiple and possible nested index + instructions must contain at least one indexname:indextype - pair, and may contain multiple such pairs separated by the + pair, and may contain multiple such pairs separated by the whitespace character ' '. In each index - pair, the name and the type of the index is separated by a + pair, the name and the type of the index is separated by a colon character ':'. - + - + Any index name consisting of ASCII letters, and following the - standard &zebra; rules will do, see + standard &zebra; rules will do, see . - + - + Index types are restricted to the values defined in the standard configuration file default.idx, see - and + and for details. - + - + &acro.dom; input documents which are not resulting in both one - unique valid - record instruction and one or more valid + unique valid + record instruction and one or more valid index instructions can not be searched and found. Therefore, invalid document processing is aborted, and any content of - the <extract> and - <store> pipelines is discarted. - A warning is issued in the logs. - + the <extract> and + <store> pipelines is discarded. + A warning is issued in the logs. + - - - The examples work as follows: - From the original &acro.xml; file - marc-one.xml (or from the &acro.xml; record &acro.dom; of the - same form coming from an <input> - pipeline), - the indexing - pipeline <extract> - produces an indexing &acro.xml; record, which is defined by - the record instruction - &zebra; uses the content of - z:id="11224466" - or - id=11224466 - as internal - record ID, and - in case static ranking is set - the content of - rank=42 - or - z:rank="42" - as static rank. - - + - In these examples, the following literal indexes are constructed: - + The examples work as follows: + From the original &acro.xml; file + marc-one.xml (or from the &acro.xml; record &acro.dom; of the + same form coming from an <input> + pipeline), + the indexing + pipeline <extract> + produces an indexing &acro.xml; record, which is defined by + the record instruction + &zebra; uses the content of + z:id="11224466" + or + id=11224466 + as internal + record ID, and - in case static ranking is set - the content of + rank=42 + or + z:rank="42" + as static rank. + + + + In these examples, the following literal indexes are constructed: + any:w control:0 title:w title:p title:s - - where the indexing type is defined after the - literal ':' character. - Any value from the standard configuration - file default.idx will do. - Finally, any - text() node content recursively contained - inside the <z:index> element, or any - element following a index processing instruction, - will be filtered through the - appropriate char map for character normalization, and will be - inserted in the named indexes. - - - Finally, this example configuration can be queried using &acro.pqf; - queries, either transported by &acro.z3950;, (here using a yaz-client) - - open localhost:9999 - Z> elem dc - Z> form xml - Z> - Z> find @attr 1=control @attr 4=3 11224466 - Z> scan @attr 1=control @attr 4=3 "" - Z> - Z> find @attr 1=title program - Z> scan @attr 1=title "" - Z> - Z> find @attr 1=title @attr 4=2 "How to program a computer" - Z> scan @attr 1=title @attr 4=2 "" - ]]> - - or the proprietary - extensions x-pquery and - x-pScanClause to - &acro.sru;, and &acro.srw; - - - - See for more information on &acro.sru;/&acro.srw; - configuration, and or the &yaz; - &acro.cql; section - for the details or the &yaz; frontend server. - - - Notice that there are no *.abs, - *.est, *.map, or other &acro.grs1; - filter configuration files involves in this process, and that the - literal index names are used during search and retrieval. - - - In case that we want to support the usual - bib-1 &acro.z3950; numeric access points, it is a - good idea to choose string index names defined in the default - configuration file tab/bib1.att, see - - - -
+ + where the indexing type is defined after the + literal ':' character. + Any value from the standard configuration + file default.idx will do. + Finally, any + text() node content recursively contained + inside the <z:index> element, or any + element following a index processing instruction, + will be filtered through the + appropriate char map for character normalization, and will be + inserted in the named indexes. + + + Finally, this example configuration can be queried using &acro.pqf; + queries, either transported by &acro.z3950;, (here using a yaz-client) + + open localhost:9999 + Z> elem dc + Z> form xml + Z> + Z> find @attr 1=control @attr 4=3 11224466 + Z> scan @attr 1=control @attr 4=3 "" + Z> + Z> find @attr 1=title program + Z> scan @attr 1=title "" + Z> + Z> find @attr 1=title @attr 4=2 "How to program a computer" + Z> scan @attr 1=title @attr 4=2 "" + ]]> + + or the proprietary + extensions x-pquery and + x-pScanClause to + &acro.sru;, and &acro.srw; + + + + See for more information on &acro.sru;/&acro.srw; + configuration, and or the &yaz; + &acro.cql; section + for the details or the &yaz; frontend server. + + + Notice that there are no *.abs, + *.est, *.map, or other &acro.grs1; + filter configuration files involves in this process, and that the + literal index names are used during search and retrieval. + + + In case that we want to support the usual + bib-1 &acro.z3950; numeric access points, it is a + good idea to choose string index names defined in the default + configuration file tab/bib1.att, see + + + +
@@ -568,14 +583,14 @@ &acro.dom; Record Model Configuration -
- &acro.dom; Indexing Configuration +
+ &acro.dom; Indexing Configuration As mentioned above, there can be only one indexing pipeline, and configuration of the indexing process is a synonym of writing an &acro.xslt; stylesheet which produces &acro.xml; output containing the - magic processing instructions or elements discussed in - . + magic processing instructions or elements discussed in + . Obviously, there are million of different ways to accomplish this task, and some comments and code snippets are in order to enlighten the wary. @@ -586,11 +601,11 @@ means that the output &acro.xml; structure is taken as starting point of the internal structure of the &acro.xslt; stylesheet, and portions of the input &acro.xml; are pulled out and inserted - into the right spots of the output &acro.xml; structure. + into the right spots of the output &acro.xml; structure. On the other side, push &acro.xslt; stylesheets are recursively calling their template definitions, a process which is commanded - by the input &acro.xml; structure, and is triggered to produce + by the input &acro.xml; structure, and is triggered to produce some output &acro.xml; whenever some special conditions in the input stylesheets are met. The pull type is well-suited for input @@ -599,187 +614,187 @@ push type might be the only possible way to sort out deeply recursive input &acro.xml; formats. - + A pull stylesheet example used to index &acro.oai; harvested records could use some of the following template definitions: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + xmlns:z="http://indexdata.com/zebra-2.0" + xmlns:oai="http://www.openarchives.org/&acro.oai;/2.0/" + xmlns:oai_dc="http://www.openarchives.org/&acro.oai;/2.0/oai_dc/" + xmlns:dc="http://purl.org/dc/elements/1.1/" + version="1.0"> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ]]> -
+
-
- &acro.dom; Indexing &acro.marcxml; +
+ &acro.dom; Indexing &acro.marcxml; - The &acro.dom; filter allows indexing of both binary &acro.marc; records - and &acro.marcxml; records, depending on its configuration. - A typical &acro.marcxml; record might look like this: - + The &acro.dom; filter allows indexing of both binary &acro.marc; records + and &acro.marcxml; records, depending on its configuration. + A typical &acro.marcxml; record might look like this: + - 42 - 00366nam 22001698a 4500 - 11224466 - DLC - 00000000000000.0 - 910710c19910701nju 00010 eng - - 11224466 - - - DLC - DLC - - - 123-xyz - - - Jack Collins - - - How to program a computer - - - Penguin - - - 8710 - - - p. cm. - - + 42 + 00366nam 22001698a 4500 + 11224466 + DLC + 00000000000000.0 + 910710c19910701nju 00010 eng + + 11224466 + + + DLC + DLC + + + 123-xyz + + + Jack Collins + + + How to program a computer + + + Penguin + + + 8710 + + + p. cm. + + ]]> - + - It is easily possible to make string manipulation in the &acro.dom; - filter. For example, if you want to drop some leading articles - in the indexing of sort fields, you might want to pick out the - &acro.marcxml; indicator attributes to chop of leading substrings. If - the above &acro.xml; example would have an indicator - ind2="8" in the title field - 245, i.e. - + It is easily possible to make string manipulation in the &acro.dom; + filter. For example, if you want to drop some leading articles + in the indexing of sort fields, you might want to pick out the + &acro.marcxml; indicator attributes to chop of leading substrings. If + the above &acro.xml; example would have an indicator + ind2="8" in the title field + 245, i.e. + - How to program a computer - + + How to program a computer + ]]> - - one could write a template taking into account this information - to chop the first 8 characters from the - sorting index title:s like this: - + + one could write a template taking into account this information + to chop the first 8 characters from the + sorting index title:s like this: + - - - 0 - - - - - - - - - - - - - + + + 0 + + + + + + + + + + + + + ]]> - - The output of the above &acro.marcxml; and &acro.xslt; excerpt would then be: - + + The output of the above &acro.marcxml; and &acro.xslt; excerpt would then be: + How to program a computer - program a computer + How to program a computer + program a computer ]]> - - and the record would be sorted in the title index under 'P', not 'H'. + + and the record would be sorted in the title index under 'P', not 'H'. -
+
-
- &acro.dom; Indexing Wizardry +
+ &acro.dom; Indexing Wizardry The names and types of the indexes can be defined in the indexing &acro.xslt; stylesheet dynamically according to - content in the original &acro.xml; records, which has + content in the original &acro.xml; records, which has opportunities for great power and wizardry as well as grande - disaster. + disaster. The following excerpt of a push stylesheet - might + might be a good idea according to your strict control of the &acro.xml; input format (due to rigorous checking against well-defined and tight RelaxNG or &acro.xml; Schema's, for example): - - - - + + + + + ]]> - This template creates indexes which have the name of the working + This template creates indexes which have the name of the working node of any input &acro.xml; file, and assigns a '1' to the index. - The example query - find @attr 1=xyz 1 + The example query + find @attr 1=xyz 1 finds all files which contain at least one xyz &acro.xml; element. In case you can not control which element names the input files contain, you might ask for @@ -787,25 +802,25 @@ One variation over the theme dynamically created - indexes will definitely be unwise: + indexes will definitely be unwise: - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + ]]> Don't be tempted to play too smart tricks with the power of @@ -813,104 +828,104 @@ indexes with unpredictable names, resulting in severe &zebra; index pollution.. -
- -
- Debuggig &acro.dom; Filter Configurations - - It can be very hard to debug a &acro.dom; filter setup due to the many - sucessive &acro.marc; syntax translations, &acro.xml; stream splitting and - &acro.xslt; transformations involved. As an aid, you have always the - power of the -s command line switch to the - zebraidz indexing command at your hand: - - zebraidx -s -c zebra.cfg update some_record_stream.xml - - This command line simulates indexing and dumps a lot of debug - information in the logs, telling exactly which transformations - have been applied, how the documents look like after each - transformation, and which record ids and terms are send to the indexer. - -
+
- + --> - - +
@@ -924,7 +939,7 @@ sgml-always-quote-attributes:t sgml-indent-step:1 sgml-indent-data:t - sgml-parent-document: "zebra.xml" + sgml-parent-document: "idzebra.xml" sgml-local-catalogs: nil sgml-namecase-general:t End: