X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Frecordmodel-domxml.xml;h=bf31b74c720b20a8224c259279db677b5cc8d430;hb=27a6d4f7f7425896345ab5a2bfdf35a96c97416e;hp=a9b85db7d726fb7eb9b50cc5679cf01f240fd8f2;hpb=8ade6bf0476b510499488f499156604172b8d1fc;p=idzebra-moved-to-github.git

diff --git a/doc/recordmodel-domxml.xml b/doc/recordmodel-domxml.xml
index a9b85db..bf31b74 100644
--- a/doc/recordmodel-domxml.xml
+++ b/doc/recordmodel-domxml.xml
@@ -1,7 +1,7 @@
 <chapter id="record-model-domxml">
-  <!-- $Id: recordmodel-domxml.xml,v 1.10 2007-03-01 11:21:20 marc Exp $ -->
+  <!-- $Id: recordmodel-domxml.xml,v 1.13 2007-03-21 19:37:00 adam Exp $ -->
   <title>&dom; &xml; Record Model and Filter Module</title>
-
+  
   <para>
    The record model described in this chapter applies to the fundamental,
    structured &xml;
@@ -365,13 +365,66 @@
          </para>
        </listitem>
        <listitem>
-         <para>The unique <literal>record</literal> instruction
-	    may have additional attributes <literal>id</literal> and
-            <literal>rank</literal>, where the value of the opaque ID
-            may be any string not containing the whitespace character 
-            <literal>' '</literal>, and the rank value must be a
+         <para>
+            The unique <literal>record</literal> instruction
+	    may have additional attributes <literal>id</literal>,
+            <literal>rank</literal> and <literal>type</literal>.
+            Attribute <literal>id</literal> is the value of the opaque ID
+            and may be any string not containing the whitespace character 
+            <literal>' '</literal>.
+            The <literal>rank</literal> attribute value must be a
             non-negative integer. See 
-            <xref linkend="administration-ranking"/>
+            <xref linkend="administration-ranking"/> .
+            The <literal>type</literal> attribute specifies how the record
+            is to be treated. The following values may be given for 
+            <literal>type</literal>:
+            <variablelist>
+             <varlistentry>
+              <term><literal>insert</literal></term>
+              <listitem>
+               <para>
+                The record is inserted. If the record already exists, it is
+                skipped (i.e. not replaced).
+               </para>
+              </listitem>
+             </varlistentry>
+             <varlistentry>
+              <term><literal>replace</literal></term>
+              <listitem>
+               <para>
+                The record is replaced. If the record does not already exist,
+                it is skipped (i.e. not inserted).
+               </para>
+              </listitem>
+             </varlistentry>
+             <varlistentry>
+              <term><literal>delete</literal></term>
+              <listitem>
+               <para>
+                The record is deleted. If the record does not already exist,
+                it is skipped (i.e. nothing is deleted).
+               </para>
+              </listitem>
+             </varlistentry>
+             <varlistentry>
+              <term><literal>update</literal></term>
+              <listitem>
+               <para>
+                The record is inserted or replaced depending on whether the
+                record exists or not. This is the default behavior but may
+                be effectively changed by "outside" the scope of the DOM
+                filter by zebraidx commands or extended services updates.
+               </para>
+              </listitem>
+             </varlistentry>
+            </variablelist>
+          Note that the value of <literal>type</literal> is only used to
+          determine the action if and only if the Zebra indexer is running
+          in "update" mode (i.e zebraidx update) or if the specialUpdate
+          action of the
+          <link linkend="administration-extended-services-z3950">Extended
+          Service Update</link> is used.
+          For this reason a specialUpdate may end up deleting records!
          </para>
        </listitem>
        <listitem>
@@ -415,7 +468,6 @@
        </listitem>
       </itemizedlist>
     </para>
-
     
     <para>The examples work as follows: 
      From the original &xml; file 
@@ -582,7 +634,7 @@
 
          <!-- OAI indexing templates -->
          <xsl:template match="oai:record/oai:header/oai:identifier">
-          <z:index name="oai_identifier;0">
+          <z:index name="oai_identifier:0">
            <xsl:value-of select="."/>
           </z:index>    
          </xsl:template>
@@ -602,9 +654,109 @@
       ]]>
      </screen>
     </para>
+  </section>
+
+
+  <section id="record-model-domxml-index-marc">
+   <title>&dom; Indexing &marcxml;</title>
+    <para>
+      The &dom; filter allows indexing of both binary &marc; records
+      and &marcxml; records, depending on it's configuration.
+      A typical &marcxml; record might look like this:
+      <screen>  
+      <![CDATA[
+      <record xmlns="http://www.loc.gov/MARC21/slim">
+       <rank>42</rank>
+       <leader>00366nam  22001698a 4500</leader>
+       <controlfield tag="001">   11224466   </controlfield>
+       <controlfield tag="003">DLC  </controlfield>
+       <controlfield tag="005">00000000000000.0  </controlfield>
+       <controlfield tag="008">910710c19910701nju           00010 eng    </controlfield>
+       <datafield tag="010" ind1=" " ind2=" ">
+         <subfield code="a">   11224466 </subfield>
+       </datafield>
+       <datafield tag="040" ind1=" " ind2=" ">
+         <subfield code="a">DLC</subfield>
+         <subfield code="c">DLC</subfield>
+       </datafield>
+       <datafield tag="050" ind1="0" ind2="0">
+         <subfield code="a">123-xyz</subfield>
+       </datafield>
+       <datafield tag="100" ind1="1" ind2="0">
+         <subfield code="a">Jack Collins</subfield>
+       </datafield>
+       <datafield tag="245" ind1="1" ind2="0">
+         <subfield code="a">How to program a computer</subfield>
+       </datafield>
+       <datafield tag="260" ind1="1" ind2=" ">
+         <subfield code="a">Penguin</subfield>
+       </datafield>
+       <datafield tag="263" ind1=" " ind2=" ">
+         <subfield code="a">8710</subfield>
+       </datafield>
+       <datafield tag="300" ind1=" " ind2=" ">
+         <subfield code="a">p. cm.</subfield>
+       </datafield>
+      </record>
+      ]]>
+      </screen>
+    </para>
+
+    <para>
+      It is easily possible to make string manipulation in the &dom;
+      filter. For example, if you want to drop some leading articles
+      in the indexing of sort fields, you might want to pick out the 
+      &marcxml; indicator attributes to chop of leading substrings. If
+      the above &xml; example would have an indicator
+      <literal>ind2="8"</literal> in the title field 
+      <literal>245</literal>, i.e.
+      <screen>  
+      <![CDATA[
+       <datafield tag="245" ind1="1" ind2="8">
+         <subfield code="a">How to program a computer</subfield>
+       </datafield>
+      ]]>
+      </screen>
+      one could write a template taking into account this information
+      to chop the first <literal>8</literal> characters from the
+      sorting index <literal>title:s</literal> like this:
+      <screen>  
+      <![CDATA[
+      <xsl:template match="m:datafield[@tag='245']">
+        <xsl:variable name="chop">
+          <xsl:choose>
+            <xsl:when test="not(number(@ind2))">0</xsl:when>
+            <xsl:otherwise><xsl:value-of select="number(@ind2)"/></xsl:otherwise>
+          </xsl:choose>
+        </xsl:variable>  
+
+        <z:index name="title:w title:p any:w">
+           <xsl:value-of select="m:subfield[@code='a']"/>
+        </z:index>
+
+        <z:index name="title:s">
+          <xsl:value-of select="substring(m:subfield[@code='a'], $chop)"/>
+        </z:index>
+
+      </xsl:template> 
+      ]]>
+      </screen>
+      The output of the above &marcxml; and &xslt; excerpt would then be:
+      <screen>  
+      <![CDATA[
+        <z:index name="title:w title:p any:w">How to program a computer</z:index>
+        <z:index name="title:s">program a computer</z:index>
+      ]]>
+      </screen>
+      and the record would be sorted in the title index under 'P', not 'H'.
+    </para>
+  </section>
+
+
+  <section id="record-model-domxml-index-wizzard">
+   <title>&dom; Indexing Wizardry</title>
     <para>
-     Notice also,
-     that the names and types of the indexes can be defined in the
+     The names and types of the indexes can be defined in the
      indexing &xslt; stylesheet <emphasis>dynamically according to
      content in the original &xml; records</emphasis>, which has
      opportunities for great power and wizardry as well as grande