+ <section id="querymodel-pqf-apt-mapping-structuretype">
+ <title>Mapping of &acro.pqf; &acro.apt; structure and completeness to
+ register type</title>
+ <para>
+ Internally &zebra; has in its default configuration several
+ different types of registers or indexes, whose tokenization and
+ character normalization rules differ. This reflects the fact that
+ searching fundamental different tokens like dates, numbers,
+ bitfields and string based text needs different rule sets.
+ </para>
+
+ <table id="querymodel-zebra-mapping-structure-types" frame="top">
+ <title>Structure and completeness mapping to register types</title>
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Structure</entry>
+ <entry>Completeness</entry>
+ <entry>Register type</entry>
+ <entry>Notes</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>
+ phrase (@attr 4=1), word (@attr 4=2),
+ word-list (@attr 4=6),
+ free-form-text (@attr 4=105), or document-text (@attr 4=106)
+ </entry>
+ <entry>Incomplete field (@attr 6=1)</entry>
+ <entry>Word ('w')</entry>
+ <entry>Traditional tokenized and character normalized word index</entry>
+ </row>
+ <row>
+ <entry>
+ phrase (@attr 4=1), word (@attr 4=2),
+ word-list (@attr 4=6),
+ free-form-text (@attr 4=105), or document-text (@attr 4=106)
+ </entry>
+ <entry>complete field' (@attr 6=3)</entry>
+ <entry>Phrase ('p')</entry>
+ <entry>Character normalized, but not tokenized index for phrase
+ matches
+ </entry>
+ </row>
+ <row>
+ <entry>urx (@attr 4=104)</entry>
+ <entry>ignored</entry>
+ <entry>URX/URL ('u')</entry>
+ <entry>Special index for URL web addresses</entry>
+ </row>
+ <row>
+ <entry>numeric (@attr 4=109)</entry>
+ <entry>ignored</entry>
+ <entry>Numeric ('n')</entry>
+ <entry>Special index for digital numbers</entry>
+ </row>
+ <row>
+ <entry>key (@attr 4=3)</entry>
+ <entry>ignored</entry>
+ <entry>Null bitmap ('0')</entry>
+ <entry>Used for non-tokenized and non-normalized bit sequences</entry>
+ </row>
+ <row>
+ <entry>year (@attr 4=4)</entry>
+ <entry>ignored</entry>
+ <entry>Year ('y')</entry>
+ <entry>Non-tokenized and non-normalized 4 digit numbers</entry>
+ </row>
+ <row>
+ <entry>date (@attr 4=5)</entry>
+ <entry>ignored</entry>
+ <entry>Date ('d')</entry>
+ <entry>Non-tokenized and non-normalized ISO date strings</entry>
+ </row>
+ <row>
+ <entry>ignored</entry>
+ <entry>ignored</entry>
+ <entry>Sort ('s')</entry>
+ <entry>Used with special sort attribute set (@attr 7=1, @attr 7=2)</entry>
+ </row>
+ <row>
+ <entry>overruled</entry>
+ <entry>overruled</entry>
+ <entry>special</entry>
+ <entry>Internal record ID register, used whenever
+ Relation Always Matches (@attr 2=103) is specified</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <!-- see in util/zebramap.c -->
+
+ <para>
+ If a <emphasis>Structure</emphasis> attribute of
+ <emphasis>Phrase</emphasis> is used in conjunction with a
+ <emphasis>Completeness</emphasis> attribute of
+ <emphasis>Complete (Sub)field</emphasis>, the term is matched
+ against the contents of the phrase (long word) register, if one
+ exists for the given <emphasis>Use</emphasis> attribute.
+ A phrase register is created for those fields in the
+ &acro.grs1; <filename>*.abs</filename> file that contains a
+ <literal>p</literal>-specifier.
+ <screen>
+ Z> scan @attr 1=Title @attr 4=1 @attr 6=3 beethoven
+ ...
+ bayreuther festspiele (1)
+ * beethoven bibliography database (1)
+ benny carter (1)
+ ...
+ Z> find @attr 1=Title @attr 4=1 @attr 6=3 "beethoven bibliography"
+ ...
+ Number of hits: 0, setno 5
+ ...
+ Z> find @attr 1=Title @attr 4=1 @attr 6=3 "beethoven bibliography database"
+ ...
+ Number of hits: 1, setno 6
+ </screen>
+ </para>
+
+ <para>
+ If <emphasis>Structure</emphasis>=<emphasis>Phrase</emphasis> is
+ used in conjunction with <emphasis>Incomplete Field</emphasis> - the
+ default value for <emphasis>Completeness</emphasis>, the
+ search is directed against the normal word registers, but if the term
+ contains multiple words, the term will only match if all of the words
+ are found immediately adjacent, and in the given order.
+ The word search is performed on those fields that are indexed as
+ type <literal>w</literal> in the &acro.grs1; <filename>*.abs</filename> file.
+ <screen>
+ Z> scan @attr 1=Title @attr 4=1 @attr 6=1 beethoven
+ ...
+ beefheart (1)
+ * beethoven (18)
+ beethovens (7)
+ ...
+ Z> find @attr 1=Title @attr 4=1 @attr 6=1 beethoven
+ ...
+ Number of hits: 18, setno 1
+ ...
+ Z> find @attr 1=Title @attr 4=1 @attr 6=1 "beethoven bibliography"
+ ...
+ Number of hits: 2, setno 2
+ ...
+ </screen>
+ </para>
+
+ <para>
+ If the <emphasis>Structure</emphasis> attribute is
+ <emphasis>Word List</emphasis>,
+ <emphasis>Free-form Text</emphasis>, or
+ <emphasis>Document Text</emphasis>, the term is treated as a
+ natural-language, relevance-ranked query.
+ This search type uses the word register, i.e. those fields
+ that are indexed as type <literal>w</literal> in the
+ &acro.grs1; <filename>*.abs</filename> file.
+ </para>
+
+ <para>
+ If the <emphasis>Structure</emphasis> attribute is
+ <emphasis>Numeric String</emphasis> the term is treated as an integer.
+ The search is performed on those fields that are indexed
+ as type <literal>n</literal> in the &acro.grs1;
+ <filename>*.abs</filename> file.
+ </para>
+
+ <para>
+ If the <emphasis>Structure</emphasis> attribute is
+ <emphasis>URX</emphasis> the term is treated as a URX (URL) entity.
+ The search is performed on those fields that are indexed as type
+ <literal>u</literal> in the <filename>*.abs</filename> file.
+ </para>
+
+ <para>
+ If the <emphasis>Structure</emphasis> attribute is
+ <emphasis>Local Number</emphasis> the term is treated as
+ native &zebra; Record Identifier.
+ </para>
+
+ <para>
+ If the <emphasis>Relation</emphasis> attribute is
+ <emphasis>Equals</emphasis> (default), the term is matched
+ in a normal fashion (modulo truncation and processing of
+ individual words, if required).
+ If <emphasis>Relation</emphasis> is <emphasis>Less Than</emphasis>,
+ <emphasis>Less Than or Equal</emphasis>,
+ <emphasis>Greater than</emphasis>, or <emphasis>Greater than or
+ Equal</emphasis>, the term is assumed to be numerical, and a
+ standard regular expression is constructed to match the given
+ expression.
+ If <emphasis>Relation</emphasis> is <emphasis>Relevance</emphasis>,
+ the standard natural-language query processor is invoked.
+ </para>
+
+ <para>
+ For the <emphasis>Truncation</emphasis> attribute,
+ <emphasis>No Truncation</emphasis> is the default.
+ <emphasis>Left Truncation</emphasis> is not supported.
+ <emphasis>Process # in search term</emphasis> is supported, as is
+ <emphasis>Regxp-1</emphasis>.
+ <emphasis>Regxp-2</emphasis> enables the fault-tolerant (fuzzy)
+ search. As a default, a single error (deletion, insertion,
+ replacement) is accepted when terms are matched against the register
+ contents.
+ </para>
+
+ </section>
+ </section>
+
+ <section id="querymodel-regular">
+ <title>&zebra; Regular Expressions in Truncation Attribute (type = 5)</title>