+ <section id="querymodel-pqf-apt-mapping-accesspoint">
+ <title>Mapping of &acro.pqf; &acro.apt; access points</title>
+ <para>
+ &zebra; understands four fundamental different types of access
+ points, of which only the
+ <emphasis>numeric use attribute</emphasis> type access points
+ are defined by the <ulink url="&url.z39.50;">&acro.z3950;</ulink>
+ standard.
+ All other access point types are &zebra; specific, and non-portable.
+ </para>
+
+ <table id="querymodel-zebra-mapping-accesspoint-types" frame="top">
+ <title>Access point name mapping</title>
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Access Point</entry>
+ <entry>Type</entry>
+ <entry>Grammar</entry>
+ <entry>Notes</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>Use attribute</entry>
+ <entry>numeric</entry>
+ <entry>[1-9][1-9]*</entry>
+ <entry>directly mapped to string index name</entry>
+ </row>
+ <row>
+ <entry>String index name</entry>
+ <entry>string</entry>
+ <entry>[a-zA-Z](\-?[a-zA-Z0-9])*</entry>
+ <entry>normalized name is used as internal string index name</entry>
+ </row>
+ <row>
+ <entry>&zebra; internal index name</entry>
+ <entry>zebra</entry>
+ <entry>_[a-zA-Z](_?[a-zA-Z0-9])*</entry>
+ <entry>hardwired internal string index name</entry>
+ </row>
+ <row>
+ <entry>&acro.xpath; special index</entry>
+ <entry>XPath</entry>
+ <entry>/.*</entry>
+ <entry>special xpath search for &acro.grs1; indexed records</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>
+ <literal>Attribute set names</literal> and
+ <literal>string index names</literal> are normalizes
+ according to the following rules: all <emphasis>single</emphasis>
+ hyphens <literal>'-'</literal> are stripped, and all upper case
+ letters are folded to lower case.
+ </para>
+
+ <para>
+ <emphasis>Numeric use attributes</emphasis> are mapped
+ to the &zebra; internal
+ string index according to the attribute set definition in use.
+ The default attribute set is &acro.bib1;, and may be
+ omitted in the &acro.pqf; query.
+ </para>
+
+ <para>
+ According to normalization and numeric
+ use attribute mapping, it follows that the following
+ &acro.pqf; queries are considered equivalent (assuming the default
+ configuration has not been altered):
+ <screen>
+ Z> find @attr 1=Body-of-text serenade
+ Z> find @attr 1=bodyoftext serenade
+ Z> find @attr 1=BodyOfText serenade
+ Z> find @attr 1=bO-d-Y-of-tE-x-t serenade
+ Z> find @attr 1=1010 serenade
+ Z> find @attrset bib1 @attr 1=1010 serenade
+ Z> find @attrset bib1 @attr 1=1010 serenade
+ Z> find @attrset Bib1 @attr 1=1010 serenade
+ Z> find @attrset b-I-b-1 @attr 1=1010 serenade
+ </screen>
+ </para>
+
+ <para>
+ The <emphasis>numerical</emphasis>
+ <literal>use attributes (type 1)</literal>
+ are interpreted according to the
+ attribute sets which have been loaded in the
+ <literal>zebra.cfg</literal> file, and are matched against specific
+ fields as specified in the <literal>.abs</literal> file which
+ describes the profile of the records which have been loaded.
+ If no use attribute is provided, a default of
+ &acro.bib1; Use Any (1016) is assumed.
+ The predefined use attribute sets
+ can be reconfigured by tweaking the configuration files
+ <filename>tab/*.att</filename>, and
+ new attribute sets can be defined by adding similar files in the
+ configuration path <literal>profilePath</literal> of the server.
+ </para>
+
+ <para>
+ String indexes can be accessed directly,
+ independently which attribute set is in use. These are just
+ ignored. The above mentioned name normalization applies.
+ String index names are defined in the
+ used indexing filter configuration files, for example in the
+ &acro.grs1;
+ <filename>*.abs</filename> configuration files, or in the
+ <literal>alvis</literal> filter &acro.xslt; indexing stylesheets.
+ </para>
+
+ <para>
+ &zebra; internal indexes can be accessed directly,
+ according to the same rules as the user defined
+ string indexes. The only difference is that
+ &zebra; internal index names are hardwired,
+ all uppercase and
+ must start with the character <literal>'_'</literal>.
+ </para>
+
+ <para>
+ Finally, &acro.xpath; access points are only
+ available using the &acro.grs1; filter for indexing.
+ These access point names must start with the character
+ <literal>'/'</literal>, they are <emphasis>not
+ normalized</emphasis>, but passed unaltered to the &zebra; internal
+ &acro.xpath; engine. See <xref linkend="querymodel-use-xpath"/>.
+
+ </para>
+
+
+ </section>
+
+
+ <section id="querymodel-pqf-apt-mapping-structuretype">
+ <title>Mapping of &acro.pqf; &acro.apt; structure and completeness to
+ register type</title>
+ <para>
+ Internally &zebra; has in its default configuration several
+ different types of registers or indexes, whose tokenization and
+ character normalization rules differ. This reflects the fact that
+ searching fundamental different tokens like dates, numbers,
+ bitfields and string based text needs different rule sets.
+ </para>
+
+ <table id="querymodel-zebra-mapping-structure-types" frame="top">
+ <title>Structure and completeness mapping to register types</title>
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Structure</entry>
+ <entry>Completeness</entry>
+ <entry>Register type</entry>
+ <entry>Notes</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>
+ phrase (@attr 4=1), word (@attr 4=2),
+ word-list (@attr 4=6),
+ free-form-text (@attr 4=105), or document-text (@attr 4=106)
+ </entry>
+ <entry>Incomplete field (@attr 6=1)</entry>
+ <entry>Word ('w')</entry>
+ <entry>Traditional tokenized and character normalized word index</entry>
+ </row>
+ <row>
+ <entry>
+ phrase (@attr 4=1), word (@attr 4=2),
+ word-list (@attr 4=6),
+ free-form-text (@attr 4=105), or document-text (@attr 4=106)
+ </entry>
+ <entry>complete field' (@attr 6=3)</entry>
+ <entry>Phrase ('p')</entry>
+ <entry>Character normalized, but not tokenized index for phrase
+ matches
+ </entry>
+ </row>
+ <row>
+ <entry>urx (@attr 4=104)</entry>
+ <entry>ignored</entry>
+ <entry>URX/URL ('u')</entry>
+ <entry>Special index for URL web addresses</entry>
+ </row>
+ <row>
+ <entry>numeric (@attr 4=109)</entry>
+ <entry>ignored</entry>
+ <entry>Numeric ('n')</entry>
+ <entry>Special index for digital numbers</entry>
+ </row>
+ <row>
+ <entry>key (@attr 4=3)</entry>
+ <entry>ignored</entry>
+ <entry>Null bitmap ('0')</entry>
+ <entry>Used for non-tokenized and non-normalized bit sequences</entry>
+ </row>
+ <row>
+ <entry>year (@attr 4=4)</entry>
+ <entry>ignored</entry>
+ <entry>Year ('y')</entry>
+ <entry>Non-tokenized and non-normalized 4 digit numbers</entry>
+ </row>
+ <row>
+ <entry>date (@attr 4=5)</entry>
+ <entry>ignored</entry>
+ <entry>Date ('d')</entry>
+ <entry>Non-tokenized and non-normalized ISO date strings</entry>
+ </row>
+ <row>
+ <entry>ignored</entry>
+ <entry>ignored</entry>
+ <entry>Sort ('s')</entry>
+ <entry>Used with special sort attribute set (@attr 7=1, @attr 7=2)</entry>
+ </row>
+ <row>
+ <entry>overruled</entry>
+ <entry>overruled</entry>
+ <entry>special</entry>
+ <entry>Internal record ID register, used whenever
+ Relation Always Matches (@attr 2=103) is specified</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <!-- see in util/zebramap.c -->
+
+ <para>
+ If a <emphasis>Structure</emphasis> attribute of
+ <emphasis>Phrase</emphasis> is used in conjunction with a
+ <emphasis>Completeness</emphasis> attribute of
+ <emphasis>Complete (Sub)field</emphasis>, the term is matched
+ against the contents of the phrase (long word) register, if one
+ exists for the given <emphasis>Use</emphasis> attribute.
+ A phrase register is created for those fields in the
+ &acro.grs1; <filename>*.abs</filename> file that contains a
+ <literal>p</literal>-specifier.
+ <screen>
+ Z> scan @attr 1=Title @attr 4=1 @attr 6=3 beethoven
+ ...
+ bayreuther festspiele (1)
+ * beethoven bibliography database (1)
+ benny carter (1)
+ ...
+ Z> find @attr 1=Title @attr 4=1 @attr 6=3 "beethoven bibliography"
+ ...
+ Number of hits: 0, setno 5
+ ...
+ Z> find @attr 1=Title @attr 4=1 @attr 6=3 "beethoven bibliography database"
+ ...
+ Number of hits: 1, setno 6
+ </screen>
+ </para>
+
+ <para>
+ If <emphasis>Structure</emphasis>=<emphasis>Phrase</emphasis> is
+ used in conjunction with <emphasis>Incomplete Field</emphasis> - the
+ default value for <emphasis>Completeness</emphasis>, the
+ search is directed against the normal word registers, but if the term
+ contains multiple words, the term will only match if all of the words
+ are found immediately adjacent, and in the given order.
+ The word search is performed on those fields that are indexed as
+ type <literal>w</literal> in the &acro.grs1; <filename>*.abs</filename> file.
+ <screen>
+ Z> scan @attr 1=Title @attr 4=1 @attr 6=1 beethoven
+ ...
+ beefheart (1)
+ * beethoven (18)
+ beethovens (7)
+ ...
+ Z> find @attr 1=Title @attr 4=1 @attr 6=1 beethoven
+ ...
+ Number of hits: 18, setno 1
+ ...
+ Z> find @attr 1=Title @attr 4=1 @attr 6=1 "beethoven bibliography"
+ ...
+ Number of hits: 2, setno 2
+ ...
+ </screen>
+ </para>
+
+ <para>
+ If the <emphasis>Structure</emphasis> attribute is
+ <emphasis>Word List</emphasis>,
+ <emphasis>Free-form Text</emphasis>, or
+ <emphasis>Document Text</emphasis>, the term is treated as a
+ natural-language, relevance-ranked query.
+ This search type uses the word register, i.e. those fields
+ that are indexed as type <literal>w</literal> in the
+ &acro.grs1; <filename>*.abs</filename> file.
+ </para>
+
+ <para>
+ If the <emphasis>Structure</emphasis> attribute is
+ <emphasis>Numeric String</emphasis> the term is treated as an integer.
+ The search is performed on those fields that are indexed
+ as type <literal>n</literal> in the &acro.grs1;
+ <filename>*.abs</filename> file.
+ </para>
+
+ <para>
+ If the <emphasis>Structure</emphasis> attribute is
+ <emphasis>URX</emphasis> the term is treated as a URX (URL) entity.
+ The search is performed on those fields that are indexed as type
+ <literal>u</literal> in the <filename>*.abs</filename> file.
+ </para>
+
+ <para>
+ If the <emphasis>Structure</emphasis> attribute is
+ <emphasis>Local Number</emphasis> the term is treated as
+ native &zebra; Record Identifier.
+ </para>
+
+ <para>
+ If the <emphasis>Relation</emphasis> attribute is
+ <emphasis>Equals</emphasis> (default), the term is matched
+ in a normal fashion (modulo truncation and processing of
+ individual words, if required).
+ If <emphasis>Relation</emphasis> is <emphasis>Less Than</emphasis>,
+ <emphasis>Less Than or Equal</emphasis>,
+ <emphasis>Greater than</emphasis>, or <emphasis>Greater than or
+ Equal</emphasis>, the term is assumed to be numerical, and a
+ standard regular expression is constructed to match the given
+ expression.
+ If <emphasis>Relation</emphasis> is <emphasis>Relevance</emphasis>,
+ the standard natural-language query processor is invoked.
+ </para>
+
+ <para>
+ For the <emphasis>Truncation</emphasis> attribute,
+ <emphasis>No Truncation</emphasis> is the default.
+ <emphasis>Left Truncation</emphasis> is not supported.
+ <emphasis>Process # in search term</emphasis> is supported, as is
+ <emphasis>Regxp-1</emphasis>.
+ <emphasis>Regxp-2</emphasis> enables the fault-tolerant (fuzzy)
+ search. As a default, a single error (deletion, insertion,
+ replacement) is accepted when terms are matched against the register
+ contents.
+ </para>
+
+ </section>
+ </section>
+
+ <section id="querymodel-regular">
+ <title>&zebra; Regular Expressions in Truncation Attribute (type = 5)</title>
+