The intermediate, internal representation of the record looks like
this:
<screen><![CDATA[
- <record xmlns="http://www.indexdata.com/pazpar2/1.0"
- mergekey="title The Shining author King, Stephen">
+ <record xmlns="http://www.indexdata.com/pazpar2/1.0"
+ mergekey="title The Shining author King, Stephen">
- <metadata type="title" rank="2">The Shining</metadata>
+ <metadata type="title" rank="2">The Shining</metadata>
- <metadata type="author">King, Stephen</metadata>
+ <metadata type="author">King, Stephen</metadata>
- <metadata type="kind">ebook</metadata>
-
- <!-- ... and so on -->
- </record>
- ]]></screen>
+ <metadata type="kind">ebook</metadata>
+ <!-- ... and so on -->
+ </record>
+]]></screen>
As you can see, there isn't much to it. There are really only a few
important elements to this file.
in the retrieval record ultimately drives merging, sorting, ranking,
the extraction of browse facets, and display, all configurable.
</para>
+
+ <para>
+ Pazpar2 1.6.37 and later also allows already clustered records to
+ be ingested. Suppose a database already clusters for us and we would like
+ to keep that cluster for Pazpar2. In that case we can generate a
+ <literal>cluster</literal> wrapper element that holds individual
+ <literal>record</literal> elements.
+ </para>
+ <para>
+ Cluster record example:
+ <screen><![CDATA[
+ <cluster xmlns="http://www.indexdata.com/pazpar2/1.0">
+ <record>
+ <metadata type="title" rank="2">The Shining</metadata>
+ <metadata type="author">King, Stephen</metadata>
+ <metadata type="kind">ebook</metadata>
+ </record>
+ <record>
+ <metadata type="title" rank="2">The Shining</metadata>
+ <metadata type="author">King, Stephen</metadata>
+ <metadata type="kind">audio</metadata>
+ </record>
+ </cluster>
+ ]]></screen>
+ </para>
</section>
<section id="client">
search. You start a new search using the 'search' command. Once the
search has been started, you can follow its progress using the
'stat', 'bytarget', 'termlist', or 'show' commands. Detailed records
- can be fetched using the 'record' command.
+ can be fetched using the 'record' command.
</para>
</section>
tf[i] = tf[i] / cluster_size;
relevance += 100000 * tf[i] / idf[i];
]]></screen>
+ <para>
+ For controlling the ranking parameters, refer to the
+ <link linkend="service-rank">rank</link> element of the
+ service definition.
+ Refer to the <link linkend="metadata-rank">rank</link> attribute
+ of the metadata element for how to control ranking for individual
+ metadata fields.
+ </para>
</section> <!-- relevance_ranking -->
<section id="masterkey_connect">