1 <!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN"
2 "http://www.oasis-open.org/docbook/xml/4.1/docbookx.dtd"
4 <!ENTITY % local SYSTEM "local.ent">
6 <!ENTITY % entities SYSTEM "entities.ent">
8 <!ENTITY % common SYSTEM "common/common.ent">
11 <!-- $Id: pazpar2_conf.xml,v 1.3 2007-01-19 18:28:08 quinn Exp $ -->
12 <refentry id="pazpar2_conf">
14 <productname>Pazpar2</productname>
15 <productnumber>&version;</productnumber>
18 <refentrytitle>Pazpar2 conf</refentrytitle>
19 <manvolnum>5</manvolnum>
23 <refname>pazpar2_conf</refname>
24 <refpurpose>Pazpar2 Configuration</refpurpose>
29 <command>pazpar2.conf</command>
33 <refsect1><title>DESCRIPTION</title>
35 The pazpar2 configuration file, together with any referenced XSLT files,
36 govern pazpar2's behavior as a client, and control the normalization and
37 extraction of data elements from incoming result records, for the
38 purposes of merging, sorting, facet analysis, and display.
42 The file is specified using the option -f on the pazpar2 command line.
43 There is not presently a way to reload the configuration file without
44 restarting pazpar2, although this will most likely be added some time
49 <refsect1><title>FORMAT</title>
51 The configuration file is XML-structured. It must be valid XML. All
52 elements specific to pazpar2 should belong to the namespace
53 "http://www.indexdata.com/pazpar2/1.0" (this is assumed in the
54 following examples). The root element is named 'pazpar2'. Under the
55 root element are a number of elements which group categories of
56 information. The categories are described below.
59 <refsect2 id="config-server"><title>server</title>
61 This section governs overall behavior of the client. The data
62 elements are described below.
64 <variablelist> <!-- level 1 -->
69 Configures the webservice -- this controls how you can connect
70 to pazpar2 from your browser or server-side code. The
71 attributes 'host' and 'port' control the binding of the
72 server. The 'host' attribute can be used to bind the server to
73 a secondary IP address of your system, enabling you to run
74 pazpar2 on port 80 alongside a conventional web server. You
75 can override this setting on the command lineusing the option -h.
84 If this item is given, pazpar2 will forward all incoming HTTP
85 requests that do not contain the filename 'search.pz2' to the
86 host and port specified using the 'host' and 'port'
87 attributes. This functionality is crucial if you wish to use
88 pazpar2 in conjunction with browser-based code (JS, Flash,
89 applets, etc.) which operates in a security sandbox. Such code
90 can only connect to the same server from which the enclosing
91 HTML page originated. Pazpar2s proxy functionality enables you
92 to host all of the main pages (plus images, CSS, etc) of your
93 application on a conventional webserver, while efficiently
94 processing webservice requests for metasearch status, results,
104 This nested element controls the behavior of pazpar2 with
105 respect to your data model. In pazpar2, incoming records are
106 normalized, using XSLT, into an internal representation (see
108 id="config-retrievalprofile">retrievalprofile</link> secion.
109 The 'service' section controls the further processing and
110 extraction of data from the internal representation, primarily
111 through the 'metdata' sub-element.
114 <variablelist> <!-- Level 2 -->
115 <varlistentry><term>metadata</term>
117 One of these elements is required for every data element in
118 the internal representation of the record (see
119 <xref linkend="data_model"/>. It governs
120 subsequent processing as pertains to sorting, relevance
121 ranking, merging, and display of data elements. It supports
122 the following attributes:
125 <variablelist> <!-- level 3 -->
126 <varlistentry><term>name</term>
129 This is the name of the data element. It is matched
130 against the 'type' attribute of the 'metadata' element
131 in the normalized record. A warning is produced if
132 metdata elements with an unknown name are found in the
133 normalized record. This name is also used to represent
134 data elements in the records returned by the
135 webservice API, and to name sort lists and browse
141 <varlistentry><term>type</term>
144 The type of data element. This value governs any
145 normalization or special processing that might take
146 place on an element. Possible values are 'generic'
147 (basic string), 'year' (a range is computed if
148 multiple years are found in the record). Note: This
149 list is likely to increase in the future.
154 <varlistentry><term>brief</term>
157 If this is set to 'yes', then the data element is
158 includes in brief records in the webservice API. Note
159 that this only makes sense for metadata elements that
160 are merged (see below). The default value is 'no'.
165 <varlistentry><term>sortkey</term>
168 Specifies that this data element is to be used for
169 sorting. The possible values are 'numeric' (numeric
170 value), 'skiparticle' (string; skip common, leading
171 articles), and 'no' (no sorting). The default value is
177 <varlistentry><term>rank</term>
180 Specifies that this element is to be used to help rank
181 records against the user's query (when ranking is
182 requested). The value is an integer, used as a
183 multiplier against the basic TF*IDF score. A value of
184 1 is the base, higher values give additional weight to
185 elements of this type. The default is '0', which
186 excludes this element from the rank calculation.
191 <varlistentry><term>termlist</term>
194 Specifies that this element is to be used as a
195 termlist, or browse facet. Values are tabulated from
196 incoming records, and a highscore of values (with
197 their associated frequency) is made available to the
198 client through the webservice API. The possible values
199 are 'yes' and 'no' (default).
204 <varlistentry><term>merge</term>
207 This governs whether, and how elements are extracted
208 from individual records and merged into cluster
209 records. The possible values are: 'unique' (include
210 all unique elements), 'longest' (include only the
211 longest element (strlen), 'range' (calculate a range
212 of values across al matching records), 'all' (include
213 all elements), or 'no' (don't merge; this is the
218 </variablelist> <!-- attributes to metadata -->
220 </variablelist> <!-- Data elements in service directive -->
223 </variablelist> <!-- Data elements in server directive -->
226 <refsect2 id="config-queryprofile">
228 At the moment, this directive is ignored; there is one global
229 CCL-mapping file which governs the mapping of queries to Z39.50
230 type-1. This file is located in etc/default.bib. This will change
235 <refsect2 id="config-retrievalprofile">
237 Note: In the present version, there is a single retrieval
238 profile. However, in a future release, it will be possible to
239 associate unique retrieval profiles with different targets, or to
240 generate retrieval profiles using XSLT from the ZeeRex description of
245 The following data elements are recognized for the retrievalprofile
250 <varlistentry><term>requestsyntax</term>
253 This element specifies the request syntax to be used in queries. It only
254 makes sense for Z39.50-type targets.
259 <varlistentry><term>nativesyntax</term>
262 This element specifies the native syntax and encoding of the
263 result records. The default is XML. The following attributes
267 <varlistentry><term>name</term>
270 The name of the syntax. Currently recognized values are
271 'iso2709' (MARC), and 'xml'.
276 <varlistentry><term>format</term>
279 The format, or schema, to be expected. Default is
285 <varlistentry><term>encoding</term>
288 The encoding of the response record. Typical values for
289 MARC records are 'marc8' (general MARC-8), 'marc8s'
290 (MARC-8, but maps to precomposed UTF-8 characters, more
291 suitable for use in web browsers), 'latin1'.
296 <varlistentry><term>mapto</term>
299 Specifies the flavor of MARCXML to map results to.
300 Default is 'marcxml'. 'marcxchange' is also possible, and
301 useful for Danish DANMARC records.
305 </variablelist> <!-- parameters to nativesyntax directive -->
308 </variablelist> <!-- sub-elements in retrievalprofile -->
313 <refsect1><title>OPTIONS</title>
317 <refsect1><title>EXAMPLES</title>
321 <refsect1><title>FILES</title>
326 <!-- Keep this comment at the end of the file
331 sgml-minimize-attributes:nil
332 sgml-always-quote-attributes:t
335 sgml-parent-document:nil
336 sgml-local-catalogs: nil
337 sgml-namecase-general:t