X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Fbook.xml;h=e469cbacb11d3bc5778d98291dc063e4358f42a3;hb=dc08032daf548ac1a49ae975fa7325d2f28896f1;hp=69fc0d94726793730fc271f9014e27f8f729c1df;hpb=d1a4fc1ceff8d4110fb58909f102883433d0160f;p=pazpar2-moved-to-github.git diff --git a/doc/book.xml b/doc/book.xml index 69fc0d9..e469cba 100644 --- a/doc/book.xml +++ b/doc/book.xml @@ -6,10 +6,10 @@ %local; %entities; - - %common; + + %idcommon; ]> - + Pazpar2 - User's Guide and Reference @@ -19,6 +19,9 @@ AdamDickmeiss + + MarcCromme + &version; ©right-year; @@ -28,7 +31,7 @@ Pazpar2 is a high-performance, user interface-independent, data model-independent metasearching - middleware featuring merging, relevance ranking, record sorting, + middle-ware featuring merging, relevance ranking, record sorting, and faceted results. @@ -45,392 +48,504 @@ - + - - Introduction - - Pazpar2 is a stand-alone metasearch client with a webservice API, designed - to be used either from a browser-based client (JavaScript, Flash, Java, - etc.), from from server-side code, or any combination of the two. - Pazpar2 is a highly optimized client designed to - search many resources in parallel. It implements record merging, - relevance-ranking and sorting by arbitrary data content, and facet - analysis for browsing purposes. It is designed to be data model - independent, and is capable of working with MARC, DublinCore, or any - other XML-structured response format -- XSLT is used to normalize and extract - data from retrieval records for display and analysis. It can be used - against any server which supports the Z39.50 protocol. Proprietary - backend modules can be used to support a large number of other protocols - (please contact Index Data for further information about this). - - - Additional functionality such as - user management, attractive displays are expected to be implemented by - applications that use pazpar2. Pazpar2 is user interface independent. - Its functionality is exposed through a simple REST-style webservice API, - designed to be simple to use from an Ajax-enbled browser, Flash - animation, Java applet, etc., or from a higher-level server-side language - like PHP or Java. Because session information can be shared between - browser-based logic and your server-side scripting, there is tremendous - flexibility in how you implement your business logic on top of pazpar2. - - - Once you launch a search in pazpar2, the operation continues behind the - scenes. Pazpar2 connects to servers, carries out searches, and - retrieves, deduplicates, and stores results internally. Your application - code may periodically inquire about the status of an ongoing operation, - and ask to see records or other result set facets. Result become - available immediately, and it is easy to build end-user interfaces which - feel extremely responsive, even when searching more than 100 servers - concurrently. - - - Pazpar2 is designed to be highly configurable. Incoming records are - normalized to XML/UTF-8, and then further normalized using XSLT to a - simple internal representation that is suitable for analysis. By - providing XSLT stylesheets for different kinds of result records, you - can tune pazpar2 to work against different kinds of information - retrieval servers. Finally, metadata is extracted, in a configurable - way, from this internal record, to support display, merging, ranking, - result set facets, and sorting. Pazpar2 is not bound to a specific model - of metadata, such as DublinCore or MARC -- by providing the right - configuration, it can work with a number of different kinds of data in - support of many different applications. - - - Pazpar2 is designed to be efficient and scalable. You can set it up to - search several hundred targets in parallel, or you can use it to support - hundreds of concurrent users. It is implemented with the same attention - to performance and economy that we use in our indexing engines, so that - you can focus on building your application, without worrying about the - details of metasearch logic. You can devote all of your attention to - usability and let pazpar2 do what it does best -- metasearch. - - - If you wish to connect to commercial or other databases which do not - support open standards, please contact Index Data. We have a licensing - agreement with a third party vendor which will enable pazpar2 to access - thousands of online databases, in addition the vast number of catalogs - and online services that support the Z39.50 protocol. - - - Pazpar2 is our attempt to re-think the traditional paradigms for - implementing and deploying metasearch logic, with an uncompromising - approach to performance, and attempting to make maximum use of the - capabilities of modern browsers. The demo user interface that - accompanies the distribution is but one example. If you think of new - ways of using pazpar2, we hope you'll share them with us, and if we - can provide assistance with regards to training, design, programming, - integration with different backends, hosting, or support, please don't - hesitate to contact us. If you'd like to see functionality in pazpar2 - that is not there today, please don't hesitate to contact us. It may - already be in our development pipeline, or there might be a - possibility for you to help out by sponsoring development time or - code. Either way, get in touch and we will give you straight answers. - - - Enjoy! - - - Pazpar2 is covered by the GNU license version 2. - See for further information. - - - - - Installation + + Introduction + + Pazpar2 is a stand-alone metasearch client with a web-service API, designed + to be used either from a browser-based client (JavaScript, Flash, Java, + etc.), from from server-side code, or any combination of the two. + Pazpar2 is a highly optimized client designed to + search many resources in parallel. It implements record merging, + relevance-ranking and sorting by arbitrary data content, and facet + analysis for browsing purposes. It is designed to be data model + independent, and is capable of working with MARC, DublinCore, or any + other XML-structured response format + -- XSLT is used to normalize and extract + data from retrieval records for display and analysis. It can be used + against any server which supports the + Z39.50 protocol. Proprietary + backend modules can be used to support a large number of other protocols + (please contact Index Data for further information about this). + + + Additional functionality such as + user management, attractive displays are expected to be implemented by + applications that use Pazpar2. Pazpar2 is user interface independent. + Its functionality is exposed through a simple REST-style web-service API, + designed to be simple to use from an Ajax-enabled browser, Flash + animation, Java applet, etc., or from a higher-level server-side language + like PHP or Java. Because session information can be shared between + browser-based logic and your server-side scripting, there is tremendous + flexibility in how you implement your business logic on top of Pazpar2. + + + Once you launch a search in Pazpar2, the operation continues behind the + scenes. Pazpar2 connects to servers, carries out searches, and + retrieves, deduplicates, and stores results internally. Your application + code may periodically inquire about the status of an ongoing operation, + and ask to see records or other result set facets. Result become + available immediately, and it is easy to build end-user interfaces which + feel extremely responsive, even when searching more than 100 servers + concurrently. + + + Pazpar2 is designed to be highly configurable. Incoming records are + normalized to XML/UTF-8, and then further normalized using XSLT to a + simple internal representation that is suitable for analysis. By + providing XSLT stylesheets for different kinds of result records, you + can tune Pazpar2 to work against different kinds of information + retrieval servers. Finally, metadata is extracted, in a configurable + way, from this internal record, to support display, merging, ranking, + result set facets, and sorting. Pazpar2 is not bound to a specific model + of metadata, such as DublinCore or MARC -- by providing the right + configuration, it can work with a number of different kinds of data in + support of many different applications. + + + Pazpar2 is designed to be efficient and scalable. You can set it up to + search several hundred targets in parallel, or you can use it to support + hundreds of concurrent users. It is implemented with the same attention + to performance and economy that we use in our indexing engines, so that + you can focus on building your application, without worrying about the + details of metasearch logic. You can devote all of your attention to + usability and let Pazpar2 do what it does best -- metasearch. + + + If you wish to connect to commercial or other databases which do not + support open standards, please contact Index Data. We have a licensing + agreement with a third party vendor which will enable Pazpar2 to access + thousands of online databases, in addition the vast number of catalogs + and online services that support the Z39.50 protocol. + + + Pazpar2 is our attempt to re-think the traditional paradigms for + implementing and deploying metasearch logic, with an uncompromising + approach to performance, and attempting to make maximum use of the + capabilities of modern browsers. The demo user interface that + accompanies the distribution is but one example. If you think of new + ways of using Pazpar2, we hope you'll share them with us, and if we + can provide assistance with regards to training, design, programming, + integration with different backends, hosting, or support, please don't + hesitate to contact us. If you'd like to see functionality in Pazpar2 + that is not there today, please don't hesitate to contact us. It may + already be in our development pipeline, or there might be a + possibility for you to help out by sponsoring development time or + code. Either way, get in touch and we will give you straight answers. + + + Enjoy! + + + Pazpar2 is covered by the GNU license version 2. + See for further information. + + + + + Installation + + Pazpar2 depends on the following tools/libraries: + + YAZ + + + The popular Z39.50 toolkit for the C language. YAZ must be + compiled with Libxml2/Libxslt support. + + + + International + Components for Unicode (ICU) + + + ICU provides Unicode support for non-English languages with + character sets outside the range of 7bit ASCII, like + Greek, Russian, German and French. Pazpar2 uses the ICU + Unicode character conversions, Unicode normalization, case + folding and other fundamental operations needed in + tokenization, normalization and ranking of records. + + + Compiling, linking, and usage of the ICU libraries is optional, + but strongly recommended for usage in an international + environment. + + + + + + + In order to compile Pazpar2 an ANSI C compiler is + required. The requirements should be the same as for YAZ. + + +
+ Installation on Unix (from Source) - Pazpar2 depends on the following tools/libraries: - - YAZ - - - The popular Z39.50 toolkit for the C language. YAZ must be - compiled with Libxml2/Libxslt support. - - - - + The latest source code for Pazpar2 is available from + . + Only few systems have none of the required + tools binary packages. If, for example, Libxml2/libXSLT are already + installed as development packages use these. + - In order to compile Pazpar2 an ANSI C compiler is - required. The requirements should be the same as for YAZ. + Ensure that the development libraries + header files are + available on your system before compiling Pazpar2. For installation + of YAZ, refer to the YAZ installation chapter. + + gunzip -c pazpar2-version.tar.gz|tar xf - + cd pazpar2-version + ./configure + make + su + make install + +
-
- Installation on Unix (from Source) - - Here is a quick step-by-step guide on how to compile the - tools that Pazpar2 uses. Only few systems have none of the required - tools binary packages. If, for example, Libxml2/libxslt are already - installed as development packages use these. - - - - Ensure that the development libraries + header files are - available on your system before compiling Pazpar2. For installation - of YAZ, refer to the YAZ installation chapter. - - - gunzip -c pazpar2-version.tar.gz|tar xf - - cd pazpar2-version - ./configure - make - su - make install - -
+
+ Installation on Debian GNU/Linux + + Index Data provides Debian packages for Pazpar2. These are prepared + for Debian versions Etch and Lenny (as of 2007). + Theses packages are available at + . + +
-
- Installation on Debian GNU/Linux - - All dependencies for Pazpar2 are available as - Debian - packages for the sarge (stable in 2005) and etch (testing in 2005) - distributions. - - - The procedures for Debian based systems, such as - Ubuntu is probably similar - +
+ Apache 2 Proxy + + Apache 2 has a + + proxy module + which allows Pazpar2 to become a backend to an Apache 2 + based web service. The Apache 2 proxy must operate in the + Reverse Proxy mode. + + + + On a Debian based Apache 2 system, the relevant modules can + be enabled with: - apt-get install libyaz-dev + sudo a2enmod proxy_http + + + + Traditionally Pazpar2 interprets URL paths with suffix + /search.pz2. + The + ProxyPass directive of Apache must be used to map a URL path + the the Pazpar2 server (listening port). + + + - With these packages installed, the usual configure + make - procedure can be used for Pazpar2 as outlined in - . + The ProxyPass directive takes a prefix rather than + a suffix as URL path. It is important that the Java Script code + uses the prefix given for it. -
- + - - Using pazpar2 + + Apache 2 proxy configuration - This chapter provides a general introduction to the use and deployment of pazpar2. + If Pazpar2 is running on port 8004 and the portal is using + search.pz2 inside portal in directory + /myportal/ we could use the following + Apache 2 configuration: + + + ProxyRequests Off + + + AddDefaultCharset off + Order deny,allow + Allow from all + + + ProxyPass /myportal/search.pz2 http://localhost:8004/search.pz2 + ProxyVia Off + + ]]> + +
-
- Pazpar2 and your systems architecture - - Pazpar2 is designed to provide asynchronous, behind-the-scenes - metasearching functionality to your application, exposing this - functionality using a simple webservice API that can be accessed - from any number of development environments. In particular, it is - possible to combine pazpar2 either with your server-side dynamic - website scripting, with scripting or code running in the browser, or - with any combination of the two. Pazpar2 is an excellent tool for - building advanced, Ajax-based user interfaces for metasearch - functionality, but it isn't a requirement -- you can choose to use - pazpar2 entirely as a backend to your regular server-side scripting. - When you do use pazpar2 in conjunction - with browser scripting (JavaScript/Ajax, Flash, applets, etc.), there are - special considerations. - + - - Pazpar2 implements a simple but efficient HTTP server, and it is - designed to interact directly with scripting running in the browser - for the best possible performance, and to limit overhead when - several browser clients generate numerous webservice requests. - However, it is still desirable to use a conventional webserver, - such as Apache, to serve up graphics, HTML documents, and - server-side scripting. Because the security sandbox environment of - most browser-side programming environments only allows communication - with the server from which the enclosing HTML page or object - originated, pazpar2 is designed so that it can act as a transparent - proxy in front of an existing webserver (see for details). In this mode, all regular - HTTP requests are transparently passed through to your webserver, - while pazpar2 only intercepts search-related webservice requests. - + + Using Pazpar2 + + This chapter provides a general introduction to the use and + deployment of Pazpar2. + - - If you want to expose your combined service on port 80, you can - either run your regular webserver on a different port, a different - server, or a different IP address associated with the same server. - +
+ Pazpar2 and your systems architecture + + Pazpar2 is designed to provide asynchronous, behind-the-scenes + metasearching functionality to your application, exposing this + functionality using a simple webservice API that can be accessed + from any number of development environments. In particular, it is + possible to combine Pazpar2 either with your server-side dynamic + website scripting, with scripting or code running in the browser, or + with any combination of the two. Pazpar2 is an excellent tool for + building advanced, Ajax-based user interfaces for metasearch + functionality, but it isn't a requirement -- you can choose to use + Pazpar2 entirely as a backend to your regular server-side scripting. + When you do use Pazpar2 in conjunction + with browser scripting (JavaScript/Ajax, Flash, applets, + etc.), there are special considerations. + - - Sometimes, it may be necessary to implement functionality on your - regular webserver that makes use of search results, for example to - implement data import functionality, emailing results, history - lists, personal citation lists, interlibrary loan functionality - ,etc. Fortunately, it is simple to exchange information between - pazpar2, your browser scripting, and backend server-side scripting. - You can send a session ID and possibly a record ID from your browser - code to your server code, and from there use pazpar2s webservice API - to access result sets or individual records. You could even 'hide' - all of pazpar2s functionality between your own API implemented on - the server-side, and access that from the browser or elsewhere. The - possibilities are just about endless. - -
+ + Pazpar2 implements a simple but efficient HTTP server, and it is + designed to interact directly with scripting running in the browser + for the best possible performance, and to limit overhead when + several browser clients generate numerous webservice requests. + However, it is still desirable to use a conventional webserver, + such as Apache, to serve up graphics, HTML documents, and + server-side scripting. Because the security sandbox environment of + most browser-side programming environments only allows communication + with the server from which the enclosing HTML page or object + originated, Pazpar2 is designed so that it can act as a transparent + proxy in front of an existing webserver (see for details). + In this mode, all regular + HTTP requests are transparently passed through to your webserver, + while Pazpar2 only intercepts search-related webservice requests. + -
- Your data model - - Pazpar2 does not have a preconceived model of what makes up a data - model. There are no assumption that records have specific fields or - that they are organized in any particular way. The only assumption - is that data comes packaged in a form that the software can work - with (presently, that means XML or MARC), and that you can provide - the necessary information to massage it into pazpar2's internal - record abstraction. - + + If you want to expose your combined service on port 80, you can + either run your regular webserver on a different port, a different + server, or a different IP address associated with the same server. + - - Handling retrieval records in pazpar2 is a two-step process. First, - you decide which data elements of the source record you are - interested in, and you specify any desired massaging or combining of - elements using an XSLT stylesheet (MARC records are automatically - normalized to MARCXML before this step). If desired, you can run - multiple XSLT stylesheets in series to accomplish this, but the - output of the last one should be a representation of the record in a - schema that pazpar2 understands. - + + Pazpar2 can also work behind + a reverse Proxy. Refer to ) + for more information. + This allows your existing HTTP server to operate on port 80 as usual. + Pazpar2 can be started on another (internal) port. + - - The intermediate, internal representation of the record looks like - this: - + + Sometimes, it may be necessary to implement functionality on your + regular webserver that makes use of search results, for example to + implement data import functionality, emailing results, history + lists, personal citation lists, interlibrary loan functionality + ,etc. Fortunately, it is simple to exchange information between + Pazpar2, your browser scripting, and backend server-side scripting. + You can send a session ID and possibly a record ID from your browser + code to your server code, and from there use Pazpar2s webservice API + to access result sets or individual records. You could even 'hide' + all of Pazpar2s functionality between your own API implemented on + the server-side, and access that from the browser or elsewhere. The + possibilities are just about endless. + +
- The Shining +
+ Your data model + + Pazpar2 does not have a preconceived model of what makes up a data + model. There are no assumption that records have specific fields or + that they are organized in any particular way. The only assumption + is that data comes packaged in a form that the software can work + with (presently, that means XML or MARC), and that you can provide + the necessary information to massage it into Pazpar2's internal + record abstraction. + - King, Stephen + + Handling retrieval records in Pazpar2 is a two-step process. First, + you decide which data elements of the source record you are + interested in, and you specify any desired massaging or combining of + elements using an XSLT stylesheet (MARC records are automatically + normalized to MARCXML before this step). If desired, you can run + multiple XSLT stylesheets in series to accomplish this, but the + output of the last one should be a representation of the record in a + schema that Pazpar2 understands. + - ebook + + The intermediate, internal representation of the record looks like + this: + - - -]]> + The Shining - As you can see, there isn't much to it. There are really only a few - important elements to this file. - + King, Stephen - - Elements should belong to the namespace - http://www.indexdata.com/pazpar2/1.0. If the root node contains the - attribute 'mergekey', then every record that generates the same - merge key (normalized for case differences, white space, and - truncation) will be joined into a cluster. In other words, you - decide how records are merged. If you don't include a merge key, - records are never merged. The 'metadata' elements provide the meat - of the elements -- the content. the 'type' attribute is used to - match each element against processing rules that determine what - happens to the data element next. - + ebook - - The next processing step is the extraction of metadata from the - intermediate representation of the record. This is governed by the - 'metadata' elements in the 'service' section of the configuration - file. See for details. The metadata - in the retrieval record ultimately drives merging, sorting, ranking, - the extraction of browse facets, and display, all configurable. - -
+ + + ]]> -
- Client development overview - - You can use pazpar2 from any environment that allows you to use - webservices. The initial goal of the software was to support - Ajax-based applications, but there literally are no limits to what - you can do. You can use pazpar2 from Javascript, Flash, Java, etc., - on the browser side, and from any development environment on the - server side, and you can pass session tokens and record IDs freely - around between these environments to build sophisticated applications. - Use your imagination. - + As you can see, there isn't much to it. There are really only a few + important elements to this file. + - - The webservice API of pazpar2 is described in detail in . - + + Elements should belong to the namespace + http://www.indexdata.com/pazpar2/1.0. + If the root node contains the + attribute 'mergekey', then every record that generates the same + merge key (normalized for case differences, white space, and + truncation) will be joined into a cluster. In other words, you + decide how records are merged. If you don't include a merge key, + records are never merged. The 'metadata' elements provide the meat + of the elements -- the content. the 'type' attribute is used to + match each element against processing rules that determine what + happens to the data element next. + - - In brief, you use the 'init' command to create a session, a - temporary workspace which carries information about the current - search. You start a new search using the 'search' command. Once the - search has been started, you can follow its progress using the - 'stat', 'bytarget', 'termlist', or 'show' commands. Detailed records - can be fetched using the 'record' command. - -
+ + The next processing step is the extraction of metadata from the + intermediate representation of the record. This is governed by the + 'metadata' elements in the 'service' section of the configuration + file. See for details. The metadata + in the retrieval record ultimately drives merging, sorting, ranking, + the extraction of browse facets, and display, all configurable. + +
-
- Connecting to non-standard resources - - Pazpar2 uses Z39.50 as its switchboard language -- i.e. as far as it - is concerned, all resources speak Z39.50. It is, however, equipped - to handle a broad range of different server behavior, through - configurable query mapping and record normalization. If you develop - configuration, stylesheets, etc., for a new type of resources, we - encourage you to share your work. But you can also use pazpar2 to - connect to hundreds of resources that do not support standard - protocols. - +
+ Client development overview + + You can use Pazpar2 from any environment that allows you to use + webservices. The initial goal of the software was to support + Ajax-based applications, but there literally are no limits to what + you can do. You can use Pazpar2 from Javascript, Flash, Java, etc., + on the browser side, and from any development environment on the + server side, and you can pass session tokens and record IDs freely + around between these environments to build sophisticated applications. + Use your imagination. + - - For a growing number of resources, Z39.50 is all you need. Over the - last few years, a number of commercial, full-text resources have - implemented Z39.50. These can be used through pazpar2 with little or - no effort. Resources that use non-standard record formats will - require a bit of XSLT work, but that's all. - + + The webservice API of Pazpar2 is described in detail in . + - - But what about resources that don't support Z39.50 at all? The NISO - SRU (MXG) protocol is slowly gathering steam. Other resources might - support OpenSearch, private, XML/HTTP-based protocols, or something - else entirely. Some databases exist only as web user interfaces and - will require screen-scraping. Still others exist only as static - files, or perhaps as databases supporting the OAI-PMH protocol. - There is hope! Read on. - + + In brief, you use the 'init' command to create a session, a + temporary workspace which carries information about the current + search. You start a new search using the 'search' command. Once the + search has been started, you can follow its progress using the + 'stat', 'bytarget', 'termlist', or 'show' commands. Detailed records + can be fetched using the 'record' command. + +
- - Index Data continues to advocate the support of open standards. We - work with database vendors to support standards, so you don't have - to worry about programming against non-standard services. We also - provide tools (see SimpleServer) - which make it comparatively easy to build gateways against servers - with non-standard behavior. Again, we encourage you to share any - work you do in this direction. - +
+ Connecting to non-standard resources + + Pazpar2 uses Z39.50 as its switchboard language -- i.e. as far as it + is concerned, all resources speak Z39.50. It is, however, equipped + to handle a broad range of different server behavior, through + configurable query mapping and record normalization. If you develop + configuration, stylesheets, etc., for a new type of resources, we + encourage you to share your work. But you can also use Pazpar2 to + connect to hundreds of resources that do not support standard + protocols. + - - But the bottom line is that working with non-standard resources in - metasearching is really, really hard. If you want to build a - project with pazpar2, and you need access to resources with - non-standard interfaces, we can help. We run gateways to more than - 2,000 popular, commercial databases and other resources, making it simple - to plug them directly into pazpar2. For a small annual fee per - database, we can help you establish connections to your licensed - resources. Meanwhile, you can help! If you build your own - standards-compliant gateways, host them for others, or share the - code! And tell your vendors that they can save everybody money and - increase the appeal of their resources by supporting standards. - + + For a growing number of resources, Z39.50 is all you need. Over the + last few years, a number of commercial, full-text resources have + implemented Z39.50. These can be used through Pazpar2 with little or + no effort. Resources that use non-standard record formats will + require a bit of XSLT work, but that's all. + - - There are those who will ask us why we are using Z39.50 as our - switchboard langyage rather than a different protocol. Basically, - we believe that Z39.50 is presently the most widely implemented - information retrieval protocol that has the level of functionality - required to support a good metasearching experience (structured - searching, structured, well-defined results). It is also compact and - efficient, and there is a very broad range of tools available to - implement it. - -
- + + But what about resources that don't support Z39.50 at all? The NISO + SRU (MXG) protocol is slowly gathering steam. Other resources might + support OpenSearch, private, XML/HTTP-based protocols, or something + else entirely. Some databases exist only as web user interfaces and + will require screen-scraping. Still others exist only as static + files, or perhaps as databases supporting the OAI-PMH protocol. + There is hope! Read on. + + + + Index Data continues to advocate the support of open standards. We + work with database vendors to support standards, so you don't have + to worry about programming against non-standard services. We also + provide tools (see SimpleServer) + which make it comparatively easy to build gateways against servers + with non-standard behavior. Again, we encourage you to share any + work you do in this direction. + + + + But the bottom line is that working with non-standard resources in + metasearching is really, really hard. If you want to build a + project with Pazpar2, and you need access to resources with + non-standard interfaces, we can help. We run gateways to more than + 2,000 popular, commercial databases and other resources, + making it simple + to plug them directly into Pazpar2. For a small annual fee per + database, we can help you establish connections to your licensed + resources. Meanwhile, you can help! If you build your own + standards-compliant gateways, host them for others, or share the + code! And tell your vendors that they can save everybody money and + increase the appeal of their resources by supporting standards. + + + + There are those who will ask us why we are using Z39.50 as our + switchboard language rather than a different protocol. Basically, + we believe that Z39.50 is presently the most widely implemented + information retrieval protocol that has the level of functionality + required to support a good metasearching experience (structured + searching, structured, well-defined results). It is also compact and + efficient, and there is a very broad range of tools available to + implement it. + +
+ +
+ Unicode Compliance + + Pazpar2 is Unicode compliant and language and locale aware but relies + on character encoding for the targets to be specified correctly if + the targets themselves are not UTF-8 based (most aren't). + Just a few bad behaving targets can spoil the search experience + considerably if for example Greek, Russian or otherwise non 7-bit ASCII + search terms are entered. In these cases some targets return + records irrelevant to the query, and the result screens will be + cluttered with noise. + + + While noise from misbehaving targets can not be removed, it can + be reduced using truly Unicode based ranking. This is an + option which is available to the system administrator if ICU + support is compiled into Pazpar2, see + for details. + + + In addition, the ICU tokenization and normalization rules must + be defined in the master configuration file described in + . + +
+ +
Reference @@ -757,7 +872,7 @@ POSSIBILITY OF SUCH DAMAGES. - +