IrTcl User's Guide and Reference <author><htmlurl url="http://www.indexdata.dk/" name="Index Data">, <tt><htmlurl url="mailto:info@indexdata.dk" name="info@indexdata.dk"></tt> <date>$Date: 2004-04-26 09:09:06 $ <abstract> IrTcl version 1.4.2 -- a Tcl extension that allows you to build Z39.50 clients. </abstract> <toc> <sect>Introduction This document describes the <sf/IrTcl/ information retrieval toolkit, which offers a high-level interface for the development of Z39.50 clients. The toolkit is based on the Tcl/Tk toolkit developed by Prof. John K. Ousterhout at the University of California [ref 1]. The core of Tcl is rather small but it offers a flexible C API for easy development of Tcl extensions. The most important Tcl extension is probably Tk -- A portale GUI for X/WIN32/MAC. To interface the Z39.50 protocol <sf/IrTcl/ uses the <bf/YAZ/ toolkit. <sf/IrTcl/ is usually build as a <it/dynamic/ library (WIN32) or a shared object (Unix) which is dynamically loaded by using the Tcl's <tt/load/ command. However, <sf/IrTcl/ can also be compiled as a traditional <it/static/ library. <sect>Compilation and installation <sect1>UNIX In order to compile the software on UNIX you need: <itemize> <item> An ANSI C compiler such as GNU C. <item> <htmlurl url="http://www.scriptics.com" name="Tcl">. Version 8.0 and 8.3 has been tested. <item> <htmlurl url="http://www.indexdata.dk/yaz/" name="YAZ"> version 1.6 or higher. </itemize> Unpack the <sf/IrTcl/ package at the same directory level as <bf/YAZ/. Type: <tscreen><verb> $ ./configure </verb></tscreen> This command tries to configure <sf/IrTcl/ for your system and creates a <tt>Makefile</tt>. The <tt>configure</tt> script tries to locate the file <tt/tclConfig.sh/ which should be generated by Tcl's installation script. Configure looks for ther Tcl shell on your system in order to locate this file. For example if <tt/tclsh/ is located in <tt>/home/joe/bin</tt>, configure will assume that <tt>tclConfig.sh</tt> is installed in <tt>/home/joe/lib</tt>, in which case the prefix is <tt>/home/joe</tt>. If you have more than one Tcl version installed on your system, or if configure cannot find the <tt/tclConfig.sh/, you can specify in which directory it is, by supplying option <tt>--with-tclconfig</tt> - for example: <tscreen><verb> $ ./configure --with-tclconfig=/home/joe/lib </verb></tscreen> The <sf/IrTcl/ executables are installed in prefix/bin and libraries and support files are installed in prefix/irtcl. Compile <sf/IrTcl/ by typing: <tscreen><verb> $ make </verb></tscreen> For Tcl versions that support dynamic libraries the command above will create the shared library, <tt/irtcl.so/, as well as the normal static library, <tt/libirtcl.a/. For Tcl versions that doesn't support dynamic libraries the make command will create two shells will build-in <sf/IrTcl/ support -- a Tcl shell called <tt/ir-tcl/. The traditional static library, <tt/libirtcl.a/, is build as well. To install the programs and support files type: <tscreen><verb> $ make install </verb></tscreen> If you wish to install man pages type: <tscreen><verb> $ make install.man </verb></tscreen> Summary of files installed (the names refer to the Makefile variables): <descrip> <tag><tt>irtcl.so</tt></tag> The <sf/IrTcl/ shared dynamic library. The actual name of this library vary. Installed in <tt>IRTCLDIR</tt>. This file is only generated when using newer versions of Tcl/Tk. <tag><tt>ir-tcl</tt></tag> The <sf/IrTcl/ shell for Tcl. This program is not needed when using a Tcl that supports shared libraries. Installed in <tt>BINDIR</tt> -- defaults to <tt>/usr/local/bin</tt>. <tag><tt>client.tcl</tt></tag> A graphical client for Tk. The client is installed as an executable script called <tt>irclient</tt> in <tt>BINDIR</tt>. This client needs a number of files, bitmaps, etc. The client looks for the files in the current directory &mdash if this fails it tries to look in the directory <tt>IRTCLDIR</tt> -- defaults to <tt>/usr/local/lib/irtcl</tt>. <tag><tt>libirtcl.a</tt></tag> The <sf/IrTcl/ library. Installed in <tt>LIBDIR</tt> -- defaults to <tt>/usr/local/lib</tt>. <tag><tt>ir-tcl.h</tt></tag> The <sf/IrTcl/ header file. Installed in <tt>INCDIR</tt> -- defaults to <tt>/usr/local/include</tt>. <tag><tt>irtdb.tcl</tt></tag> A setup file with definitions of target and queries. Read and updated by <tt>client.tcl</tt>. Installed in <tt>IRTCLDIR</tt> -- defaults to <tt>/usr/local/lib/irtcl</tt>. <tag><tt>formats/*</tt></tag> Display format files written in Tk. Read by <tt>client.tcl</tt>. Installed in <tt>IRTCLDIR</tt> -- defaults to <tt>/usr/local/lib/irtcl</tt>. <tag><tt>bitmaps/*</tt></tag> Various bitmap files. Read by <tt>client.tcl</tt>. Installed in <tt>IRTCLDIR</tt> -- defaults to <tt>/usr/local/lib/irtcl</tt>. <tag><tt>LICENSE</tt></tag> LICENSE file. Read by <tt>client.tcl</tt>. Installed in <tt>IRTCLDIR</tt> -- defaults to <tt>/usr/local/lib/irtcl</tt>. </descrip> <sf/IrTcl/ can be used either from the Tcl interpreter <tt/tclsh/ or the statically linked program <tt/ir-tcl/. Using <tt/tclsh/ is the preferred method. You must load the <sf/IrTcl/ library by using the <tt/load/ command. A static, non-dynamic, version goes like this: <tscreen><verb> $ ir-tcl % </verb></tscreen> and the dynamic version (preferred) goes like: <tscreen><verb> $ tclsh % load ./irtcl.tcl % </verb></tscreen> <sect1>WIN32 <sf/IrTcl/ is shipped with a "makefile" for the NMAKE tool part of with Microsoft Visual C++. Start an MS-DOS prompt and switch the sub directory <tt>WIN</tt> where the file <tt>makefile</tt> is located. Customize the installation by editing the <tt>makefile</tt> file (for example by using wordpad). The following summarises the most important settings in that file. <descrip> <tag><tt>YAZDIR</tt></tag> Specifies where YAZ is located. <tag><tt>DEBUG</tt></tag> If set to 1, the software is compiled with debugging libraries. If set to 0, the software is compiled with release (non-debugging) libraries. </descrip> When satisfied with the settings in the makefile type <tscreen><verb> nmake </verb></tscreen> If compilation was successful the executables <tt>irtcl.dll</tt> is put in directory <tt>BIN</tt>. To start the test client that comes with <sf/IrTcl/ make sure both <tt/YAZ.DLL/ and <tt/IRTCL.DLL/ are in current directory or in your PATH. Go to the top-level directory of <sf/IrTcl/ and type "wish -f client.tcl". You might want to make a short-cut to start this. <sect1>Using Tk If your Tcl/Tk supports dynamic libraries you can use the <tt/load/ command from within <tt/wish/ as described in the previous section. The enclosed script <tt>client.tcl</tt> is a graphical client which demonstates an example of a user interface for the Z39.50 protocol. At first the script was relatively small but it has grown since the beginning. At present it is about 3000 lines. To start the client with dynamic library support use: <tscreen><verb> $ wish -f client.tcl </verb></tscreen> Note: Substitute the real name for the wish interpreter above, for version 8.0 it is probably called <tt/wish8.0/. The client lets up define targets and query types within the interface. Hence, you will not need to modify configuration files. Stuff regarding targets can be found in the pull down menu 'Target' with the following options: <descrip> <tag>Connect</tag> Establishes connection to a target. <tag>Disconnect</tag> Closes a target connection. <tag>About</tag> Shows implementation Id, implementation Version, etc for the current target. <tag>Setup</tag> Pops up a target definition window. You may alter a target definition. <tag>Setup new</tag> Lets you define a new target. </descrip> The term query type refers to a collection of search fields. The pull down menu Options|Query deals with queries. You may insert/modify/remove query types. <sect>Overview of the API Basically, <sf/IrTcl/ is a set of commands introduced to Tcl. When extending Tcl there are two approaches: action-oriented commands and object-oriented commands. Action-oriented commands manipulate Tcl variables and each command introduces only one action. The string manipulation commands in Tcl are action oriented. Object-oriented commands are added for every declared variable (object). Object-oriented commands usually provide a set of actions (methods) to manipulate the object. The widgets in Tk (X objects) are examples of the object-oriented style. <sf/IrTcl/ commands are object-oriented. The main reason for this is that the data structures involved in the IR protocol are not easily represented by Tcl data structures. Also, the <sf/IrTcl/ objects tend to exist for a relativly long time. Note that although we use the term object-oriented commands, this does not mean that the programming style is strictly object-oriented. For example, there is such no such thing as inheritance. We are now ready to present the three commands introduced to Tcl by <sf/IrTcl/: <descrip> <tag/ir/ The ir object represents a connection to a target. More precisely it describes a Z-association. <tag/ir-set/ The ir-set describes a result set, which is conceptually a collection of records returned by the target. The ir-set object may retrieve records from a target by means of the ir object; it may read/write records from/to a local file or it may be updated with a user-edited record. <tag/ir-scan/ The scan object represents a list of scan lines retrieved from a target. </descrip> <bf/Example/ To create a new IR object called <tt/z-assoc/ write: <tscreen><verb> ir z-assoc </verb></tscreen> <bf/End of example/ Each object provides a set of <em/settings/ which may either be readable, writeable of both. All settings immediately follow the name of the object. If a value is present the setting is set to <em/value/. <bf/Example/ We wish to set the preferred-message-size to 18000 on the <tt/z-assoc/ object: <tscreen><verb> z-assoc preferredMessageSize 18000 </verb></tscreen> To read the current value of preferred-message-size use: <tscreen><verb> z-assoc preferredMessageSize </verb></tscreen> <bf/End of example/ One important category consists of settings is those that relate to the event-driven model. When <sf/IrTcl/ receives responses from the target, i.e. init responses, search responses, etc., a <em/callback/ routine is called. Callback routines are represented in Tcl as a list, which is re-interpreted prior to invocation. The method is similar to the one used in Tk to capture X events. For each Z39.50 request there is a corresponding object action. The most important actions are: <descrip> <tag/connect/ Establishes connection with a target <tag/init/ Sends an initialize request. <tag/search/ Sends a search request. <tag/present/ Sends a present request. <tag/scan/ Sends a scan request. </descrip> <bf/Example/ This example shows a complete connect - init - search - present scenario. First an IR object, called <tt/z/, is created. Also a result set <tt/z.1/ is introduced by the <tt/ir-set/ and it is specified that the result set uses <tt/z/ as its association. The setting <tt/databaseNames/ is set to the database <tt/books/ to which the following searches are directed. A callback is then defined and a connection is established to <tt/fake.com/ by the <tt/connect/ action. If the connect succeeds the <tt/connect-response/ is called. In the Tcl procedure, <tt/connect-response/, a callback is defined <em/before/ the init request is executed. The Tcl procedure <tt/init-response/ is called when a init response is returned from the target. The <tt/init-response/ procedure sets up a <tt/search-response/ callback handler and sends a search-request by using a query which consists of a single word <tt/science/. When the <tt/search-response/ procedure is called it defines a variable <tt/hits/ and sets it to the value of the setting <tt/resultCount/. If <tt/hits/ is positive a present-request is sent -- asking for 5 records from position 1. Finally, a present response is received and the number of records returned is stored in the variable <tt/ret/. <tscreen><verb> ir z z databaseNames books ir-set z.1 z z callback {connect-response} z connect fake.com proc connect-response {} { z callback {init-response} z init } proc init-response {} { z callback {search-response} z.1 search science } proc search-response {} { set hits [z.1 resultCount] puts "$hits hits" if {$hits > 0} { z callback {present-response} z.1 present 1 5 } } proc present-response {} { set ret [z.1 numberOfRecordsReturned] puts "$ret records returned" } </verb></tscreen> <bf/End of example/ The previous example program doesn't care about error conditions. If errors occur in the program they will be trapped by the Tcl error handler. This is not always appropriate. However, Tcl offers a <tt/catch/ command to support error handling by the program itself. <sect>Associations The ir object describes an association with a target. This section covers the connect-init-disconnect actions provided by the ir object. An ir object is created by the <tt/ir/ command and the created object enters a 'not connected' state, because it isn't connected to a target yet. <sect1>Connect A connection is established by the <tt/connect/ action which is immediately followed by a hostname. A number of settings affect the <tt/connect/ action. Obviously, these settings should be set <bf/before/ connecting. The settings are: <descrip> <tag><tt>comstack </tt><tt>mosi|tcpip</tt></tag> Comstack type. Note that <tt/mosi/ is no longer supported by <bf/YAZ/. <tag><tt>protocol </tt><tt>Z39|SR</tt></tag> Protocol type - ANSI/NISO Z39.50 or ISO SR. Note, that SR is no longer supported by <bf/YAZ/. <tag><tt>callback </tt>list</tag> Tcl script called when the connection is established. <tag><tt>failback </tt>list</tag> Fatal error Tcl script. Called on protocol errors or if target closes connection. </descrip> If the connect is unsuccessful either the connect action itself will return an error code or the failback handler is invoked. In general, the <tt>failback</tt> handler is invoked when serious unrecoverable errors occur when communicating with the target. In this case the <sf/IrTcl/ system shuts down the connection. The <tt>failback</tt> handler might inspect the <tt>failInfo</tt> setting to determine the cause of the failure; it returns two elements. The first is an error integer; the second is an english representation of the error. The error codes and the corresponding messages are: <descrip> <tag><tt>0</tt></tag>ok <tag><tt>1</tt></tag>connect failed <tag><tt>2</tt></tag>connection closed <tag><tt>3</tt></tag>connection closed <tag><tt>4</tt></tag>failed to decode incoming APDU <tag><tt>5</tt></tag>unknown APDU </descrip> Note: in case 3 the connection was closed during read a read operation whereas in case 4 it was closed during a write operation. <sect1>Init If the connect operation succeeds the <tt/init/ action should be used. The init related settings are: <descrip> <tag><tt>preferredMessageSize </tt>integer</tag> Preferred-message-size. Default value is 30000. <tag><tt>maximumRecordSize </tt>integer</tag> Maximum-record-size. Default value is 30000. <tag><tt>idAuthentication </tt>string ...</tag> Id-authentication. There are three forms. If any empty is given, the Id-authentication is not used. If one non-empty string is given, the 'open' authentication is used. If three strings are specified, the version 'id-pass' authentication (version 3 only) is used in which case the first string is groupId; the second string is userId and the third string is password. <tag><tt>implementationName </tt>string</tag> Implementation-name of origin system. <tag><tt>implementationId</tt></tag> Implementation-id of origin system. This setting is read-only. <tag><tt>implementationVersion</tt></tag> Implementation-version of origin system. This settings is read-only. <tag><tt>options </tt>list</tag> Options to be negotiated in the init service. The list contains the options that are set. Possible values are <tt>search</tt>, <tt>present</tt>, <tt>delSet</tt>, <tt>resourceReport</tt>, <tt>triggerResourceCtrl</tt>, <tt>resourceCtrl</tt>, <tt>accessCtrl</tt>, <tt>scan</tt>, <tt>sort</tt>, <tt>extendedServices</tt>, <tt>level-1Segmentation</tt>, <tt>level-2Segmentation</tt>, <tt>concurrentOperations</tt> and <tt>namedResultSets</tt>. Currently the default options are: <tt>search</tt>, <tt>present</tt>, <tt>scan</tt> and <tt>namedResultSets</tt>. The <tt>options</tt> setting is set to its default value when an ir object is created and when a <tt>disconnect</tt> action is performed. <tag><tt>protocolVersion </tt>integer</tag> Protocol version: 2, 3, etc. Default is 2. <tag><tt>referenceId </tt>string</tag> Reference-id of init operation. If string is empty no reference-id is used. <tag><tt>initResponse </tt>list</tag> Init-response Tcl script. <tag><tt>callback </tt>list</tag> General response Tcl script. Only used if <tt>initResponse</tt> is not specified. </descrip> The init-response handler should inspect some of the settings shown below: <descrip> <tag><tt>initResult </tt>returns boolean</tag> Init response status. True if init operation was successful; false otherwise. <tag><tt>preferredMessageSize </tt>returns integer</tag> Preferred-message-size after negotiation. <tag><tt>maximumRecordSize </tt>returns integer</tag> Maximum-record-size after negotiation. <tag><tt>targetImplementationName </tt>returns string</tag> Implementation-name of target system. <tag><tt>targetImplementationId </tt>returns string</tag> Implementation-id of target system. <tag><tt>targetImplementationVersion </tt>returns string</tag> Implementation-version of target system. <tag><tt>options </tt>returns list</tag> Options after negotiation. The list contains the options that are set. <tag><tt>protocolVersion </tt>returns integer</tag> Protocol version: 2, 3, etc after negotiation. <tag><tt>userInformationField </tt>returns string</tag> User information field. <tag><tt>referenceId </tt>returns string</tag> Reference-id of init response. </descrip> <bf/Example/ Consider a client with the ability to access multiple targets. We define a list of targets that we wish to connect to. Each item in the list describes the target parameters with the following four components: association-name, comstack-type, protocol-type and a hostname. The list for the two targets: Library of Congress and Z39.50 target Data Research, will be defined as: <tscreen><verb> set targetList { {loc tcpip Z39 z3950.loc.gov:7090} {drs tcpip Z39 dranet.dra.com} } </verb></tscreen> The Tcl code below defines, connect and initialize the targets in <tt/targetList/: <tscreen><verb> foreach target $targetList { set assoc [lindex $target 0] ir $assoc $assoc comstack [lindex $target 1] $assoc protocol [lindex $target 2] $assoc failback [list fail-response $assoc] $assoc callback [list connect-response $assoc] $assoc connect [lindex $target 3] } proc connect-response {assoc} { $assoc callback [list init-response $assoc] $assoc init } proc fail-response {assoc} { puts "$assoc closed connection or protocol error" } proc init-response {assoc} { if {[$assoc initResult]} { puts "$assoc initialized ok" } else { puts "$assoc didn't initialize" } } </verb></tscreen> <tt/target/ is bound to each item in the list of targets. The <tt/assoc/ is set to the ir object name. Then, the comstack, protocol and failback are set for the <tt/assoc/ object. The ir object name is argument to the <tt/fail-response/ and <tt/connect-response/ routines. Note the use of the Tcl <tt/list/ command which is necessary here because the argument contains variables (<tt/assoc/) that should be substituted before the handler is defined. After the connect operation, the <tt/init-response/ handler is defined in much the same way as the failback handler. And, finally, an init request is executed. <bf/End of example/ <sect1>Disconnect To terminate the connection the <tt/disconnect/ action should be used. This action has no parameters. Another connection may be established by a new <tt/connect/ action on the same ir object. <sect>Result sets This section covers the queries used by <sf/IrTcl/, and how searches and presents are handled. A search operation and a result set is described by the ir set object. The ir set object is defined by the <tt/ir-set/ command which has two parameters. The first is the name of the new ir set object, and the second, which is optional, is the name of an assocation -- an ir object. The second argument is required if the ir set object should be able to perform searches and presents. However, it is not required if only ``local'' operations is done with the ir set object. When the ir set object is created a number of settings are inherited from the ir object, such as the selected databass, query type, etc. Thus, the ir object contains what we could call default settings. <sect1>Queries Search requests are sent by the <tt/search/ action which takes a query as parameter. There are two types of queries, RPN and CCL, controlled by the setting <tt/queryType/. A string representation for the query is used in <sf/IrTcl/ since Tcl has reasonably powerful string manipulaton capabilities. The RPN query used in <sf/IrTcl/ is the prefix query notation also used in the <bf/YAZ/ test client. The CCL query is an uninterpreted octet-string which is parsed by the target. We refer to the standard: ISO 8777. Note that only a few targets actually support the CCL query and the interpretation of the standard may vary. The prefix query notation (which is converted to RPN) offer a few operators. They are: <descrip> <tag><tt>@attr </tt>list op</tag> The attributes in list are applied to op <tag><tt>@and </tt>op1 op2</tag> Boolean <em/and/ on op1 and op2 <tag><tt>@or </tt>op1 op2</tag> Boolean <em/or/ on op1 and op2 <tag><tt>@not </tt>op1 op2</tag> Boolean <em/not/ on op1 and op2 <tag><tt>@prox </tt>list op1 op2</tag> Proximity operation on op1 and op2. Not implemented yet. <tag><tt>@set </tt>name</tag> Result set reference <tag><tt>@attrset </tt>set</tag> Whole query uses the specified attribute set. If this operator is used it must be defined at the beginning of the query. </descrip> It is simple to build RPN queries in <sf/IrTcl/. Search terms are sequences of characters, as in: <tscreen><verb> science </verb></tscreen> Boolean operators use the prefix notation (instead of the suffix/RPN), as in: <tscreen><verb> @and science technology </verb></tscreen> Search terms may be associated with attributes. These attributes are indicated by the <tt/@attr/ operator. Assuming the bib-1 attribute set, we can set the use-attribute (type is 1) to title (value is 4): <tscreen><verb> @attr 1=4 science </verb></tscreen> Also, it is possible to apply attributes to a range of search terms. In the query below, both search terms have use=title but the <tt/tech/ term is right truncated: <tscreen><verb> @attr 1=4 @and @attr 5=1 tech beta </verb></tscreen> To search for the DatabaseInfo records from an Explain server, we could use <tscreen><verb> @attrset exp1 @attr 1=1 DatabaseInfo </verb></tscreen> <sect1>Search The settings that affect the search are listed below: <descrip> <tag><tt>databaseNames </tt>list</tag> Database-names. <tag><tt>smallSetUpperBound </tt>integer</tag> Small set upper bound. Default 0. <tag><tt>largeSetLowerBound </tt>integer</tag> Large set lower bound. Default 1. <tag><tt>mediumSetPresentNumber </tt>integer</tag> Medium set present number. Default 0. <tag><tt>replaceIndicator </tt>boolean</tag> Replace-indicator. Default true (1). <tag><tt>setName </tt>string</tag> Name of result set. Default name of set is <tt/default/. <tag><tt>queryType rpn|ccl</tt></tag> Query type-1 or query type-2. Default rpn (type-1). <tag><tt>preferredRecordSyntax </tt>string</tag> Preferred record syntax -- UNIMARC, USMARC, etc. <tag><tt>smallSetElementSetNames </tt>string</tag> small-set-element-set names. If string is empty the element set is not set. Default is empty (not set). <tag><tt>mediumSetElementSetNames </tt>string</tag> medium-set-element-set names. If string is empty the element set is not set. Default is empty (not set). <tag><tt>nextResultSetPosition </tt>returns integer</tag> Next result set position. <tag><tt>referenceId </tt>string</tag> Reference-id. If string is empty no reference-id is used. <tag><tt>searchResponse </tt>list</tag> Search-response Tcl script. <tag><tt>callback </tt>list</tag> General response Tcl script. Only used if searchResponse is not specified. This setting is valid only for the <tt/ir/ object -- not the <tt/ir-set/ object. </descrip> Setting the <tt/databaseNames/ is mandatory. All other settings have reasonable defaults. The search-response handler, specified by the <tt/callback/ - or the <tt/searchResponse/ setting, should read some of the settings shown below: <descrip> <tag><tt>searchStatus</tt> returns boolean</tag> Search-status. True if search operation was successful; false otherwise. <tag><tt>responseStatus </tt>returns list</tag> Response status information. <tag><tt>resultCount </tt>returns integer</tag> result-count <tag><tt>numberOfRecordsReturned </tt>returns integer</tag> Number of records returned. <tag><tt>referenceId </tt>returns string</tag> Reference-id of search response. </descrip> The <tt/responseStatus/ signals one of three conditions which is indicated by the value of the first item in the list: <descrip> <tag><tt>NSD</tt></tag> indicates that the target has returned one or more non-surrogate diagnostic messages. The <tt/NSD/ item is followed by a list with all non-surrogate messages. Each non-surrogate message consists of three items. The first item of the three items is the error code (integer); the next item is a textual representation of the error code in plain english; the third item is additional information, possibly empty if no additional information was returned by the target. <tag><tt>DBOSD</tt></tag> indicates a successful operation where the target has returned one or more records. Each record may be either a database record or a surrogate diagnostic. <tag><tt>OK</tt></tag> indicates a successful operation -- no records are returned from the target. </descrip> <bf/Example/ We continue with the multiple-targets example. The <tt/init-response/ procedure will attempt to make searches: <tscreen><verb> proc init-response {assoc} { puts "$assoc connected" ir-set ${assoc}.1 $assoc $assoc.1 queryType rpn $assoc.1 databaseNames base-a base-b $assoc callback [list search-response $assoc ${assoc}.1] $assoc.1 search "@attr 1=4 @and @attr 5=1 tech beta" } </verb></tscreen> An ir set object is defined and the ir object is told about the name of ir object. The ir set object use the name of the ir object as prefix. Then, the query-type is defined to be RPN, i.e. we will use the prefix query notation later on. Two databases, <tt/base-a/ and <tt/base-b/, are selected. A <tt/search-response/ handler is defined with the ir object and the ir-set object as parameters and the search is executed. The first part of the <tt/search-response/ looks like: <tscreen><verb> proc search-response {assoc rset} { set status [$rset responseStatus] set type [lindex $status 0] if {$type == "NSD"} { set code [lindex $status 1] set msg [lindex $status 2] set addinfo [lindex $status 3] puts "NSD $code: $msg: $addinfo" return } set hits [$rset resultCount] if {$type == "DBOSD"} { set ret [$rset numberOfRecordsReturned] ... } } </verb></tscreen> The response status is stored in variable <tt/status/ and the first element indicates the condition. If non-surrogate diagnostics are returned they are displayed. Otherwise, the search was a success and the number of hits is read. Finally, it is tested whether the search response returned records (database or diagnostic). Note that we actually didn't inspect the search status (setting <tt/searchStatus/) to determine whether the search was successful or not, because the standard specifies that one or more non-surrogate diagnostics should be returned by the target in case of errors. <bf/End of example/ If one or more records are returned from the target they will be stored in the result set object. In the case in which the search response contains records, it is very similar to the present response case. Therefore, some settings are common to both situations. <sect1>Present The <tt/present/ action sends a present request. The <tt/present/ is followed by two optional integers. The first integer is the result-set starting position -- defaults to 1. The second integer is the number of records requested -- defaults to 10. The settings which could be modified before a <tt/present/ action are: <descrip> <tag><tt>preferredRecordSyntax </tt>string</tag> preferred record syntax -- UNIMARC, USMARC, etc. <tag><tt>elementSetNames </tt>string</tag> Element-set names. If string is empty the element set is not set. Default is empty (not set). <tag><tt>referenceId </tt>string</tag> Reference-id. If string is empty no reference-id is used. <tag><tt>presentResponse </tt>list</tag> Present-response Tcl script. <tag><tt>callback </tt>list</tag> General response Tcl script. Only used if presentResponse is not specified This setting is valid only for the <tt/ir/ object -- not the <tt/ir-set/ object. </descrip> The present-response handler should inspect the settings shown in table below. Note that <tt/responseStatus/ and <tt/numberOfRecordsReturned/ settings were also used in the search-response case. As in the search response case, records returned from the target are stored in the result set object. <descrip> <tag><tt>presentStatus </tt>returns boolean</tag> Present-status. <tag><tt>responseStatus </tt>returns list</tag> Response status information. <tag><tt>numberOfRecordsReturned </tt>returns integer</tag> Number of records returned. <tag><tt>nextResultSetPosition </tt>returns integer</tag> Next result set position. <tag><tt>referenceId </tt>returns string</tag> Reference-id of present response. </descrip> <sect1>Records Search responses and present responses may result in one or more records stored in the ir set object if the <tt/responseStatus/ setting indicates database or surrogate diagnostics (<tt/DBOSD/). The individual records, indexed by an integer position offset, should then be inspected. If element set names have been specified either in the search requests (<tt>smallSetElementSetNames</tt> / <tt>mediumSetElementSetNames</tt>) or present requests (<tt>elementSetNames</tt>) the individual records in the ir set object are assigned appropriate element set ids. In this mode records at a given position are treated different as long as they have difference element set ids. To inspect records with a particular element set id in subsequent operations use the <tt>recordElements</tt> setting followed by the id. If you have more than one record at a given position and you do not use <tt>recordElements</tt> the record selected at the given position is undefined. The action <tt>type</tt> followed by an integer returns information about a given position in an ir set. There are three possiblities: <descrip> <tag><tt/SD/</tag> The item is a surrogate diagnostic record. <tag><em/empty/</tag> There is no record at the specified position. <tag><tt/DB/</tag> The item is a database record. </descrip> To handle the first case, surrogate diagnostic record, the <tt/Diag/ action should be used. It returns three items: error code (integer), text representation in plain english (string), and additional information (string, possibly empty). In the second case, no record, note that there still might be a record at the position but with an id that differs from that specified by <tt>recordElements</tt>. In the third case, database record, the <tt/recordType/ action should be used. It returns the record type at the given position. Some record types are: <tscreen> UNIMARC INTERMARC CCF USMARC UKMARC NORMARC LIBRISMARC DANMARC FINMARC SUTRS </tscreen> <bf/Example/ We continue our search-response example. In the case, <tt/DBOSD/, we should inspect the result set items. Recall that the ir set name was passed to the search-response handler as argument <tt/rset/. <tscreen><verb> if {$type == "DBOSD"} { set ret [$rset numberOfRecordsReturned] for {set i 1} {$i<=$ret} {incr i} { set itype [$rset type $i] if {$itype == "SD"} { set diag [$rset Diag $i] set code [lindex $diag 0] set msg [lindex $diag 1] set addinfo [lindex $diag 2] puts "$i: NSD $code: $msg: $addinfo" } elseif {$itype == "DB"} { set rtype [$rset recordType $i] puts "$i: type is $rtype" } } } </verb></tscreen> Each item in the result set is examined. If an item is a diagnostic message it is displayed; otherwise if it's a database record its type is displayed. <bf/End of example/ <sect1>MARC records In the case, where there is a MARC record at a given position we want to display it somehow. The action <tt/getMarc/ is what we need. The <tt/getMarc/ is followed by a position integer and the type of extraction we want to make: <tt/field/ or <tt/line/. The <tt/field/ and <tt/line/ type are followed by three parameters that serve as extraction masks. They are called tag, indicator and field. If the mask matches a tag/indicator/field of a record the information is extracted. Two characters have special meaning in masks: the dot (any character) and star (any number of any character). The <tt/field/ type returns one or more lists of field information that matches the mask specification. Only the content of fields is returned. The <tt/line/ type, on the other hand, returns a Tcl list that completely describe the layout of the MARC record -- including tags, fields, etc. The <tt/field/ type is sufficient and efficient in the case, where only a small number of fields are extracted, and in the case where no further processing (in Tcl) is necessary. However, if the MARC record is to be edited or altered in any way, the <tt/line/ extraction is more powerful -- only limited by the Tcl language itself. <bf/Example/ Consider the record below: <tscreen><verb> 001 11224466 003 DLC 005 00000000000000.0 008 910710c19910701nju 00010 eng 010 $a 11224466 040 $a DLC $c DLC 050 00 $a 123-xyz 100 10 $a Jack Collins 245 10 $a How to program a computer 260 1 $a Penguin 263 $a 8710 300 $a p. cm. </verb></tscreen> Assuming this record is at position 1 in ir-set <tt/z.1/, we might extract the title-field (245 * a), with the following command: <tscreen><verb> z.1 getMarc 1 field 245 * a </verb></tscreen> which gives: <tscreen><verb> {How to program a computer} </verb></tscreen> Using the <tt/line/ instead of <tt/field/ gives: <tscreen><verb> {245 {10} {{a {How to program a computer}} }} </verb></tscreen> If we wish to extract the whole record as a list, we use: <tscreen><verb> z.1 getMarc 1 line * * * </verb></tscreen> giving: <tscreen><verb> {001 {} {{{} { 11224466 }} }} {003 {} {{{} DLC} }} {005 {} {{{} 00000000000000.0} }} {008 {} {{{} {910710c19910701nju 00010 eng }} }} {010 { } {{a { 11224466 }} }} {040 { } {{a DLC} {c DLC} }} {050 {00} {{a 123-xyz} }} {100 {10} {{a {Jack Collins}} }} {245 {10} {{a {How to program a computer}} }} {260 {1 } {{a Penguin} }} {263 { } {{a 8710} }} {300 { } {{a {p. cm.}} }} </verb></tscreen> <bf/End of example/ <bf/Example/ This example demonstrates how Tcl can be used to examine a MARC record in the list notation. The procedure <tt/extract-format/ makes an extraction of fields in a MARC record based on a number of masks. There are 5 parameters, <tt/r/: a record in list notation, <tt/tag/: regular expression to match the record tags, <tt/ind/: regular expression to match indicators, <tt/field/: regular expression to match fields, and finally <tt/text/: regular expression to match the content of a field. <tscreen><verb> proc extract-format {r tag ind field text} { foreach line $r { if {[regexp $tag [lindex $line 0]] && \ [regexp $ind [lindex $line 1]]} { foreach f [lindex $line 2] { if {[regexp $field [lindex $f 0]]} { if {[regexp $text [lindex $f 1]]} { puts [lindex $f 1] } } } } } } </verb></tscreen> To match <tt/comput/ followed by any number of character(s) in the 245 fields in the record from the previous example, we could use: <tscreen><verb> set r [z.1 getMarc 1 line * * *] extract-format $r 245 .. . comput </verb></tscreen> which gives: <tscreen><verb> How to program a computer </verb></tscreen> <bf/End of example/ The <tt/putMarc/ action does the opposite of <tt/getMarc/. It copies a record in Tcl list notation to a ir set object and is needed if a result-set must be updated by a Tcl modified (user-edited) record. <sect1>SUTRS In <sf/IrTcl/ a SUTRS record is treated as one single string. To retrieve a SUTRS record use the <tt>getSutrs</tt> followed by an index. <sect1>XML In <sf/IrTcl/ an XML record is treated as one single string. To retrieve a XML record use the <tt>getXml</tt> followed by an index. <sect1>GRS-1 A GRS-1 record in <sf/IrTcl/ is represented as a list of elements. Each element specifies a tag as well as data. The data may be a subtree, which is represented as a list, and so on. The method <tt/getGrs/ is followed by a record index and optional specifiers that selects a specific sub-tree. Each element consists of 5 elements: <descrip> <tag>tag-set</tag> Tag set number. <tag>value-type</tag> Type of tag value. May be either <tt/numeric/ of <tt/string/. <tag>value</tag> The value it self. <tag>data-type</tag> May be either <tt/octets/, <tt/numeric/, <tt/ext/, <tt/string/, <tt/bool/, <tt/intUnit/, <tt/empty/, <tt/notRequested/, <tt/diagnostic/ or <tt/subtree/. <tag>data</tag>The data associated with element of given type as indicated before. If data-type is <tt/numeric/ or <tt/string/ then data is encoded as a single Tcl token. The data-type <tt/bool/ is encoded as 0 or 1 for false and true respectively. If the data-type is <tt/subtree/ the data is a sub-list. In all other cases, the data is the empty string. </descrip> <bf/Example/ Consider the GRS-1 record below as shown by the <bf/YAZ/ client program: <tscreen><verb> (1,1) OID: GILS-schema (1,14) 2 (2,1) UTAH EARTHQUAKE EPICENTERS class=4,type=1,value=us (4,52) UTAH GEOLOGICAL AND MINERAL SURVEY (3,Local-Subject-Index) APPALACHIAN VALLEY; EARTHQUAKE; EPICENTER (2,6) (1,19) Five files of epicenter data arranged by ... (3,Format) DIGITAL DATA SETS (3,Data-Category) TERRESTRIAL (3,Comments) Data are supplied by the University of Utah ... (4,70) (4,90) (2,10) UTAH GEOLOGICAL AND MINERAL SURVEY (4,2) 606 BLACK HAWK WAY (4,3) SALT LAKE CITY (3,State) UT (3,Zip-Code) 84108 (2,16) USA (2,14) (801) 581-6831 (4,7) UTAH EARTHQUAKE EPICENTERS (4,1) ESDD0006 (1,16) 198903 </verb></tscreen> The record may be fetched from the result set, <tt/z.1/, at position 1 by using: <tscreen><verb> z.1 getGrs 1 </verb></tscreen> which will return: <tscreen><verb> { 1 numeric 1 oid 1.2.840.10003.13.2 } { 1 numeric 14 string 2 } { 2 numeric 1 string { UTAH EARTHQUAKE EPICENTERS} } { 4 numeric 52 string {UTAH GEOLOGICAL AND MINERAL SURVEY} } { 3 string Local-Subject-Index string {APPALACHIAN VALLEY; EARTHQUAKE; EPICENTER} } { 2 numeric 6 subtree { { 1 numeric 19 string {Five files of epicenter data arranged by ...} } { 3 string Format string {DIGITAL DATA SETS} } { 3 string Data-Category string TERRESTRIAL } { 3 string Comments string {Data are supplied by the University of Utah ...} } } } { 4 numeric 70 subtree { { 4 numeric 90 subtree { { 2 numeric 10 string {UTAH GEOLOGICAL AND MINERAL SURVEY} } { 4 numeric 2 string {606 BLACK HAWK WAY} } { 4 numeric 3 string {SALT LAKE CITY} } { 3 string State string UT } { 3 string Zip-Code string 84108 } { 2 numeric 16 string USA } { 2 numeric 14 string {(801) 581-6831} } } } { 4 numeric 7 string {UTAH EARTHQUAKE EPICENTERS} } } } { 4 numeric 1 string ESDD0006 } { 1 numeric 16 string 198903 } </verb></tscreen> We can choose only to get the path (2,6) by using: <tscreen><verb> z.1 getGrs 1 (2,6) </verb></tscreen> and we'll get: <tscreen><verb> { 2 numeric 6 subtree { { 1 numeric 19 string {Five files of epicenter data arranged by ...} } { 3 string Format string {DIGITAL DATA SETS} } { 3 string Data-Category string TERRESTRIAL } { 3 string Comments string {Data are supplied by the University of Utah ...} } } } </verb></tscreen> To get the well known (1,19) within the subject (2,6) we use <tscreen><verb> z.1 getGrs 1 (2,6) (1,19) </verb></tscreen> and get: <tscreen><verb> { 2 numeric 6 subtree { { 1 numeric 19 string {Five files of epicenter data arranged by ...} } } } </verb></tscreen> <bf/End of example/ <sect1>Explain Explain records are retrieved like other records. The method, <tt>getExplain</tt> is followed by an index and and an optional Explain record pattern. The returned record is a canonical representation of the Explain record. An ASN.1 sequence is represented as a list. Each item in the list consists of the name of the element, followed by its value if the value is supplied. The optional pattern that follows the index after <tt>getExplain</tt> consists of one or more elements, that is matched against the elements of the actual record. <bf/Example/ One of the few targets that support explain is the ATT research server at <tt>z3950.research.att.com</tt>. The targetInfo record was returned by the target and it's stored in position 1 in the result set, <tt>z.1</tt>. To retrieve the whole record we must use <tscreen><verb> z.1 getExplain 1 </verb></tscreen> and we get in return <tscreen><verb> {targetInfo commonInfo {name {Lucent Technologies Research Server}} recentNews icon {namedResultSets 1} {multipleDBsearch 0} {maxResultSets 100} {maxResultSize 600000} maxTerms timeoutInterval {welcomeMessage {strings { {language eng} {text {Salutations - this is Lucent Technologies experimental Z39.50 server. No guarentees, but free and unlimited access!}} } } } {contactInfo {name {Robert Waldstein}} {description {strings { {language eng} {text {Librarian system designer - no legal anythings}} } } } {address {strings { {language eng} {text {Room 3D-591 600 Mountain Ave Murray Hill N.J. USA 07974}} } } } {email wald@lucent.com} {phone {908 582-6171}} } description nicknames {usageRest {strings { {language eng} {text {None - as long as nonProfit research}} } } } paymentAddr {hours {strings { {language eng} {text {Should never be down}} } } } dbCombinations addresses commonAccessInfo } </verb></tscreen> The <tt>targetInfo</tt> above indicates the the record is really a <tt>targetInfo</tt> record. The <tt>commonInfo</tt>, which is optional, is not supplied by this server. The <tt>name</tt>, however is supplied, with the value <tt>Lucent Technologies Research Server</tt>. To retrieve the <tt>contactInfo</tt> from the record above we can extract the element from the record by using Tcl's list manipulation facilities, for example by doing <tscreen><verb> set ti [z.1 getExplain 1] lindex [lindex $ti 0] 12 </verb></tscreen> which will return <tscreen><verb> contactInfo {name {Robert Waldstein}} {description {strings { {language eng} {text {Librarian system designer - no legal anythings}} } } } {address {strings { {language eng} {text {Room 3D-591 600 Mountain Ave Murray Hill N.J. USA 07974}} } } } {email wald@lucent.com} {phone {908 582-6171}} </verb></tscreen> We can also extract almost the same by doing <tscreen><verb> z.1 getExplain 1 targetInfo contactInfo </verb></tscreen> which will return <tscreen><verb> {name {Robert Waldstein}} {description {strings { {language eng} {text {Librarian system designer - no legal anythings}} } } } {address {strings { {language eng} {text {Room 3D-591 600 Mountain Ave Murray Hill N.J. USA 07974}} } } } {email wald@lucent.com} {phone {908 582-6171}} </verb></tscreen> <bf/End of example/ <sect>Scan To perform scan, a scan object must be created by the <tt>ir-scan</tt> command. This command has two arguments -- name of the scan object and name of the ir object. Basically, the scan object, provides one <tt>scan</tt> action which sends a scan request to the target. The <tt>action</tt> is followed by a string describing starting point of the term list. The format used is a simple subset of the query used in search requests. Only <tt>@attr</tt> specifications and simple terms are allowed. The settings that affect the scan are: <descrip> <tag><tt>stepSize </tt>integer</tag> Step size. Default is 0. <tag><tt>numberOfTermsRequested </tt>integer</tag> Number of terms requested. Default is 20. <tag><tt>preferredPositionInResponse </tt>integer</tag> Preferred position in response. Default is 1. <tag><tt>databaseNames </tt>list</tag> Database names. Note that this setting is not (yet) supported for the scan object. You must set this for the ir object instead. <tag><tt>referenceId </tt>string</tag> Reference-id. If string is empty no reference-id is used. <tag><tt>scanResponse </tt>list</tag> Scan-response Tcl script. <tag><tt>callback </tt>list</tag> General response Tcl script. Only used if <tt>scanResponse</tt> is not specified. This setting is valid only for the <tt/ir/ object -- not the <tt/ir-set/ object. </descrip> The scan object normally holds one or more scan line entries upon successful completion. The table below summarizes the settings that should be used in a response handler. <descrip> <tag><tt>scanStatus </tt>returns integer</tag> Scan status. An integer between 0 and 6. <tag><tt>numberOfTermsReturned </tt>returns integer</tag> Number of terms returned. <tag><tt>positionOfTerm </tt>returns integer</tag> An integer describing the position of term. <tag><tt>scanLine </tt>returns list</tag> This function returns information about a given scan line (entry) at a given index specified by the integer. The first scan line is numbered zero; the second 1 and so on. A list is returned by the <tt>scanLine</tt> setting. The first element is <tt>T</tt> if the scan line is a normal term and <tt>SD</tt> if the scan line is a surrogate diagnostic. In the first case (normal) the scan term is second element in the list and the number of occurences is the third element. In the other case (surrogate diagnostic), the second element is the diagnostic code, the third a text representation of the error code and the fourth element is additional information. <tag><tt>referenceId </tt>returns string</tag> Reference-id of scan response. </descrip> <bf/Example/ We will scan for the terms after <tt>science</tt> in the Title index. We will assume that an ir object called <tt>z-assoc</tt> has already been created. <tscreen><verb> z-assoc callback {scan-response} ir-scan z-scan z-assoc z-scan scan "@attr 1=4 science" proc scan-response {} { set status [z-scan status] if {$status == 0} { set no [z-scan numberOfTermsReturned] for {set i 0} {$i < $no} {incr i} { set line [z-scan scanLine $i] set type [lindex $line 0] if {$type == "T"} { puts [lindex $line 1] } elseif {$type == "SD"} { puts [lindex $line 1] } } } } </verb></tscreen> <bf/End of examle/ <sect>License Copyright © 1995-2004, Index Data ApS. Permission to use, copy, modify, distribute, and sell this software and its documentation, in whole or in part, for any purpose, is hereby granted, provided that: 1. This copyright and permission notice appear in all copies of the software and its documentation. Notices of copyright or attribution which appear at the beginning of any file must remain unchanged. 2. The names of Index Data or the individual authors may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT WARRANTY OF ANY KIND, EXPRESS, IMPLIED, OR OTHERWISE, INCLUDING WITHOUT LIMITATION, ANY WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL INDEX DATA BE LIABLE FOR ANY SPECIAL, INCIDENTAL, INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY KIND, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER OR NOT ADVISED OF THE POSSIBILITY OF DAMAGE, AND ON ANY THEORY OF LIABILITY, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. <sect>About Index Data Index Data is a consulting and software-development enterprise that specialises in library and information management systems. Our interests and expertise span a broad range of related fields, and one of our primary, long-term objectives is the development of a powerful information management system with open network interfaces and hypermedia capabilities. We make this software available free of charge, on a fairly unrestrictive license; as a service to the networking community, and to further the development of quality software for open network communication. We'll be happy to answer questions about the software, and about ourselves in general. <tscreen><verb> Index Data Kobmagergade 43 1150 Copenhagen K Denmark </verb></tscreen> <tscreen><verb> Phone: +45 3341 0100 Fax : +45 3341 0101 Email: info@indexdata.dk </verb></tscreen> <sect>References <descrip> <tag>1 IrTcl Homepage</tag> <htmlurl url="http://www.indexdata.dk/irtcl/" name="http://www.indexdata.dk/irtcl/"> <tag>2 Ousterhout, John K.:</tag> Tcl and the Tk Toolkit. Addison-Wesley Company Inc (ISBN 0-201-63337-X). The Tcl/Tk toolkit home page is <htmlurl url="http://tcl.activestate.com" name="http://tcl.activestate.com">. The primary download area is <htmlurl url="http://prdownloads.sourceforge.net/tcl/" name="http://prdownloads.sourceforge.net/tcl/">. <tag>3 Welch, Brent B.:</tag> Practical Programming in Tcl and Tk. Prentice Hall (ISBN 0-13-616830-2). </descrip> </article>