doc/ir-tcl.sgml

   1 <!doctype linuxdoc system>
   2
   3 <!--
   4   $Id: ir-tcl.sgml,v 1.9 1995-06-25 10:25:33 adam Exp $
   5 -->
   6
   7 <article>
   8 <title>IrTcl User's Guide and Reference
   9 <author>Index Data, <tt/info@index.ping.dk/
  10 <date>$Revision: 1.9 $
  11 <abstract>
  12 This document describes IrTcl &mdash; an information retrieval toolkit for
  13 Tcl and Tk that provides access to the Z39.50/SR protocol.
  14 </abstract>
  15
  16 <toc>
  17
  18 <sect>Introduction
  19
  20 <p>
  21 This document describes the <sf/IrTcl/ information retrieval toolkit,
  22 which offers a high-level, client interface to the Z39.50 and SR protocols.
  23 The toolkit is based on the Tcl/Tk toolkit developed by Prof. John
  24 K. Ousterhout at the University of California &lsqb;ref 1&rsqb;.
  25 Tcl is a simple, somewhat shell-like, interpreted language. What
  26 makes Tcl attractive is that it also offers a C API, which makes
  27 extensions to the language possible. The most important Tcl extension is
  28 probably Tk &mdash; A Motif look-and-feel interface to the X window
  29 system.
  30
  31 To interface the Z39.50/SR protocol <sf/IrTcl/ uses <bf/YAZ/.
  32 <bf/YAZ/ offers two transport types: RFC1729/BER on TCP/IP and the mOSI
  33 protocol stack.
  34 However, the mOSI transport is only an option, and hence it is not
  35 needed unless you wish to communicate within an OSI environment.
  36 See &lsqb;ref 2&rsqb; for more information about the XTI/mOSI implementation.
  37
  38 <sf/IrTcl/ provides two system environments:
  39
  40 <itemize>
  41 <item> A simple command line shell &mdash; useful for
  42 testing purposes.
  43 <item> A system which operates within the Tk environment which
  44 makes it very easy to implement GUI clients.
  45 </itemize>
  46
  47 <sect>Compilation and installation
  48
  49 <p>
  50 In order to compile you need:
  51 <itemize>
  52 <item> An ANSI C compiler such as GNU C.
  53 <item> Tcl 7.3.
  54 <item> YAZ version 1.0b or higher
  55 </itemize>
  56
  57 As an option you may want:
  58 <itemize>
  59 <item> Tk 3.6.
  60 <item> XTI/mosi
  61 </itemize>
  62
  63 Newer versions of Tcl and Tk have been released. These packages
  64 will <em/not/ work with <sf/IrTcl/. The <sf/IrTcl/ package will
  65 probably be able to use the newer versions soon. Fortunately this
  66 move will not change the <sf/IrTcl/ API &mdash; only the Tk code of the
  67 test client will be modified.
  68
  69 Unpack the <sf/IrTcl/ package at the same directory level as <bf/YAZ/.
  70
  71 Type:
  72 <tscreen><verb>
  73 $ ./configure
  74 </verb></tscreen>
  75
  76 This command tries to configure <sf/IrTcl/ for your system and creates
  77 a <tt>Makefile</tt>.
  78
  79 If the <tt>configure</tt> command cannot locate Tcl and Tk in your standard
  80 locations for libraries searched by your C compiler it will guess
  81 that the libraries are located in <tt>/usr/local/lib</tt> and that
  82 the header files are located in <tt>/usr/local/include</tt>.
  83 If this is incorrect you will have to modify the <tt>Makefile</tt> yourself.
  84
  85 Compile <sf/IrTcl/ by typing:
  86 <tscreen><verb>
  87 $ make
  88 </verb></tscreen>
  89
  90 If you don't have Tk you will only be able to create the <tt>ir-tcl</tt>
  91 program and you must type <tt>make ir-tcl</tt> instead.
  92
  93 If successful, this will make <tt>ir-tcl</tt>, <tt>ir-tk</tt> (if
  94 Tk is present) and a library called <tt>libirtcl.a</tt>.
  95
  96 To install the programs and support files type:
  97 <tscreen><verb>
  98 $ make install
  99 </verb></tscreen>
 100
 101 Summary of files installed (the names refer to the Makefile variables):
 102
 103 <descrip>
 104 <tag><tt>ir-tk</tt></tag> The Tk client. Installed in <tt>BINDIR</tt> &mdash;
 105 defaults to <tt>/usr/local/bin</tt>. When ir-tk starts it reads
 106 <tt>client.tcl</tt>. If the files doesn't exist in the current
 107 directory it tries to read it from <tt>IRTCLDIR</tt> - defaults
 108 to <tt>/usr/local/lib/irtcl</tt>.
 109 <tag><tt>ir-tcl</tt></tag> The Tcl client. Installed in <tt>BINDIR</tt> &mdash;
 110 defaults to <tt>/usr/local/bin</tt>.
 111 <tag><tt>libirtcl.a</tt></tag> The <sf/IrTcl/ library.
 112 Installed in <tt>LIBDIR</tt> &mdash; defaults to <tt>/usr/local/lib</tt>.
 113 <tag><tt>ir-tcl.h</tt></tag> The <sf/IrTcl/ header file.
 114 Installed in <tt>INCDIR</tt> &mdash; defaults to <tt>/usr/local/include</tt>.
 115 <tag><tt>client.tcl</tt></tag> A graphical client written in TK.
 116 Installed in <tt>IRTCLDIR</tt> &mdash; defaults to
 117 <tt>/usr/local/lib/irtcl</tt>.
 118 <tag><tt>clientrc.tcl</tt></tag> A setup file with definitions
 119 of target and queries. Read and updated by <tt>client.tcl</tt>. Installed
 120 in <tt>IRTCLDIR</tt> &mdash; defaults to <tt>/usr/local/lib/irtcl</tt>.
 121 <tag><tt>formats/*</tt></tag> Display format files written
 122 in Tk. Read by <tt>client.tcl</tt>. Installed
 123 in <tt>IRTCLDIR</tt> &mdash; defaults to <tt>/usr/local/lib/irtcl</tt>.
 124 <tag><tt>bitmaps/*</tt></tag> Various bitmap files. Read by
 125 <tt>client.tcl</tt>. Installed
 126 in <tt>IRTCLDIR</tt> &mdash; defaults to <tt>/usr/local/lib/irtcl</tt>.
 127 <tag><tt>LICENSE</tt></tag> LICENSE file. Read by
 128 <tt>client.tcl</tt>. Installed
 129 in <tt>IRTCLDIR</tt> &mdash; defaults to <tt>/usr/local/lib/irtcl</tt>.
 130 </descrip>
 131
 132 <sect1>ir-tcl
 133
 134 <p>
 135 The <tt>ir-tcl</tt> program is a shell like <tt>tclsh</tt> except that
 136 <tt>ir-tcl</tt> features the new set of information retrieval commands.
 137 Normally <tt>ir-tcl</tt> waits on <tt/stdin/ (for you to type commands) and
 138 on sockets events (connected to Z39.50/SR targets).
 139 You simply type the Tcl commands line by line. A filename may be specified as
 140 argument to <tt>ir-tcl</tt> in which case the file specified is evaluated
 141 as a script.
 142
 143 <sect1>ir-tk
 144
 145 <p>
 146 <tt>ir-tk</tt> is a program which basically waits for X events and
 147 socket events (assocated with Z39.50/SR targets). <tt>ir-tk</tt> normally
 148 tries to evaluate the file named <tt>client.tcl</tt> when it
 149 starts. However, this behaviour may be changed by specifying another
 150 filename with the <tt>-file</tt> option.
 151
 152 The enclosed script <tt>client.tcl</tt> is a graphical client which
 153 demonstates an example of a user interface for the Z39.50/SR protocols.
 154 At first the script was relatively small but it has grown since the
 155 beginning. At present it is about 3000 lines.
 156
 157 The client lets up define targets and query types within the interface.
 158 Hence, you will not need to modify configation files.
 159
 160 Stuff concerning targets can be found in the pull down menu 'Target'
 161 with the following options:
 162 <descrip>
 163 <tag>Connect</tag> Establishes connection to a target.
 164 <tag>Disconnect</tag> Closes a target connection.
 165 <tag>About</tag> Shows implementation Id, implementation Version, etc
 166  for the current target.
 167 <tag>Setup</tag> Pops up a target definition window. You may alter
 168  a target definition.
 169 <tag>Setup new</tag> Lets you define a new target.
 170 </descrip>
 171
 172 The term query type refers to a collection of search fields. The
 173 pull down menu Options|Query deals with queries. You may
 174 insert/modify/remove query types.
 175
 176 <sect>Overview of the API
 177
 178 <p>
 179 Basically, <sf/IrTcl/ is a set of commands introduced to Tcl.
 180 When extending Tcl there are two approaches: action-oriented commands
 181 and object-oriented commands.
 182
 183 Action-oriented commands manipulate
 184 Tcl variables and each command introduces only one action.
 185 The string manipulation commands in Tcl are action oriented.
 186
 187 Object-oriented commands are added for every declared
 188 variable (object). Object-oriented commands usually provide a set of
 189 actions (methods) to manipulate the object.
 190 The widgets in Tk (X objects) are examples of the object-oriented style.
 191
 192 <sf/IrTcl/ commands are object-oriented. The main reason
 193 for this is that the data structures involved in the IR protocol
 194 are not easily represented by Tcl data structures.
 195 Also, the <sf/IrTcl/ objects tend to exist for a relativly long time.
 196 Note that although we use the term object-oriented commands, this
 197 does not mean that the programming style is strictly object-oriented. For
 198 example, there is such no such thing as inheritance.
 199
 200 We are now ready to present the three commands introduced to Tcl by
 201 <sf/IrTcl/:
 202
 203 <descrip>
 204 <tag/ir/ The ir object represents a connection to a target. More
 205 precisely it describes a Z-association.
 206 <tag/ir-set/ The ir-set describes a result set, which is
 207 conceptually a collection of records returned by the target.
 208 The ir-set object may retrieve records from a target by means of
 209 the ir object; it may read/write records from/to a local file or it may be
 210 updated with a user-edited record.
 211 <tag/ir-scan/ The scan object represents a list of scan lines
 212 retrieved from a target.
 213 </descrip>
 214
 215 <bf/Example/
 216
 217 To create a new IR object called <tt/z-assoc/ write:
 218 <tscreen><verb>
 219    ir z-assoc
 220 </verb></tscreen>
 221
 222 <bf/End of example/
 223
 224 Each object provides a set of <em/settings/ which may either be
 225 readable, writeable of both. All settings immediately follow
 226 the name of the object. If a value is present the setting
 227 is set to <em/value/.
 228
 229 <bf/Example/
 230
 231 We wish to set the preferred-message-size to 18000 on the
 232 <tt/z-assoc/ object:
 233
 234 <tscreen><verb>
 235    z-assoc preferredMessageSize 18000
 236 </verb></tscreen>
 237
 238 To read the current value of preferred-message-size use:
 239
 240 <tscreen><verb>
 241    z-assoc preferredMessageSize
 242 </verb></tscreen>
 243 <bf/End of example/
 244
 245 One important category consists of settings is those that relate to the
 246 event-driven model. When <sf/IrTcl/ receives responses from the target, i.e.
 247 init responses, search responses, etc., a <em/callback/ routine
 248 is called. Callback routines are represented in Tcl as
 249 a list, which is re-interpreted prior to invocation.
 250 The method is similar to the one used in Tk to capture X events.
 251
 252 For each SR/Z39.50 request there is a corresponding object action. The most
 253 important actions are:
 254 <descrip>
 255 <tag/connect/ Establishes connection with a target
 256 <tag/init/ Sends an initialize request.
 257 <tag/search/ Sends a search request.
 258 <tag/present/ Sends a present request.
 259 <tag/scan/ Sends a scan request.
 260 </descrip>
 261
 262 <bf/Example/
 263
 264 This example shows a complete connect - init - search - present scenario.
 265
 266 First an IR object, called <tt/z/, is created.
 267 Also a result set <tt/z.1/ is introduced by the <tt/ir-set/
 268 and it is specified that the result set uses <tt/z/ as its association.
 269
 270 The setting <tt/databaseNames/ is set to the
 271 database <tt/books/ to which the following searches are directed.
 272 A callback is then defined and a connection is established to
 273 <tt/fake.com/ by the <tt/connect/ action.
 274 If the connect succeeds the <tt/connect-response/ is called.
 275
 276 In the Tcl procedure, <tt/connect-response/, a callback is defined
 277 <em/before/ the init request is executed.
 278 The Tcl procedure <tt/init-response/ is called when a
 279 init response is returned from the target.
 280
 281 The <tt/init-response/ procedure sets up a <tt/search-response/
 282 callback handler and sends a search-request by using a query which
 283 consists of a single word <tt/science/.
 284
 285 When the <tt/search-response/ procedure is called it defines
 286 a variable <tt/hits/ and sets it to the value of the setting
 287 <tt/resultCount/. If <tt/hits/ is positive a present-request is
 288 sent &mdash; asking for 5 records from position 1.
 289
 290 Finally, a present response is received and the number of records
 291 returned is stored in the variable <tt/ret/.
 292
 293 <tscreen><verb>
 294 ir z
 295 ir-set z.1 z
 296 z databaseNames books
 297 z callback {connect-response}
 298 z connect fake.com
 299
 300 proc connect-response {} {
 301     z callback {init-response}
 302     z init
 303 }
 304
 305 proc init-response {} {
 306     z.1 callback {search-response}
 307     z.1 search science
 308 }
 309
 310 proc search-response {} {
 311     set hits [z.1 resultCount]
 312     puts "$hits hits"
 313     if {$hits > 0} {
 314         z.1 callback {present-response}
 315         z.1 present 1 5
 316     }
 317 }
 318
 319 proc present-response {} {
 320     set ret [z.1 numberOfRecordsReturned]
 321     puts "$ret records returned"
 322 }
 323 </verb></tscreen>
 324 <bf/End of example/
 325
 326 The previous example program doesn't care about error conditions.
 327 If errors occur in the program they will be trapped by the Tcl error
 328 handler. This is not always appropriate. However, Tcl offers a
 329 <tt/catch/ command to support error handling by the program itself.
 330
 331 <sect>Associations
 332
 333 <p>
 334 The ir object describes an association with a target.
 335 This section covers the connect-init-disconnect actions provided
 336 by the ir object.
 337 An ir object is created by the <tt/ir/ command and the
 338 created object enters a 'not connected' state, because it isn't
 339 connected to a target yet.
 340
 341 <sect1>Connect
 342
 343 <p>
 344 A connection is established by the <tt/connect/ action which is
 345 immediately followed by a hostname. A number of settings affect the
 346 <tt/connect/ action. Obviously, these settings should be set
 347 <bf/before/ connecting. The settings are:
 348
 349 <descrip>
 350 <tag><tt>comstack </tt><tt>mosi|tcpip</tt></tag>
 351  Comstack type.
 352 <tag><tt>protocol </tt><tt>Z39|SR</tt></tag>
 353  Protocol type - ANSI/NISO Z39.50 or ISO SR.
 354 <tag><tt>callback </tt><em>list</em></tag>
 355  Tcl script called when the connection is established
 356 <tag><tt>failback </tt><em>list</em></tag>
 357  Fatal error Tcl script. Called on protocol errors or if target
 358  closes connection
 359 </descrip>
 360
 361 If the connect is unsuccessful either the connect action itself
 362 will return an error code or the failback handler is invoked.
 363
 364 <sect1>Init
 365
 366 <p>
 367 If the connect operation succeeds the <tt/init/ action should be used.
 368 The init related settings are:
 369
 370 <descrip>
 371 <tag><tt>preferredMessageSize </tt><em>integer</em></tag>
 372  Preferred-message-size. Default value is 30000.
 373 <tag><tt>maximumRecordSize </tt><em>integer</em></tag>
 374  Maximum-record-size. Default value is 30000.
 375 <tag><tt>idAuthentication </tt><em>string</em> ...</tag>
 376  Id-authentication. There are three forms. If any empty is
 377  given, the Id-authentication is not used.  If one non-empty string
 378  is given, the 'open' authentication is used. If three strings are
 379  specified, the version 'id-pass' authentication (version 3 only)
 380  is used in which case the first string is groupId; the second string
 381  is userId and the third string is password.
 382 <tag><tt>implementationName </tt><em>string</em></tag>
 383  Implementation-name of origin system.
 384 <tag><tt>implementationId</tt></tag>
 385  Implementation-id of origin system. This setting is read-only.
 386 <tag><tt>implementationVersion</tt></tag>
 387  Implementation-version of origin system. This settings is read-only.
 388 <tag><tt>options </tt><em>list</em></tag>
 389  Options to be negotiated in the init service. The list contains
 390  the options that are set. Possible values are <tt>search</tt>,
 391  <tt>present</tt>, <tt>delSet</tt>, <tt>resourceReport</tt>,
 392  <tt>triggerResourceCtrl</tt>, <tt>resourceCtrl</tt>,
 393  <tt>accessCtrl</tt>, <tt>scan</tt>, <tt>sort</tt>,
 394  <tt>extendedServices</tt>, <tt>level-1Segmentation</tt>,
 395  <tt>level-2Segmentation</tt>, <tt>concurrentOperations</tt> and
 396  <tt>namedResultSets</tt>. Currently the default options are:
 397  <tt>search</tt>, <tt>present</tt>, <tt>scan</tt> and
 398  <tt>namedResultSets</tt>. The <tt>options</tt> setting is set to its default
 399  value when an ir object is created and when a <tt>disconnect</tt>
 400  action is performed.
 401 <tag><tt>protocolVersion </tt><em>integer</em></tag>
 402  Protocol version: 2, 3, etc. Default is 2.
 403 <tag><tt>initResponse </tt><em>list</em></tag>
 404  Init-response Tcl script. Note: not implemented - use <tt>callback</tt>
 405  instead.
 406 <tag><tt>callback </tt><em>list</em></tag>
 407  General response Tcl script. Only used if <tt>initResponse</tt>
 408  is not specified.
 409 </descrip>
 410
 411 The init-response handler should inspect some of the settings shown
 412 below:
 413
 414 <descrip>
 415 <tag><tt>initResult </tt><em>boolean</em></tag>
 416  Init response status. True if init operation was successful;
 417  false otherwise.
 418 <tag><tt>preferredMessageSize </tt><em>integer</em></tag>
 419  Preferred-message-size.
 420 <tag><tt>maximumRecordSize </tt><em>integer</em></tag>
 421  Maximum-record-size.
 422 <tag><tt>targetImplementationName </tt><em>string</em></tag>
 423  Implementation-name of target system.
 424 <tag><tt>targetImplementationId </tt><em>string</em></tag>
 425  Implementation-id of target system.
 426 <tag><tt>targetImplementationVersion </tt><em>string</em></tag>
 427  Implementation-version of target system.
 428 <tag><tt>options </tt><em>list</em></tag>
 429  Options negotiated in init. The list contains the options that are set.
 430 <tag><tt>protocolVersion </tt><em>integer</em></tag>
 431  Protocol version: 2, 3, etc.
 432 <tag><tt>userInformationField </tt><em>string</em></tag>
 433  User information field.
 434 </descrip>
 435
 436 <bf/Example/
 437
 438 Consider a client with the ability to access multiple targets.
 439
 440 We define a list of targets that we wish to connect to.
 441 Each item in the list describes the target parameters with
 442 the following four components: association-name, comstack-type,
 443 protocol-type and a hostname.
 444
 445 The list for the two targets: ISO/SR target DANBIB and TCP/Z39.50
 446 target Data Research, will be defined as:
 447 <tscreen><verb>
 448 set targetList { {danbib mosi SR 0103/find2.denet.dk:4500}
 449                  {drs tcpip Z39 dranet.dra.com} }
 450 </verb></tscreen>
 451
 452 The Tcl code below defines, connect and initialize the
 453 targets in <tt/targetList/:
 454
 455 <tscreen><verb>
 456 foreach target $targetList {
 457     set assoc [lindex $target 0]
 458     ir $assoc
 459     $assoc comstack [lindex $target 1]
 460     $assoc protocol [lindex $target 2]
 461     $assoc failback [list fail-response $assoc]
 462     $assoc callback [list connect-response $assoc]
 463     $assoc connect [lindex $target 3]
 464 }
 465
 466 proc connect-response {assoc} {
 467     $assoc callback [list init-response $assoc]
 468     $assoc init
 469 }
 470
 471 proc fail-response {assoc} {
 472     puts "$assoc closed connection or protocol error"
 473 }
 474
 475 proc init-response {assoc} {
 476     if {[$assoc initResult]} {
 477         puts "$assoc initialized ok"
 478     } else {
 479         puts "$assoc didn't initialize"
 480     }
 481 }
 482 </verb></tscreen>
 483
 484 <tt/target/ is bound to each item in the list of targets.
 485 The <tt/assoc/ is set to the ir object name.
 486 Then, the comstack, protocol and failback are set for the <tt/assoc/ object.
 487 The ir object name is argument to the <tt/fail-response/ and
 488 <tt/connect-response/ routines.
 489 Note the use of the Tcl <tt/list/ command which
 490 is necessary here because the argument contains variables
 491 (<tt/assoc/) that should be substituted before the handler is defined.
 492 After the connect operation, the <tt/init-response/ handler
 493 is defined in much the same way as the failback handler.
 494 And, finally, an init request is executed.
 495
 496 <bf/End of example/
 497
 498 <sect1>Disconnect
 499
 500 <p>
 501 To terminate the connection the <tt/disconnect/ action should be used.
 502 This action has no parameters.
 503 Another connection may be established by a new <tt/connect/ action on
 504 the same ir object.
 505
 506 <sect>Result sets
 507
 508 <p>
 509 This section covers the queries used by <sf/IrTcl/, and how searches and
 510 presents are handled.
 511
 512 A search operation and a result set is described by the ir set object.
 513 The ir set object is defined by the <tt/ir-set/ command which
 514 has two parameters. The first is the name of the new ir set object, and
 515 the second, which is optional, is the name of an assocation &mdash; an ir
 516 object. The second argument is required if the ir set object should be able
 517 to perform searches and presents. However, it is not required if
 518 only ``local'' operations is done with the ir set object.
 519
 520 When the ir set object is created a number of settings are inherited
 521 from the ir object, such as the selected databass, query type,
 522 etc. Thus, the ir object contains what we could call default
 523 settings.
 524
 525 <sect1>Queries
 526
 527 <p>
 528 Search requests are sent by the <tt/search/ action which
 529 takes a query as parameter. There are two types of queries,
 530 RPN and CCL, controlled by the setting <tt/queryType/.
 531 A string representation for the query is used in <sf/IrTcl/ since
 532 Tcl has reasonably powerful string manipulaton capabilities.
 533 The RPN query used in <sf/IrTcl/ is the prefix query notation also used in
 534 the <bf/YAZ/ test client.
 535
 536 The CCL query is an uninterpreted octet-string which is parsed by the target.
 537 We refer to the standard: ISO 8777. Note that only a few targets
 538 actually support the CCL query and the interpretation of
 539 the standard may vary.
 540
 541 The prefix query notation (which is converted to RPN) offer a few
 542 operators. They are:
 543
 544 <descrip>
 545 <tag><tt>@attr </tt><em>list op</em></tag>
 546  The attributes in list are applied to op
 547 <tag><tt>@and </tt><em>op1 op2</em></tag>
 548  Boolean <em/and/ on op1 and op2
 549 <tag><tt>@or </tt><em>op1 op2</em></tag>
 550  Boolean <em/or/ on op1 and op2
 551 <tag><tt>@not </tt><em>op1 op2</em></tag>
 552  Boolean <em/not/ on op1 and op2
 553 <tag><tt>@prox </tt><em>list op1 op2</em></tag>
 554  Proximity operation on op1 and op2. Not implemented yet.
 555 <tag><tt>@set </tt><em>name</em></tag>
 556  Result set reference
 557 </descrip>
 558
 559 It is simple to build RPN queries in <sf/IrTcl/. Search terms
 560 are sequences of characters, as in:
 561 <tscreen><verb>
 562    science
 563 </verb></tscreen>
 564
 565 Boolean operators use the prefix notation (instead of the suffix/RPN),
 566 as in:
 567 <tscreen><verb>
 568    @and science technology
 569 </verb></tscreen>
 570
 571 Search terms may be associated with attributes. These
 572 attributes are indicated by the <tt/@attr/ operator.
 573 Assuming the bib-1 attribute set, we can set the use-attribute
 574 (type is 1) to title (value is 4):
 575
 576 <tscreen><verb>
 577    @attr 1=4 science
 578 </verb></tscreen>
 579
 580 Also, it is possible to apply attributes to a range of search terms.
 581 In the query below, both search terms have use=title but the <tt/tech/
 582 term is right truncated:
 583
 584 <tscreen><verb>
 585    @attr 1=4 @and @attr 5=1 tech beta
 586 </verb></tscreen>
 587
 588 <sect1>Search
 589
 590 <p>
 591 The settings that affect the search are listed below:
 592
 593 <descrip>
 594 <tag><tt>databaseNames </tt><em>list</em></tag>
 595  database-names.
 596 <tag><tt>smallSetUpperBound </tt><em>integer</em></tag>
 597  small set upper bound. Default 0.
 598 <tag><tt>largeSetLowerBound </tt><em>integer</em></tag>
 599  large set lower bound. Default 2.
 600 <tag><tt>mediumSetPresentNumber </tt><em>integer</em></tag>
 601  medium set present number. Default 0.
 602 <tag><tt>replaceIndicator </tt><em>boolean</em></tag>
 603  replace-indicator.
 604 <tag><tt>setName </tt><em>string</em></tag>
 605  name of result set.
 606 <tag><tt>queryType rpn|ccl</tt></tag>
 607  query type-1 or query type-2
 608 <tag><tt>preferredRecordSyntax </tt><em>string</em></tag>
 609  preferred record syntax &mdash; UNIMARC, USMARC, etc.
 610 <tag><tt>smallSetElementSetNames </tt><em>string</em></tag>
 611  small-set-element-set names. Not implemented yet.
 612 <tag><tt>mediumSetElementSetNames </tt><em>string</em></tag>
 613  medium-set-element-set names. Not implemented yet.
 614 <tag><tt>searchResponse </tt><em>list</em></tag>
 615  Search-response Tcl script. Not implemented yet. Use <tt>callback</tt>
 616  instead.
 617 <tag><tt>callback </tt><em>list</em></tag>
 618  General response Tcl script. Only used if searchResponse is not specified
 619 </descrip>
 620
 621 Setting the <tt/databaseNames/ is mandatory. All other settings
 622 have reasonable defaults.
 623 The search-response handler, specified by the <tt/callback/ - or
 624 the <tt/searchResponse/ setting,
 625 should read some of the settings shown below:
 626
 627 <descrip>
 628 <tag><tt>searchStatus </tt><em>boolean</em></tag>
 629  search-status. True if search operation was successful; false
 630  otherwise.
 631 <tag><tt>responseStatus </tt><em>list</em></tag>
 632  response status information.
 633 <tag><tt>resultCount </tt><em>integer</em></tag>
 634  result-count
 635 <tag><tt>numberOfRecordsReturned </tt><em>integer</em></tag>
 636  number of records returned.
 637 </descrip>
 638
 639 The <tt/responseStatus/ signals one of three conditions which
 640 is indicated by the value of the first item in the list:
 641
 642 <descrip>
 643 <tag><tt>NSD</tt></tag> indicates that the target has returned one or
 644 more non-surrogate diagnostic messages. The <tt/NSD/ item is followed by
 645 a list with all non-surrogate messages. Each non-surrogate message consists
 646 of three items. The first item of the three items is the error
 647 code (integer); the next item is a textual representation of the error
 648 code in plain english; the third item is additional information, possibly
 649 empty if no additional information was returned by the target.
 650
 651 <tag><tt>DBOSD</tt></tag> indicates a successful operation where the
 652 target has returned one or more records. Each record may be
 653 either a database record or a surrogate diagnostic.
 654
 655 <tag><tt>OK</tt></tag> indicates a successful operation &mdash; no records are
 656 returned from the target.
 657 </descrip>
 658
 659 <bf/Example/
 660
 661 We continue with the multiple-targets example.
 662 The <tt/init-response/ procedure will attempt to make searches:
 663
 664 <tscreen><verb>
 665 proc init-response {assoc} {
 666     puts "$assoc connected"
 667     ir-set ${assoc}.1 $assoc
 668     $assoc.1 queryType rpn
 669     $assoc.1 databaseNames base-a base-b
 670     $assoc.1 callback [list search-response $assoc ${assoc}.1]
 671     $assoc.1 search "@attr 1=4 @and @attr 5=1 tech beta"
 672 }
 673 </verb></tscreen>
 674
 675 An ir set object is defined and the
 676 ir object is told about the name of ir object.
 677 The ir set object use the name of the ir object as prefix.
 678
 679 Then, the query-type is defined to be RPN, i.e. we will
 680 use the prefix query notation later on.
 681
 682 Two databases, <tt/base-a/ and <tt/base-b/, are selected.
 683
 684 A <tt/search-response/ handler is defined with the
 685 ir object and the ir-set object as parameters and
 686 the search is executed.
 687
 688 The first part of the <tt/search-response/ looks like:
 689 <tscreen><verb>
 690 proc search-response {assoc rset} {
 691     set status [$rset responseStatus]
 692     set type [lindex $status 0]
 693     if {$type == "NSD"} {
 694         set code [lindex $status 1]
 695         set msg [lindex $status 2]
 696         set addinfo [lindex $status 3]
 697         puts "NSD $code: $msg: $addinfo"
 698         return
 699     }
 700     set hits [$rset resultCount]
 701     if {$type == "DBOSD"} {
 702         set ret [$rset numberOfRecordsReturned]
 703         ...
 704     }
 705 }
 706 </verb></tscreen>
 707 The response status is stored in variable <tt/status/ and
 708 the first element indicates the condition.
 709 If non-surrogate diagnostics are returned they are displayed.
 710 Otherwise, the search was a success and the number of hits
 711 is read. Finally, it is tested whether the search response
 712 returned records (database or diagnostic).
 713
 714 Note that we actually didn't inspect the search status (setting
 715 <tt/searchStatus/) to determine whether the search was successful or not,
 716 because the standard specifies that one or more non-surrogate
 717 diagnostics should be returned by the target in case of errors.
 718
 719 <bf/End of example/
 720
 721 If one or more records are returned from the target they
 722 will be stored in the result set object.
 723 In the case in which the search response contains records, it is
 724 very similar to the present response case. Therefore, some settings
 725 are common to both situations.
 726
 727 <sect1>Present
 728
 729 <p>
 730 The <tt/present/ action sends a present request. The <tt/present/ is
 731 followed by two optional integers. The first integer is the
 732 result-set starting position &mdash; defaults to 1. The second integer
 733 is the number of records requested &mdash; defaults to 10.
 734 The settings which could be modified before a <tt/present/
 735 action are:
 736
 737 <descrip>
 738 <tag><tt>preferredRecordSyntax </tt><em>string</em></tag>
 739  preferred record syntax &mdash; UNIMARC, USMARC, etc.
 740 <tag><tt>elementSetElementSetNames </tt><em>string</em></tag>
 741  element-set names
 742 <tag><tt>presentResponse </tt><em>list</em></tag>
 743  Present-response Tcl script. Not implemented yet. Use <tt>callback</tt>
 744  instead.
 745 <tag><tt>callback </tt><em>list</em></tag>
 746  General response Tcl script. Only used if presentResponse is not specified
 747 </descrip>
 748
 749 The present-response handler should inspect the settings
 750 shown in table below.
 751 Note that <tt/responseStatus/ and <tt/numberOfRecordsReturned/
 752 settings were also used in the search-response case.
 753
 754 As in the search response case, records returned from the
 755 target are stored in the result set object.
 756
 757 <descrip>
 758 <tag><tt>presentStatus </tt><em>boolean</em></tag>
 759  present-status
 760 <tag><tt>responseStatus </tt><em>list</em></tag>
 761  Response status information
 762 <tag><tt>numberOfRecordsReturned </tt><em>integer</em></tag>
 763  number of records returned
 764 <tag><tt>nextResultSetPosition </tt><em>integer</em></tag>
 765  next result set position
 766 </descrip>
 767
 768 <sect1>Records
 769
 770 <p>
 771 Search responses and present responses may result in
 772 one or more records stored in the ir set object if
 773 the <tt/responseStatus/ setting indicates database or
 774 surrogate diagnostics (<tt/DBOSD/). The individual
 775 records, indexed by an integer position, should be
 776 inspected.
 777
 778 The action <tt/type/ followed by an integer returns information
 779 about a given position in an ir set. There are three possiblities:
 780
 781 <descrip>
 782 <tag><tt/SD/</tag> The item is a surrogate diagnostic record.
 783 <tag><tt/DB/</tag> The item is a database record.
 784 <tag><em/empty/</tag> There is no record at the specified position.
 785 </descrip>
 786
 787 To handle the first case, surrogate diagnostic record, the
 788 <tt/Diag/ action should be used. It returns three
 789 items: error code (integer), text representation in plain english
 790 (string), and additional information (string, possibly empty).
 791
 792 In the second case, database record, the <tt/recordType/ action should
 793 be used. It returns the record type at the given position.
 794 Some record types are:
 795
 796 <tscreen>
 797 UNIMARC
 798 INTERMARC
 799 CCF
 800 USMARC
 801 UKMARC
 802 NORMARC
 803 LIBRISMARC
 804 DANMARC
 805 FINMARC
 806 SUTRS
 807 </tscreen>
 808
 809 <bf/Example/
 810
 811 We continue our search-response example. In the case,
 812 <tt/DBOSD/, we should inspect the result set items.
 813 Recall that the ir set name was passed to the
 814 search-response handler as argument <tt/rset/.
 815
 816 <tscreen><verb>
 817     if {$type == "DBOSD"} {
 818         set ret [$rset numberOfRecordsReturned]
 819         for {set i 1} {$i<=$ret} {incr i} {
 820             set itype [$rset type $i]
 821             if {$itype == "SD"} {
 822                 set diag [$rset Diag $i]
 823                 set code [lindex $diag 0]
 824                 set msg [lindex $diag 1]
 825                 set addinfo [lindex $diag 2]
 826                 puts "$i: NSD $code: $msg: $addinfo"
 827             } elseif {$itype == "DB"} {
 828                 set rtype [$rset recordType $i]
 829                 puts "$i: type is $rtype"
 830             }
 831         }
 832     }
 833 </verb></tscreen>
 834 Each item in the result set is examined.
 835 If an item is a diagnostic message it is displayed; otherwise
 836 if it's a database record its type is displayed.
 837
 838 <bf/End of example/
 839
 840 <sect1>MARC records
 841
 842 <p>
 843 In the case, where there is a MARC record at a given position we
 844 want to display it somehow. The action <tt/getMarc/ is what we need.
 845 The <tt/getMarc/ is followed by a position integer and the type of
 846 extraction we want to make: <tt/field/ or <tt/line/.
 847
 848 The <tt/field/ and <tt/line/ type are followed by three
 849 parameters that serve as extraction masks.
 850 They are called tag, indicator and field.
 851 If the mask matches a tag/indicator/field of a record the information
 852 is extracted. Two characters have special meaning in masks: the
 853 dot (any character) and star (any number of any character).
 854
 855 The <tt/field/ type returns one or more lists of field information
 856 that matches the mask specification. Only the content of fields
 857 is returned.
 858
 859 The <tt/line/ type, on the other hand, returns a Tcl list that
 860 completely describe the layout of the MARC record &mdash; including
 861 tags, fields, etc.
 862
 863 The <tt/field/ type is sufficient and efficient in the case, where only a
 864 small number of fields are extracted, and in the case where no
 865 further processing (in Tcl) is necessary.
 866
 867 However, if the MARC record is to be edited or altered in any way, the
 868 <tt/line/ extraction is more powerful &mdash; only limited by the Tcl
 869 language itself.
 870
 871 <bf/Example/
 872
 873 Consider the record below:
 874 <tscreen><verb>
 875 001       11224466
 876 003    DLC
 877 005    00000000000000.0
 878 008    910710c19910701nju           00010 eng
 879 010    $a    11224466
 880 040    $a DLC $c DLC
 881 050 00 $a 123-xyz
 882 100 10 $a Jack Collins
 883 245 10 $a How to program a computer
 884 260 1  $a Penguin
 885 263    $a 8710
 886 300    $a p. cm.
 887 </verb></tscreen>
 888
 889 Assuming this record is at position 1 in ir-set <tt/z.1/, we
 890 might extract the title-field (245 * a), with the following command:
 891 <tscreen><verb>
 892 z.1 getMarc 1 field 245 * a
 893 </verb></tscreen>
 894
 895 which gives:
 896 <tscreen><verb>
 897 {How to program a computer}
 898 </verb></tscreen>
 899
 900 Using the <tt/line/ instead of <tt/field/ gives:
 901 <tscreen><verb>
 902 {245 {10} {{a {How to program a computer}} }}
 903 </verb></tscreen>
 904
 905 If we wish to extract the whole record as a list, we use:
 906 <tscreen><verb>
 907 z.1 getMarc 1 line * * *
 908 </verb></tscreen>
 909
 910 giving:
 911 <tscreen><verb>
 912 {001 {} {{{} {   11224466 }} }}
 913 {003 {} {{{} DLC} }}
 914 {005 {} {{{} 00000000000000.0} }}
 915 {008 {} {{{} {910710c19910701nju           00010 eng  }} }}
 916 {010 {  } {{a {   11224466 }} }}
 917 {040 {  } {{a DLC} {c DLC} }}
 918 {050 {00} {{a 123-xyz} }}
 919 {100 {10} {{a {Jack Collins}} }}
 920 {245 {10} {{a {How to program a computer}} }}
 921 {260 {1 } {{a Penguin} }}
 922 {263 {  } {{a 8710} }}
 923 {300 {  } {{a {p. cm.}} }}
 924 </verb></tscreen>
 925
 926 <bf/End of example/
 927
 928 <bf/Example/
 929
 930 This example demonstrates how Tcl can be used to examine
 931 a MARC record in the list notation.
 932
 933 The procedure <tt/extract-format/ makes an extraction of
 934 fields in a MARC record based on a number of masks.
 935 There are 5 parameters, <tt/r/: a
 936 record in list notation, <tt/tag/: regular expression to
 937 match the record tags, <tt/ind/: regular expression to
 938 match indicators, <tt/field/: regular expression to
 939 match fields, and finally <tt/text/: regular expression to
 940 match the content of a field.
 941
 942 <tscreen><verb>
 943 proc extract-format {r tag ind field text} {
 944     foreach line $r {
 945         if {[regexp $tag [lindex $line 0]] && \
 946                 [regexp $ind [lindex $line 1]]} {
 947             foreach f [lindex $line 2] {
 948                 if {[regexp $field [lindex $f 0]]} {
 949                     if {[regexp $text [lindex $f 1]]} {
 950                         puts [lindex $f 1]
 951                     }
 952                 }
 953             }
 954         }
 955     }
 956 }
 957 </verb></tscreen>
 958
 959 To match <tt/comput/ followed by any number of character(s) in the
 960 245 fields in the record from the previous example, we could use:
 961 <tscreen><verb>
 962 set r [z.1 getMarc 1 line * * *]
 963
 964 extract-format $r 245 .. . comput
 965 </verb></tscreen>
 966 which gives:
 967 <tscreen><verb>
 968 How to program a computer
 969 </verb></tscreen>
 970
 971 <bf/End of example/
 972
 973 The <tt/putMarc/ action does the opposite of <tt/getMarc/. It
 974 copies a record in Tcl list notation to a ir set object and is
 975 needed if a result-set must be updated by a Tcl modified (user-edited)
 976 record.
 977
 978 <sect1>SUTRS
 979
 980 In <sf/IrTcl/ a SUTRS record is treated as one single string. To retrieve
 981 a SUTRS string at a given index, the <tt>getSutrs</tt> should be used.
 982 The <tt>getSutrs</tt> is immediately followed by a index.
 983
 984 <sect>Scan
 985
 986 <p>
 987 To perform scan, a scan object must be created by the <tt>ir-scan</tt>
 988 command. This command has two arguments - name of the scan object and
 989 name of the ir object. Basically, the scan object, provides one <tt>scan</tt>
 990 action which sends a scan request to the target. The <tt>action</tt>
 991 is followed by a string describing starting point of the term list. The
 992 format used is a simple subset of the query used in search requests. Only
 993 <tt>@attr</tt> specifications and simple terms are allowed.
 994 The settings that affect the scan are:
 995
 996 <descrip>
 997 <tag><tt>stepSize </tt><em>integer</em></tag>
 998  Step size. Default is 0.
 999 <tag><tt>numberOfTermsRequested </tt><em>integer</em></tag>
1000  Number of terms requested. Default is 20.
1001 <tag><tt>preferredPositionInResponse </tt><em>integer</em></tag>
1002  Preferred position in response. Default is 1.
1003 <tag><tt>databaseNames </tt><em>list</em></tag>
1004  Database names. Note that this setting is not (yet) supported for
1005  the scan object. You must set this for the ir object instead.
1006 <tag><tt>callback </tt><em>list</em></tag>
1007  General response Tcl script. This setting is not (yet) supported for
1008  the scan object. You must set this for the ir object instead.
1009 </descrip>
1010
1011 The scan object normally holds one or more scan line entries upon
1012 successful completion. The table below summarizes the settings
1013 that should be used in a response handler.
1014
1015 <descrip>
1016 <tag><tt>scanStatus</tt></tag>
1017  Scan status. An integer between 0 and 6.
1018 <tag><tt>numberOfTermsReturned </tt><em>integer</em></tag>
1019  Number of terms returned.
1020 <tag><tt>positionOfTerm</tt></tag>
1021  An integer describing the position of term.
1022 <tag><tt>scanLine </tt> <em>integer</em></tag>
1023  This function returns information about a given scan line (entry) at a given
1024  index specified by the integer. The first scan line is numbered zero;
1025  the second 1 and so on. A list is returned by the <tt>scanLine</tt>
1026  setting. The first element is <tt>T</tt> if the scan line
1027  is a normal term and <tt>SD</tt> if the scan line is a surrogate
1028  diagnostic. In the first case (normal) the scan term is second element
1029  in the list and the number of occurences is the third element.
1030  In the other case (surrogate diagnostic), the second element
1031  is the diagnostic code, the third a text representation of the error
1032  code and the fourth element is additional information.
1033 </descrip>
1034
1035 <bf/Example/
1036
1037 We will scan for the terms after <tt>science</tt> in the Title index.
1038 We will assume that an ir object called <tt>z-assoc</tt> has already
1039 been created.
1040
1041 <tscreen><verb>
1042    z-assoc callback {scan-response}
1043    ir-scan z-scan z-assoc
1044    z-scan scan "@attr 1=4 science"
1045
1046    proc scan-response {} {
1047        set status [z-scan status]
1048        if {$status == 0} {
1049            set no [z-scan numberOfTermsReturned]
1050            for {set i 0} {$i < $no} {incr i} {
1051                set line [z-scan scanLine $i]
1052                set type [lindex $line 0]
1053                if {$type == "T"} {
1054                    puts [lindex $line 1]
1055                } elseif {$type == "SD"} {
1056                    puts [lindex $line 1]
1057                }
1058            }
1059        }
1060    }
1061 </verb></tscreen>
1062 <bf/End of examle/
1063
1064 <sect>License
1065
1066 <p>
1067 Copyright &copy; 1995, Index Data.
1068
1069 Permission to use, copy, modify, distribute, and sell this software and
1070 its documentation, in whole or in part, for any purpose, is hereby granted,
1071 provided that:
1072
1073 1. This copyright and permission notice appear in all copies of the
1074 software and its documentation. Notices of copyright or attribution
1075 which appear at the beginning of any file must remain unchanged.
1076
1077 2. The names of Index Data or the individual authors may not be used to
1078 endorse or promote products derived from this software without specific
1079 prior written permission.
1080
1081 THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT WARRANTY OF ANY KIND,
1082 EXPRESS, IMPLIED, OR OTHERWISE, INCLUDING WITHOUT LIMITATION, ANY
1083 WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
1084 IN NO EVENT SHALL INDEX DATA BE LIABLE FOR ANY SPECIAL, INCIDENTAL,
1085 INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY KIND, OR ANY DAMAGES
1086 WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER OR
1087 NOT ADVISED OF THE POSSIBILITY OF DAMAGE, AND ON ANY THEORY OF
1088 LIABILITY, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE
1089 OF THIS SOFTWARE.
1090
1091 <sect>About Index Data
1092
1093 <p>
1094 Index Data is a consulting and software-development enterprise that
1095 specialises in library and information management systems. Our
1096 interests and expertise span a broad range of related fields, and one
1097 of our primary, long-term objectives is the development of a powerful
1098 information management
1099 system with open network interfaces and hypermedia capabilities.
1100
1101 We make this software available free of charge, on a fairly unrestrictive
1102 license; as a service to the networking community, and to further the
1103 development of quality software for open network communication.
1104
1105 We'll be happy to answer questions about the software, and about ourselves
1106 in general.
1107
1108 <tscreen>
1109 Index Data&nl
1110 Ryesgade 3&nl
1111 2200 K&oslash;benhavn N&nl
1112 DENMARK
1113 </tscreen>
1114
1115 <p>
1116 <tscreen><verb>
1117 Phone: +45 3536 3672
1118 Fax  : +45 3536 0449
1119 Email: info@index.ping.dk
1120 </verb></tscreen>
1121
1122 <sect>References
1123
1124 <p>
1125
1126 <descrip>
1127 <tag>1 Ousterhout, John K.:</tag>
1128 Tcl and the Tk Toolkit. Addison-Wesley Company Inc (ISBN
1129 0-201-63337-X). Source and documentation
1130 can be found in <tt>URL:ftp://ftp.cs.berkeley.edu/pub/tcl</tt>
1131 and mirrors.
1132 <tag>2 Furniss, Peter:</tag>
1133 RFC 1698: Octet Sequences for Upper-Layer OSI to Support
1134 Basic Communications Applications.
1135 </descrip>
1136
1137 </article>