doc/odr.xml

   1 <!-- $Header: /home/cvsroot/yaz/doc/odr.xml,v 1.1 2001-01-04 13:36:24 adam Exp $ -->
   2 <chapter><title id="odr">The ODR Module</title>
   3
   4 <sect1><title>Introduction</title>
   5
   6 <para>
   7 &odr; is the BER-encoding/decoding subsystem of &yaz;. Care as been taken
   8 to isolate &odr; from the rest of the package - specifically from the
   9 transport interface. &odr; may be used in any context where basic
  10 ASN.1/BER representations are used.
  11 </para>
  12
  13 <para>
  14 If you are only interested in writing a Z39.50 implementation based on
  15 the PDUs that are already provided with &yaz;, you only need to concern
  16 yourself with the section on managing ODR streams (section
  17 <link linkend="odr-use">Using ODR</link>). Only if you need to
  18 implement ASN.1 beyond that which has been provided, should you
  19 worry about the second half of the documentation
  20 (section <link linkend="odr-prog">Programming with ODR</link>).
  21 If you use one of the higher-level interfaces, you can skip this
  22 section entirely.
  23 </para>
  24
  25 <para>
  26 This is important, so we'll repeat it for emphasis: <emphasis>You do not
  27 need to read section <link linkend="odr-prog">Programming with ODR</link> to
  28 implement Z39.50 with &yaz;.</emphasis>
  29 </para>
  30
  31 <para>
  32 If you need a part of the protocol that isn't already in &yaz;, you
  33 should contact the authors before going to work on it yourself: We
  34 might already be working on it. Conversely, if you implement a useful
  35 part of the protocol before us, we'd be happy to include it in a
  36 future release.
  37 </para>
  38
  39 </sect1>
  40 <sect1><title id="odr-use">Using ODR</title>
  41
  42 <sect2><title>ODR Streams</title>
  43
  44 <para>
  45 Conceptually, the ODR stream is the source of encoded data in the
  46 decoding mode; when encoding, it is the receptacle for the encoded
  47 data. Before you can use an ODR stream it must be allocated. This is
  48 done with the function
  49 </para>
  50
  51 <synopsis>
  52   ODR odr_createmem(int direction);
  53 </synopsis>
  54
  55 <para>
  56 The <function>odr_createmem()</function> function takes as argument one
  57 of three manifest constants: <literal>ODR_ENCODE</literal>,
  58 <literal>ODR_DECODE</literal>, or <literal>ODR_PRINT</literal>.
  59 An &odr; stream can be in only one mode - it is not possible to change
  60 its mode once it's selected. Typically, your program will allocate
  61 at least two ODR streams - one for decoding, and one for encoding.
  62 </para>
  63
  64 <para>
  65 When you're done with the stream, you can use
  66 </para>
  67
  68 <synopsis>
  69   void odr_destroy(ODR o);
  70 </synopsis>
  71
  72 <para>
  73 to release the resources allocated for the stream.
  74 </para>
  75 </sect2>
  76
  77 <sect2><title id="memory">Memory Management</title>
  78
  79 <para>
  80 Two forms of memory management take place in the &odr; system. The first
  81 one, which has to do with allocating little bits of memory (sometimes
  82 quite large bits of memory, actually) when a protocol package is
  83 decoded, and turned into a complex of interlinked structures. This
  84 section deals with this system, and how you can use it for your own
  85 purposes. The next section deals with the memory management which is
  86 required when encoding data - to make sure that a large enough buffer is
  87 available to hold the fully encoded PDU.
  88 </para>
  89
  90 <para>
  91 The &odr; module has its own memory management system, which is
  92 used whenever memory is required. Specifically, it is used to allocate
  93 space for data when decoding incoming PDUs. You can use the memory
  94 system for your own purposes, by using the function
  95 </para>
  96
  97 <synopsis>
  98 void *odr_malloc(ODR o, int size);
  99 </synopsis>
 100
 101 <para>
 102 You can't use the normal <function>free(2)</function> routine to free
 103 memory allocated by this function, and &odr; doesn't provide a parallel
 104 function. Instead, you can call
 105 </para>
 106
 107 <synopsis>
 108   void odr_reset(ODR o, int size);
 109 </synopsis>
 110
 111 <para>
 112 when you are done with the
 113 memory: Everything allocated since the last call to
 114 <function>odr_reset()</function> is released.
 115 The <function>odr_reset()</function> call is also required to clear
 116 up an error condition on a stream.
 117 </para>
 118
 119 <para>
 120 The function
 121 </para>
 122
 123 <synopsis>
 124   int odr_total(ODR o);
 125 </synopsis>
 126
 127 <para>
 128 returns the number of bytes allocated on the stream since the last call to
 129 <function>odr_reset()</function>.
 130 </para>
 131
 132 <para>
 133 The memory subsystem of &odr; is fairly efficient at allocating and
 134 releasing little bits of memory. Rather than managing the individual,
 135 small bits of space, the system maintains a freelist of larger chunks
 136 of memory, which are handed out in small bits. This scheme is
 137 generally known as a <emphasis>nibble memory</emphasis> system.
 138 It is very useful for maintaing short-lived constructions such
 139 as protocol PDUs.
 140 </para>
 141
 142 <para>
 143 If you want to retain a bit of memory beyond the next call to
 144 <function>odr_reset()</function>, you can use the function
 145 </para>
 146
 147 <synopsis>
 148   ODR_MEM odr_extract_mem(ODR o);
 149 </synopsis>
 150
 151 <para>
 152 This function will give you control of the memory recently allocated
 153 on the ODR stream. The memory will live (past calls to
 154 <function>odr_reset()</function>), until you call the function
 155 </para>
 156
 157 <synopsis>
 158   void odr_release_mem(ODR_MEM p);
 159 </synopsis>
 160
 161 <para>
 162 The opaque <literal>ODR_MEM</literal> handle has no other purpose than
 163 referencing the memory block for you until you want to release it.
 164 </para>
 165
 166 <para>
 167 You can use <function>odr_extract_mem()</function> repeatedly between
 168 allocating data, to retain individual control of separate chunks of data.
 169 </para>
 170
 171 </sect2>
 172 <sect2><title>Encoding and Decoding Data</title>
 173
 174 <para>
 175 When encoding data, the ODR stream will write the encoded octet string
 176 in an internal buffer. To retrieve the data, use the function
 177 </para>
 178
 179 <synopsis>
 180   char *odr_getbuf(ODR o, int *len, int *size);
 181 </synopsis>
 182
 183 <para>
 184 The integer pointed to by len is set to the length of the encoded
 185 data, and a pointer to that data is returned. <literal>*size</literal>
 186 is set to the size of the buffer (unless <literal>size</literal> is null,
 187 signalling that you are not interested in the size). The next call to
 188 a primitive function using the same &odr; stream will overwrite the
 189 data, unless a different buffer has been supplied using the call
 190 </para>
 191
 192 <synopsis>
 193   void odr_setbuf(ODR o, char *buf, int len, int can_grow);
 194 </synopsis>
 195
 196 <para>
 197 which sets the encoding (or decoding) buffer used by <literal>o</literal> to
 198 <literal>buf</literal>, using the length <literal>len</literal>.
 199 Before a call to an encoding function, you can use
 200 <function>odr_setbuf()</function> to provide the stream with an encoding
 201 buffer of sufficient size (length). The <literal>can_grow</literal>
 202 parameter tells the encoding &odr; stream whether it is allowed to use
 203 <function>realloc(2)</function> to increase the size of the buffer when
 204 necessary. The default condition of a new encoding stream is equivalent
 205 to the results of calling
 206 </para>
 207
 208 <synopsis>
 209 odr_setbuf(stream, 0, 0, 1);
 210 </synopsis>
 211
 212 <para>
 213 In this case, the stream will allocate and reallocate memory as
 214 necessary. The stream reallocates memory by repeatedly doubling the
 215 size of the buffer - the result is that the buffer will typically
 216 reach its maximum, working size with only a small number of reallocation
 217 operations. The memory is freed by the stream when the latter is destroyed,
 218 unless it was assigned by the user with the <literal>can_grow</literal>
 219 parameter set to zero (in this case, you are expected to retain
 220 control of the memory yourself).
 221 </para>
 222
 223 <para>
 224 To assume full control of an encoded buffer, you must first call
 225 <function>odr_getbuf()</function> to fetch the buffer and its length.
 226 Next, you should call <function>odr_setbuf()</function> to provide a
 227 different buffer (or a null pointer) to the stream. In the simplest
 228 case, you will reuse the same buffer over and over again, and you
 229 will just need to call <function>odr_getbuf()</function> after each
 230 encoding operation to get the length and address of the buffer.
 231 Note that the stream may reallocate the buffer during an encoding
 232 operation, so it is necessary to retrieve the correct address after
 233 each encoding operation.
 234 </para>
 235
 236 <para>
 237 It is important to realise that the ODR stream will not release this
 238 memory when you call <function>odr_reset()</function>: It will
 239 merely update its internal pointers to prepare for the encoding of a
 240 new data value.
 241 When the stream is released by the <function>odr_destroy()</function>
 242 function, the memory given to it by <function>odr_setbuf</function> will
 243 be released <emphasis>only</emphasis> if the <literal>can_grow</literal>
 244 parameter to <function>odr_setbuf()</function> was nonzero. The
 245 <literal>can_grow</literal> parameter, in other words, is a way of
 246 signalling who is to own the buffer, you or the ODR stream. If you never call
 247 <function>odr_setbuf()</function> on your encoding stream, which is
 248 typically the case, the buffer allocated by the stream will belong to
 249 the stream by default.
 250 </para>
 251
 252 <para>
 253 When you wish to decode data, you should first call
 254 <function>odr_setbuf()</function>, to tell the decoding stream
 255 where to find the encoded data, and how long the buffer is
 256 (the <literal>can_grow</literal> parameter is ignored by a decoding
 257 stream). After this, you can call the function corresponding to the
 258 data you wish to decode (eg, <function>odr_integer()</function> odr
 259 <function>z_APDU()</function>).
 260 </para>
 261
 262 <para>
 263 Examples of encoding/decoding functions:
 264 </para>
 265
 266 <synopsis>
 267   int odr_integer(ODR o, int **p, int optional, const char *name);
 268
 269   int z_APDU(ODR o, Z_APDU **p, int optional, const char *name);
 270 </synopsis>
 271
 272 <para>
 273 If the data is absent (or doesn't match the tag corresponding to the type),
 274 the return value will be either 0 or 1 depending on the
 275 <literal>optional</literal> flag. If <literal>optional</literal>
 276 is 0 and the data is absent, an error flag will be raised in the
 277 stream, and you'll need to call <function>odr_reset()</function> before
 278 you can use the stream again. If <literal>optional</literal> is
 279 nonzero, the pointer <emphasis>pointed</emphasis> to/ by <literal>p</literal>
 280 will be set to the null value, and the function will return 1.
 281 The <literal>name</literal> argument is used to pretty-print the
 282 tag in question. It may be set to <literal>NULL</literal> if
 283 pretty-printing is not desired.
 284 </para>
 285
 286 <para>
 287 If the data value is found where it's expected, the pointer
 288 <emphasis>pointed to</emphasis> by the <literal>p</literal> argument
 289 will be set to point to the decoded type.
 290 The space for the type will be allocated and owned by the &odr; stream, and
 291 it will live until you call <function>odr_reset()</function> on the
 292 stream. You cannot use <function>free(2)</function> to release the memory.
 293 You can decode several data elements (by repeated calls to
 294 <function>odr_setbuf()</function> and your decoding function), and
 295 new memory will be allocated each time. When you do call
 296 <function>odr_reset()</function>, everything decoded since the
 297 last call to <function>odr_reset()</function> will be released.
 298 </para>
 299
 300 <para>
 301 The use of the double indirection can be a little confusing at first
 302 (its purpose will become clear later on, hopefully),
 303 so an example is in order. We'll encode an integer value, and
 304 immediately decode it again using a different stream. A useless, but
 305 informative operation.
 306 </para>
 307
 308 <programlisting>
 309 void do_nothing_useful(int value)
 310 {
 311     ODR encode, decode;
 312     int *valp, *resvalp;
 313     char *bufferp;
 314     int len;
 315
 316     /* allocate streams */
 317     if (!(encode = odr_createmem(ODR_ENCODE)))
 318         return;
 319     if (!(decode = odr_createmem(ODR_DECODE)))
 320         return;
 321
 322     valp = &amp;value;
 323     if (odr_integer(encode, &amp;valp, 0, 0) == 0)
 324     {
 325         printf("encoding went bad\n");
 326         return;
 327     }
 328     bufferp = odr_getbuf(encode, &amp;len);
 329     printf("length of encoded data is &percnt;d\n", len);
 330
 331     /* now let's decode the thing again */
 332     odr_setbuf(decode, bufferp, len);
 333     if (odr_integer(decode, &amp;resvalp, 0, 0) == 0)
 334     {
 335         printf("decoding went bad\n");
 336         return;
 337     }
 338     printf("the value is &percnt;d\n", *resvalp);
 339
 340     /* clean up */
 341     odr_destroy(encode);
 342     odr_destroy(decode);
 343 }
 344 </programlisting>
 345
 346 <para>
 347 This looks like a lot of work, offhand. In practice, the &odr; streams
 348 will typically be allocated once, in the beginning of your program (or at the
 349 beginning of a new network session), and the encoding and decoding
 350 will only take place in a few, isolated places in your program, so the
 351 overhead is quite manageable.
 352 </para>
 353
 354 </sect2>
 355
 356 <sect2><title>Diagnostics</title>
 357
 358 <para>
 359 The encoding/decoding functions all return 0 when an error occurs.
 360 Until you call <function>odr_reset()</function>, you cannot use the
 361 stream again, and any function called will immediately return 0.
 362 </para>
 363
 364 <para>
 365 To provide information to the programmer or administrator, the function
 366 </para>
 367
 368 <synopsis>
 369   void odr_perror(ODR o, char *message);
 370 </synopsis>
 371
 372 <para>
 373 is provided, which prints the <literal>message</literal> argument to
 374 <literal>stderr</literal> along with an error message from the stream.
 375 </para>
 376
 377 <para>
 378 You can also use the function
 379 </para>
 380
 381 <synopsis>
 382   int odr_geterror(ODR o);
 383 </synopsis>
 384
 385 <para>
 386 to get the current error number from the screen. The number will be
 387 one of these constants:
 388 </para>
 389
 390 <table frame="top"><title>ODR Error codes</title>
 391 <tgroup cols="2">
 392 <thead>
 393 <row>
 394 <entry>code</entry>
 395 <entry>Description</entry>
 396 </row>
 397 </thead>
 398 <tbody>
 399 <row>
 400 <entry>OMEMORY</entry><entry>Memory allocation failed.</entry>
 401 </row>
 402
 403 <row>
 404 <entry>OSYSERR</entry><entry>A system- or library call has failed.
 405 The standard diagnostic variable <literal>errno</literal> should be
 406 examined to determine the actual error.</entry>
 407 </row>
 408
 409 <row>
 410 <entry>OSPACE</entry><entry>No more space for encoding.
 411 This will only occur when the user has explicitly provided a
 412 buffer for an encoding stream without allowing the system to
 413 allocate more space.</entry>
 414 </row>
 415
 416 <row>
 417 <entry>OREQUIRED</entry><entry>This is a common protocol error; A
 418 required data element was missing during encoding or decoding.</entry>
 419 </row>
 420
 421 <row>
 422 <entry>OUNEXPECTED</entry><entry>An unexpected data element was
 423 found during decoding.</entry>
 424 </row>
 425
 426 <row><entry>OOTHER</entry><entry>Other error. This is typically an
 427 indication of misuse of the &odr; system by the programmer, and also
 428 that the diagnostic system isn't as good as it should be, yet.</entry>
 429 </row>
 430 </tbody>
 431 </tgroup>
 432 </table>
 433
 434 <para>
 435 The character string array
 436 </para>
 437
 438 <synopsis>
 439   char *odr_errlist&lsqb;&rsqb;
 440 </synopsis>
 441
 442 <para>
 443 can be indexed by the error code to obtain a human-readable
 444 representation of the problem.
 445 </para>
 446
 447 </sect2>
 448 <sect2><title>Summary and Synopsis</title>
 449
 450 <synopsis>
 451 #include &lt;odr.h>
 452
 453 ODR odr_createmem(int direction);
 454
 455 void odr_destroy(ODR o);
 456
 457 void odr_reset(ODR o);
 458
 459 char *odr_getbuf(ODR o, int *len);
 460
 461 void odr_setbuf(ODR o, char *buf, int len);
 462
 463 void *odr_malloc(ODR o, int size);
 464
 465 ODR_MEM odr_extract_mem(ODR o);
 466
 467 void odr_release_mem(ODR_MEM r);
 468
 469 int odr_geterror(ODR o);
 470
 471 void odr_perror(char *message);
 472
 473 extern char *odr_errlist[];
 474 </synopsis>
 475
 476 </sect2>
 477 </sect1>
 478
 479 <sect1><title id="odr-prog">Programming with ODR</title>
 480
 481 <para>
 482 The API of &odr; is designed to reflect the structure of ASN.1, rather
 483 than BER itself. Future releases may be able to represent data in
 484 other external forms.
 485 </para>
 486
 487 <para>
 488 The interface is based loosely on that of the Sun Microsystems XDR routines.
 489 Specifically, each function which corresponds to an ASN.1 primitive
 490 type has a dual function. Depending on the settings of the ODR
 491 stream which is supplied as a parameter, the function may be used
 492 either to encode or decode data. The functions that can be built
 493 using these primitive functions, to represent more complex datatypes, share
 494 this quality. The result is that you only have to enter the definition
 495 for a type once - and you have the functionality of encoding, decoding
 496 (and pretty-printing) all in one unit. The resulting C source code is
 497 quite compact, and is a pretty straightforward representation of the
 498 source ASN.1 specification. Although no ASN.1 compiler is supplied
 499 with &odr; at this time, it shouldn't be too difficult to write one, or
 500 perhaps even to adapt an existing compiler to output &odr; routines
 501 (not surprisingly, writing encoders/decoders using &odr; turns out
 502 to be boring work).
 503 </para>
 504
 505 <para>
 506 In many cases, the model of the XDR functions works quite well in this
 507 role.
 508 In others, it is less elegant. Most of the hassle comes from the optional
 509 SEQUENCE memebers which don't exist in XDR.
 510 </para>
 511
 512 <sect2><title>The Primitive ASN.1 Types</title>
 513
 514 <para>
 515 ASN.1 defines a number of primitive types (many of which correspond
 516 roughly to primitive types in structured programming languages, such as C).
 517 </para>
 518
 519 <sect3><title>INTEGER</title>
 520
 521 <para>
 522 The &odr; function for encoding or decoding (or printing) the ASN.1
 523 INTEGER type looks like this:
 524 </para>
 525
 526 <synopsis>
 527   int odr_integer(ODR o, int **p, int optional, const char *name);
 528 </synopsis>
 529
 530 <para>
 531 (we don't allow values that can't be contained in a C integer.)
 532 </para>
 533
 534 <para>
 535 This form is typical of the primitive &odr; functions. They are named
 536 after the type of data that they encode or decode. They take an &odr;
 537 stream, an indirect reference to the type in question, and an
 538 <literal>optional</literal> flag (corresponding to the OPTIONAL keyword
 539 of ASN.1) as parameters. They all return an integer value of either one
 540 or zero.
 541 When you use the primitive functions to construct encoders for complex
 542 types of your own, you should follow this model as well. This
 543 ensures that your new types can be reused as elements in yet more
 544 complex types.
 545 </para>
 546
 547 <para>
 548 The <literal>o</literal> parameter should obviously refer to a properly
 549 initialized &odr; stream of the right type (encoding/decoding/printing)
 550 for the operation that you wish to perform.
 551 </para>
 552
 553 <para>
 554 When encoding or printing, the function first looks at
 555 <literal>* p</literal>. If <literal>* p</literal> (the pointer pointed
 556 to by <literal>p</literal>) is a null pointer, this is taken to mean that
 557 the data element is absent. If the <literal>optional</literal> parameter
 558 is nonzero, the function will return one (signifying success) without
 559 any further processing. If the <literal>optional</literal> is zero, an
 560 internal error flag is set in the &odr; stream, and the function will
 561 return 0. No further operations can be carried out on the stream without
 562 a call to the function <function>odr_reset()</function>.
 563 </para>
 564
 565 <para>
 566 If <literal>*p</literal> is not a null pointer, it is expected to
 567 point to an instance of the data type. The data will be subjected to
 568 the encoding rules, and the result will be placed in the buffer held
 569 by the &odr; stream.
 570 </para>
 571
 572 <para>
 573 The other ASN.1 primitives have similar functions that operate in
 574 similar manners:
 575 </para>
 576 </sect3>
 577 <sect3><title>BOOLEAN</title>
 578
 579 <synopsis>
 580   int odr_bool(ODR o, bool_t **p, int optional, const char *name);
 581 </synopsis>
 582
 583 </sect3>
 584 <sect3><title>REAL</title>
 585
 586 <para>
 587 Not defined.
 588 </para>
 589
 590 </sect3>
 591 <sect3><title>NULL</title>
 592
 593 <synopsis>
 594   int odr_null(ODR o, bool_t **p, int optional, const char *name);
 595 </synopsis>
 596
 597 <para>
 598 In this case, the value of **p is not important. If <literal>*p</literal>
 599 is different from the null pointer, the null value is present, otherwise
 600 it's absent.
 601 </para>
 602
 603 </sect3>
 604 <sect3><title>OCTET STRING</title>
 605
 606 <synopsis>
 607   typedef struct odr_oct
 608   {
 609       unsigned char *buf;
 610       int len;
 611       int size;
 612   } Odr_oct;
 613
 614   int odr_octetstring(ODR o, Odr_oct **p, int optional, const char *name);
 615 </synopsis>
 616
 617 <para>
 618 The <literal>buf</literal> field should point to the character array
 619 that holds the octetstring. The <literal>len</literal> field holds the
 620 actual length, while the <literal>size</literal> field gives the size
 621 of the allocated array (not of interest to you, in most cases).
 622 The character array need not be null terminated.
 623 </para>
 624
 625 <para>
 626 To make things a little easier, an alternative is given for string
 627 types that are not expected to contain embedded NULL characters (eg.
 628 VisibleString):
 629 </para>
 630
 631 <synopsis>
 632   int odr_cstring(ODR o, char **p, int optional, const char *name);
 633 </synopsis>
 634
 635 <para>
 636 Which encoded or decodes between OCTETSTRING representations and
 637 null-terminates C strings.
 638 </para>
 639
 640 <para>
 641 Functions are provided for the derived string types, eg:
 642 </para>
 643
 644 <synopsis>
 645   int odr_visiblestring(ODR o, char **p, int optional, const char *name);
 646 </synopsis>
 647
 648 </sect3>
 649 <sect3><title>BIT STRING</title>
 650
 651 <synopsis>
 652   int odr_bitstring(ODR o, Odr_bitmask **p, int optional, const char *name);
 653 </synopsis>
 654
 655 <para>
 656 The opaque type <literal>Odr_bitmask</literal> is only suitable for
 657 holding relatively brief bit strings, eg. for options fields, etc.
 658 The constant <literal>ODR_BITMASK_SIZE</literal> multiplied by 8
 659 gives the maximum possible number of bits.
 660 </para>
 661
 662 <para>
 663 A set of macros are provided for manipulating the
 664 <literal>Odr_bitmask</literal> type:
 665 </para>
 666
 667 <synopsis>
 668   void ODR_MASK_ZERO(Odr_bitmask *b);
 669
 670   void ODR_MASK_SET(Odr_bitmask *b, int bitno);
 671
 672   void ODR_MASK_CLEAR(Odr_bitmask *b, int bitno);
 673
 674   int ODR_MASK_GET(Odr_bitmask *b, int bitno);
 675 </synopsis>
 676
 677 <para>
 678 The functions are modelled after the manipulation functions that
 679 accompany the <literal>fd_set</literal> type used by the
 680 <function>select(2)</function> call.
 681 <literal>ODR_MASK_ZERO</literal> should always be called first on a
 682 new bitmask, to initialize the bits to zero.
 683 </para>
 684 </sect3>
 685
 686 <sect3><title>OBJECT IDENTIFIER</title>
 687
 688 <synopsis>
 689 int odr_oid(ODR o, Odr_oid **p, int optional, const char *name);
 690 </synopsis>
 691
 692 <para>
 693 The C OID represenation is simply an array of integers, terminated by
 694 the value -1 (the <literal>Odr_oid</literal> type is synonymous with
 695 the <literal>int</literal> type).
 696 We suggest that you use the OID database module (see section
 697 <link linkend="oid">Object Identifiers</link>) to handle object identifiers
 698 in your application.
 699 </para>
 700
 701 </sect3>
 702 </sect2>
 703 <sect2><title id="tag-prim">Tagging Primitive Types</title>
 704
 705 <para>
 706 The simplest way of tagging a type is to use the
 707 <function>odr_implicit_tag()</function> or
 708 <function>odr_explicit_tag()</function> macros:
 709 </para>
 710
 711 <synopsis>
 712   int odr_implicit_tag(ODR o, Odr_fun fun, int class, int tag, int
 713                        optional, const char *name);
 714
 715   int odr_explicit_tag(ODR o, Odr_fun fun, int class, int tag,
 716                        int optional, const char *name);
 717 </synopsis>
 718
 719 <para>
 720 To create a type derived from the integer type by implicit tagging, you
 721 might write:
 722 </para>
 723
 724 <screen>
 725   MyInt ::= &lsqb;210&rsqb; IMPLICIT INTEGER
 726 </screen>
 727
 728 <para>
 729 In the &odr; system, this would be written like:
 730 </para>
 731
 732 <screen>
 733 int myInt(ODR o, int **p, int optional, const char *name)
 734 {
 735     return odr_implicit_tag(o, odr_integer, p,
 736                    ODR_CONTEXT, 210, optional, name);
 737 }
 738 </screen>
 739
 740 <para>
 741 The function <function>myInt()</function> can then be used like any of
 742 the primitive functions provided by &odr;. Note that the behavior of
 743 <function>odr_explicit()</function>
 744 and <function>odr_implicit()</function> macros
 745 act exactly the same as the functions they are applied to - they
 746 respond to error conditions, etc, in the same manner - they
 747 simply have three extra parameters. The class parameter may
 748 take one of the values: <literal>ODR_CONTEXT</literal>,
 749 <literal>ODR_PRIVATE</literal>, <literal>ODR_UNIVERSAL</literal>, or
 750 <literal>/ODR_APPLICATION</literal>.
 751 </para>
 752
 753 </sect2>
 754 <sect2><title>Constructed Types</title>
 755
 756 <para>
 757 Constructed types are created by combining primitive types. The
 758 &odr; system only implements the SEQUENCE and SEQUENCE OF constructions
 759 (although adding the rest of the container types should be simple
 760 enough, if the need arises).
 761 </para>
 762
 763 <para>
 764 For implementing SEQUENCEs, the functions
 765 </para>
 766
 767 <synopsis>
 768   int odr_sequence_begin(ODR o, void *p, int size, const char *name);
 769   int odr_sequence_end(ODR o);
 770 </synopsis>
 771
 772 <para>
 773 are provided.
 774 </para>
 775
 776 <para>
 777 The <function>odr_sequence_begin()</function> function should be
 778 called in the beginning of a function that implements a SEQUENCE type.
 779 Its parameters are the &odr; stream, a pointer (to a pointer to the type
 780 you're implementing), and the <literal>size</literal> of the type
 781 (typically a C structure). On encoding, it returns 1 if
 782 <literal>* p</literal> is a null pointer. The <literal>size</literal>
 783 parameter is ignored. On decoding, it returns 1 if the type is found in
 784 the data stream. <literal>size</literal> bytes of memory are allocated,
 785 and <literal>*p</literal> is set to point to this space.
 786 <function>odr_sequence_end()</function> is called at the end of the
 787 complex function. Assume that a type is defined like this:
 788 </para>
 789
 790 <screen>
 791   MySequence ::= SEQUENCE {
 792       intval INTEGER,
 793       boolval BOOLEAN OPTIONAL }
 794 </screen>
 795
 796 <para>
 797 The corresponding &odr; encoder/decoder function and the associated data
 798 structures could be written like this:
 799 </para>
 800
 801 <screen>
 802   typedef struct MySequence
 803   {
 804     int *intval;
 805     bool_t *boolval;
 806   } MySequence;
 807
 808   int mySequence(ODR o, MySequence **p, int optional, const char *name)
 809   {
 810     if (odr_sequence_begin(o, p, sizeof(**p), name) == 0)
 811         return optional &amp;&amp; odr_ok(o);
 812     return
 813         odr_integer(o, &amp;(*p)->intval, 0, "intval") &amp;&amp;
 814         odr_bool(o, &amp;(*p)->boolval, 1, "boolval") &amp;&amp;
 815         odr_sequence_end(o);
 816   }
 817 </screen>
 818
 819 <para>
 820 Note the 1 in the call to <function>odr_bool()</function>, to mark
 821 that the sequence member is optional.
 822 If either of the member types had been tagged, the macros
 823 <function>odr_implicit()</function> or <function>odr_explicit()</function>
 824 could have been used.
 825 The new function can be used exactly like the standard functions provided
 826 with &odr;. It will encode, decode or pretty-print a data value of the
 827 <literal>MySequence</literal> type. We like to name types with an
 828 initial capital, as done in ASN.1 definitions, and to name the
 829 corresponding function with the first character of the name in lower case.
 830 You could, of course, name your structures, types, and functions any way
 831 you please - as long as you're consistent, and your code is easily readable.
 832 <literal>odr_ok</literal> is just that - a predicate that returns the
 833 state of the stream. It is used to ensure that the behaviour of the new
 834 type is compatible with the interface of the primitive types.
 835 </para>
 836
 837 </sect2>
 838 <sect2><title>Tagging Constructed Types</title>
 839
 840 <note>
 841 <para>
 842 See section <link linkend="tag-prim">Tagging Primitive types</link>
 843 for information on how to tag the primitive types, as well as types
 844 that are already defined.
 845 </para>
 846 </note>
 847
 848 <sect3><title>Implicit Tagging</title>
 849
 850 <para>
 851 Assume the type above had been defined as
 852 </para>
 853
 854 <screen>
 855   MySequence ::= &lsqb;10&rsqb; IMPLICIT SEQUENCE {
 856      intval INTEGER,
 857      boolval BOOLEAN OPTIONAL }
 858 </screen>
 859
 860 <para>
 861 You would implement this in &odr; by calling the function
 862 </para>
 863
 864 <synopsis>
 865   int odr_implicit_settag(ODR o, int class, int tag);
 866 </synopsis>
 867
 868 <para>
 869 which overrides the tag of the type immediately following it. The
 870 macro <function>odr_implicit()</function> works by calling
 871 <function>odr_implicit_settag()</function> immediately
 872 before calling the function pointer argument.
 873 Your type function could look like this:
 874 </para>
 875
 876 <screen>
 877   int mySequence(ODR o, MySequence **p, int optional, const char *name)
 878   {
 879     if (odr_implicit_settag(o, ODR_CONTEXT, 10) == 0 ||
 880         odr_sequence_begin(o, p, sizeof(**p), name) == 0)
 881         return optional &amp;&amp; odr_ok(o);
 882     return
 883         odr_integer(o, &amp;(*p)->intval, 0, "intval") &amp;&amp;
 884         odr_bool(o, &amp;(*p)->boolval, 1, "boolval") &amp;&amp;
 885        odr_sequence_end(o);
 886 }
 887 </screen>
 888
 889 <para>
 890 The definition of the structure <literal>MySequence</literal> would be
 891 the same.
 892 </para>
 893 </sect3>
 894
 895 <sect3><title>Explicit Tagging</title>
 896
 897 <para>
 898 Explicit tagging of constructed types is a little more complicated,
 899 since you are in effect adding a level of construction to the data.
 900 </para>
 901
 902 <para>
 903 Assume the definition:
 904 </para>
 905
 906 <screen>
 907   MySequence ::= &lsqb;10&rsqb; IMPLICIT SEQUENCE {
 908       intval INTEGER,
 909       boolval BOOLEAN OPTIONAL }
 910 </screen>
 911
 912 <para>
 913 Since the new type has an extra level of construction, two new functions
 914 are needed to encapsulate the base type:
 915 </para>
 916
 917 <synopsis>
 918   int odr_constructed_begin(ODR o, void *p, int class, int tag,
 919                             const char *name);
 920
 921   int odr_constructed_end(ODR o);
 922 </synopsis>
 923
 924 <para>
 925 Assume that the IMPLICIT in the type definition above were replaced
 926 with EXPLICIT (or that the IMPLICIT keyword were simply deleted, which
 927 would be equivalent). The structure definition would look the same,
 928 but the function would look like this:
 929 </para>
 930
 931 <screen>
 932   int mySequence(ODR o, MySequence **p, int optional, const char *name)
 933   {
 934     if (odr_constructed_begin(o, p, ODR_CONTEXT, 10, name) == 0)
 935         return optional &amp;&amp; odr_ok(o);
 936     if (o->direction == ODR_DECODE)
 937         *p = odr_malloc(o, sizeof(**p));
 938     if (odr_sequence_begin(o, p, sizeof(**p), 0) == 0)
 939     {
 940       *p = 0; /* this is almost certainly a protocol error */
 941       return 0;
 942     }
 943     return
 944         odr_integer(o, &amp;(*p)->intval, 0, "intval") &amp;&amp;
 945         odr_bool(o, &amp;(*p)->boolval, 1, "boolval") &amp;&amp;
 946         odr_sequence_end(o) &amp;&amp;
 947         odr_constructed_end(o);
 948   }
 949 </screen>
 950
 951 <para>
 952 Notice that the interface here gets kind of nasty. The reason is
 953 simple: Explicitly tagged, constructed types are fairly rare in
 954 the protocols that we care about, so the
 955 aesthetic annoyance (not to mention the dangers of a cluttered
 956 interface) is less than the time that would be required to develop a
 957 better interface. Nevertheless, it is far from satisfying, and it's a
 958 point that will be worked on in the future. One option for you would
 959 be to simply apply the <function>odr_explicit()</function> macro to
 960 the first function, and not
 961 have to worry about <function>odr_constructed_*</function> yourself.
 962 Incidentally, as you might have guessed, the
 963 <function>odr_sequence_</function> functions are themselves
 964 implemented using the <function>/odr_constructed_</function> functions.
 965 </para>
 966
 967 </sect3>
 968 </sect2>
 969 <sect2><title>SEQUENCE OF</title>
 970
 971 <para>
 972 To handle sequences (arrays) of a apecific type, the function
 973 </para>
 974
 975 <synopsis>
 976   int odr_sequence_of(ODR o, int (*fun)(ODR o, void *p, int optional),
 977                       void *p, int *num, const char *name);
 978 </synopsis>
 979
 980 <para>
 981 The <literal>fun</literal> parameter is a pointer to the decoder/encoder
 982 function of the type. <literal>p</literal> is a pointer to an array of
 983 pointers to your type. <literal>num</literal> is the number of elements
 984 in the array.
 985 </para>
 986
 987 <para>
 988 Assume a type
 989 </para>
 990
 991 <screen>
 992   MyArray ::= SEQUENCE OF INTEGER
 993 </screen>
 994
 995 <para>
 996 The C representation might be
 997 </para>
 998
 999 <screen>
1000   typedef struct MyArray
1001   {
1002     int num_elements;
1003     int **elements;
1004   } MyArray;
1005 </screen>
1006
1007 <para>
1008 And the function might look like
1009 </para>
1010
1011 <screen>
1012   int myArray(ODR o, MyArray **p, int optional, const char *name)
1013   {
1014     if (o->direction == ODR_DECODE)
1015         *p = odr_malloc(o, sizeof(**p));
1016     if (odr_sequence_of(o, odr_integer, &amp;(*p)->elements,
1017         &amp;(*p)->num_elements, name))
1018         return 1;
1019     *p = 0;
1020     return optional &amp;&amp; odr_ok(o);
1021   }
1022 </screen>
1023
1024 </sect2>
1025 <sect2><title>CHOICE Types</title>
1026
1027 <para>
1028 The choice type is used fairly often in some ASN.1 definitions, so
1029 some work has gone into streamlining its interface.
1030 </para>
1031
1032 <para>
1033 CHOICE types are handled by the function:
1034 </para>
1035
1036 <synopsis>
1037   int odr_choice(ODR o, Odr_arm arm&lsqb;&rsqb;, void *p, void *whichp,
1038                  const char *name);
1039 </synopsis>
1040
1041 <para>
1042 The <literal>arm</literal> array is used to describe each of the possible
1043 types that the CHOICE type may assume. Internally in your application,
1044 the CHOICE type is represented as a discriminated union. That is, a C union
1045 accompanied by an integer (or enum) identifying the active 'arm' of
1046 the union. <literal>whichp</literal> is a pointer to the union
1047 discriminator. When encoding, it is examined to determine the current
1048 type. When decoding, it is set to reference the type that was found in
1049 the input stream.
1050 </para>
1051
1052 <para>
1053 The Odr_arm type is defined thus:
1054 </para>
1055
1056 <screen>
1057   typedef struct odr_arm
1058   {
1059     int tagmode;
1060     int class;
1061     int tag;
1062     int which;
1063     Odr_fun fun;
1064     char *name;
1065   } Odr_arm;
1066 </screen>
1067
1068 <para>
1069 The interpretation of the fields are:
1070 </para>
1071
1072 <variablelist>
1073 <varlistentry><term>tagmode</term>
1074  <listitem><para>Either <literal>ODR_IMPLICIT</literal>,
1075 <literal>ODR_EXPLICIT</literal>, or <literal>ODR_NONE</literal> (-1) to mark
1076 no tagging.</para></listitem>
1077 </varlistentry>
1078
1079 <varlistentry><term>which</term>
1080  <listitem><para>The value of the discriminator that corresponds to
1081 this CHOICE element. Typically, it will be a &num;defined constant, or
1082 an enum member.</para></listitem>
1083 </varlistentry>
1084
1085 <varlistentry><term>fun</term>
1086  <listitem><para>A pointer to a function that implements the type of
1087 the CHOICE member. It may be either a standard &odr; type or a type
1088 defined by yourself.</para></listitem>
1089 </varlistentry>
1090
1091 <varlistentry><term>name</term>
1092  <listitem><para>Name of tag.</para></listitem>
1093 </varlistentry>
1094 </variablelist>
1095
1096 <para>
1097 A handy way to prepare the array for use by the
1098 <function>odr_choice()</function> function is to
1099 define it as a static, initialized array in the beginning of your
1100 decoding/encoding function. Assume the type definition:
1101 </para>
1102
1103 <screen>
1104   MyChoice ::= CHOICE {
1105     untagged INTEGER,
1106     tagged   &lsqb;99&rsqb; IMPLICIT INTEGER,
1107     other    BOOLEAN
1108   }
1109 </screen>
1110
1111 <para>
1112 Your C type might look like
1113 </para>
1114
1115 <screen>
1116   typedef struct MyChoice
1117   {
1118       enum
1119       {
1120           MyChoice_untagged,
1121           MyChoice_tagged,
1122           MyChoice_other
1123       } which;
1124       union
1125       {
1126           int *untagged;
1127           int *tagged;
1128           bool_t *other;
1129       } u;
1130   };
1131 </screen>
1132
1133 <para>
1134 And your function could look like this:
1135 </para>
1136
1137 <screen>
1138 int myChoice(ODR o, MyChoice **p, int optional, const char *name)
1139 {
1140     static Odr_arm arm&lsqb;&rsqb; =
1141     {
1142         {-1, -1, -1, MyChoice_untagged, odr_integer, "untagged"},
1143         {ODR_IMPLICIT, ODR_CONTEXT, 99, MyChoice_tagged, odr_integer,
1144                                                             "tagged"},
1145         {-1, -1, -1, MyChoice_other, odr_boolean, "other"},
1146         {-1, -1, -1, -1, 0}
1147     };
1148
1149     if (o->direction == ODR_DECODE)
1150         *p = odr_malloc(o, sizeof(**p);
1151     else if (!*p)
1152         return optional &amp;&amp; odr_ok(o);
1153
1154     if (odr_choice(o, arm, &amp;(*p)->u, &amp;(*p)->which), name)
1155         return 1;
1156     *p = 0;
1157     return optional &amp;&amp; odr_ok(o);
1158 }
1159 </screen>
1160
1161 <para>
1162 In some cases (say, a non-optional choice which is a member of a sequence),
1163 you can "embed" the union and its discriminator in the structure
1164 belonging to the enclosing type, and you won't need to fiddle with
1165 memory allocation to create a separate structure to wrap the
1166 discriminator and union.
1167 </para>
1168
1169 <para>
1170 The corresponding function is somewhat nicer in the Sun XDR interface.
1171 Most of the complexity of this interface comes from the possibility of
1172 declaring sequence elements (including CHOICEs) optional.
1173 </para>
1174
1175 <para>
1176 The ASN.1 specifictions naturally requires that each member of a
1177 CHOICE have a distinct tag, so they can be told apart on decoding.
1178 Sometimes it can be useful to define a CHOICE that has multiple types
1179 that share the same tag. You'll need some other mechanism, perhaps
1180 keyed to the context of the CHOICE type. In effect, we would like to
1181 introduce a level of context-sensitiveness to our ASN.1 specification.
1182 When encoding an internal representation, we have no problem, as long
1183 as each CHOICE member has a distinct discriminator value. For
1184 decoding, we need a way to tell the choice function to look for a
1185 specific arm of the table. The function
1186 </para>
1187
1188 <synopsis>
1189   void odr_choice_bias(ODR o, int what);
1190 </synopsis>
1191
1192 <para>
1193 provides this functionality. When called, it leaves a notice for the next
1194 call to <function>odr_choice()</function> to be called on the decoding
1195 stream <literal>o</literal> that only the <literal>arm</literal> entry with
1196 a <literal>which</literal> field equal to <literal>what</literal>
1197 should be tried.
1198 </para>
1199
1200 <para>
1201 The most important application (perhaps the only one, really) is in
1202 the definition of application-specific EXTERNAL encoders/decoders
1203 which will automatically decode an ANY member given the direct or
1204 indirect reference.
1205 </para>
1206
1207 </sect2>
1208 </sect1>
1209
1210 <sect1><title>Debugging</title>
1211
1212 <para>
1213 The protocol modules are suffering somewhat from a lack of diagnostic
1214 tools at the moment. Specifically ways to pretty-print PDUs that
1215 aren't recognized by the system. We'll include something to this end
1216 in a not-too-distant release. In the meantime, what we do when we get
1217 packages we don't understand is to compile the ODR module with
1218 <literal>ODR_DEBUG</literal> defined. This causes the module to dump tracing
1219 information as it processes data units. With this output and the
1220 protocol specification (Z39.50), it is generally fairly easy to see
1221 what goes wrong.
1222 </para>
1223 </sect1>
1224 </chapter>