The ODR Module Introduction &odr; is the BER-encoding/decoding subsystem of &yaz;. Care as been taken to isolate &odr; from the rest of the package - specifically from the transport interface. &odr; may be used in any context where basic ASN.1/BER representations are used. If you are only interested in writing a Z39.50 implementation based on the PDUs that are already provided with &yaz;, you only need to concern yourself with the section on managing ODR streams (). Only if you need to implement ASN.1 beyond that which has been provided, should you worry about the second half of the documentation (). If you use one of the higher-level interfaces, you can skip this section entirely. This is important, so we'll repeat it for emphasis: You do not need to read to implement Z39.50 with &yaz;. If you need a part of the protocol that isn't already in &yaz;, you should contact the authors before going to work on it yourself: We might already be working on it. Conversely, if you implement a useful part of the protocol before us, we'd be happy to include it in a future release. Using ODR ODR Streams Conceptually, the ODR stream is the source of encoded data in the decoding mode; when encoding, it is the receptacle for the encoded data. Before you can use an ODR stream it must be allocated. This is done with the function ODR odr_createmem(int direction); The odr_createmem() function takes as argument one of three manifest constants: ODR_ENCODE, ODR_DECODE, or ODR_PRINT. An &odr; stream can be in only one mode - it is not possible to change its mode once it's selected. Typically, your program will allocate at least two ODR streams - one for decoding, and one for encoding. When you're done with the stream, you can use void odr_destroy(ODR o); to release the resources allocated for the stream. Memory Management Two forms of memory management take place in the &odr; system. The first one, which has to do with allocating little bits of memory (sometimes quite large bits of memory, actually) when a protocol package is decoded, and turned into a complex of interlinked structures. This section deals with this system, and how you can use it for your own purposes. The next section deals with the memory management which is required when encoding data - to make sure that a large enough buffer is available to hold the fully encoded PDU. The &odr; module has its own memory management system, which is used whenever memory is required. Specifically, it is used to allocate space for data when decoding incoming PDUs. You can use the memory system for your own purposes, by using the function void *odr_malloc(ODR o, size_t size); You can't use the normal free(2) routine to free memory allocated by this function, and &odr; doesn't provide a parallel function. Instead, you can call void odr_reset(ODR o); when you are done with the memory: Everything allocated since the last call to odr_reset() is released. The odr_reset() call is also required to clear up an error condition on a stream. The function size_t odr_total(ODR o); returns the number of bytes allocated on the stream since the last call to odr_reset(). The memory subsystem of &odr; is fairly efficient at allocating and releasing little bits of memory. Rather than managing the individual, small bits of space, the system maintains a free-list of larger chunks of memory, which are handed out in small bits. This scheme is generally known as a nibble memory system. It is very useful for maintaining short-lived constructions such as protocol PDUs. If you want to retain a bit of memory beyond the next call to odr_reset(), you can use the function ODR_MEM odr_extract_mem(ODR o); This function will give you control of the memory recently allocated on the ODR stream. The memory will live (past calls to odr_reset()), until you call the function void odr_release_mem(ODR_MEM p); The opaque ODR_MEM handle has no other purpose than referencing the memory block for you until you want to release it. You can use odr_extract_mem() repeatedly between allocating data, to retain individual control of separate chunks of data. Encoding and Decoding Data When encoding data, the ODR stream will write the encoded octet string in an internal buffer. To retrieve the data, use the function char *odr_getbuf(ODR o, int *len, int *size); The integer pointed to by len is set to the length of the encoded data, and a pointer to that data is returned. *size is set to the size of the buffer (unless size is null, signaling that you are not interested in the size). The next call to a primitive function using the same &odr; stream will overwrite the data, unless a different buffer has been supplied using the call void odr_setbuf(ODR o, char *buf, int len, int can_grow); which sets the encoding (or decoding) buffer used by o to buf, using the length len. Before a call to an encoding function, you can use odr_setbuf() to provide the stream with an encoding buffer of sufficient size (length). The can_grow parameter tells the encoding &odr; stream whether it is allowed to use realloc(2) to increase the size of the buffer when necessary. The default condition of a new encoding stream is equivalent to the results of calling odr_setbuf(stream, 0, 0, 1); In this case, the stream will allocate and reallocate memory as necessary. The stream reallocates memory by repeatedly doubling the size of the buffer - the result is that the buffer will typically reach its maximum, working size with only a small number of reallocation operations. The memory is freed by the stream when the latter is destroyed, unless it was assigned by the user with the can_grow parameter set to zero (in this case, you are expected to retain control of the memory yourself). To assume full control of an encoded buffer, you must first call odr_getbuf() to fetch the buffer and its length. Next, you should call odr_setbuf() to provide a different buffer (or a null pointer) to the stream. In the simplest case, you will reuse the same buffer over and over again, and you will just need to call odr_getbuf() after each encoding operation to get the length and address of the buffer. Note that the stream may reallocate the buffer during an encoding operation, so it is necessary to retrieve the correct address after each encoding operation. It is important to realize that the ODR stream will not release this memory when you call odr_reset(): It will merely update its internal pointers to prepare for the encoding of a new data value. When the stream is released by the odr_destroy() function, the memory given to it by odr_setbuf will be released only if the can_grow parameter to odr_setbuf() was nonzero. The can_grow parameter, in other words, is a way of signaling who is to own the buffer, you or the ODR stream. If you never call odr_setbuf() on your encoding stream, which is typically the case, the buffer allocated by the stream will belong to the stream by default. When you wish to decode data, you should first call odr_setbuf(), to tell the decoding stream where to find the encoded data, and how long the buffer is (the can_grow parameter is ignored by a decoding stream). After this, you can call the function corresponding to the data you wish to decode (eg, odr_integer() odr z_APDU()). Encoding and decoding functions int odr_integer(ODR o, Odr_int **p, int optional, const char *name); int z_APDU(ODR o, Z_APDU **p, int optional, const char *name); If the data is absent (or doesn't match the tag corresponding to the type), the return value will be either 0 or 1 depending on the optional flag. If optional is 0 and the data is absent, an error flag will be raised in the stream, and you'll need to call odr_reset() before you can use the stream again. If optional is nonzero, the pointer pointed to/ by p will be set to the null value, and the function will return 1. The name argument is used to pretty-print the tag in question. It may be set to NULL if pretty-printing is not desired. If the data value is found where it's expected, the pointer pointed to by the p argument will be set to point to the decoded type. The space for the type will be allocated and owned by the &odr; stream, and it will live until you call odr_reset() on the stream. You cannot use free(2) to release the memory. You can decode several data elements (by repeated calls to odr_setbuf() and your decoding function), and new memory will be allocated each time. When you do call odr_reset(), everything decoded since the last call to odr_reset() will be released. Encoding and decoding of an integer The use of the double indirection can be a little confusing at first (its purpose will become clear later on, hopefully), so an example is in order. We'll encode an integer value, and immediately decode it again using a different stream. A useless, but informative operation. This looks like a lot of work, offhand. In practice, the &odr; streams will typically be allocated once, in the beginning of your program (or at the beginning of a new network session), and the encoding and decoding will only take place in a few, isolated places in your program, so the overhead is quite manageable. Printing When an ODR stream is created of type ODR_PRINT the ODR module will print the contents of a PDU in a readable format. By default output is written to the stderr stream. This behavior can be changed, however, by calling the function odr_setprint(ODR o, FILE *file); before encoders or decoders are being invoked. It is also possible to direct the output to a buffer (of indeed another file), by using the more generic mechanism: void odr_set_stream(ODR o, void *handle, void (*stream_write)(ODR o, void *handle, int type, const char *buf, int len), void (*stream_close)(void *handle)); Here the user provides an opaque handle and two handlers, stream_write for writing, and stream_close which is supposed to close/free resources associated with handle. The stream_close handler is optional and if NULL for the function is provided, it will not be invoked. The stream_write takes the ODR handle as parameter, the user defined handle, a type ODR_OCTETSTRING, ODR_VISIBLESTRING which indicates the type of contents is being written. Another utility useful for diagnostics (error handling) or as part of the printing facilities is: const char **odr_get_element_path(ODR o); which returns a list of current elements that ODR deals with at the moment. For the returned array, say ar, ar[0] is the top level element, ar[n] is the last. The last element has the property that ar[n+1] == NULL. Element Path for record For a database record part of a PresentResponse the array returned by odr_get_element is presentResponse, databaseOrSurDiagnostics, ?, record, ?, databaseRecord . The question mark appears due to unnamed constructions. Diagnostics The encoding/decoding functions all return 0 when an error occurs. Until you call odr_reset(), you cannot use the stream again, and any function called will immediately return 0. To provide information to the programmer or administrator, the function void odr_perror(ODR o, char *message); is provided, which prints the message argument to stderr along with an error message from the stream. You can also use the function int odr_geterror(ODR o); to get the current error number from the screen. The number will be one of these constants: ODR Error codes code Description OMEMORYMemory allocation failed. OSYSERRA system- or library call has failed. The standard diagnostic variable errno should be examined to determine the actual error. OSPACENo more space for encoding. This will only occur when the user has explicitly provided a buffer for an encoding stream without allowing the system to allocate more space. OREQUIREDThis is a common protocol error; A required data element was missing during encoding or decoding. OUNEXPECTEDAn unexpected data element was found during decoding. OOTHEROther error. This is typically an indication of misuse of the &odr; system by the programmer, and also that the diagnostic system isn't as good as it should be, yet.
The character string array char *odr_errlist[] can be indexed by the error code to obtain a human-readable representation of the problem.
Summary and Synopsis #include <yaz/odr.h> ODR odr_createmem(int direction); void odr_destroy(ODR o); void odr_reset(ODR o); char *odr_getbuf(ODR o, int *len, int *size); void odr_setbuf(ODR o, char *buf, int len, int can_grow); void *odr_malloc(ODR o, int size); NMEM odr_extract_mem(ODR o); int odr_geterror(ODR o); void odr_perror(ODR o, const char *message); extern char *odr_errlist[];
Programming with ODR The API of &odr; is designed to reflect the structure of ASN.1, rather than BER itself. Future releases may be able to represent data in other external forms. There is an ASN.1 tutorial available at this site. This site also has standards for ASN.1 (X.680) and BER (X.690) online. The ODR interface is based loosely on that of the Sun Microsystems XDR routines. Specifically, each function which corresponds to an ASN.1 primitive type has a dual function. Depending on the settings of the ODR stream which is supplied as a parameter, the function may be used either to encode or decode data. The functions that can be built using these primitive functions, to represent more complex data types, share this quality. The result is that you only have to enter the definition for a type once - and you have the functionality of encoding, decoding (and pretty-printing) all in one unit. The resulting C source code is quite compact, and is a pretty straightforward representation of the source ASN.1 specification. In many cases, the model of the XDR functions works quite well in this role. In others, it is less elegant. Most of the hassle comes from the optional SEQUENCE members which don't exist in XDR. The Primitive ASN.1 Types ASN.1 defines a number of primitive types (many of which correspond roughly to primitive types in structured programming languages, such as C). INTEGER The &odr; function for encoding or decoding (or printing) the ASN.1 INTEGER type looks like this: int odr_integer(ODR o, Odr_int **p, int optional, const char *name); The Odr_int is just a simple integer. This form is typical of the primitive &odr; functions. They are named after the type of data that they encode or decode. They take an &odr; stream, an indirect reference to the type in question, and an optional flag (corresponding to the OPTIONAL keyword of ASN.1) as parameters. They all return an integer value of either one or zero. When you use the primitive functions to construct encoders for complex types of your own, you should follow this model as well. This ensures that your new types can be reused as elements in yet more complex types. The o parameter should obviously refer to a properly initialized &odr; stream of the right type (encoding/decoding/printing) for the operation that you wish to perform. When encoding or printing, the function first looks at * p. If * p (the pointer pointed to by p) is a null pointer, this is taken to mean that the data element is absent. If the optional parameter is nonzero, the function will return one (signifying success) without any further processing. If the optional is zero, an internal error flag is set in the &odr; stream, and the function will return 0. No further operations can be carried out on the stream without a call to the function odr_reset(). If *p is not a null pointer, it is expected to point to an instance of the data type. The data will be subjected to the encoding rules, and the result will be placed in the buffer held by the &odr; stream. The other ASN.1 primitives have similar functions that operate in similar manners: BOOLEAN int odr_bool(ODR o, Odr_bool **p, int optional, const char *name); REAL Not defined. NULL int odr_null(ODR o, Odr_null **p, int optional, const char *name); In this case, the value of **p is not important. If *p is different from the null pointer, the null value is present, otherwise it's absent. OCTET STRING typedef struct odr_oct { unsigned char *buf; int len; } Odr_oct; int odr_octetstring(ODR o, Odr_oct **p, int optional, const char *name); The buf field should point to the character array that holds the octetstring. The len field holds the actual length. The character array need not be null terminated. To make things a little easier, an alternative is given for string types that are not expected to contain embedded NULL characters (eg. VisibleString): int odr_cstring(ODR o, char **p, int optional, const char *name); Which encoded or decodes between OCTETSTRING representations and null-terminates C strings. Functions are provided for the derived string types, eg: int odr_visiblestring(ODR o, char **p, int optional, const char *name); BIT STRING int odr_bitstring(ODR o, Odr_bitmask **p, int optional, const char *name); The opaque type Odr_bitmask is only suitable for holding relatively brief bit strings, eg. for options fields, etc. The constant ODR_BITMASK_SIZE multiplied by 8 gives the maximum possible number of bits. A set of macros are provided for manipulating the Odr_bitmask type: void ODR_MASK_ZERO(Odr_bitmask *b); void ODR_MASK_SET(Odr_bitmask *b, int bitno); void ODR_MASK_CLEAR(Odr_bitmask *b, int bitno); int ODR_MASK_GET(Odr_bitmask *b, int bitno); The functions are modeled after the manipulation functions that accompany the fd_set type used by the select(2) call. ODR_MASK_ZERO should always be called first on a new bitmask, to initialize the bits to zero. OBJECT IDENTIFIER int odr_oid(ODR o, Odr_oid **p, int optional, const char *name); The C OID representation is simply an array of integers, terminated by the value -1 (the Odr_oid type is synonymous with the short type). We suggest that you use the OID database module (see ) to handle object identifiers in your application. Tagging Primitive Types The simplest way of tagging a type is to use the odr_implicit_tag() or odr_explicit_tag() macros: int odr_implicit_tag(ODR o, Odr_fun fun, int class, int tag, int optional, const char *name); int odr_explicit_tag(ODR o, Odr_fun fun, int class, int tag, int optional, const char *name); To create a type derived from the integer type by implicit tagging, you might write: MyInt ::= [210] IMPLICIT INTEGER In the &odr; system, this would be written like: int myInt(ODR o, Odr_int **p, int optional, const char *name) { return odr_implicit_tag(o, odr_integer, p, ODR_CONTEXT, 210, optional, name); } The function myInt() can then be used like any of the primitive functions provided by &odr;. Note that the behavior of odr_explicit_tag() and odr_implicit_tag() macros act exactly the same as the functions they are applied to - they respond to error conditions, etc, in the same manner - they simply have three extra parameters. The class parameter may take one of the values: ODR_CONTEXT, ODR_PRIVATE, ODR_UNIVERSAL, or /ODR_APPLICATION. Constructed Types Constructed types are created by combining primitive types. The &odr; system only implements the SEQUENCE and SEQUENCE OF constructions (although adding the rest of the container types should be simple enough, if the need arises). For implementing SEQUENCEs, the functions int odr_sequence_begin(ODR o, void *p, int size, const char *name); int odr_sequence_end(ODR o); are provided. The odr_sequence_begin() function should be called in the beginning of a function that implements a SEQUENCE type. Its parameters are the &odr; stream, a pointer (to a pointer to the type you're implementing), and the size of the type (typically a C structure). On encoding, it returns 1 if * p is a null pointer. The size parameter is ignored. On decoding, it returns 1 if the type is found in the data stream. size bytes of memory are allocated, and *p is set to point to this space. odr_sequence_end() is called at the end of the complex function. Assume that a type is defined like this: MySequence ::= SEQUENCE { intval INTEGER, boolval BOOLEAN OPTIONAL } The corresponding &odr; encoder/decoder function and the associated data structures could be written like this: typedef struct MySequence { Odr_int *intval; Odr_bool *boolval; } MySequence; int mySequence(ODR o, MySequence **p, int optional, const char *name) { if (odr_sequence_begin(o, p, sizeof(**p), name) == 0) return optional && odr_ok(o); return odr_integer(o, &(*p)->intval, 0, "intval") && odr_bool(o, &(*p)->boolval, 1, "boolval") && odr_sequence_end(o); } Note the 1 in the call to odr_bool(), to mark that the sequence member is optional. If either of the member types had been tagged, the macros odr_implicit_tag() or odr_explicit_tag() could have been used. The new function can be used exactly like the standard functions provided with &odr;. It will encode, decode or pretty-print a data value of the MySequence type. We like to name types with an initial capital, as done in ASN.1 definitions, and to name the corresponding function with the first character of the name in lower case. You could, of course, name your structures, types, and functions any way you please - as long as you're consistent, and your code is easily readable. odr_ok is just that - a predicate that returns the state of the stream. It is used to ensure that the behavior of the new type is compatible with the interface of the primitive types. Tagging Constructed Types See for information on how to tag the primitive types, as well as types that are already defined. Implicit Tagging Assume the type above had been defined as MySequence ::= [10] IMPLICIT SEQUENCE { intval INTEGER, boolval BOOLEAN OPTIONAL } You would implement this in &odr; by calling the function int odr_implicit_settag(ODR o, int class, int tag); which overrides the tag of the type immediately following it. The macro odr_implicit_tag() works by calling odr_implicit_settag() immediately before calling the function pointer argument. Your type function could look like this: int mySequence(ODR o, MySequence **p, int optional, const char *name) { if (odr_implicit_settag(o, ODR_CONTEXT, 10) == 0 || odr_sequence_begin(o, p, sizeof(**p), name) == 0) return optional && odr_ok(o); return odr_integer(o, &(*p)->intval, 0, "intval") && odr_bool(o, &(*p)->boolval, 1, "boolval") && odr_sequence_end(o); } The definition of the structure MySequence would be the same. Explicit Tagging Explicit tagging of constructed types is a little more complicated, since you are in effect adding a level of construction to the data. Assume the definition: MySequence ::= [10] IMPLICIT SEQUENCE { intval INTEGER, boolval BOOLEAN OPTIONAL } Since the new type has an extra level of construction, two new functions are needed to encapsulate the base type: int odr_constructed_begin(ODR o, void *p, int class, int tag, const char *name); int odr_constructed_end(ODR o); Assume that the IMPLICIT in the type definition above were replaced with EXPLICIT (or that the IMPLICIT keyword were simply deleted, which would be equivalent). The structure definition would look the same, but the function would look like this: int mySequence(ODR o, MySequence **p, int optional, const char *name) { if (odr_constructed_begin(o, p, ODR_CONTEXT, 10, name) == 0) return optional && odr_ok(o); if (o->direction == ODR_DECODE) *p = odr_malloc(o, sizeof(**p)); if (odr_sequence_begin(o, p, sizeof(**p), 0) == 0) { *p = 0; /* this is almost certainly a protocol error */ return 0; } return odr_integer(o, &(*p)->intval, 0, "intval") && odr_bool(o, &(*p)->boolval, 1, "boolval") && odr_sequence_end(o) && odr_constructed_end(o); } Notice that the interface here gets kind of nasty. The reason is simple: Explicitly tagged, constructed types are fairly rare in the protocols that we care about, so the esthetic annoyance (not to mention the dangers of a cluttered interface) is less than the time that would be required to develop a better interface. Nevertheless, it is far from satisfying, and it's a point that will be worked on in the future. One option for you would be to simply apply the odr_explicit_tag() macro to the first function, and not have to worry about odr_constructed_* yourself. Incidentally, as you might have guessed, the odr_sequence_ functions are themselves implemented using the /odr_constructed_ functions. SEQUENCE OF To handle sequences (arrays) of a specific type, the function int odr_sequence_of(ODR o, int (*fun)(ODR o, void *p, int optional), void *p, int *num, const char *name); The fun parameter is a pointer to the decoder/encoder function of the type. p is a pointer to an array of pointers to your type. num is the number of elements in the array. Assume a type MyArray ::= SEQUENCE OF INTEGER The C representation might be typedef struct MyArray { int num_elements; Odr_int **elements; } MyArray; And the function might look like int myArray(ODR o, MyArray **p, int optional, const char *name) { if (o->direction == ODR_DECODE) *p = odr_malloc(o, sizeof(**p)); if (odr_sequence_of(o, odr_integer, &(*p)->elements, &(*p)->num_elements, name)) return 1; *p = 0; return optional && odr_ok(o); } CHOICE Types The choice type is used fairly often in some ASN.1 definitions, so some work has gone into streamlining its interface. CHOICE types are handled by the function: int odr_choice(ODR o, Odr_arm arm[], void *p, void *whichp, const char *name); The arm array is used to describe each of the possible types that the CHOICE type may assume. Internally in your application, the CHOICE type is represented as a discriminated union. That is, a C union accompanied by an integer (or enum) identifying the active 'arm' of the union. whichp is a pointer to the union discriminator. When encoding, it is examined to determine the current type. When decoding, it is set to reference the type that was found in the input stream. The Odr_arm type is defined thus: typedef struct odr_arm { int tagmode; int class; int tag; int which; Odr_fun fun; char *name; } Odr_arm; The interpretation of the fields are: tagmode Either ODR_IMPLICIT, ODR_EXPLICIT, or ODR_NONE (-1) to mark no tagging. which The value of the discriminator that corresponds to this CHOICE element. Typically, it will be a #defined constant, or an enum member. fun A pointer to a function that implements the type of the CHOICE member. It may be either a standard &odr; type or a type defined by yourself. name Name of tag. A handy way to prepare the array for use by the odr_choice() function is to define it as a static, initialized array in the beginning of your decoding/encoding function. Assume the type definition: MyChoice ::= CHOICE { untagged INTEGER, tagged [99] IMPLICIT INTEGER, other BOOLEAN } Your C type might look like typedef struct MyChoice { enum { MyChoice_untagged, MyChoice_tagged, MyChoice_other } which; union { Odr_int *untagged; Odr_int *tagged; Odr_bool *other; } u; }; And your function could look like this: int myChoice(ODR o, MyChoice **p, int optional, const char *name) { static Odr_arm arm[] = { {-1, -1, -1, MyChoice_untagged, odr_integer, "untagged"}, {ODR_IMPLICIT, ODR_CONTEXT, 99, MyChoice_tagged, odr_integer, "tagged"}, {-1, -1, -1, MyChoice_other, odr_boolean, "other"}, {-1, -1, -1, -1, 0} }; if (o->direction == ODR_DECODE) *p = odr_malloc(o, sizeof(**p); else if (!*p) return optional && odr_ok(o); if (odr_choice(o, arm, &(*p)->u, &(*p)->which), name) return 1; *p = 0; return optional && odr_ok(o); } In some cases (say, a non-optional choice which is a member of a sequence), you can "embed" the union and its discriminator in the structure belonging to the enclosing type, and you won't need to fiddle with memory allocation to create a separate structure to wrap the discriminator and union. The corresponding function is somewhat nicer in the Sun XDR interface. Most of the complexity of this interface comes from the possibility of declaring sequence elements (including CHOICEs) optional. The ASN.1 specifications naturally requires that each member of a CHOICE have a distinct tag, so they can be told apart on decoding. Sometimes it can be useful to define a CHOICE that has multiple types that share the same tag. You'll need some other mechanism, perhaps keyed to the context of the CHOICE type. In effect, we would like to introduce a level of context-sensitiveness to our ASN.1 specification. When encoding an internal representation, we have no problem, as long as each CHOICE member has a distinct discriminator value. For decoding, we need a way to tell the choice function to look for a specific arm of the table. The function void odr_choice_bias(ODR o, int what); provides this functionality. When called, it leaves a notice for the next call to odr_choice() to be called on the decoding stream o that only the arm entry with a which field equal to what should be tried. The most important application (perhaps the only one, really) is in the definition of application-specific EXTERNAL encoders/decoders which will automatically decode an ANY member given the direct or indirect reference. Debugging The protocol modules are suffering somewhat from a lack of diagnostic tools at the moment. Specifically ways to pretty-print PDUs that aren't recognized by the system. We'll include something to this end in a not-too-distant release. In the meantime, what we do when we get packages we don't understand is to compile the ODR module with ODR_DEBUG defined. This causes the module to dump tracing information as it processes data units. With this output and the protocol specification (Z39.50), it is generally fairly easy to see what goes wrong.