Generic serverIntroduction
If you aren't into documentation, a good way to learn how the
back end interface works is to look at the backend.h
file. Then, look at the small dummy-server in
ztest/ztest.c. The backend.h
file also makes a good reference, once you've chewed your way through
the prose of this file.
If you have a database system that you would like to make available by
means of Z39.50 or SRU, &yaz; basically offers your two options. You
can use the APIs provided by the &asn;, &odr;, and &comstack;
modules to
create and decode PDUs, and exchange them with a client.
Using this low-level interface gives you access to all fields and
options of the protocol, and you can construct your server as close
to your existing database as you like.
It is also a fairly involved process, requiring
you to set up an event-handling mechanism, protocol state machine,
etc. To simplify server implementation, we have implemented a compact
and simple, but reasonably full-functioned server-frontend that will
handle most of the protocol mechanics, while leaving you to
concentrate on your database interface.
The backend interface was designed in anticipation of a specific
integration task, while still attempting to achieve some degree of
generality. We realize fully that there are points where the
interface can be improved significantly. If you have specific
functions or parameters that you think could be useful, send us a
mail (or better, sign on to the mailing list referred to in the
top-level README file). We will try to fit good suggestions into future
releases, to the extent that it can be done without requiring
too many structural changes in existing applications.
The &yaz; server does not support XCQL.
The Database Frontend
We refer to this software as a generic database frontend. Your
database system is the backend database, and the
interface between the two is called the backend API.
The backend API consists of a small number of function handlers and
structure definitions. You are required to provide the
main() routine for the server (which can be
quite simple), as well as a set of handlers to match each of the
prototypes.
The interface functions that you write can use any mechanism you like
to communicate with your database system: You might link the whole
thing together with your database application and access it by
function calls; you might use IPC to talk to a database server
somewhere; or you might link with third-party software that handles
the communication for you (like a commercial database client library).
At any rate, the handlers will perform the tasks of:
Initialization.
Searching.
Fetching records.
Scanning the database index (optional - if you wish to implement SCAN).
Extended Services (optional).
Result-Set Delete (optional).
Result-Set Sort (optional).
Return Explain for SRU (optional).
(more functions will be added in time to support as much of
Z39.50-1995 as possible).
The Backend API
The header file that you need to use the interface are in the
include/yaz directory. It's called
backend.h. It will include other files from
the include/yaz directory, so you'll
probably want to use the -I option of your compiler to tell it
where to find the files. When you run
make in the top-level &yaz; directory,
everything you need to create your server is to link with the
lib/libyaz.la library.
Your main() Routine
As mentioned, your main() routine can be quite brief.
If you want to initialize global parameters, or read global configuration
tables, this is the place to do it. At the end of the routine, you should
call the function
int statserv_main(int argc, char **argv,
bend_initresult *(*bend_init)(bend_initrequest *r),
void (*bend_close)(void *handle));
The third and fourth arguments are pointers to handlers. Handler
bend_init is called whenever the server receives
an Initialize Request, so it serves as a Z39.50 session initializer. The
bend_close handler is called when the session is
closed.
statserv_main will establish listening sockets
according to the parameters given. When connection requests are received,
the event handler will typically fork() and
create a sub-process to handle a new connection.
Alternatively the server may be setup to create threads for each
connection.
If you do use global variables and forking, you should be aware, then,
that these cannot be shared between associations, unless you explicitly
disable forking by command line parameters.
The server provides a mechanism for controlling some of its behavior
without using command-line options. The function
statserv_options_block *statserv_getcontrol(void);
will return a pointer to a struct statserv_options_block
describing the current default settings of the server. The structure
contains these elements:
int dynamic
A boolean value, which determines whether the server
will fork on each incoming request (TRUE), or not (FALSE). Default is
TRUE. This flag is only read by UNIX-based servers (WIN32 based servers
doesn't fork).
int threads
A boolean value, which determines whether the server
will create a thread on each incoming request (TRUE), or not (FALSE).
Default is FALSE. This flag is only read by UNIX-based servers
that offer POSIX Threads support.
WIN32-based servers always operate in threaded mode.
int inetd
A boolean value, which determines whether the server
will operates under a UNIX INET daemon (inetd). Default is FALSE.
char logfile[ODR_MAXNAME+1]File for diagnostic output ("": stderr).
char apdufile[ODR_MAXNAME+1]
Name of file for logging incoming and outgoing APDUs
("": don't log APDUs, "-":
stderr).
char default_listen[1024]Same form as the command-line specification of
listener address. "": no default listener address.
Default is to listen at "tcp:@:9999". You can only
specify one default listener address in this fashion.
enum oid_proto default_proto;Either PROTO_Z3950 or
PROTO_SR.
Default is PROTO_Z39_50.
int idle_timeout;Maximum session idle-time, in minutes. Zero indicates
no (infinite) timeout. Default is 15 minutes.
int maxrecordsize;Maximum permissible record (message) size. Default
is 1Mb. This amount of memory will only be allocated if a
client requests a very large amount of records in one operation
(or a big record).
Set it to a lower number if you are worried about resource
consumption on your host system.
char configname[ODR_MAXNAME+1]Passed to the backend when a new connection is received.
char setuid[ODR_MAXNAME+1]Set user id to the user specified, after binding
the listener addresses.
void (*bend_start)(struct statserv_options_block *p)Pointer to function which is called after the
command line options have been parsed - but before the server
starts listening.
For forked UNIX servers this handler is called in the mother
process; for threaded servers this handler is called in the
main thread.
The default value of this pointer is NULL in which case it
isn't invoked by the frontend server.
When the server operates as an NT service this handler is called
whenever the service is started.
void (*bend_stop)(struct statserv_options_block *p)Pointer to function which is called whenever the server
has stopped listening for incoming connections. This function pointer
has a default value of NULL in which case it isn't called.
When the server operates as an NT service this handler is called
whenever the service is stopped.
void *handleUser defined pointer (default value NULL).
This is a per-server handle that can be used to specify "user-data".
Do not confuse this with the session-handle as returned by bend_init.
The pointer returned by statserv_getcontrol points to
a static area. You are allowed to change the contents of the structure,
but the changes will not take effect before you call
void statserv_setcontrol(statserv_options_block *block);
that you should generally update this structure before calling
statserv_main().
The Backend Functions
For each service of the protocol, the backend interface declares one or
two functions. You are required to provide implementations of the
functions representing the services that you wish to implement.
Init
bend_initresult (*bend_init)(bend_initrequest *r);
This handler is called once for each new connection request, after
a new process/thread has been created, and an Initialize Request has
been received from the client. The pointer to the
bend_init handler is passed in the call to
statserv_start.
This handler is also called when operating in SRU mode - when
a connection has been made (even though SRU does not offer
this service).
Unlike previous versions of YAZ, the bend_init also
serves as a handler that defines the Z39.50 services that the backend
wish to support. Pointers to all service handlers,
including search - and fetch must be specified here in this handler.
The request - and result structures are defined as
typedef struct bend_initrequest
{
/** \brief user/name/password to be read */
Z_IdAuthentication *auth;
/** \brief encoding stream (for results) */
ODR stream;
/** \brief printing stream */
ODR print;
/** \brief decoding stream (use stream for results) */
ODR decode;
/** \brief reference ID */
Z_ReferenceId *referenceId;
/** \brief peer address of client */
char *peer_name;
/** \brief character set and language negotiation
see include/yaz/z-charneg.h
*/
Z_CharSetandLanguageNegotiation *charneg_request;
/** \brief character negotiation response */
Z_External *charneg_response;
/** \brief character set (encoding) for query terms
This is NULL by default. It should be set to the native character
set that the backend assumes for query terms */
char *query_charset;
/** \brief whehter query_charset also applies to recors
Is 0 (No) by default. Set to 1 (yes) if records is in the same
character set as queries. If in doubt, use 0 (No).
*/
int records_in_same_charset;
char *implementation_id;
char *implementation_name;
char *implementation_version;
/** \brief Z39.50 sort handler */
int (*bend_sort)(void *handle, bend_sort_rr *rr);
/** \brief SRU/Z39.50 search handler */
int (*bend_search)(void *handle, bend_search_rr *rr);
/** \brief SRU/Z39.50 fetch handler */
int (*bend_fetch)(void *handle, bend_fetch_rr *rr);
/** \brief SRU/Z39.50 present handler */
int (*bend_present)(void *handle, bend_present_rr *rr);
/** \brief Z39.50 extended services handler */
int (*bend_esrequest) (void *handle, bend_esrequest_rr *rr);
/** \brief Z39.50 delete result set handler */
int (*bend_delete)(void *handle, bend_delete_rr *rr);
/** \brief Z39.50 scan handler */
int (*bend_scan)(void *handle, bend_scan_rr *rr);
/** \brief Z39.50 segment facility handler */
int (*bend_segment)(void *handle, bend_segment_rr *rr);
/** \brief SRU explain handler */
int (*bend_explain)(void *handle, bend_explain_rr *rr);
/** \brief SRU scan handler */
int (*bend_srw_scan)(void *handle, bend_scan_rr *rr);
/** \brief SRU record update handler */
int (*bend_srw_update)(void *handle, bend_update_rr *rr);
/** \brief whether named result sets are supported (0=disable, 1=enable) */
int named_result_sets;
} bend_initrequest;
typedef struct bend_initresult
{
int errcode; /* 0==OK */
char *errstring; /* system error string or NULL */
void *handle; /* private handle to the backend module */
} bend_initresult;
In general, the server frontend expects that the
bend_*result pointer that you return is valid at
least until the next call to a bend_* function.
This applies to all of the functions described herein. The parameter
structure passed to you in the call belongs to the server frontend, and
you should not make assumptions about its contents after the current
function call has completed. In other words, if you want to retain any
of the contents of a request structure, you should copy them.
The errcode should be zero if the initialization of
the backend went well. Any other value will be interpreted as an error.
The errstring isn't used in the current version, but
one option would be to stick it in the initResponse as a VisibleString.
The handle is the most important parameter. It should
be set to some value that uniquely identifies the current session to
the backend implementation. It is used by the frontend server in any
future calls to a backend function.
The typical use is to set it to point to a dynamically allocated state
structure that is private to your backend module.
The auth member holds the authentication information
part of the Z39.50 Initialize Request. Interpret this if your serves
requires authentication.
The members peer_name,
implementation_id,
implementation_name and
implementation_version holds
DNS of client, ID of implementor, name
of client (Z39.50) implementation - and version.
The bend_ - members are set to NULL when
bend_init is called. Modify the pointers by
setting them to point to backend functions.
Search and RetrieveWe now describe the handlers that are required to support search -
and retrieve. You must support two functions - one for search - and one
for fetch (retrieval of one record). If desirable you can provide a
third handler which is called when a present request is received which
allows you to optimize retrieval of multiple-records.
int (*bend_search) (void *handle, bend_search_rr *rr);
typedef struct {
char *setname; /* name to give to this set */
int replace_set; /* replace set, if it already exists */
int num_bases; /* number of databases in list */
char **basenames; /* databases to search */
Z_ReferenceId *referenceId;/* reference ID */
Z_Query *query; /* query structure */
ODR stream; /* encode stream */
ODR decode; /* decode stream */
ODR print; /* print stream */
bend_request request;
bend_association association;
int *fd;
int hits; /* number of hits */
int errcode; /* 0==OK */
char *errstring; /* system error string or NULL */
Z_OtherInformation *search_info; /* additional search info */
char *srw_sortKeys; /* holds SRU/SRW sortKeys info */
char *srw_setname; /* holds SRU/SRW generated resultsetID */
int *srw_setnameIdleTime; /* holds SRU/SRW life-time */
int estimated_hit_count; /* if hit count is estimated */
int partial_resultset; /* if result set is partial */
} bend_search_rr;
The bend_search handler is a fairly close
approximation of a protocol Z39.50 Search Request - and Response PDUs
The setname is the resultSetName from the protocol.
You are required to establish a mapping between the set name and whatever
your backend database likes to use.
Similarly, the replace_set is a boolean value
corresponding to the resultSetIndicator field in the protocol.
num_bases/basenames is a length of/array of character
pointers to the database names provided by the client.
The query is the full query structure as defined in
the protocol ASN.1 specification.
It can be either of the possible query types, and it's up to you to
determine if you can handle the provided query type.
Rather than reproduce the C interface here, we'll refer you to the
structure definitions in the file
include/yaz/z-core.h. If you want to look at the
attributeSetId OID of the RPN query, you can either match it against
your own internal tables, or you can use the
OID tools.
The structure contains a number of hits, and an
errcode/errstring pair. If an error occurs
during the search, or if you're unhappy with the request, you should
set the errcode to a value from the BIB-1 diagnostic set. The value
will then be returned to the user in a nonsurrogate diagnostic record
in the response. The errstring, if provided, will
go in the addinfo field. Look at the protocol definition for the
defined error codes, and the suggested uses of the addinfo field.
The bend_search handler is also called when
the frontend server receives a SRU SearchRetrieveRequest.
For SRU, a CQL query is usually provided by the client.
The CQL query is available as part of Z_Query
structure (note that CQL is now part of Z39.50 via an external).
To support CQL in existing implementations that only do Type-1,
we refer to the CQL-to-PQF tool described
here.
To maintain backwards compatibility, the frontend server
of yaz always assume that error codes are BIB-1 diagnostics.
For SRU operation, a Bib-1 diagnostic code is mapped to
SRU diagnostic.
int (*bend_fetch) (void *handle, bend_fetch_rr *rr);
typedef struct bend_fetch_rr {
char *setname; /* set name */
int number; /* record number */
Z_ReferenceId *referenceId;/* reference ID */
Odr_oid *request_format; /* format, transfer syntax (OID) */
Z_RecordComposition *comp; /* Formatting instructions */
ODR stream; /* encoding stream - memory source if req */
ODR print; /* printing stream */
char *basename; /* name of database that provided record */
int len; /* length of record or -1 if structured */
char *record; /* record */
int last_in_set; /* is it? */
Odr_oid *output_format; /* response format/syntax (OID) */
int errcode; /* 0==success */
char *errstring; /* system error string or NULL */
int surrogate_flag; /* surrogate diagnostic */
char *schema; /* string record schema input/output */
} bend_fetch_rr;
The frontend server calls the bend_fetch handler
when it needs database records to fulfill a Z39.50 Search Request, a
Z39.50 Present Request or a SRU SearchRetrieveRequest.
The setname is simply the name of the result set
that holds the reference to the desired record.
The number is the offset into the set (with 1
being the first record in the set). The format field
is the record format requested by the client (See
).
A value of NULL for format indicates that the
client did not request a specific format.
The stream argument is an &odr; stream which
should be used for allocating space for structured data records.
The stream will be reset when all records have been assembled, and
the response package has been transmitted.
For unstructured data, the backend is responsible for maintaining a
static or dynamic buffer for the record between calls.
If a SRU SearchRetrieveRequest is received by the frontend server,
the referenceId is NULL and the
format (transfer syntax) is the OID for XML.
The schema for SRU is stored in both the
Z_RecordComposition
structure and schema (simple string).
In the structure, the basename is the name of the
database that holds the
record. len is the length of the record returned, in
bytes, and record is a pointer to the record.
last_in_set should be nonzero only if the record
returned is the last one in the given result set.
errcode and errstring, if
given, will be interpreted as a global error pertaining to the
set, and will be returned in a non-surrogate-diagnostic.
If you wish to return the error as a surrogate-diagnostic
(local error) you can do this by setting
surrogate_flag to 1 also.
If the len field has the value -1, then
record is assumed to point to a constructed data
type. The format field will be used to determine
which encoder should be used to serialize the data.
If your backend generates structured records, it should use
odr_malloc() on the provided stream for allocating
data: This allows the frontend server to keep track of the record sizes.
The format field is mapped to an object identifier
in the direct reference of the resulting EXTERNAL representation
of the record.
The current version of &yaz; only supports the direct reference mode.
int (*bend_present) (void *handle, bend_present_rr *rr);
typedef struct {
char *setname; /* set name */
int start;
int number; /* record number */
Odr_oid *format; /* format, transfer syntax (OID) */
Z_ReferenceId *referenceId;/* reference ID */
Z_RecordComposition *comp; /* Formatting instructions */
ODR stream; /* encoding stream - memory source if required */
ODR print; /* printing stream */
bend_request request;
bend_association association;
int hits; /* number of hits */
int errcode; /* 0==OK */
char *errstring; /* system error string or NULL */
} bend_present_rr;
The bend_present handler is called when
the server receives a Z39.50 Present Request.
The setname,
start and number is the
name of the result set - start position - and number of records to
be retrieved respectively. format and
comp is the preferred transfer syntax and element
specifications of the present request.
Note that this is handler serves as a supplement for
bend_fetch and need not to be defined in order to
support search - and retrieve.
Delete
For back-ends that supports delete of a result set only one handler
must be defined.
int (*bend_delete)(void *handle, bend_delete_rr *rr);
typedef struct bend_delete_rr {
int function;
int num_setnames;
char **setnames;
Z_ReferenceId *referenceId;
int delete_status; /* status for the whole operation */
int *statuses; /* status each set - indexed as setnames */
ODR stream;
ODR print;
} bend_delete_rr;
The delete set function definition is rather primitive, mostly because
we have had no practical need for it as of yet. If someone wants
to provide a full delete service, we'd be happy to add the
extra parameters that are required. Are there clients out there
that will actually delete sets they no longer need?
Scan
For servers that wish to offer the scan service one handler
must be defined.
int (*bend_scan)(void *handle, bend_scan_rr *rr);
typedef enum {
BEND_SCAN_SUCCESS, /* ok */
BEND_SCAN_PARTIAL /* not all entries could be found */
} bend_scan_status;
typedef struct bend_scan_rr {
int num_bases; /* number of elements in databaselist */
char **basenames; /* databases to search */
Odr_oid *attributeset;
Z_ReferenceId *referenceId; /* reference ID */
Z_AttributesPlusTerm *term;
ODR stream; /* encoding stream - memory source if required */
ODR print; /* printing stream */
int *step_size; /* step size */
int term_position; /* desired index of term in result list/returned */
int num_entries; /* number of entries requested/returned */
/* scan term entries. The called handler does not have
to allocate this. Size of entries is num_entries (see above) */
struct scan_entry *entries;
bend_scan_status status;
int errcode;
char *errstring;
char *scanClause; /* CQL scan clause */
char *setname; /* Scan in result set (NULL if omitted) */
} bend_scan_rr;
This backend server handles both Z39.50 scan
and SRU scan. In order for a handler to distinguish between SRU (CQL) scan
Z39.50 Scan , it must check for a non-NULL value of
scanClause.
if designed today, it would be a choice using a union or similar,
but that would break binary compatibility with existing servers.
Application Invocation
The finished application has the following
invocation syntax (by way of statserv_main()):
&gfs-synopsis;
The options are:
&gfs-options;
A listener specification consists of a transport mode followed by a
colon (:) followed by a listener address. The transport mode is
either tcp, unix: or
ssl.
For TCP and SSL, an address has the form
hostname | IP-number [: portnumber]
The port number defaults to 210 (standard Z39.50 port).
For UNIX, the address is the filename of socket.
For TCP/IP and SSL, the special hostname @
(at sign) is mapped to the address INADDR_ANY,
which causes the server to listen on any local interface.
Running the GFS on Unix
Assuming the server application appname is
started as root, the following will make it listen on port 210.
The server will change identity to nobody
and write its log to /var/log/app.log.
application -l /var/log/app.log -u nobody tcp:@:210
The server will accept Z39.50 requests and offer SRU service on port 210.
Setting up Apache as SRU Frontend
If you use Apache
as your public web server and want to offer HTTP port 80
access to the YAZ server on 210, you can use the
ProxyPass
directive.
If you have virtual host
srw.mydomain you can use the following directives
in Apache's httpd.conf:
<VirtualHost *>
ErrorLog /home/srw/logs/error_log
TransferLog /home/srw/logs/access_log
ProxyPass / http://srw.mydomain:210/
</VirtualHost>
The above for the Apache 1.3 series.
Running a server with local access only
Servers that is only being accessed from the local host should listen
on UNIX file socket rather than a Internet socket. To listen on
/tmp/mysocket start the server as follows:
application unix:/tmp/mysocket
GFS Configuration and Virtual Hosts
&gfs-virtual;