X-Git-Url: http://git.indexdata.com/?a=blobdiff_plain;f=doc%2Frecordmodel.xml;h=ba72af19db33726646e78904c5f36d299dbf3b7f;hb=8d16f8a75221de7152eceb7d6be4d9e847d99e4f;hp=6ad60ed125af11e8052cea12ffe9904739894244;hpb=25a37c9be836f891281688788a7a1f967ea2b2cb;p=idzebra-moved-to-github.git diff --git a/doc/recordmodel.xml b/doc/recordmodel.xml index 6ad60ed..ba72af1 100644 --- a/doc/recordmodel.xml +++ b/doc/recordmodel.xml @@ -1,5 +1,5 @@ - + The Record Model @@ -33,7 +33,7 @@ - + When records are accessed by the system, they are represented in their local, or native format. This might be SGML or HTML files, @@ -42,7 +42,8 @@ input filter by preparing conversion rules based on regular expressions and possibly augmented by a flexible scripting language (Tcl). - The input filter produces as output an internal representation: + The input filter produces as output an internal representation, + a tree structure. @@ -94,7 +95,7 @@ subsequent sections. Zebra can read structured records in many different formats. How this is done is governed by additional parameters after the - "grs" keyboard, separated by "." characters. + "grs" keyword, separated by "." characters. @@ -220,7 +221,7 @@ Each element is terminated by a closing tag - beginning with </, and containing the same symbolic tag-name as the corresponding opening tag. - The general closing tag - </> - + The general closing tag - </> - terminates the element started by the last opening tag. The structuring of elements is significant. The element Telephone, @@ -259,7 +260,7 @@ - + Variants @@ -460,12 +461,12 @@ - An action is surrounded by curly braces ({...}), and + An action is surrounded by curly braces ({...}), and consists of a sequence of statements. Statements may be separated by newlines or semicolons (;). Within actions, the strings that matched the expressions immediately preceding the action can be referred to as - $0, $1, $2, etc. + $0, $1, $2, etc. @@ -476,13 +477,13 @@ - begin type [parameter ... ] + begin type [parameter ... ] Begin a new data element. The type is one of the following: - + record @@ -539,7 +540,7 @@ - end [type] + end [type] Close a tagged element. If no parameter is given, @@ -608,8 +609,8 @@ - TITLE "Zen and the Art of Motorcycle Maintenance" ROOT + TITLE "Zen and the Art of Motorcycle Maintenance" AUTHOR "Robert Pirsig" @@ -623,10 +624,10 @@ - TITLE "Zen and the Art of Motorcycle Maintenance" ROOT - FIRST-NAME "Robert" + TITLE "Zen and the Art of Motorcycle Maintenance" AUTHOR + FIRST-NAME "Robert" SURNAME "Pirsig" @@ -691,35 +692,35 @@ Which of the two elements are transmitted to the client by the server depends on the specifications provided by the client, if any. - + In practice, each variant node is associated with a triple of class, type, value, corresponding to the variant mechanism of Z39.50. - + - + Data Elements - + Data nodes have no children (they are always leaf nodes in the record tree). - + - + - + Configuring Your Data Model - + The following sections describe the configuration files that govern the internal management of data records. The system searches for the files @@ -762,14 +763,14 @@ - The Tag set (again, this can consist of several different sets). + The tag set (again, this can consist of several different sets). This is used when reading the records from a file, to recognize the different tags, and when transmitting the record to the client - mapping the tags to their numerical representation, if they are known. - + The variant set which is used in the profile. This provides a @@ -840,8 +841,8 @@ Generally, the files are simple ASCII files, which can be maintained - using any text editor. Blank lines, and lines beginning with a (#) are - ignored. Any characters on a line followed by a (#) are also ignored. + using any text editor. Blank lines, and lines beginning with a (#) are + ignored. Any characters on a line followed by a (#) are also ignored. All other lines contain directives, which provide some setting or value to the system. Generally, settings are characterized by a single @@ -853,7 +854,7 @@ - + The Abstract Syntax (.abs) Files @@ -883,12 +884,12 @@ The file may contain the following directives: - + - + - name symbolic-name + name symbolic-name (m) This provides a shorthand name or @@ -897,17 +898,17 @@ - reference OID-name + reference OID-name (m) The reference name of the OID for the profile. The reference names can be found in the util - module of YAZ. + module of YAZ. - attset filename + attset filename (m) The attribute set that is used for @@ -916,7 +917,7 @@ - tagset filename + tagset filename (o) The tag set (if any) that describe @@ -925,7 +926,7 @@ - varset filename + varset filename (o) The variant set used in the profile. @@ -933,25 +934,27 @@ - maptab filename + maptab filename (o,r) This points to a conversion table that might be used if the client asks for the record in a different schema from the native one. - + + - marc filename + marc filename (o) Points to a file containing parameters - for representing the record contents in the ISO2709 syntax. Read the - description of the MARC representation facility below. + for representing the record contents in the ISO2709 syntax. + Read the description of the MARC representation facility below. - + + - esetname name filename + esetname name filename (o,r) Associates the @@ -959,9 +962,10 @@ given in place of the filename, this corresponds to a null mapping for the given element set name. - + + - any tags + any tags (o) This directive specifies a list of attributes @@ -971,49 +975,74 @@ provides an efficient way of supporting free-text searching across all elements. However, it does increase the size of the index significantly. The attributes can be qualified with a structure, as in - the elm directive below. + the elm directive below. - + + - elm path name attributes + elm path name attributes (o,r) Adds an element to the abstract record syntax of the schema. - The path follows the + The path follows the syntax which is suggested by the Z39.50 document - that is, a sequence - of tags separated by slashes (/). Each tag is given as a + of tags separated by slashes (/). Each tag is given as a comma-separated pair of tag type and -value surrounded by parenthesis. - The name is the name of the element, and - the attributes + The name is the name of the element, and + the attributes specifies which attributes to use when indexing the element in a comma-separated list. A ! in place of the attribute name is equivalent to specifying an attribute name identical to the element name. A - in place of the attribute name specifies that no indexing is to take place for the given element. - The attributes can be qualified with field - types to specify which + The attributes can be qualified with field + types to specify which character set should govern the indexing procedure for that field. The same data element may be indexed into several different fields, using different character set definitions. See the . - The default field type is "w" for word. + The default field type is w for + word. - + + + + + xelm xpath attributes + + + Specifies indexing for record nodes given by + xpath. Unlike directive + elm, this directive allows you to index attribute + contents. The xpath uses + a syntax similar to XPath. The attributes + have same syntax and meaning as directive elm, except that operator + ! refers to the nodes selected by xpath. + + + + + - encoding encodingname + encoding encodingname This directive specifies character encoding for external records. For records such as XML that specifies encoding within the file via a header this directive is ignored. If neither this directive is given, nor an encoding is set - within external records, ISO-8859-1 encoding is assmed. + within external records, ISO-8859-1 encoding is assumed. - xpath enable/disable + xpath enable/disable If this directive is followed by enable, @@ -1023,6 +1052,102 @@ + + + + systag systemtag element + + + This directive maps system information to an element during + retrieval. This information is dynamically created. The + following system tags are defined + + + size + + + Size of record in bytes. By default this + is mapped to element size. + + + + + + rank + + + Score/rank of record. By default this + is mapped to element rank. + If no score was calculated for the record (non-ranked + searched) search this directive is ignored. + + + + + + sysno + + + Zebra's system number (record ID) for the + record. By default this is mapped to element + localControlNumber. + + + + + If you do not want a particular system tag to be applied, + then set the resulting element to something undefined in the + abs file (such as none). + + + + + + + + systag + systemTag + actualTag + + + + Specifies what information, if any, Zebra should + automatically include in retrieval records for the + ``system fields'' that it supports. + systemTag may + be any of the following: + + + rank + + An integer indicating the relevance-ranking score + assigned to the record. + + + + sysno + + An automatically generated identifier for the record, + unique within this database. It is represented by the + <localControlNumber> element in + XML and the (1,14) tag in GRS-1. + + + + size + + The size, in bytes, of the retrieved record. + + + + + + The actualTag parameter may be + none to indicate that the named element + should be omitted from retrieval records. + + + @@ -1084,7 +1209,7 @@ The Attribute Set (.att) Files - This file type describes the Use elements of + This file type describes the Use elements of an attribute set. It contains the following directives. @@ -1092,7 +1217,7 @@ - name symbolic-name + name symbolic-name (m) This provides a shorthand name or @@ -1101,24 +1226,24 @@ - reference OID-name + reference OID-name (m) The reference name of the OID for the attribute set. - The reference names can be found in the util - module of YAZ. + The reference names can be found in the util + module of YAZ. - include filename + include filename (o,r) This directive is used to include another attribute set as a part of the current one. This is used when a new attribute set is defined as an extension to another set. For instance, many new attribute sets are defined as extensions - to the bib-1 set. + to the bib-1 set. This is an important feature of the retrieval system of Z39.50, as it ensures the highest possible level of interoperability, as those access points of your database which are @@ -1128,15 +1253,15 @@ att - att-value att-name [local-value] + att-value att-name [local-value] (o,r) This repeatable directive introduces a new attribute to the set. The attribute value is stored in the index (unless a - local-value is + local-value is given, in which case this is stored). The name is used to refer to the - attribute from the abstract syntax. + attribute from the abstract syntax. @@ -1463,7 +1588,7 @@ simpleElement - path ['variant' variant-request] + path ['variant' variant-request] (o,r) This corresponds to a simple element request @@ -1748,9 +1873,9 @@ - Curly braces {} may be used to enclose ranges of single + Curly braces {} may be used to enclose ranges of single characters (possibly using the escape convention described in the - preceding point), eg. {a-z} to introduce the + preceding point), eg. {a-z} to introduce the standard range of ASCII characters. Note that the interpretation of such a range depends on the concrete representation in your local, physical character set.