From 1ccadd935ed1b004e16537ab77c0653dd9058f99 Mon Sep 17 00:00:00 2001 From: oleg Date: Tue, 11 Mar 2003 13:24:10 +0000 Subject: [PATCH] Rename file. DocBook XML better, I think. --- doc/marc_indexing.sgml | 317 ----------------------------------------------- doc/marc_indexing.xml | 319 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 319 insertions(+), 317 deletions(-) delete mode 100644 doc/marc_indexing.sgml create mode 100644 doc/marc_indexing.xml diff --git a/doc/marc_indexing.sgml b/doc/marc_indexing.sgml deleted file mode 100644 index acb20f6..0000000 --- a/doc/marc_indexing.sgml +++ /dev/null @@ -1,317 +0,0 @@ - - - - - - - Indexing of MARC records by Zebra - - Zebra is suitable for distribution of MARC records via Z39.50. We - have a several possibilities to describe the indexing process of MARC records. - This document shows these possibilities. - - - - - - Simple indexing of MARC records -Simple indexing is not described yet. - - - - Extended indexing of MARC records - -Extended indexing of MARC records will help you if you need index a -combination of subfields, or index only a part of the whole field, -or use during indexing process embedded fields of MARC record. - - -Extended indexing of MARC records additionally allows: - - - -to index data in LEADER of MARC record - - - -to index data in control fields (with fixed length) - - - -to use during indexing the values of indicators - - - -to index linked fields for UNIMARC based formats - - - - - -In compare with simple indexing process the extended indexing -may increase (about 2-3 times) the time of indexing process for MARC -records. - - -The index-formula - -At the beginning, we have to define the term index-formula -for MARC records. This term helps to understand the notation of extended indexing of MARC records -by Zebra. Our definition is based on the document "The -table of conformity for Z39.50 use attributes and RUSMARC fields". -The document is available only in russian language. - -The index-formula is the combination of subfields presented in such way: - - - 71-00$a, $g, $h ($c){.$b ($c)} , (1) - - -We know that Zebra supports a Bib-1 attribute - right truncation. -In this case, the index-formula (1) consists from -forms, defined in the same way as (1) - - - 71-00$a, $g, $h - 71-00$a, $g - 71-00$a - - -The original MARC record may be without some elements, which included in index-formula. - - -This notation incudes such operands as: - - - - # - It means whitespace character. - - - - - - The position may contain any value, defined by MARC format. - For example, index-formula - - - 70-#1$a, $g , (2) - - -includes - - - 700#1$a, $g - 701#1$a, $g - 702#1$a, $g - - - - - - -{...} -The repeatable elements are defined in figure-brackets {}. For example, -index-formula - - - - 71-00$a, $g, $h ($c){.$b ($c)} , (3) - - -includes - - - 71-00$a, $g, $h ($c). $b ($c) - 71-00$a, $g, $h ($c). $b ($c). $b ($c) - 71-00$a, $g, $h ($c). $b ($c). $b ($c). $b ($c) - - - - - - -All another operands are the same as accepted in MARC world. - - - - - -Notation of <emphasis>index-formula</emphasis> for Zebra - - -Extended indexing overloads path of -elm definition in abstract syntax file of Zebra -(.abs file). It means that names beginning with -"mc-" are interpreted by Zebra as -index-formula. The database index is created and -linked with access point (Bib-1 use attribute) -according to this formula. - -For example, index-formula - - - 71-00$a, $g, $h ($c){.$b ($c)} , (4) - - -in .abs file looks like: - - - mc-71.00_$a,_$g,_$h_(_$c_){.$b_(_$c_)} - - - -The notation of index-formula uses the operands: - - - -_ -It means whitespace character. - - - -. -The position may contain any value, defined by MARC format. For example, -index-formula - - - 70-#1$a, $g , (5) - - -matches mc-70._1_$a,_$g_ and includes - - - 700_1_$a,_$g_ - 701_1_$a,_$g_ - 702_1_$a,_$g_ - - - - - -{...} -The repeatable elements are defined in figure-brackets {}. For example, -index-formula - - - 71#00$a, $g, $h ($c) {.$b ($c)} , (6) - - -matches mc-71.00_$a,_$g,_$h_(_$c_){.$b_(_$c_)} and -includes - - - 71.00_$a,_$g,_$h_(_$c_).$b_(_$c_) - 71.00_$a,_$g,_$h_(_$c_).$b_(_$c_).$b_(_$c_) - 71.00_$a,_$g,_$h_(_$c_).$b_(_$c_).$b_(_$c_).$b_(_$c_) - - - - - - -<...> -Embedded index-formula (for linked fields) is between <>. For example, -index-formula - - - 4--#-$170-#1$a, $g ($c) , (7) - - -matches mc-4.._._$1<70._1_$a,_$g_(_$c_)>_ and -includes - - - 463_._$1<70._1_$a,_$g_(_$c_)>_ - - - - - - - - -All another operands are the same as accepted in MARC world. - - - -Examples - - - - - - -indexing LEADER - -You need to use keyword "ldr" to index leader. For example, indexing data from 6th -and 7th position of LEADER - - - elm mc-ldr[6] Record-type ! - elm mc-ldr[7] Bib-level ! - - - - - - -indexing data from control fields - -indexing date (the time added to database) - - - elm mc-008[0-5] Date/time-added-to-db ! - - -or for RUSMARC (this data included in 100th field) - - - elm mc-100___$a[0-7]_ Date/time-added-to-db ! - - - - - - -using indicators while indexing - -For RUSMARC index-formula -70-#1$a, $g matches - - - elm 70._1_$a,_$g_ Author !:w,!:p - - -When Zebra finds a field according to "70." pattern it checks -the indicators. In this case the value of first indicator doesn't mater, but -the value of second one must be whitespace, in another case a field is not -indexed. - - - - - -indexing embedded (linked) fields for UNIMARC based formats - -For RUSMARC index-formula -4--#-$170-#1$a, $g ($c) matches - - - elm mc-4.._._$1<70._1_$a,_$g_(_$c_)>_ Author !:w,!:p - - -Data are extracted from record if the field matches to -"4.._." pattern and data in linked field match to embedded -index-formula 70._1_$a,_$g_(_$c_). - - - - - - - - - - - - diff --git a/doc/marc_indexing.xml b/doc/marc_indexing.xml new file mode 100644 index 0000000..62495bd --- /dev/null +++ b/doc/marc_indexing.xml @@ -0,0 +1,319 @@ + + + + + + + Indexing of MARC records by Zebra + + Zebra is suitable for distribution of MARC records via Z39.50. We + have a several possibilities to describe the indexing process of MARC records. + This document shows these possibilities. + + + + + + Simple indexing of MARC records +Simple indexing is not described yet. + + + + Extended indexing of MARC records + +Extended indexing of MARC records will help you if you need index a +combination of subfields, or index only a part of the whole field, +or use during indexing process embedded fields of MARC record. + + +Extended indexing of MARC records additionally allows: + + + +to index data in LEADER of MARC record + + + +to index data in control fields (with fixed length) + + + +to use during indexing the values of indicators + + + +to index linked fields for UNIMARC based formats + + + + + +In compare with simple indexing process the extended indexing +may increase (about 2-3 times) the time of indexing process for MARC +records. + + +The index-formula + +At the beginning, we have to define the term index-formula +for MARC records. This term helps to understand the notation of extended indexing of MARC records +by Zebra. Our definition is based on the document "The +table of conformity for Z39.50 use attributes and RUSMARC fields". +The document is available only in russian language. + +The index-formula is the combination of subfields presented in such way: + + +71-00$a, $g, $h ($c){.$b ($c)} , (1) + + +We know that Zebra supports a Bib-1 attribute - right truncation. +In this case, the index-formula (1) consists from +forms, defined in the same way as (1) + + +71-00$a, $g, $h +71-00$a, $g +71-00$a + + +The original MARC record may be without some elements, which included in index-formula. + + +This notation includes such operands as: + + + + # + It means whitespace character. + + + + - + The position may contain any value, defined by MARC format. + For example, index-formula + + +70-#1$a, $g , (2) + + +includes + + +700#1$a, $g +701#1$a, $g +702#1$a, $g + + + + + + +{...} +The repeatable elements are defined in figure-brackets {}. For example, +index-formula + + + +71-00$a, $g, $h ($c){.$b ($c)} , (3) + + +includes + + +71-00$a, $g, $h ($c). $b ($c) +71-00$a, $g, $h ($c). $b ($c). $b ($c) +71-00$a, $g, $h ($c). $b ($c). $b ($c). $b ($c) + + + + + + +All another operands are the same as accepted in MARC world. + + + + + +Notation of <emphasis>index-formula</emphasis> for Zebra + + +Extended indexing overloads path of +elm definition in abstract syntax file of Zebra +(.abs file). It means that names beginning with +"mc-" are interpreted by Zebra as +index-formula. The database index is created and +linked with access point (Bib-1 use attribute) +according to this formula. + +For example, index-formula + + +71-00$a, $g, $h ($c){.$b ($c)} , (4) + + +in .abs file looks like: + + +mc-71.00_$a,_$g,_$h_(_$c_){.$b_(_$c_)} + + + +The notation of index-formula uses the operands: + + + +_ +It means whitespace character. + + + +. +The position may contain any value, defined by MARC format. For example, +index-formula + + +70-#1$a, $g , (5) + + +matches mc-70._1_$a,_$g_ and includes + + +700_1_$a,_$g_ +701_1_$a,_$g_ +702_1_$a,_$g_ + + + + + +{...} +The repeatable elements are defined in figure-brackets {}. For example, +index-formula + + +71#00$a, $g, $h ($c) {.$b ($c)} , (6) + + +matches mc-71.00_$a,_$g,_$h_(_$c_){.$b_(_$c_)} and +includes + + +71.00_$a,_$g,_$h_(_$c_).$b_(_$c_) +71.00_$a,_$g,_$h_(_$c_).$b_(_$c_).$b_(_$c_) +71.00_$a,_$g,_$h_(_$c_).$b_(_$c_).$b_(_$c_).$b_(_$c_) + + + + + + +<...> +Embedded index-formula (for linked fields) is between <>. For example, +index-formula + + +4--#-$170-#1$a, $g ($c) , (7) + + +matches mc-4.._._$1<70._1_$a,_$g_(_$c_)>_ and +includes + + +463_._$1<70._1_$a,_$g_(_$c_)>_ + + + + + + + + +All another operands are the same as accepted in MARC world. + + + +Examples + + + + + + +indexing LEADER + +You need to use keyword "ldr" to index leader. For example, indexing data from 6th +and 7th position of LEADER + + +elm mc-ldr[6] Record-type ! +elm mc-ldr[7] Bib-level ! + + + + + + +indexing data from control fields + +indexing date (the time added to database) + + +elm mc-008[0-5] Date/time-added-to-db ! + + +or for RUSMARC (this data included in 100th field) + + +elm mc-100___$a[0-7]_ Date/time-added-to-db ! + + + + + + +using indicators while indexing + +For RUSMARC index-formula +70-#1$a, $g matches + + +elm 70._1_$a,_$g_ Author !:w,!:p + + +When Zebra finds a field according to "70." pattern it checks +the indicators. In this case the value of first indicator doesn't mater, but +the value of second one must be whitespace, in another case a field is not +indexed. + + + + + +indexing embedded (linked) fields for UNIMARC based formats + +For RUSMARC index-formula +4--#-$170-#1$a, $g ($c) matches + + +elm mc-4.._._$1<70._1_$a,_$g_(_$c_)>_ Author !:w,!:p + + +Data are extracted from record if the field matches to +"4.._." pattern and data in linked field match to embedded +index-formula 70._1_$a,_$g_(_$c_). + + + + + + + + + + + + -- 1.7.10.4