X-Git-Url: http://git.indexdata.com/?p=idzebra-moved-to-github.git;a=blobdiff_plain;f=doc%2Ftutorial.xml;h=e96f9427b467427babea35c48fe83f8063c79d46;hp=6e18a247f47a502714c15b2b6675212dd458fc36;hb=1d5d4f08cb84516d75fcb5e6ed4199b6454cccd6;hpb=bd8bb26da2fffe069a9624f253695419eb5c6475 diff --git a/doc/tutorial.xml b/doc/tutorial.xml index 6e18a24..e96f942 100644 --- a/doc/tutorial.xml +++ b/doc/tutorial.xml @@ -22,8 +22,8 @@ Additional OAI test records can be downloaded by running a shell - script (you may want to abort the script when you have waitet - longer than your coffe brews ..). + script (you may want to abort the script when you have waited + longer than your coffee brews ..). cd data ./fetch_OAI_data.sh @@ -105,7 +105,7 @@ Searching and retrieving &acro.xml; records is easy. For example, - you can point your browser to one of the following url's to + you can point your browser to one of the following URLs to search for the term the. Just point your browser at this link: - These URL's woun't work unless you have indexed the example data + These URLs won't work unless you have indexed the example data and started an &zebra; server as outlined in the previous section. In case we actually want to retrieve one record, we need to alter - our URl to the following + our URL to the following http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=the&startRecord=1&maximumRecords=1&recordSchema=dc @@ -159,7 +159,7 @@ conf/oai2dc.xsl, and the zebra schema implemented in conf/oai2zebra.xsl. - The URL's for acessing both are the same, except for the different + The URLs for accessing both are the same, except for the different value of the recordSchema parameter: http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=the&startRecord=1&maximumRecords=1&recordSchema=dc @@ -208,17 +208,17 @@ The &acro.oai; indexing example defines many different index names, a study of the conf/oai2index.xsl stylesheet reveals the following word type indexes (i.e. those - swith suffix :w): + with suffix :w): any:w - dc_title:w - dc_creator:w - dc_subject:w - dc_description:w - dc_contributor:w - dc_publisher:w - dc_language:w - dc_rights:w + title:w + author:w + subject:w + description:w + contributor:w + publisher:w + language:w + rights:w By default, searches do access the any:w index, but we can direct searches to any access point by constructing the @@ -226,9 +226,9 @@ we use + 1=title the&startRecord=1&maximumRecords=1&recordSchema=dc"> http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=@attr - 1=dc_title the&startRecord=1&maximumRecords=1&recordSchema=dc + 1=title the&startRecord=1&maximumRecords=1&recordSchema=dc @@ -236,45 +236,45 @@ Similar we can direct searches to the other indexes defined. Or we can create boolean combinations of searches on different indexes. In this case we search for the in - dc_title and for fish in - dc_description using the query - @and @attr 1=dc_title the @attr 1=dc_description fish. + title and for fish in + description using the query + @and @attr 1=title the @attr 1=description fish. http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=@and - @attr 1=dc_title the - @attr 1=dc_description fish&startRecord=1&maximumRecords=1&recordSchema=dc + @attr 1=title the + @attr 1=description fish&startRecord=1&maximumRecords=1&recordSchema=dc - + Investigating the content of the indexes - How doess the magic work? What is inside the indexes? Why is a certain - record foound by a search, and another not?. The answer is in the - inverterd indexes. You can easily investigate them using the + How does the magic work? What is inside the indexes? Why is a certain + record found by a search, and another not?. The answer is in the + inverted indexes. You can easily investigate them using the special &zebra; schema zebra::index::fieldname. In this example you - can see that the dc_title index has both word + can see that the title index has both word (type :w) and phrase (type :p) indexed fields, - - http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=the&startRecord=1&maximumRecords=1&recordSchema=zebra::index::dc_title + + http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=the&startRecord=1&maximumRecords=1&recordSchema=zebra::index::title But where in the indexes did the term match for the query occur? Easily answered with the special &zebra; schema - zebra::snippet. The matching terma are + zebra::snippet. The matching terms are encapsulated by <s> tags. http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=the&startRecord=1&maximumRecords=1&recordSchema=zebra::snippet @@ -286,9 +286,9 @@ found inside my hit set? Try the special &zebra; schema zebra::facet::fieldname:type. In this case, we investigate additional search terms for the - dc_title:w index. - - http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=the&startRecord=1&maximumRecords=1&recordSchema=zebra::facet::dc_title:w + title:w index. + + http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=the&startRecord=1&maximumRecords=1&recordSchema=zebra::facet::title:w @@ -296,8 +296,8 @@ One can ask for multiple facets. Here, we want them from phrase indexes of type :p. - - http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=the&startRecord=1&maximumRecords=1&recordSchema=zebra::facet::dc_publisher:p,dc_title:p + + http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=the&startRecord=1&maximumRecords=1&recordSchema=zebra::facet::publisher:p,title:p @@ -310,13 +310,13 @@ The &acro.sru; specification mandates that the &acro.cql; query language is supported and properly configure. Also, the server - needs to be able to emmit a proper &acro.explain; &acro.xml; + needs to be able to emit a proper &acro.explain; &acro.xml; record, which is used to determine the capabilities of the specific server instance. - In this example configuration we expoit the similarities between + In this example configuration we exploit the similarities between the &acro.explain; record and the &acro.cql; query language configuration, we generate the later from the former using an &acro.xslt; transformation. @@ -326,7 +326,7 @@ - The we are all set to start the &acro.sru;/acro.z3950; server including + We are all set to start the &acro.sru;/acro.z3950; server including &acro.pqf; and &acro.cql; query configuration. It uses the &yaz; frontend server configuration - just type @@ -374,11 +374,11 @@ url="http://localhost:9999/?version=1.1&operation=scan&scanClause=dc.identifier=fish"> http://localhost:9999/?version=1.1&operation=scan&scanClause=dc.identifier=fish - accesses the indexed indentifiers. + accesses the indexed identifiers. - In addition, all &zebra; internal special elemen sets or record + In addition, all &zebra; internal special element sets or record schema's of the form zebra:: just work right out of the box elements zebra::facet::any:w Z> show 1+1 - Z> elements zebra::facet::dc_publisher:p,dc_title:p + Z> elements zebra::facet::publisher:p,title:p Z> show 1+1 @@ -478,10 +478,10 @@ Z> find @attr 1=oai_setspec @attr 4=3 7374617475733D756E707562 Z> show 1+1 - Z> find @attr 1=dc_title communication + Z> find @attr 1=title communication Z> show 1+1 - Z> find @attr 1=dc_identifier @attr 4=3 + Z> find @attr 1=identifier @attr 4=3 http://resolver.caltech.edu/CaltechCSTR:1986.5228-tr-86 Z> show 1+1 @@ -498,8 +498,8 @@ Z> scan @attr 1=oai_datestamp @attr 4=3 1 Z> scan @attr 1=oai_setspec @attr 4=3 2000 Z> - Z> scan @attr 1=dc_title communication - Z> scan @attr 1=dc_identifier @attr 4=3 a + Z> scan @attr 1=title communication + Z> scan @attr 1=identifier @attr 4=3 a @@ -555,8 +555,8 @@ Notice that searching and scan on indexes - dc_contributor, dc_language, - dc_rights, and dc_source + contributor, language, + rights, and source might fail, simply because none of the records in the small example set have these fields set, and consequently, these indexes might not been created.