Metaproxy SPARQL module Index Data sparql 3mp Metaproxy Module sparql Metaproxy Module for accessing a triplestore DESCRIPTION This module translates Z39.50 operations init, search, present to HTTP requests that accesses a remote triplestore via HTTP Configuration consists of one or more db elements. Each db element describes how to access a specific database. The db element takes attributes name of Z39.50 database (path) and HTTP access point of triplestore (uri). Optionally, the schema for the database may be given with attribute schema. Each db element takes these elements: Configurable values: <prefix/> Section that maps prefixes and namespaces for RDF vocabularies. The format is prefix followed by colon, followed by value. <form/> SPARQL Query formulation selection. SHould start with one of the query forms: SELECT or CONSTRUCT. <criteria/> section that allows to map static graph patterns for binding variables, narrowing types, etc, or any other WHERE clause criteria static to the Z39.50/SRU database. The final query conversion logic should be able to deduce which optional criteria should be included in the generated SPARQL by analyzing variables required in the query matching and display fields. <index type="attribute"/> Section used to declare RPN use attribute strings (indexes) and map them to BIBFRAME graph patterns. Items in this section are expanded during RPN query processing and placeholders (%s, %d) are substituted with query terms. To map a given CQL index (e.g the default keyword index) into multiple entity properties, SPARQL constructs like `OPTIONAL` or `UNION` could be used. <modifier/> Optional section that allows you to add solution sequences or modifiers. SCHEMA EXAMPLE Configuration for database "Default" that allows searching works. Only the field (use attribute) "bf.wtitle" is supported. rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns bf: http://bibframe.org/vocab/
SELECT ?work ?wtitle
?work a bf:Work ?work bf:workTitle ?wt ?wt bf:titleValue ?wtitle ?wt bf:titleValue %v FILTER(contains(%v, %s))
]]>
The matching is done by a simple case-sensitive substring match. There is no deduplication, so if a work has two titles, we get two rows.
EXAMPLE A more complex configuration for database "work". This could be included in the same filter section as the "Default" db above. rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns bf: http://bibframe.org/vocab/
SELECT ?work (sql:GROUP_DIGEST (?wtitle, ' ; ', 1000, 1)) AS ?title (sql:GROUP_DIGEST (?creatorlabel, ' ; ', 1000, 1))AS ?creator (sql:GROUP_DIGEST (?subjectlabel, ' ; ', 1000, 1))AS ?subject
?work a bf:Work OPTIONAL { ?work bf:workTitle ?wt . ?wt bf:titleValue ?wtitle } OPTIONAL { ?work bf:creator ?creator . ?creator bf:label ?creatorlabel } OPTIONAL { ?work bf:subject ?subject . ?subject bf:label ?subjectlabel } ?wt bf:titleValue %v FILTER(contains(%v, %s)) ?creator bf:label %v FILTER(contains(%v, %s)) ?subject bf:label %v FILTER(contains(%v, %s)) { ?work ?op1 ?child . ?child ?op2 %v FILTER(contains(STR(%v), %s)) } GROUP BY $work ]]>
This returns one row for each work. Titles, authors, and subjects are all optional. If they repeat, the repeated values are concatenated into a single field, separated by semicolons. This is done by the GROUP_DIGEST function that is specific to the Virtuoso back end. This example supports use attributes 4 (title), 1003 (author), 21 (subject), and 1016 (keyword) which matches any literal in a triplet that refers to the work, so it works for the titleValue in the workTitle, as well as the label in the subject, and what ever else there may be. Like the preceding example, the matching is by a simple substring, case sensitive. A more realistic term matching could be done with regular expressions, at the cost of some readability portability, and performance.
EXAMPLE Configuration for database "works". This uses CONSTRUCT to produce rdf. rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns bf: http://bibframe.org/vocab/
CONSTRUCT { ?work bf:title ?wtitle . ?work bf:instanceTitle ?title . ?work bf:author ?creator . ?work bf:subject ?subjectlabel }
?work a bf:Work ?work bf:workTitle ?wt ?wt bf:titleValue ?wtitle ?wt bf:titleValue %v FILTER(contains(%v, %s)) ?work bf:creator ?creator ?creator bf:label ?creatorlabel ?creator bf:label %v FILTER(contains(%v, %s)) ?work bf:subject ?subject ?subject bf:label ?subjectlabel ?subject bf:label %v FILTER(contains(%v, %s)) ]]>
EXAMPLE Configuration for database "instance". Like "work" above this uses SELECT to return row-based data, this time from the instances. This is not deduplicated, so if an instance has two titles, we get two rows, and if it also has two formats, we get four rows. The DISTINCT in the SELECT rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns bf: http://bibframe.org/vocab/
SELECT DISTINCT ?instance ?title ?format
?instance a bf:Instance ?instance bf:title ?title ?instance bf:title %v FILTER(contains(%v, %s)) ?instance bf:format ?format ?instance bf:format %s ]]>
SEE ALSO metaproxy 1