From mike@seatbooker.net Tue Oct 29 15:12:09 2002 Envelope-to: mike@miketaylor.org.uk Date: Tue, 29 Oct 2002 14:11:48 GMT From: Mike Taylor To: ZNG@loc.gov Cc: mike@miketaylor.org.uk Subject: Again: Grammar Tweaks Dear Everyone, I sent this message last Friday, and didn't get a delivery failure message or anything similar; but there has been absolutely zero response on-list, which makes me wonder whether it mysteriously didn't get through. ... or surely it didn't get caught by people's "this message is too complicated to pay attention to" filters? :-~ _/|_ _______________________________________________________________ /o ) \/ Mike Taylor www.miketaylor.org.uk )_v__/\ "Conclusion: is left to the reader (see Table 2). Acknowledgements: I wrote this paper for money" -- A. A. Chastel, _A critical analysis of the explanation of red-shifts by a new field_, A&A 53, 67 (1976) ------------------------------- cut here ------------------------------- Well, it looks like the CQL grammar has settled down more or less to everyone's satisfaction. So it must be time to throw it all up the air again! :-) No, I'm joking -- mostly. I'd like to point one actual mistake (I think), suggest one substantive change, and request a few cosmetic changes. For anyone who's not got it to hand, the URL for the grammar is http://lcweb.loc.gov/z3950/agency/zing/srwu/cql.html 1. I think it's a mistake that the grammar says: prox-qualifiers ::= "/" [ unit ] "/" [ relation ] "/" [ distance ] "/" ordering (and the similar productions that follow) because that allows prox/word/exact/3 <--- "exact" is meaningless here and -- even worse -- prox/word/=/stem <--- a relation-modifier! (This is not only silly, but ambiguous too) So I think all the occurrences of "relation" in the productions for prox need to be changed to "order-or-equal-relation". 2. The only thing that I'm suggesting we actually _change_ is the order of the proximity parameters. Quick! Close your eyes and tell me the correct order of relation, ordering, distance and unit? See -- you can't do it: no-one can :-) So, based somewhat on Adam's rather more difficult suggestion of a couple of days ago, I propose that we change the order to: relation/distance/unit/ordering Rationale: you can read it out loud. If you want to find two clauses with the conditions "*more* than *5* *sentences* apart", you would write ``foo prox/>/5/sentence bar''. 3. Cosmetic changes. 3a. The "/" at the beginning of each of the prox-qualfiers productions can be moved up into the definition of prox, like this: prox::= "prox" [ "/" prox-qualifiers ] which yields a slightly simpler, neater (but equivalent) grammar. 3b. The things that the grammar called "index-name", we have been calling "qualifiers" (and talking about the "qualifier-sets" that contain them.) I think that's a much nicer name than "index-name", in part because it doesn't carry such a loading of implementation detail. Also, remember that we way we've designed things, a qualifier will typically implemented by multiple indexes (a word index and a string index) so I don't want to give misleading impressions. 3b1. :-) That would mean that, in the name of simplicity, we'd need to rename "prox-qualifiers" to something like "prox-modifiers" or "prox-parameters" (which is what we've actually been calling them, 4WIW) and rename "qualifier" to something more suggestive such as "relation-modifier" (which, again, is what we've been using in prose.) 3c. (Nearly done, honest.) I think that "order-or-equal-relation" is a horrible name and would much prefer to call it something like "numeric-relation", which better explains its role in, for example, proximity parameters. So, putting it all together, here's how I think the grammar should look: ------------------------------- cut here ------------------------------- cql-query ::= cql-query boolean search-clause | search-clause boolean ::= "and" | "or" | "not" | prox search-clause ::= "(" cql-query ")" | [ qualifier relation ] term relation ::= base-relation { "/" relation-modifier } base-relation ::= numeric-relation | "exact" | "all" | "any" relation-modifier ::= "relevant" | "fuzzy" | "stem" numeric-relation ::= "<" | ">" | "<=" | ">=" | "<>" | "=" prox ::= "prox" [ "/" prox-parameters ] prox-parameters ::= [ numeric-relation ] "/" [ distance ] "/" [ unit ] "/" ordering | [ numeric-relation ] "/" [ distance ] "/" unit | [ numeric-relation ] "/" distance | numeric-relation unit ::= "word" | "sentence" | "paragraph" | "element" ordering ::= "ordered" | "unordered" distance ::= non-negative-integer qualifier ::= [ qualifier-prefix "." ] qualifier-name qualifier-prefix ::= identifier qualifier-name ::= identifier identifer ::= string term ::= string | ""string"" string ::= a character string ------------------------------- cut here ------------------------------- Hope this helps, and that it's none of it's controversial. I guess it ought not to be, except maybe the change in the order of proximity parameters. _/|_ _______________________________________________________________ /o ) \/ Mike Taylor www.miketaylor.org.uk )_v__/\ The IBM 360 had no stack, and that was stupid, short-sighted design. The Cray 2 has no stack either, but that's elegant minimalism.