From mike Tue May 23 08:49:36 2006 X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["10552" "Tuesday" "23" "May" "2006" "09:43:08" "+0200" "marc" "marc@indexdata.dk" nil "326" "SRU Server lint/tester" "^X-Spam-Status:" nil nil "5" nil nil nil nil nil nil nil nil nil] nil) Return-path: X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on bagel.indexdata.dk X-Spam-Level: Envelope-to: mike@miketaylor.org.uk Delivery-date: Tue, 23 May 2006 09:43:10 +0200 Received: from localhost.localdomain [127.0.0.1] by localhost with POP3 (fetchmail-6.2.5) for mike@localhost (single-drop); Tue, 23 May 2006 08:49:36 +0100 (BST) Received: from user.indexdata.dk ([213.150.43.10] helo=[10.0.1.66]) by bagel.indexdata.dk with esmtp (Exim 3.35 #1 (Debian)) id 1FiRXp-0004lO-00; Tue, 23 May 2006 09:43:09 +0200 Message-ID: <4472BD0C.1000102@indexdata.dk> User-Agent: Debian Thunderbird 1.0.7 (X11/20051017) X-Accept-Language: en-us, en MIME-Version: 1.0 References: <17522.6764.89623.774386@localhost.localdomain> In-Reply-To: <17522.6764.89623.774386@localhost.localdomain> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.1 From: marc To: Mike Taylor , adam@indexdata.dk, Ralph LeVan Subject: SRU Server lint/tester Date: Tue, 23 May 2006 09:43:08 +0200 Mike Taylor wrote: > Guys, > > Ralph's fixed his SRU-lint web-page. It takes a few minutes to get > its head around foo.indexdata.dk, but it does manage, and has some > useful things to say. > Hej Ralph Thanks for a fast fix of your SRU Server tester http://alcme.oclc.org/srw/SRUServerTester.html for the Alvis-Zebra SRU server URL http://foo.indexdata.dk I think your SRU checker does a nice job. And it did indeed find some errors in our implementation, so we all can improve. Thanks for that! I have a couple of small comments: 1) You test scan like this: http://foo.indexdata.dk?version=1.1&scanClause=rec.id+=+dog&operation=scan&responsePosition=1&maximumTerms=5 I think you should always add a responsePosition=3 to the mix, as there might be indexes (and there are here!) where the term 'dog' comes lexocographically after the last index entry, and you get a fat empty 1.1 But using http://foo.indexdata.dk/?version=1.1&operation=scan&scanClause=rec.id+%3D+%28dog%29&responsePosition=3&maximumTerms=5&stylesheet= gives you 1.1 FFFEBE6A0D7773AF401A728D5C818AEB 1 FFFF3A78648AC540304B1F50A2C0D644 1 By the way, an empty index is not that useful, but I think it's not necessarily an error to have one unpopulated index, so using a warning from your side is a good choice, I feel. 2) scan relation 'exact' You try http://foo.indexdata.dk?version=1.1&scanClause=rec.id+exact+dog&operation=scan&responsePosition=1&maximumTerms=5 with relation 'exact', but my explain never told you that the server supports relation 'exact'. I think, my correct server response to this should be a fatal diagnostic, and your correct test result should have been 'diagnostic this-and-that expected'. Unless it's mandatory that any index supports 'exact', in which case an error should be reported (I need to look in the specs to be sure ..) 3) test of search retrieve You are doing a decent job here. I have a suggestion for a slight improvement: you might want to use your information from a scan in the follwing way: - search for a term _not_ found in the index, and see that there are 0 hits (and the response is correct) - search for at term found in the scan response, and see that the number of hits equals the number of hits claimed in the scan response I know, it's more work to check for numbers, but it's also a nice sanity check on top of a syntax/protocol check. 4) huge records: some of the records are insane huge (up to 5 MB of XML). For example you hit one here: http://foo.indexdata.dk?version=1.1&query=alvis.entity-disease+=+"dominant optic atrophy"&operation=searchRetrieve&maximumRecords=1 You might want to test for XML response message size before doing anything else to it, and report a warning like 'response too huge to be tested, exceeds X MB of XML' (I know, I should use anotheŕ default schema here, to give small records, for example the 'dc' schema. We have to improve too ..) 5) recordSchema In these cases, one might want to try the other record schema's to see if one get something useful there .. I did not see you testing any recordSchema of those I did mention, nor testing non-existent record schema's .. 6) In general, testing with wrong arguments on almost any place one can do is a good idea as well, as the hardest part of SRU/SRW is to get the diagnostics right .. for example record position too high, or non-existing relaiton or index, non-existing operation .. ect .. (Yes, I know, this is huge work to do ..) 7) Finally, useability: I think it's a nice idea to have this report formatted in XHTML tables, with nice links to click on for each test case, such that one can just execute the request which did produce errors/warnings. This is of course only eye-candy, but also a cheap improvement of useability. Still, I think you did a very decent job, and created a very useful service. I thank for the problems discovered with my service. I have some programming to do as well .. Marc Cromme Index Data > ------- start of forwarded message ------- > Return-path: > X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on bagel.indexdata.dk > X-Spam-Level: > Envelope-to: mike@miketaylor.org.uk > Delivery-date: Mon, 22 May 2006 21:07:05 +0200 > Received: from localhost.localdomain [127.0.0.1] > by localhost with POP3 (fetchmail-6.2.5) > for mike@localhost (single-drop); Mon, 22 May 2006 20:56:19 +0100 (BST) > Received: from mshieldserver1.oclc.org ([132.174.29.209]) > by bagel.indexdata.dk with smtp (Exim 3.35 #1 (Debian)) > id 1FiFk9-0005kc-00 > for ; Mon, 22 May 2006 21:07:05 +0200 > Received: From OAEXCH2SERVER.oa.oclc.org ([132.174.29.222]) by mshieldserver1.oclc.org (WebShield SMTP v4.5 MR2); > id 1148324788949; Mon, 22 May 2006 15:06:28 -0400 > X-MimeOLE: Produced By Microsoft Exchange V6.5 > Content-class: urn:content-classes:message > MIME-Version: 1.0 > Content-Type: text/plain; > charset="us-ascii" > Content-Transfer-Encoding: quoted-printable > Message-ID: <811A02A11096B343880D2EEF72C4C83202FCD5E9@OAEXCH2SERVER.oa.oclc.org> > X-MS-Has-Attach: > X-MS-TNEF-Correlator: > Thread-Topic: [Adam Dickmeiss: Re: [Tech-alert] SRU Server lint/tester] > thread-index: AcZ7Hk4E9wk02cosR0uT69PyXob4kACsz3fA > X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham > version=3.1.1 > From: "LeVan,Ralph" > To: "Mike Taylor" > Subject: RE: [Adam Dickmeiss: Re: [Tech-alert] SRU Server lint/tester] > Date: Mon, 22 May 2006 15:06:28 -0400 > > Let's just pretend I didn't send that last email, okay? > > So, lovely Explain records we're having today! > > I've fixed the blow up. You've got a couple of searches that return > very large records that I think are peculiar. Your stylesheet can't > render them and I report that I'm getting an error 400 from them. > > Run the test against the server again and let me know if there's > something more I should be doing. > > Thanks, Mike! > > Ralph > > >>-----Original Message----- >>From: Mike Taylor [mailto:mike@miketaylor.org.uk] >>Sent: Friday, May 19, 2006 4:29 AM >>To: LeVan,Ralph >>Subject: [Adam Dickmeiss: Re: [Tech-alert] SRU Server lint/tester] >>=20 >>Hi, Ralph. FYI, it seems that your SRU lint barfs on our server. >>=20 >>------- start of forwarded message ------- >>Return-path: >>X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on >>bagel.indexdata.dk >>X-Spam-Level: >>Envelope-to: mike@indexdata.com >>Delivery-date: Thu, 18 May 2006 18:38:14 +0200 >>Received: from localhost.localdomain [127.0.0.1] >> by localhost with POP3 (fetchmail-6.2.5) >> for mike@localhost (single-drop); Thu, 18 May 2006 17:49:11 > > +0100 > >>(BST) >>Received: from kebab.indexdata.dk ([83.133.64.60]) >> by bagel.indexdata.dk with esmtp (Exim 3.35 #1 (Debian)) >> id 1FglVt-0007y3-00; Thu, 18 May 2006 18:38:13 +0200 >>Received: from localhost ([127.0.0.1] helo=3Dkebab.indexdata.dk) >> by kebab.indexdata.dk with esmtp (Exim 4.50) >> id 1FglVU-0001I6-DK; Thu, 18 May 2006 18:37:48 +0200 >>Received: from user.indexdata.dk ([213.150.43.10] > > helo=3Dbagel.indexdata.dk) > >> by kebab.indexdata.dk with esmtp (Exim 4.50) id 1FglUy-0001Hv-WF >> for tech-alert@lists.indexdata.dk; Thu, 18 May 2006 18:37:36 > > +0200 > >>Received: from dickmeiss.net ([213.173.244.115] helo=3D[10.0.0.18]) >> by bagel.indexdata.dk with esmtp (Exim 3.35 #1 (Debian)) >> id 1FglUj-0007QK-00 >> for ; Thu, 18 May 2006 18:37:01 > > +0200 > >>Message-ID: <446CA2AC.1030200@indexdata.dk> >>User-Agent: Thunderbird 1.5.0.2 (X11/20060501) >>MIME-Version: 1.0 >>References: <446C8BB1.6080206@indexdata.dk> >>In-Reply-To: <446C8BB1.6080206@indexdata.dk> >>Content-Type: text/plain; charset=3DISO-8859-1; format=3Dflowed >>Content-Transfer-Encoding: 7bit >>X-BeenThere: tech-alert@lists.indexdata.dk >>X-Mailman-Version: 2.1.5 >>Precedence: list >>Reply-To: Announcements/discussion of interesting technology >> >>List-Id: Announcements/discussion of interesting technology >> >>List-Unsubscribe: >bin/mailman/listinfo/tech-alert>, >> > > > >>List-Archive: >alert> >>List-Post: >>List-Help: = > > > >>List-Subscribe: > > >>alert>, >> >>Errors-To: tech-alert-bounces@lists.indexdata.dk >>X-SA-Exim-Connect-IP: 127.0.0.1 >>X-SA-Exim-Mail-From: tech-alert-bounces@lists.indexdata.dk >>X-SA-Exim-Scanned: No (on kebab.indexdata.dk); SAEximRunCond expanded > > to > >>false >>X-Spam-Status: No, score=3D-2.5 required=3D5.0 tests=3DAWL,BAYES_00, >> FORGED_RCVD_HELO autolearn=3Dham version=3D3.1.1 >>From: Adam Dickmeiss >>Sender: tech-alert-bounces@lists.indexdata.dk >>To: Announcements/discussion of interesting technology >> >>Subject: Re: [Tech-alert] SRU Server lint/tester >>Date: Thu, 18 May 2006 18:37:00 +0200 >>=20 >>marc wrote: >> >>>SRU Server tester >>> >>>Try it: surf into >>>http://alcme.oclc.org/srw/SRUServerTester.html >>> >>>and give the Alvis-Zebra SRU server URL >>>http://foo.indexdata.dk >>>into the box. >>> >>>Have fun! >> >>Sort of. I get a big Java runtime exception. Nice. >>=20 >>/ Adam >>=20 >> >>>Marc >>> >> >>=20 >>=20 >>_______________________________________________ >>Tech-alert mailing list >>Tech-alert@lists.indexdata.dk >>http://lists.indexdata.dk/cgi-bin/mailman/listinfo/tech-alert >>------- end of forwarded message ------- > > ------- end of forwarded message ------- -- Marc Cromme, cand. polyt, Ph.D Senior Developer, Project Manager Index Data Aps Købmagergade 43, 2 1150 Copenhagen K. Denmark tel: +45 3341 0100 fax: +45 3341 0101 http://www.indexdata.com INDEX DATA Means Business for Open Source and Open Standards