relevance: avoid false hits for whitespace tokens
authorAdam Dickmeiss <adam@indexdata.dk>
Tue, 18 Sep 2012 12:32:29 +0000 (14:32 +0200)
committerAdam Dickmeiss <adam@indexdata.dk>
Tue, 18 Sep 2012 12:32:29 +0000 (14:32 +0200)
For example, & could be turned into an empty string. And that
would occur nowhere else, giving a high inverse document frequency!

src/relevance.c

index b9fc0e1..a3ee82d 100644 (file)
@@ -50,7 +50,7 @@ static int word_entry_match(struct word_entry *entries, const char *norm_str,
 {
     for (; entries; entries = entries->next)
     {
-        if (!strcmp(norm_str, entries->norm_str))
+        if (*norm_str && !strcmp(norm_str, entries->norm_str))
         {
             const char *cp = 0;
             int no_read = 0;