[Ferret-talk] Weird analyzer issue with the word 'fly'

Jens Kraemer jk at jkraemer.net
Thu Apr 9 08:40:21 EDT 2009


Hi Max!

On 09.04.2009, at 13:45, Max Williams wrote:
>
> I'm having a problem with some search terms - i narrowed one of them
> down to the inclusion of the word 'fly'.  Can anyone give me any clues
> at to what might be happening, or even how i can investigate?

First of all I'd have a look at what the analyzer does to your query  
terms:

ts = StemmingAnalyzer.new.token_stream nil, 'flea fly'
while token = ts.next
  puts token
end

For some reason the word 'fly' is turned into 'fli' by the analyzer.  
But that's ok, as long as it works the same way at indexing time. Next  
use the ferret_browser tool to inspect your index and check whether  
the term 'fli' really appears in your index. I doubt that, because if  
this was the case everything would work as expected. So I guess we  
have a problem with the analysis at indexing time.

> My index is set up like this:
>
> acts_as_ferret({ :store_class_name => true,
>                  :analyzer => Ferret::Analysis::StemmingAnalyzer.new,
>                  :fields => {:name =>            { :boost => 2.0 },
>                              ...
>               }})

now that I look at this the second time the problem seems quite  
obvious :-) The analyzer option needs to be given as part of a  
separate ferret options hash like this:

acts_as_ferret :store_class_name => true,
               :ferret => { :analyzer =>  
Ferret::Analysis::StemmingAnalyzer.new },
               :fields => { ... }

rebuild your index and everything should be working as expected.


Cheers,
Jens


--
Jens Krämer
Finkenlust 14, 06449 Aschersleben, Germany
VAT Id DE251962952
http://www.jkraemer.net/ - Blog
http://www.omdb.org/     - The new free film database

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 194 bytes
Desc: This is a digitally signed message part
URL: <http://rubyforge.org/pipermail/ferret-talk/attachments/20090409/7bedec35/attachment.bin>


More information about the Ferret-talk mailing list