[Brug-talk] Ferret - sphinx - Solr

Peter De Berdt (10-forward) peter at 10-forward.be
Mon Jan 21 04:47:27 EST 2008


On 21 Jan 2008, at 01:03, Peter Vandenabeele wrote:

> We are evaluating different solutions for search in Rails. The
> "logical" choice proposed is Ferret (a Lucene clone for Rails). But
> now I bumped into this recent thread (started Jan 4, 2008):
>
>   http://www.ruby-forum.com/topic/137629#616449
>
> where a number of people mention serious stability problems with
> Ferret. The alternative (with less features but far more stable and
> performant) that seems to be proposed is "sphinx". From this recent
> thread, Ferret seems too risky for the first version. I might even
> just use a plain simple SQL "like" or "rlike" or a "fulltext match" to
> start with. Actually, for Postgresql tsearch2 gets thumbs up (but we
> settled for Mysql for now ...).
>
> Any hints from local experiences? Thanks in advance ...

We've been using ferret in both development and now in production and  
I must say our experiences are... bad.

In development mode, our indexes corrupted quite often, giving either  
bad results or just bombing the application. Sometimes reindexing the  
models with a rake task did solve the problems, but there were times  
when manually deleting the index folder was necessary to get it going  
again.
Then comes production mode. Because of concurrency issues (more  
mongrels accessing/updating the index at the same time), you have to  
rely on backgroundrb (which is included in the acts_as_ferret  
plugin). We're using quite complex indices with quite a few of  
related fields indexed with the main record, as well as multi model  
searches. The problems we face are the following:
• At irregular times, errors like this pop up: undefined method  
`to_doc' for #<DRb::DRbUnknown:0xb7468c78>
Googling around has revealed there's a hacky patch and the plugin  
developers don't have a clue as to where the problem lies.
• The backgroundrb server just halts (i.e. the process is killed)
• The index gets corrupted anyway, without an apparent reason

Now, the first thing you always count on, is human error, i.e. we  
made a mistake somewhere ourselves and we need to fix it. Having gone  
through every single line of code, I can honestly say I'm quite sure  
this is not the case. This brings it down to three possible bad  
players: Ferret itself, acts_as_ferret or backgroundrb. I guess I  
don't have to tell you I'm not particularly happy with this.

Over the last couple of days, I've been looking into other possible  
solutions, such as acts_as_solr, acts_as_searchable (using  
Hyperestraier), and the MySQL MyISAM extra table thingie of which the  
name escapes me for the moment. I'm currently leaning towards solr,  
but Sphinx looks mighty tempting. I might run a few tests with it to  
see how well it copes. Anyone have any realworld experience?

Best regards.


Peter De Berdt

______________________
10-forward
Zwarteweg 28
B-8433  Middelkerke
Mobile : (0473) 38 35 86
info at 10-forward.be
http://www.10-forward.be
______________________




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/brug-talk/attachments/20080121/1d9f0fd7/attachment-0001.html 


More information about the Brug-talk mailing list