From alex at liivid.com Sat Nov 3 08:49:17 2007
From: alex at liivid.com (Alex Neth)
Date: Sat, 3 Nov 2007 20:49:17 +0800
Subject: [Ferret-talk] Performance before and after optimization
Message-ID: <39DF135A-146B-477D-90C8-ED1C7A309988@liivid.com>
I have an index with a few hundred thousand records. The index is
generally very fast, with sub 100ms responses. However if I start
adding records, it gets extremely slow, up to over 2 seconds per
query. This is true even if I am not currently indexing until I
optimize the
index.
In order to work around this, I index in bulk and immediately
optimize. This is not ideal for the performance of my site.
Unfortunately, contrary to what Dave Balmain seems to say here:
http://osdir.com/ml/lang.ruby.ferret.general/2006-08/msg00037.html ,
the index seems to be locked for reading during optimization.
So I have two questions:
1) Why does the performance degrade so badly after adding just a few
records, unless I optimize the index? Can I avoid this?
2) Can I keep a second index so that it doesn't get locked during
optimization and then switch to the optimized index? Perhaps the index
is not really locked and it is just using all the CPU? (I am using a
single CPU server)?
Thanks for any help.
-Alex
From hongli at plan99.net Sun Nov 4 13:22:49 2007
From: hongli at plan99.net (Hongli Lai)
Date: Sun, 04 Nov 2007 19:22:49 +0100
Subject: [Ferret-talk] Searching different fields based on document
permissions
Message-ID: <472E0DF9.7020209@plan99.net>
I'm currently writing a system that stores user-created documents. Each
user belongs to a specific group, and the system supports multiple
groups. The thing is, my users want to be able to hide pieces of a
document from other groups. So for example, lets say Joe of team A has
written this document:
"Hello all, our secret plan is finally complete! We will
begin our mission of world domination at 12:00 PM tomorrow."
If Jane of team B views this document, she'll only see the text:
"Hello all, our secret plan is finally complete!"
Only other people in team A will be able to see the original message.
So each document essentially has two versions of contents: one without
private information, and one with private information.
My users were very specific about this feature and want it no matter
what. But this poses a problem for searching. Is it possible to tell
Ferret the following?
- Search all documents with the given search terms, but:
* Search in the field content_without_private_information if the
document does not belong to team A.
* Search in the field content_with_private_information if the
document belongs to team A.
I've taken a quick look at the tutorial, and I've purchased the Ferret
book by O'Reilly. But so far I can't seem to find anything that makes
this possible. Is it possible at all? Or are there other possible
alternatives?
From scottd at gmail.com Sun Nov 4 19:39:31 2007
From: scottd at gmail.com (Scott Davies)
Date: Sun, 4 Nov 2007 16:39:31 -0800
Subject: [Ferret-talk] Searching different fields based on document
permissions
In-Reply-To: <472E0DF9.7020209@plan99.net>
References: <472E0DF9.7020209@plan99.net>
Message-ID: <75f591160711041639j37be4fd4h4c13c41e9ee4f499@mail.gmail.com>
It's trivial if you construct your query tree manually, which you'll
probably have to do for your security purposes (as opposed to using
one of the existing query parsers)...the first argument to TermQuery's
constructor is which field to search.
On 11/4/07, Hongli Lai wrote:
> I'm currently writing a system that stores user-created documents. Each
> user belongs to a specific group, and the system supports multiple
> groups. The thing is, my users want to be able to hide pieces of a
> document from other groups. So for example, lets say Joe of team A has
> written this document:
> "Hello all, our secret plan is finally complete! We will
> begin our mission of world domination at 12:00 PM tomorrow."
>
> If Jane of team B views this document, she'll only see the text:
> "Hello all, our secret plan is finally complete!"
> Only other people in team A will be able to see the original message.
>
> So each document essentially has two versions of contents: one without
> private information, and one with private information.
>
> My users were very specific about this feature and want it no matter
> what. But this poses a problem for searching. Is it possible to tell
> Ferret the following?
> - Search all documents with the given search terms, but:
> * Search in the field content_without_private_information if the
> document does not belong to team A.
> * Search in the field content_with_private_information if the
> document belongs to team A.
>
> I've taken a quick look at the tutorial, and I've purchased the Ferret
> book by O'Reilly. But so far I can't seem to find anything that makes
> this possible. Is it possible at all? Or are there other possible
> alternatives?
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
From hongli at plan99.net Mon Nov 5 05:09:28 2007
From: hongli at plan99.net (Hongli Lai)
Date: Mon, 05 Nov 2007 11:09:28 +0100
Subject: [Ferret-talk] Searching different fields based on
document permissions
In-Reply-To: <75f591160711041639j37be4fd4h4c13c41e9ee4f499@mail.gmail.com>
References: <472E0DF9.7020209@plan99.net>
<75f591160711041639j37be4fd4h4c13c41e9ee4f499@mail.gmail.com>
Message-ID: <472EEBD8.4040002@plan99.net>
Scott Davies wrote:
> It's trivial if you construct your query tree manually, which you'll
> probably have to do for your security purposes (as opposed to using
> one of the existing query parsers)...the first argument to TermQuery's
> constructor is which field to search.
I found out that I'll need a query that looks like this:
"(group_id:#{group_id} AND private_content:#{search_term}) OR
(public_content:#{search_term})"
The query parser seems to generate a BooleanQuery at the top-level. I
spent several hours reading the book and the API, but I could not find a
way to generate 'OR' boolean queries. The API only allows :must,
:must_not and :should. How can I construct an OR query like the one above?
From jk at jkraemer.net Mon Nov 5 05:23:48 2007
From: jk at jkraemer.net (Jens Kraemer)
Date: Mon, 5 Nov 2007 11:23:48 +0100
Subject: [Ferret-talk] Searching different fields based on
document permissions
In-Reply-To: <472EEBD8.4040002@plan99.net>
References: <472E0DF9.7020209@plan99.net>
<75f591160711041639j37be4fd4h4c13c41e9ee4f499@mail.gmail.com>
<472EEBD8.4040002@plan99.net>
Message-ID: <20071105102348.GX19167@thunder.jkraemer.net>
On Mon, Nov 05, 2007 at 11:09:28AM +0100, Hongli Lai wrote:
> Scott Davies wrote:
> > It's trivial if you construct your query tree manually, which you'll
> > probably have to do for your security purposes (as opposed to using
> > one of the existing query parsers)...the first argument to TermQuery's
> > constructor is which field to search.
>
> I found out that I'll need a query that looks like this:
>
> "(group_id:#{group_id} AND private_content:#{search_term}) OR
> (public_content:#{search_term})"
>
> The query parser seems to generate a BooleanQuery at the top-level. I
> spent several hours reading the book and the API, but I could not find a
> way to generate 'OR' boolean queries. The API only allows :must,
> :must_not and :should. How can I construct an OR query like the one above?
Ferret by default creates OR queries so a query string like
'term1 term2' means the same as 'term1 OR term2' .
Using the API, :should is the correct modifier to create ORed boolean
clauses.
Jens
--
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database
From jk at jkraemer.net Mon Nov 5 05:29:16 2007
From: jk at jkraemer.net (Jens Kraemer)
Date: Mon, 5 Nov 2007 11:29:16 +0100
Subject: [Ferret-talk] Performance before and after optimization
In-Reply-To: <39DF135A-146B-477D-90C8-ED1C7A309988@liivid.com>
References: <39DF135A-146B-477D-90C8-ED1C7A309988@liivid.com>
Message-ID: <20071105102916.GY19167@thunder.jkraemer.net>
On Sat, Nov 03, 2007 at 08:49:17PM +0800, Alex Neth wrote:
[..]
> 2) Can I keep a second index so that it doesn't get locked during
> optimization and then switch to the optimized index? Perhaps the index
> is not really locked and it is just using all the CPU? (I am using a
> single CPU server)?
If you're already indexing in batches, keeping a second read-only index for
searching is a good idea. rsync is useful to keep the search-index up to
date in this case.
To check if CPU usage is a problem, try lowering the optimizing process'
priority and see how it goes.
Jens
--
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database
From pjones at pmade.com Mon Nov 5 12:51:38 2007
From: pjones at pmade.com (Peter Jones)
Date: Mon, 5 Nov 2007 10:51:38 -0700
Subject: [Ferret-talk] Unified ferret_start and ferret_stop
Message-ID:
Sorry for the top posting. I posted this to the ruby-forum site on
the 25th of October but it doesn't seem to have made it's way to this
mailing list. Here is the original post:
---
I've attached my first set of changes. The attached archive includes a
README file with information about what I've changed and why.
These changes are only for Unix-like operating systems, for now. If
you like the changes I've made, I'll integrate the Windows code from
the various scripts in the script directory.
Let me know if you have any questions.
---
The patches were originally attached to the forum posting. The URL to
the patches is therefor: http://www.ruby-forum.com/attachment/780/patches.tar.gz
Thanks.
--
Peter Jones
pmade inc. - http://pmade.com
From pjones at pmade.com Mon Nov 5 12:46:20 2007
From: pjones at pmade.com (Peter Jones)
Date: Mon, 5 Nov 2007 10:46:20 -0700
Subject: [Ferret-talk] Partial Class Definition if Ferret Server Not Running
Message-ID: <57F9859C-C936-4E05-B0F9-F64173DC0C79@pmade.com>
When using a remote ferret server, if the ferret server is not running
the acts_as_ferret class method will raise an exception. This causes
the model class to only be partially defined, and therefore all use of
that class in the rails application will explode until the rails
process is restarted.
This stems from the fact that ensure_index_exists is called on the
server just before the end of the acts_as_ferret class method. This
brings up a few questions:
1) Why can't remote_index call ensure_index_exists on the fly similar
to how local_index does it? Can't this be done in the server on the
fly? What about rebuilding all indexes in the server using
ensure_index_exists at start up time, instead of being called for each
class during class definition?
2) There seems to be a lot of generic functionality in local_index
that could be moved up to the abstract index, and therefor expand the
functionality of the remote_index class. Are there any reasons this
hasn't been done yet?
Either way, this needs to be corrected because allowing an exception
to raise during class definition is a very bad thing. I'd be more than
happy to submit a patch if someone points me in the right direction
regarding the correct way to resolve this (in remote_index or
ferret_sever).
Having the ferret_server check the indexes when it starts seems to be
the correct idea, instead of having them checked once for each class
in each rails process as it starts. Thanks.
--
Peter Jones
pmade inc. - http://pmade.com
From pjones at pmade.com Mon Nov 5 16:05:37 2007
From: pjones at pmade.com (Peter Jones)
Date: Mon, 5 Nov 2007 14:05:37 -0700
Subject: [Ferret-talk] Segmentation Fault in more_like_this.rb
Message-ID:
I've been seeing some core dumps coming from ferret_server:
acts_as_ferret/lib/more_like_this.rb:170: [BUG] Segmentation fault
ruby 1.8.6 (2007-03-13) [i386-freebsd6]
I'm running the latest build of ferret (0.11.4-rc5). Line 170 in
more_like_this.rb is:
freq = reader.doc_freq(field_name, word)
which is calling into the ferret C code (if I'm reading this correctly).
Is there anything I can do to get you more information, or help track
down this problem?
Thanks.
--
Peter Jones
pmade inc. - http://pmade.com
From ndaniels at mac.com Mon Nov 5 16:11:53 2007
From: ndaniels at mac.com (Noah M. Daniels)
Date: Mon, 5 Nov 2007 16:11:53 -0500
Subject: [Ferret-talk] Strange wildcard problem
Message-ID: <42B8371C-25DE-4707-926A-FC7431F40B2C@mac.com>
Hi,
Apologies for reposting this for those who read this via ruby-forum,
but it didn't make it to the list before, and the list seems more
active...
I'm using ferret (via acts_as_ferret) in a somewhat unorthodox
manner and am having a strange wildcard problem. Before anyone wonders
why we're doing things this way, the answer is basically that it lets
us precompute what would be expensive database queries and store the
results in a simple way (ferret index) prior to pushing the static
data to our production server.
Basically, I've got two (for the sake of simplicity) models, both of
which are indexed on a similar (but separate) non-model field.
However, one of those two models does not seem to get the proper
number of results for a wildcard search:
First of all, there's a non-indexed model called ProductTuple that's
got a supplier_id as well as a product_category_id and
product_material_id as well as some other id fields that aren't really
important here. Thus, a ProductTuple has foreign key relationships to
Suppliers and ProductCategories and ProductMaterials, but for ferret
purposes just think of those foreign keys as what they are - ids (e.g.
integers).
The first model, Supplier, is ferret-indexed on several fields, such
as the supplier name and supplier country, as well as the
'ferret_product_tuples' non-model field. ferret_product_tuples simply
takes all the product tuples for a supplier and concatenates their
product_category_id, product_material_id, etc. with delimiters.
So, for a product tuple with product_category_id 82,
product_material_id 88, and undefined product_technique_id, the
resulting part of the ferret_product_tuple string would look like
x00082_00088_00000x (where we use 00000 to indicate null). the xs are
used as anchors, essentially, as a given supplier's
ferret_product_tuple string might look like 'x00082_00088_00000x
x00000_00081_00013x'.
Now, the ferret query that gets constructed when we do the relevant
queries simply looks like:
'ferret_product_tuple:x00082_?????_?????x'
and this would, in the above instance, match that supplier.
Everything I've described works _perfectly_, EXCEPT...
we also index product_categories on this same string. So product
category #82 would have a bunch of ferret_product_tuple strings that
start out x00082 and have various things in the other positions.
Here's what's strange... a product_category query for
'ferret_product_tuple:x?????_?????_?????x' should return ALL product
categories, right? Yet it only returns six. A product category query
for 'ferret_product_tuple:x?????_00081_?????x' should return all the
product categories that share product_tuples with product_material
#81, but in fact returns only a small number of categories. Yet making
the wildcard match MORE restrictive by substituting
'ferret_product_tuple:x00082_00081_?????x' into that query yields
product_category #82, which is erroneously not included in the 6
results for 'ferret_product_tuple:x?????_00081_?????x'.
So, have I stumbled upon a bug in the wildcard handling? My initial
thought was that the different analyzer I was using for the
product_category index was the culprit, but I changed that analyzer
out to no effect, so I've ruled that out.
Any ideas? Thanks!
From bk at benjaminkrause.com Mon Nov 5 16:23:00 2007
From: bk at benjaminkrause.com (Benjamin Krause)
Date: Mon, 5 Nov 2007 22:23:00 +0100
Subject: [Ferret-talk] Segmentation Fault in more_like_this.rb
In-Reply-To:
References:
Message-ID: <65C9DC2C-9D1F-4A7F-A897-A7034EB10049@benjaminkrause.com>
Peter,
> Is there anything I can do to get you more information, or help track
> down this problem?
Yes, of course.. try to break the error down to a simple test case
and create a ticket at the ferret trac. There're still a few problems
in ferret that needs to be addressed, and this might be one of
them. Whenever David gets another chance to fix some ferret
bugs, it would be great to have a test case that helps to
identify the problem.
Ben
From kraemer at webit.de Tue Nov 6 04:06:16 2007
From: kraemer at webit.de (Jens Kraemer)
Date: Tue, 6 Nov 2007 10:06:16 +0100
Subject: [Ferret-talk] Strange wildcard problem
In-Reply-To: <42B8371C-25DE-4707-926A-FC7431F40B2C@mac.com>
References: <42B8371C-25DE-4707-926A-FC7431F40B2C@mac.com>
Message-ID: <20071106090616.GD30619@cordoba.webit.de>
Hi!
wildcard queries have a built in upper limit of terms they search for,
which by default is set to 512 (according to
http://ferret.davebalmain.com/api/classes/Ferret/Search/WildcardQuery.html).
So when you query for asdf*, Ferret expands this to all terms in your
index starting with asdf, but will stop after collecting 512 terms, then
go and retrieve all documents containing these 512 terms, obviously
missing those that would in theory match your query, but do this by
containing a matching term that wasn't retrieved in the first step.
Of course you can set the max_term count to a higher value, but in the
long run this isn't really a solution. If I understand you correctly,
your tuple field right now has a single term for each document, and that
term is different for each document. Splitting up your tuple values into
several different terms could help to reduce the number of terms needed
to fetch for a wild card query.
Cheers,
Jens
On Mon, Nov 05, 2007 at 04:11:53PM -0500, Noah M. Daniels wrote:
> Hi,
> Apologies for reposting this for those who read this via ruby-forum,
> but it didn't make it to the list before, and the list seems more
> active...
> I'm using ferret (via acts_as_ferret) in a somewhat unorthodox
> manner and am having a strange wildcard problem. Before anyone wonders
> why we're doing things this way, the answer is basically that it lets
> us precompute what would be expensive database queries and store the
> results in a simple way (ferret index) prior to pushing the static
> data to our production server.
> Basically, I've got two (for the sake of simplicity) models, both of
> which are indexed on a similar (but separate) non-model field.
> However, one of those two models does not seem to get the proper
> number of results for a wildcard search:
> First of all, there's a non-indexed model called ProductTuple that's
> got a supplier_id as well as a product_category_id and
> product_material_id as well as some other id fields that aren't really
> important here. Thus, a ProductTuple has foreign key relationships to
> Suppliers and ProductCategories and ProductMaterials, but for ferret
> purposes just think of those foreign keys as what they are - ids (e.g.
> integers).
> The first model, Supplier, is ferret-indexed on several fields, such
> as the supplier name and supplier country, as well as the
> 'ferret_product_tuples' non-model field. ferret_product_tuples simply
> takes all the product tuples for a supplier and concatenates their
> product_category_id, product_material_id, etc. with delimiters.
> So, for a product tuple with product_category_id 82,
> product_material_id 88, and undefined product_technique_id, the
> resulting part of the ferret_product_tuple string would look like
> x00082_00088_00000x (where we use 00000 to indicate null). the xs are
> used as anchors, essentially, as a given supplier's
> ferret_product_tuple string might look like 'x00082_00088_00000x
> x00000_00081_00013x'.
> Now, the ferret query that gets constructed when we do the relevant
> queries simply looks like:
> 'ferret_product_tuple:x00082_?????_?????x'
> and this would, in the above instance, match that supplier.
> Everything I've described works _perfectly_, EXCEPT...
> we also index product_categories on this same string. So product
> category #82 would have a bunch of ferret_product_tuple strings that
> start out x00082 and have various things in the other positions.
> Here's what's strange... a product_category query for
> 'ferret_product_tuple:x?????_?????_?????x' should return ALL product
> categories, right? Yet it only returns six. A product category query
> for 'ferret_product_tuple:x?????_00081_?????x' should return all the
> product categories that share product_tuples with product_material
> #81, but in fact returns only a small number of categories. Yet making
> the wildcard match MORE restrictive by substituting
> 'ferret_product_tuple:x00082_00081_?????x' into that query yields
> product_category #82, which is erroneously not included in the 6
> results for 'ferret_product_tuple:x?????_00081_?????x'.
> So, have I stumbled upon a bug in the wildcard handling? My initial
> thought was that the different analyzer I was using for the
> product_category index was the culprit, but I changed that analyzer
> out to no effect, so I've ruled that out.
> Any ideas? Thanks!
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
--
Jens Kr?mer
webit! Gesellschaft f?r neue Medien mbH
Schnorrstra?e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer at webit.de | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
From kraemer at webit.de Tue Nov 6 04:06:57 2007
From: kraemer at webit.de (Jens Kraemer)
Date: Tue, 6 Nov 2007 10:06:57 +0100
Subject: [Ferret-talk] Unified ferret_start and ferret_stop
In-Reply-To:
References:
Message-ID: <20071106090657.GE30619@cordoba.webit.de>
Thanks Peter,
I'll have a look at these this evening.
Jens
On Mon, Nov 05, 2007 at 10:51:38AM -0700, Peter Jones wrote:
> Sorry for the top posting. I posted this to the ruby-forum site on
> the 25th of October but it doesn't seem to have made it's way to this
> mailing list. Here is the original post:
> ---
> I've attached my first set of changes. The attached archive includes a
> README file with information about what I've changed and why.
> These changes are only for Unix-like operating systems, for now. If
> you like the changes I've made, I'll integrate the Windows code from
> the various scripts in the script directory.
> Let me know if you have any questions.
> ---
> The patches were originally attached to the forum posting. The URL to
> the patches is therefor: http://www.ruby-forum.com/attachment/780/patches.tar.gz
> Thanks.
> --
> Peter Jones
> pmade inc. - http://pmade.com
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
--
Jens Kr?mer
webit! Gesellschaft f?r neue Medien mbH
Schnorrstra?e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer at webit.de | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
From lebreeze at gmail.com Tue Nov 6 07:14:19 2007
From: lebreeze at gmail.com (Levent Ali)
Date: Tue, 6 Nov 2007 13:14:19 +0100
Subject: [Ferret-talk] Question about the Fail over on the ferret server
In-Reply-To:
References:
Message-ID:
I am looking to solve the same issue... Ferret only seems to be able to
use 1 cpu on the machine as well and once it ramps up to near 100% it
comes to a grinding halt...
--
Posted via http://www.ruby-forum.com/.
From lebreeze at gmail.com Tue Nov 6 07:33:19 2007
From: lebreeze at gmail.com (Levent Ali)
Date: Tue, 6 Nov 2007 13:33:19 +0100
Subject: [Ferret-talk] ferret / acts_as_ferret multiple server deployment
In-Reply-To:
References: <80767f27b7712e9a65f94ab6d4c09987@ruby-forum.com>
<20060912183159.GB2233@cordoba.webit.de>
<6e9c57140e75348102f2a5bcaf37a2ce@ruby-forum.com>
<20060912211659.GA29768@cordoba.webit.de>
Message-ID: <953f6162853d2b5652798ce0c39455b7@ruby-forum.com>
David Balmain wrote:
> On 9/13/06, Jens Kraemer wrote:
>>
>> load balancing the indexing to several servers can only be done via
>> segmenting the data across those servers, and merging it when searching.
>> This seems possible but is not implemented in Ferret (yet?)
>
> The start of this is there (ie the MultiSearcher). I just need to
> implement RemoteSearcher. Don't expect it any time soon however as I'm
> a little burnt out at the moment. I'm just going to be cleaning up
> what is currently already built for the time being.
>
> Cheers,
> Dave
Any progress on RemoteSearcher? :)
--
Posted via http://www.ruby-forum.com/.
From ndaniels at mac.com Tue Nov 6 09:00:33 2007
From: ndaniels at mac.com (Noah Daniels)
Date: Tue, 6 Nov 2007 15:00:33 +0100
Subject: [Ferret-talk] Strange wildcard problem
In-Reply-To: <20071106090616.GD30619@cordoba.webit.de>
References: <42B8371C-25DE-4707-926A-FC7431F40B2C@mac.com>
<20071106090616.GD30619@cordoba.webit.de>
Message-ID: <7435d282c6ea369eacfd1c1775bc341d@ruby-forum.com>
Jens Kraemer wrote:
> Hi!
>
> wildcard queries have a built in upper limit of terms they search for,
> which by default is set to 512 (according to
> http://ferret.davebalmain.com/api/classes/Ferret/Search/WildcardQuery.html).
>
> So when you query for asdf*, Ferret expands this to all terms in your
> index starting with asdf, but will stop after collecting 512 terms, then
> go and retrieve all documents containing these 512 terms, obviously
> missing those that would in theory match your query, but do this by
> containing a matching term that wasn't retrieved in the first step.
>
> Of course you can set the max_term count to a higher value, but in the
> long run this isn't really a solution. If I understand you correctly,
> your tuple field right now has a single term for each document, and that
> term is different for each document. Splitting up your tuple values into
> several different terms could help to reduce the number of terms needed
> to fetch for a wild card query.
>
Interesting, thanks. Actually I can't split the tuple values up -- the
requirement is to see those terms occur together in the same tuple, not
just for the same document (there is a difference in this case). So,
I'll try expanding the max_term count to see if that helps; otherwise
I'll have to rethink the solution.
--
Posted via http://www.ruby-forum.com/.
From ndaniels at mac.com Tue Nov 6 11:25:56 2007
From: ndaniels at mac.com (Noah M. Daniels)
Date: Tue, 6 Nov 2007 11:25:56 -0500
Subject: [Ferret-talk] Strange wildcard problem
In-Reply-To: <7435d282c6ea369eacfd1c1775bc341d@ruby-forum.com>
References: <42B8371C-25DE-4707-926A-FC7431F40B2C@mac.com>
<20071106090616.GD30619@cordoba.webit.de>
<7435d282c6ea369eacfd1c1775bc341d@ruby-forum.com>
Message-ID:
On Nov 6, 2007, at 9:00 AM, Noah Daniels wrote:
> Jens Kraemer wrote:
>>
>
> Interesting, thanks. Actually I can't split the tuple values up -- the
> requirement is to see those terms occur together in the same tuple,
> not
> just for the same document (there is a difference in this case). So,
> I'll try expanding the max_term count to see if that helps; otherwise
> I'll have to rethink the solution.
Jens, many thanks; upping the max_terms (max_clauses seems to be the
same thing) solved the problem beautifully. However, now I'm trying to
get this working with a remote ferret server (using acts_as_ferret)
and not having any luck. Particularly, I can't figure out where to set
max_terms (or Ferret::Search::MultiTermQuery.default_max_terms= ) such
that the remote ferret server will pick it up -- including in the
start script for the remote ferret server. Where can I change this
option so it'll work for a remote server with AAF?
thanks!
From kraemer at webit.de Tue Nov 6 11:35:54 2007
From: kraemer at webit.de (Jens Kraemer)
Date: Tue, 6 Nov 2007 17:35:54 +0100
Subject: [Ferret-talk] Strange wildcard problem
In-Reply-To:
References: <42B8371C-25DE-4707-926A-FC7431F40B2C@mac.com>
<20071106090616.GD30619@cordoba.webit.de>
<7435d282c6ea369eacfd1c1775bc341d@ruby-forum.com>
Message-ID: <20071106163554.GD2040@cordoba.webit.de>
On Tue, Nov 06, 2007 at 11:25:56AM -0500, Noah M. Daniels wrote:
>
>
> On Nov 6, 2007, at 9:00 AM, Noah Daniels wrote:
>
> > Jens Kraemer wrote:
> >>
> >
> > Interesting, thanks. Actually I can't split the tuple values up -- the
> > requirement is to see those terms occur together in the same tuple,
> > not
> > just for the same document (there is a difference in this case). So,
> > I'll try expanding the max_term count to see if that helps; otherwise
> > I'll have to rethink the solution.
>
> Jens, many thanks; upping the max_terms (max_clauses seems to be the
> same thing) solved the problem beautifully. However, now I'm trying to
> get this working with a remote ferret server (using acts_as_ferret)
> and not having any luck. Particularly, I can't figure out where to set
> max_terms (or Ferret::Search::MultiTermQuery.default_max_terms= ) such
> that the remote ferret server will pick it up -- including in the
> start script for the remote ferret server. Where can I change this
> option so it'll work for a remote server with AAF?
Placing it at the end of acts_as_ferret's init.rb should work.
Cheers,
Jens
--
Jens Kr?mer
webit! Gesellschaft f?r neue Medien mbH
Schnorrstra?e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer at webit.de | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
From ndaniels at mac.com Tue Nov 6 11:41:24 2007
From: ndaniels at mac.com (Noah M. Daniels)
Date: Tue, 6 Nov 2007 11:41:24 -0500
Subject: [Ferret-talk] Strange wildcard problem
In-Reply-To: <20071106163554.GD2040@cordoba.webit.de>
References: <42B8371C-25DE-4707-926A-FC7431F40B2C@mac.com>
<20071106090616.GD30619@cordoba.webit.de>
<7435d282c6ea369eacfd1c1775bc341d@ruby-forum.com>
<20071106163554.GD2040@cordoba.webit.de>
Message-ID: <355290B8-BD7C-45F3-A1FC-CE6D15ABDBD1@mac.com>
On Nov 6, 2007, at 11:35 AM, Jens Kraemer wrote:
> On Tue, Nov 06, 2007 at 11:25:56AM -0500, Noah M. Daniels wrote:
>>
> Placing it at the end of acts_as_ferret's init.rb should work.
Unfortunately, it doesn't seem to. For a local index, I can just put
this anywhere in code (even in a controller, or in the console) and I
start getting correct results from my query:
Ferret::Search::MultiTermQuery.default_max_terms = 5000
but on my staging server, where a drb ferret server is used, putting
that line in the init.rb doesn't seem to do anything -- in fact, even
putting it into the initialize method of the LocalIndex class doesn't
help! Any ideas?
thanks!
From alex at liivid.com Wed Nov 7 08:55:24 2007
From: alex at liivid.com (Alex Neth)
Date: Wed, 7 Nov 2007 21:55:24 +0800
Subject: [Ferret-talk] Ferret-talk Digest, Vol 25, Issue 2
In-Reply-To:
References:
Message-ID:
> From: Jens Kraemer
> Subject: Re: [Ferret-talk] Performance before and after optimization
> On Sat, Nov 03, 2007 at 08:49:17PM +0800, Alex Neth wrote:
> [..]
>> 2) Can I keep a second index so that it doesn't get locked during
>> optimization and then switch to the optimized index? Perhaps the
>> index
>> is not really locked and it is just using all the CPU? (I am using a
>> single CPU server)?
>
> If you're already indexing in batches, keeping a second read-only
> index for
> searching is a good idea. rsync is useful to keep the search-index
> up to
> date in this case.
>
> To check if CPU usage is a problem, try lowering the optimizing
> process'
> priority and see how it goes.
>
Thanks Jens. Any suggestion on how to get a two index solution
working with acts_as_ferret? I could not find an easy way to change
the index location dynamically. I would love to have a "read-only"
index. It seems like using rsync might be problematic though as the
index might not be in a consistent state throughout the sync.
I don't think it is CPU, but it is definitely locking my site for up
to a minute during optimization, which is very bad.
From jk at jkraemer.net Wed Nov 7 15:05:52 2007
From: jk at jkraemer.net (Jens Kraemer)
Date: Wed, 7 Nov 2007 21:05:52 +0100
Subject: [Ferret-talk] Unified ferret_start and ferret_stop
In-Reply-To:
References:
Message-ID: <20071107200552.GB18363@thunder.jkraemer.net>
Hi Peter,
works like a charm and looks great :-)
Just merged this into trunk.
Cheers,
Jens
On Mon, Nov 05, 2007 at 10:51:38AM -0700, Peter Jones wrote:
> Sorry for the top posting. I posted this to the ruby-forum site on
> the 25th of October but it doesn't seem to have made it's way to this
> mailing list. Here is the original post:
> ---
> I've attached my first set of changes. The attached archive includes a
> README file with information about what I've changed and why.
> These changes are only for Unix-like operating systems, for now. If
> you like the changes I've made, I'll integrate the Windows code from
> the various scripts in the script directory.
> Let me know if you have any questions.
> ---
> The patches were originally attached to the forum posting. The URL to
> the patches is therefor: http://www.ruby-forum.com/attachment/780/patches.tar.gz
> Thanks.
> --
> Peter Jones
> pmade inc. - http://pmade.com
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
--
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database
From mail at stuartsierra.com Wed Nov 7 17:01:45 2007
From: mail at stuartsierra.com (Stuart Sierra)
Date: Wed, 7 Nov 2007 17:01:45 -0500
Subject: [Ferret-talk] Ferret-talk Digest, Vol 25, Issue 2
In-Reply-To:
References:
Message-ID: <314ee0450711071401l97b4be8j5e298d7d24383ea3@mail.gmail.com>
On 11/7/07, Alex Neth wrote:
> Thanks Jens. Any suggestion on how to get a two index solution
> working with acts_as_ferret?
I rolled my own with methods in my model class, something like this:
def self.setup_new_index(location)
config = aaf_configuration[:ferret].dup
config.update(:create => true, :auto_flush => false,
:field_infos => ActsAsFerret::field_infos([self]),
:path => location)
index = Ferret::Index::Index.new(config)
index.logger = Logger.new("#{location}/index.log")
index
end
def self.build_new_index(location)
index = setup_new_index(location)
max = self.maximum(:id)
start = self.minimum(:id)
start.step(max, increment) do |n|
begin
record = self.find(n)
rescue ActiveRecord::RecordNotFound
next
end
index << record.to_doc if record and record.ferret_enabled?(true)
end
end
Then I have a rake task to replace the old index with the new one.
-Stuart Sierra
columbialawtech.org
From mail at stuartsierra.com Wed Nov 7 17:04:39 2007
From: mail at stuartsierra.com (Stuart Sierra)
Date: Wed, 7 Nov 2007 17:04:39 -0500
Subject: [Ferret-talk] Ferret-talk Digest, Vol 25, Issue 2
In-Reply-To: <314ee0450711071401l97b4be8j5e298d7d24383ea3@mail.gmail.com>
References:
<314ee0450711071401l97b4be8j5e298d7d24383ea3@mail.gmail.com>
Message-ID: <314ee0450711071404j4048d4d4ic034ddd5d4676d80@mail.gmail.com>
> On 11/7/07, Alex Neth wrote:
> > Thanks Jens. Any suggestion on how to get a two index solution
> > working with acts_as_ferret?
On 11/7/07, Stuart Sierra wrote:
> I rolled my own with methods in my model class, something like this:
Correction: this line:
> start.step(max, increment) do |n|
should be
> start.upto(max) do |n|
-Stuart Sierra
columbialawtech.org
From phedre at gmail.com Fri Nov 9 14:41:17 2007
From: phedre at gmail.com (phedre)
Date: Fri, 9 Nov 2007 14:41:17 -0500
Subject: [Ferret-talk] Problem with stemming and AAF
Message-ID: <5d302a7b0711091141s433c009cpd45cb5db392a244d@mail.gmail.com>
I'm sure I'm missing something completely obvious here, so I hope
someone can point me in the right direction!
I've implemented a basic search with AAF, which works as expected; I'm
running a ferret drb server, and using will_paginate to page results.
The code in my search_controller.rb:
search_text = params[:query] || " "
@products = Product.find_with_ferret(search_text, :page =>
params[:page], :per_page => #$ItemsPerPage, :limit => $ItemsPerPage,
:offset => $offset)
@results_pages = Product.paginate_search(search_text, :page =>
params[:page], :per_page => $ItemsPerPage)
The next step was to implement stemming, which seemed straightforward
enough. I created the stemmed_analyzer.rb file in the lib directory,
as follows:
require 'rubygems'
require 'ferret'
class StemmedAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = ENGLISH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, str)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
@stop_words))
end
end
And added the call to the analyzer in my model file:
acts_as_ferret( :fields => { :name => { :boost => 1,
:store => :yes },
:product_number => { :boost => 2 },
:description => { :boost => 0,
:store => :yes },
:care => { :boost => -2 },
:manufacturer_name => { :boost => 1,
:store => :yes },
:collection_name => { :boost => 1,
:store => :yes },
:category_name => { :boost => 0 }
},
:remote => true,
:analyzer => StemmedAnalyzer.new )
Straight forward, no errors. But also no results. Searching for chairs
returns only results for that word, not chair or chairs. I know the
actual analyzer works, as when I explicity call it as follows, it
returns the correct root words to the log files:
search_terms = StemmedAnalyzer.new.token_stream(nil, params[:query])
while token = search_terms.next
puts token
end
Like so: Search for "chairs tables" returns
token["chair":0:6:1]
token["tabl":7:13:1]
but the front end throws up on me with a:
TypeError (wrong argument type DRb::DRbObject (expected Data))
I'm fully confused. I'm sure it's something obvious that I'm just not
seeing, and after beating my head against this for two days, I'm
hoping someone can point it out to me! Or at least get me moving in
the right direction. Thanks for any help!
claudia
--
If you can't be a good example, then you'll just have to be a horrible warning.
From jk at jkraemer.net Sat Nov 10 03:36:18 2007
From: jk at jkraemer.net (Jens Kraemer)
Date: Sat, 10 Nov 2007 09:36:18 +0100
Subject: [Ferret-talk] Problem with stemming and AAF
In-Reply-To: <5d302a7b0711091141s433c009cpd45cb5db392a244d@mail.gmail.com>
References: <5d302a7b0711091141s433c009cpd45cb5db392a244d@mail.gmail.com>
Message-ID: <20071110083618.GJ2341@thunder.jkraemer.net>
Hi!
the analyzer option belongs to the set of options which aaf directly
passes on to Ferret, and therefore the call has to read:
acts_as_ferret(:fields => { },
:remote => true,
:ferret => {
:analyzer => StemmedAnalyzer
})
Cheers,
Jens
On Fri, Nov 09, 2007 at 02:41:17PM -0500, phedre wrote:
> I'm sure I'm missing something completely obvious here, so I hope
> someone can point me in the right direction!
>
> I've implemented a basic search with AAF, which works as expected; I'm
> running a ferret drb server, and using will_paginate to page results.
> The code in my search_controller.rb:
>
> search_text = params[:query] || " "
> @products = Product.find_with_ferret(search_text, :page =>
> params[:page], :per_page => #$ItemsPerPage, :limit => $ItemsPerPage,
> :offset => $offset)
> @results_pages = Product.paginate_search(search_text, :page =>
> params[:page], :per_page => $ItemsPerPage)
>
>
>
> The next step was to implement stemming, which seemed straightforward
> enough. I created the stemmed_analyzer.rb file in the lib directory,
> as follows:
>
> require 'rubygems'
> require 'ferret'
>
> class StemmedAnalyzer < Ferret::Analysis::Analyzer
> include Ferret::Analysis
> def initialize(stop_words = ENGLISH_STOP_WORDS)
> @stop_words = stop_words
> end
> def token_stream(field, str)
> StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
> @stop_words))
> end
> end
>
>
> And added the call to the analyzer in my model file:
>
> acts_as_ferret( :fields => { :name => { :boost => 1,
> :store => :yes },
> :product_number => { :boost => 2 },
> :description => { :boost => 0,
> :store => :yes },
> :care => { :boost => -2 },
> :manufacturer_name => { :boost => 1,
> :store => :yes },
> :collection_name => { :boost => 1,
> :store => :yes },
> :category_name => { :boost => 0 }
> },
> :remote => true,
> :analyzer => StemmedAnalyzer.new )
--
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database
From alex at liivid.com Sat Nov 10 03:29:48 2007
From: alex at liivid.com (Alex Neth)
Date: Sat, 10 Nov 2007 16:29:48 +0800
Subject: [Ferret-talk] Performance before and after optimization
In-Reply-To:
References:
Message-ID: <243B3673-0968-484C-AEBE-2BB66B161E55@liivid.com>
> From: Jens Kraemer
> Subject: Re: [Ferret-talk] Performance before and after optimization
> On Sat, Nov 03, 2007 at 08:49:17PM +0800, Alex Neth wrote:
> [..]
>> 2) Can I keep a second index so that it doesn't get locked during
>> optimization and then switch to the optimized index? Perhaps the
>> index
>> is not really locked and it is just using all the CPU? (I am using a
>> single CPU server)?
>
> If you're already indexing in batches, keeping a second read-only
> index for
> searching is a good idea. rsync is useful to keep the search-index
> up to
> date in this case.
>
> To check if CPU usage is a problem, try lowering the optimizing
> process'
> priority and see how it goes.
>
Thanks Jens. Any suggestion on how to get a two index solution
working with acts_as_ferret? I could not find an easy way to change
the index location dynamically. I would love to have a "read-only"
index. It seems like using rsync might be problematic though as the
index might not be in a consistent state throughout the sync.
It's not the CPU. The index is definitely locked for reading during
optimization. With cheap disk space, I would rather use two indexes,
add new records to the "off" index, optimize it, then switch indexes
- and go back and for like that.
From alex at liivid.com Sat Nov 10 05:41:34 2007
From: alex at liivid.com (Alex Neth)
Date: Sat, 10 Nov 2007 18:41:34 +0800
Subject: [Ferret-talk] Ferret-talk Digest, Vol 25, Issue 3
In-Reply-To:
References:
Message-ID: <33AF33E6-109F-4F76-BFC5-504E3BCD527E@liivid.com>
Thanks Stuart. I thought I had read somewhere that rebuild_index
built the index in a different location and then swapped it, but
after looking at the code (in local_index.rb) this doesn't appear to
be the case. That might explain why the ferret server crashes
sometimes when a search takes place during a reindex.
I wouldn't be doing exactly the same thing as this but this does get
me started. I'm concerned about swapping the index files on a live
site though. Seems risky so I'll probably try to update the
ferret_index member in LocalIndex. Looks like that will work.
-Alex
On Nov 10, 2007, at 4:36 PM, ferret-talk-request at rubyforge.org wrote:
>
> Message: 7
> Date: Wed, 7 Nov 2007 17:01:45 -0500
> From: "Stuart Sierra"
> Subject: Re: [Ferret-talk] Ferret-talk Digest, Vol 25, Issue 2
> To: ferret-talk at rubyforge.org
> Message-ID:
> <314ee0450711071401l97b4be8j5e298d7d24383ea3 at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> On 11/7/07, Alex Neth wrote:
>> Thanks Jens. Any suggestion on how to get a two index solution
>> working with acts_as_ferret?
>
> I rolled my own with methods in my model class, something like this:
>
> def self.setup_new_index(location)
> config = aaf_configuration[:ferret].dup
> config.update(:create => true, :auto_flush => false,
> :field_infos => ActsAsFerret::field_infos([self]),
> :path => location)
> index = Ferret::Index::Index.new(config)
> index.logger = Logger.new("#{location}/index.log")
> index
> end
>
>
> def self.build_new_index(location)
> index = setup_new_index(location)
>
> max = self.maximum(:id)
> start = self.minimum(:id)
>
> start.step(max, increment) do |n|
> begin
> record = self.find(n)
> rescue ActiveRecord::RecordNotFound
> next
> end
> index << record.to_doc if record and record.ferret_enabled?
> (true)
> end
> end
>
> Then I have a rake task to replace the old index with the new one.
>
> -Stuart Sierra
> columbialawtech.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20071110/7106628d/attachment.html
From jk at jkraemer.net Sun Nov 11 08:34:49 2007
From: jk at jkraemer.net (Jens Kraemer)
Date: Sun, 11 Nov 2007 14:34:49 +0100
Subject: [Ferret-talk] Ferret-talk Digest, Vol 25, Issue 3
In-Reply-To: <33AF33E6-109F-4F76-BFC5-504E3BCD527E@liivid.com>
References:
<33AF33E6-109F-4F76-BFC5-504E3BCD527E@liivid.com>
Message-ID: <20071111133449.GB15113@thunder.jkraemer.net>
On Sat, Nov 10, 2007 at 06:41:34PM +0800, Alex Neth wrote:
> Thanks Stuart. I thought I had read somewhere that rebuild_index
> built the index in a different location and then swapped it, but
> after looking at the code (in local_index.rb) this doesn't appear to
> be the case. That might explain why the ferret server crashes
> sometimes when a search takes place during a reindex.
have a look at the rebuild_index implementation in ferret_server.rb,
that's the one which is used in DRb mode. And yes, it rebuilds the index
in the background while running searches on the old one, so the index
swapping logic in there might be a doog starting point for you.
Cheers,
Jens
--
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database
From lists at kikobu.com Sun Nov 11 10:32:58 2007
From: lists at kikobu.com (Morten)
Date: Sun, 11 Nov 2007 16:32:58 +0100
Subject: [Ferret-talk] undefined method `add'
Message-ID:
We've been running into problems with ferret indexing lately. The
problem is intermittent and some times it persists. Just got this after
wiping the index and redeploying:
NoMethodError (undefined method `add' for Solution:Class):
(druby://10.1.65.87:9009)
/data/releases/20071111152414/vendor/rails/activerecord/lib/active_record/base.rb:1238:in
`method_missing'
(druby://10.1.65.87:9009)
/data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/ferret_server.rb:71:in
`send'
(druby://10.1.65.87:9009)
/data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/ferret_server.rb:71:in
`method_missing'
/data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/remote_index.rb:31:in
`<<'
/data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/instance_methods.rb:73:in
`ferret_create'
I'm running the latest stable version of AAF. Any tips or work arounds
much appreciated.
Morten
From lists at kikobu.com Sun Nov 11 10:09:32 2007
From: lists at kikobu.com (Morten)
Date: Sun, 11 Nov 2007 16:09:32 +0100
Subject: [Ferret-talk] Reducing dependency on remote ferret process
Message-ID:
Hi.
We use FerretDrb for search. If the ferret process is down, our entire
application comes down the moment we try to save a model which is indexed.
Is there a way to decouple this relationship such that we can somehow
resume normal operations despite ferret being down and not index the model?
Thanks.
Morten
From hongli at plan99.net Sun Nov 11 11:19:34 2007
From: hongli at plan99.net (Hongli Lai)
Date: Sun, 11 Nov 2007 17:19:34 +0100
Subject: [Ferret-talk] Reducing dependency on remote ferret process
In-Reply-To:
References:
Message-ID: <47372B96.8010303@plan99.net>
Morten wrote:
> Hi.
>
> We use FerretDrb for search. If the ferret process is down, our entire
> application comes down the moment we try to save a model which is indexed.
>
> Is there a way to decouple this relationship such that we can somehow
> resume normal operations despite ferret being down and not index the model?
>
> Thanks.
>
> Morten
I really don't understand your concern. I could also say "if the web
server process is down, our entire application is down" (assuming you're
talking about a web app). The FerretDrb process shouldn't be down. If
you continue even if it's down, your index will become out of date.
Depending on your data that may or may not be worse than crashing.
From lists at kikobu.com Sun Nov 11 16:17:53 2007
From: lists at kikobu.com (Morten)
Date: Sun, 11 Nov 2007 22:17:53 +0100
Subject: [Ferret-talk] Reducing dependency on remote ferret process
In-Reply-To: <47372B96.8010303@plan99.net>
References: <47372B96.8010303@plan99.net>
Message-ID:
Hongli Lai wrote:
> Morten wrote:
>> Hi.
>>
>> We use FerretDrb for search. If the ferret process is down, our entire
>> application comes down the moment we try to save a model which is indexed.
>>
>> Is there a way to decouple this relationship such that we can somehow
>> resume normal operations despite ferret being down and not index the model?
>>
>> Thanks.
>>
>> Morten
>
> I really don't understand your concern. I could also say "if the web
> server process is down, our entire application is down" (assuming you're
> talking about a web app). The FerretDrb process shouldn't be down. If
> you continue even if it's down, your index will become out of date.
> Depending on your data that may or may not be worse than crashing.
I don't think your comparison is quite fair. Ferret is nice, but it's
not fully matured compared to Apache, MySQL and so on. At least I'm
having more stability issues with it than I've had with the other
processes that I base my work on, which is why I think my concern is
completely valid.
Br,
Morten
From bk at benjaminkrause.com Sun Nov 11 16:36:54 2007
From: bk at benjaminkrause.com (Benjamin Krause)
Date: Sun, 11 Nov 2007 22:36:54 +0100
Subject: [Ferret-talk] Reducing dependency on remote ferret process
In-Reply-To:
References:
Message-ID: <920060E7-4234-40F2-AF2D-5D40ECE30D61@benjaminkrause.com>
Hey ..
unfortunately, no .. not with the current construction.
However, there might be a chance to switch to a
messaging service like ap4r, so your indexing
requests doesn't get lost.
I think there are some considerations about re-factoring
the drb server, so maybe this dependency might be
dropped in the future..
Cheers
Ben
On 2007-11-11, at 16:09, Morten wrote:
>
> Hi.
>
> We use FerretDrb for search. If the ferret process is down, our entire
> application comes down the moment we try to save a model which is
> indexed.
>
> Is there a way to decouple this relationship such that we can somehow
> resume normal operations despite ferret being down and not index the
> model?
>
> Thanks.
>
> Morten
>
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
Gruss
Ben
---
Benjamin Krause
http://www.omdb.org/
bk at benjaminkrause.com
Rails-Schulung "Advancing with Rails" mit David A. Black
19.11.-22.11.2007, Berlin-Mitte
Details u. Anmeldung: http://www.railsschulung.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20071111/aeb7c5d3/attachment.html
From julioody at gmail.com Sun Nov 11 17:03:58 2007
From: julioody at gmail.com (Julio Cesar Ody)
Date: Mon, 12 Nov 2007 09:03:58 +1100
Subject: [Ferret-talk] Reducing dependency on remote ferret process
In-Reply-To: <920060E7-4234-40F2-AF2D-5D40ECE30D61@benjaminkrause.com>
References:
<920060E7-4234-40F2-AF2D-5D40ECE30D61@benjaminkrause.com>
Message-ID:
Let me take a wild guess on this one.
On ACTS_AS_FERRET_GEM_ROOT/lib/index.rb
def ferret_create
if ferret_enabled?
logger.debug "ferret_create/update: #{self.class.name} : #{self.id}"
self.class.aaf_index << self
else
ferret_enable if @ferret_disabled == :once
end
true # signal success to AR
end
Try wrapping "aaf_index<<" like this
begin
self.class.aaf_index << self
rescue Exception => e
logger.warn "Error creating/updating document:
#{e.inspect}\n#{e.backtrace.join("\n\t")}"
end
Mind I'm just reading the source and writing the code here straight
away. In the event of my theory being right, this would gracefully
handle exceptions related to adding an entry to the index by dropping
a warning in the AAF log file and moving on.
I think this could be an optional in config/initializers (Rails 2.0)
perhaps, as in
config.aaf.exception_on_save = true
IMHO, of course.
On Nov 12, 2007 8:36 AM, Benjamin Krause wrote:
> Hey ..
>
> unfortunately, no .. not with the current construction.
> However, there might be a chance to switch to a
> messaging service like ap4r, so your indexing
> requests doesn't get lost.
>
> I think there are some considerations about re-factoring
> the drb server, so maybe this dependency might be
> dropped in the future..
>
> Cheers
> Ben
>
>
>
>
>
>
> On 2007-11-11, at 16:09, Morten wrote:
>
> Hi.
>
> We use FerretDrb for search. If the ferret process is down, our entire
> application comes down the moment we try to save a model which is indexed.
>
> Is there a way to decouple this relationship such that we can somehow
> resume normal operations despite ferret being down and not index the model?
>
> Thanks.
>
> Morten
>
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
>
>
> Gruss
> Ben
> ---
> Benjamin Krause
> http://www.omdb.org/
> bk at benjaminkrause.com
>
> Rails-Schulung "Advancing with Rails" mit David A. Black
> 19.11.-22.11.2007, Berlin-Mitte
> Details u. Anmeldung: http://www.railsschulung.de
>
>
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
From lists at kikobu.com Sun Nov 11 17:24:43 2007
From: lists at kikobu.com (Morten)
Date: Sun, 11 Nov 2007 23:24:43 +0100
Subject: [Ferret-talk] undefined method `add'
In-Reply-To:
References:
Message-ID:
Hi,
I *think* I'm getting closer to what's going on with this problem.
Basically, the models that we're experiencing this with, are subclasses
(Rails STI), such that:
class Entry < AR::Base
acts_as_ferret
end
class Solution < Entry
end
class Notice < Entry
end
The problem may appear intermittently, because the subclassed models
have not been loadeded correctly somehow, and thus confusing ferret. If
I reload the page that causes the problem a few times, things usually
begin working.
I suppose one way to do a quick fix would be to explicity require the
models in one of the initialization files (eg. environment.rb) such that
entry gets required first, and then each of the sub-classes.
I'll see if I can reproduce this outside of production.
Br,
Morten
Morten wrote:
> We've been running into problems with ferret indexing lately. The
> problem is intermittent and some times it persists. Just got this after
> wiping the index and redeploying:
>
> NoMethodError (undefined method `add' for Solution:Class):
> (druby://10.1.65.87:9009)
> /data/releases/20071111152414/vendor/rails/activerecord/lib/active_record/base.rb:1238:in
> `method_missing'
> (druby://10.1.65.87:9009)
> /data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/ferret_server.rb:71:in
> `send'
> (druby://10.1.65.87:9009)
> /data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/ferret_server.rb:71:in
> `method_missing'
>
> /data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/remote_index.rb:31:in
> `<<'
>
> /data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/instance_methods.rb:73:in
> `ferret_create'
>
>
> I'm running the latest stable version of AAF. Any tips or work arounds
> much appreciated.
>
> Morten
From lists at kikobu.com Mon Nov 12 04:00:16 2007
From: lists at kikobu.com (Morten)
Date: Mon, 12 Nov 2007 10:00:16 +0100
Subject: [Ferret-talk] undefined method `add'
In-Reply-To:
References:
Message-ID:
Well, that wasn't it. It appears to happen for top level classes in the
inheritance hierarchy as well and for classes that are not even subclassed.
It only happens once per class type immediatly after restarting the
Ferret backgroundrb process, after which things begin working.
It does not help to require the classes in environment.rb
Any suggestions?
Thanks.
Morten
Morten wrote:
> Hi,
>
> I *think* I'm getting closer to what's going on with this problem.
>
> Basically, the models that we're experiencing this with, are subclasses
> (Rails STI), such that:
>
> class Entry < AR::Base
> acts_as_ferret
> end
>
> class Solution < Entry
> end
>
> class Notice < Entry
> end
>
> The problem may appear intermittently, because the subclassed models
> have not been loadeded correctly somehow, and thus confusing ferret. If
> I reload the page that causes the problem a few times, things usually
> begin working.
>
> I suppose one way to do a quick fix would be to explicity require the
> models in one of the initialization files (eg. environment.rb) such that
> entry gets required first, and then each of the sub-classes.
>
> I'll see if I can reproduce this outside of production.
>
> Br,
>
> Morten
>
>
>
>
>
> Morten wrote:
>> We've been running into problems with ferret indexing lately. The
>> problem is intermittent and some times it persists. Just got this after
>> wiping the index and redeploying:
>>
>> NoMethodError (undefined method `add' for Solution:Class):
>> (druby://10.1.65.87:9009)
>> /data/releases/20071111152414/vendor/rails/activerecord/lib/active_record/base.rb:1238:in
>> `method_missing'
>> (druby://10.1.65.87:9009)
>> /data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/ferret_server.rb:71:in
>> `send'
>> (druby://10.1.65.87:9009)
>> /data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/ferret_server.rb:71:in
>> `method_missing'
>>
>> /data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/remote_index.rb:31:in
>> `<<'
>>
>> /data/releases/20071111152414/vendor/plugins/acts_as_ferret/lib/instance_methods.rb:73:in
>> `ferret_create'
>>
>>
>> I'm running the latest stable version of AAF. Any tips or work arounds
>> much appreciated.
>>
>> Morten
From lists at kikobu.com Mon Nov 12 08:27:58 2007
From: lists at kikobu.com (Morten)
Date: Mon, 12 Nov 2007 14:27:58 +0100
Subject: [Ferret-talk] undefined method `add'
In-Reply-To:
References:
Message-ID:
On the first request, which breaks, the following gets written to the
ferret_server.log:
call index method: add with [#127, :type=>Solution,
:is_syndicated=>false, :submitter_id=>-1
This means, that the ferret server does not know the type of the object
being indexed on the first request. Is this correct? As far as I can
tell, the object getting sent is a "Ferret::Document" (from the gem). Is
there a way to make AAF always send a hash? Would that make sense?
Morten
From lists at kikobu.com Mon Nov 12 09:22:33 2007
From: lists at kikobu.com (Morten)
Date: Mon, 12 Nov 2007 15:22:33 +0100
Subject: [Ferret-talk] undefined method `add' - work around
In-Reply-To:
References:
Message-ID:
The underlying problem is bad unmarshalling of the Ferret::Document that
gets sent to the DRb server.
In ferret_server.rb:
rescue NoMethodError
@logger.debug "no luck, trying to call class method instead"
Using rescue NoMethodError => e and then include e.message in the debug
output, reveals:
undefined method `to_doc' for #
I'm pretty blank as to why Ferret::Document does not get properly
unmarshalled on the initial request. If I change the add method to
attempt a DRb reload in local_index.rb (line ~139):
def add(record)
if record.is_a?(DRb::DRbUnknown)
record = record.reload
logger.warn("Reloaded DRb::DRbUnknown to #{record.class.name}")
end
record = record.to_doc unless Hash === record || Ferret::Document ===
record
ferret_index << record
end
Then I do indeed get a Document instance back, ie. I have a work around.
But why does this work around work? Does the unmarshalling process occur
before the relevant classes get loaded in the initial request?
I'll patch up my local AAF to use this work around, but as it does not
solve the actual root problem, I guess it's not interesting as a patch
submission.
Br,
Morten
From phedre at gmail.com Mon Nov 12 09:48:34 2007
From: phedre at gmail.com (claudia)
Date: Mon, 12 Nov 2007 09:48:34 -0500
Subject: [Ferret-talk] Problem with stemming and AAF
In-Reply-To: <17665fee0711120645td8cbbdm88c6e11916c02a53@mail.gmail.com>
References: <5d302a7b0711091141s433c009cpd45cb5db392a244d@mail.gmail.com>
<20071110083618.GJ2341@thunder.jkraemer.net>
<17665fee0711120645td8cbbdm88c6e11916c02a53@mail.gmail.com>
Message-ID: <17665fee0711120648i15cd2414h5135bdca3dbaa87d@mail.gmail.com>
Such a simple solution. That's what I get for spending days staring at
the silly thing. Thanks for the help!
claudia
On 10/11/2007, Jens Kraemer wrote:
> the analyzer option belongs to the set of options which aaf directly
> passes on to Ferret, and therefore the call has to read:
> acts_as_ferret(:fields => { },
> :remote => true,
> :ferret => {
> :analyzer => StemmedAnalyzer
> })
From kraemer at webit.de Mon Nov 12 10:28:47 2007
From: kraemer at webit.de (Jens Kraemer)
Date: Mon, 12 Nov 2007 16:28:47 +0100
Subject: [Ferret-talk] undefined method `add' - work around
In-Reply-To:
References:
Message-ID: <20071112152847.GM10556@cordoba.webit.de>
Hi Morten,
glad you could make this work for you.
I'm not sure why you're seeing this strange behaviour, I've never seen
this happen before.
Cheers,
Jens
On Mon, Nov 12, 2007 at 03:22:33PM +0100, Morten wrote:
>
> The underlying problem is bad unmarshalling of the Ferret::Document that
> gets sent to the DRb server.
>
> In ferret_server.rb:
>
> rescue NoMethodError
> @logger.debug "no luck, trying to call class method instead"
>
> Using rescue NoMethodError => e and then include e.message in the debug
> output, reveals:
>
> undefined method `to_doc' for #
>
> I'm pretty blank as to why Ferret::Document does not get properly
> unmarshalled on the initial request. If I change the add method to
> attempt a DRb reload in local_index.rb (line ~139):
>
> def add(record)
> if record.is_a?(DRb::DRbUnknown)
> record = record.reload
> logger.warn("Reloaded DRb::DRbUnknown to #{record.class.name}")
> end
>
> record = record.to_doc unless Hash === record || Ferret::Document ===
> record
> ferret_index << record
> end
>
> Then I do indeed get a Document instance back, ie. I have a work around.
>
> But why does this work around work? Does the unmarshalling process occur
> before the relevant classes get loaded in the initial request?
>
> I'll patch up my local AAF to use this work around, but as it does not
> solve the actual root problem, I guess it's not interesting as a patch
> submission.
>
> Br,
>
> Morten
>
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
--
Jens Kr?mer
webit! Gesellschaft f?r neue Medien mbH
Schnorrstra?e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer at webit.de | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
From alain.ravet+ferret at gmail.com Tue Nov 13 07:47:04 2007
From: alain.ravet+ferret at gmail.com (Alain Ravet)
Date: Tue, 13 Nov 2007 13:47:04 +0100
Subject: [Ferret-talk] acts_as_ferret : cannot use a customized Analyzer (as
indicated in the AdvancedUsageNotes)
Message-ID:
Hi all,
I cannot make aaf (rev. 220) use my custom analyzer, despite following the
indications @
http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage
To pinpoint the problem, I created a model + a simple analyzer with 2 stop
words : "fax" and "gsm".
test 1 : model.rebuild_index + model.find_by_contents("fax") # fax is a
stop word.
=> I get a result when I should not.
(note : I delete the index directory => I can see the index is recreated,
index/develop
).
test 2 : insert a 'raise' in the token_stream() method => it's never thrown.
test 3 : use the standard analyzer, to exclude the 2 stop words => same
wrong result.
class AccessPointKind2 < ActiveRecord::Base
set_table_name "access_point_kinds2"
acts_as_ferret(
{:remote => true, :fields => { :name => {:store => :yes}} } ,
{ :analyzer =>
Ferret::Analysis::StandardAnalyzer.new(["fax","gsm"])
}
)
end
Here are the model and the analyzer :
MODEL :
class AccessPointKind2 < ActiveRecord::Base
set_table_name "access_point_kinds2"
acts_as_ferret(
{:remote => true, :fields => { :name => {:store => :yes}} } ,
{:analyzer => PlainAsciiAnalyzer.new}
)
end
ANALYZER
lib : plain_ascii_analyzer.rb
class PlainAsciiAnalyzer < ::Ferret::Analysis::Analyzer
include ::Ferret::Analysis
def token_stream(field, str)
StopFilter.new(
StandardTokenizer.new(str) ,
["fax", "gsm"]
)
# raise <<<----- is never executed when uncommented !!
end
end
In the console, I rebuild the index + search for a stop word => I get a
results, when I should not :
>> reload!; AccessPointKind2.rebuild_index ;
AccessPointKind2.find_by_contents("gsm").collect &:name
Reloading...
AccessPointKind2 Columns (0.002963) SHOW FIELDS FROM access_point_kinds2
Asked for a remote server ? true, ENV["FERRET_USE_LOCAL_INDEX"] is nil,
looks like we are not the server
Will use remote index server which should be available at
druby://localhost:9010
default field list: [:name]
AccessPointKind2 Load (0.002706) SELECT * FROM access_point_kinds2 WHERE
(access_point_kinds2.id in ('7','12','13','8','2'))
Query: gsm
total hits: 5, results delivered: 5
=> ["gsm", "gsm", "gsm(werk)", "gsm(priv?)", "gsm(priv?)"]
>>
I guess it's obvious, but I cannot see it.
Help.
Thanks in advance.
Alain
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20071113/29bd78c4/attachment.html
From jk at jkraemer.net Wed Nov 14 04:25:37 2007
From: jk at jkraemer.net (Jens Kraemer)
Date: Wed, 14 Nov 2007 10:25:37 +0100
Subject: [Ferret-talk] acts_as_ferret : cannot use a customized Analyzer
(as indicated in the AdvancedUsageNotes)
In-Reply-To:
References:
Message-ID: <20071114092537.GB3558@thunder.jkraemer.net>
Hi,
I just tried and I'm afraid I couldn't reproduce your problem here (with
aaf trunk). I just committed a testcase using StandardAnalyzer with your
stop word list, and it works as intended. I also tried with your
analyzer class from below, same result.
Could you please try the lates aaf from trunk to see if it fixes your
problem?
Cheers,
Jens
On Tue, Nov 13, 2007 at 01:47:04PM +0100, Alain Ravet wrote:
> Hi all,
>
>
> I cannot make aaf (rev. 220) use my custom analyzer, despite following the
> indications @
>
> http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage
>
>
> To pinpoint the problem, I created a model + a simple analyzer with 2 stop
> words : "fax" and "gsm".
>
> test 1 : model.rebuild_index + model.find_by_contents("fax") # fax is a
> stop word.
> => I get a result when I should not.
>
> (note : I delete the index directory => I can see the index is recreated,
> index/develop
>
> ).
>
> test 2 : insert a 'raise' in the token_stream() method => it's never thrown.
>
> test 3 : use the standard analyzer, to exclude the 2 stop words => same
> wrong result.
> class AccessPointKind2 < ActiveRecord::Base
>
> set_table_name "access_point_kinds2"
>
> acts_as_ferret(
> {:remote => true, :fields => { :name => {:store => :yes}} } ,
> { :analyzer =>
> Ferret::Analysis::StandardAnalyzer.new(["fax","gsm"])
> }
> )
> end
>
>
>
>
>
> Here are the model and the analyzer :
> MODEL :
>
> class AccessPointKind2 < ActiveRecord::Base
> set_table_name "access_point_kinds2"
>
> acts_as_ferret(
> {:remote => true, :fields => { :name => {:store => :yes}} } ,
> {:analyzer => PlainAsciiAnalyzer.new}
> )
> end
>
>
> ANALYZER
> lib : plain_ascii_analyzer.rb
> class PlainAsciiAnalyzer < ::Ferret::Analysis::Analyzer
> include ::Ferret::Analysis
> def token_stream(field, str)
> StopFilter.new(
> StandardTokenizer.new(str) ,
> ["fax", "gsm"]
> )
> # raise <<<----- is never executed when uncommented !!
> end
> end
>
>
>
> In the console, I rebuild the index + search for a stop word => I get a
> results, when I should not :
>
>
> >> reload!; AccessPointKind2.rebuild_index ;
> AccessPointKind2.find_by_contents("gsm").collect &:name
> Reloading...
> AccessPointKind2 Columns (0.002963) SHOW FIELDS FROM access_point_kinds2
> Asked for a remote server ? true, ENV["FERRET_USE_LOCAL_INDEX"] is nil,
> looks like we are not the server
> Will use remote index server which should be available at
> druby://localhost:9010
> default field list: [:name]
> AccessPointKind2 Load (0.002706) SELECT * FROM access_point_kinds2 WHERE
> (access_point_kinds2.id in ('7','12','13','8','2'))
> Query: gsm
> total hits: 5, results delivered: 5
> => ["gsm", "gsm", "gsm(werk)", "gsm(priv?)", "gsm(priv?)"]
> >>
>
>
> I guess it's obvious, but I cannot see it.
> Help.
>
> Thanks in advance.
>
> Alain
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
--
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database
From alain.ravet+ferret at gmail.com Wed Nov 14 16:51:25 2007
From: alain.ravet+ferret at gmail.com (Alain Ravet)
Date: Wed, 14 Nov 2007 22:51:25 +0100
Subject: [Ferret-talk] acts_as_ferret : cannot use a customized Analyzer
(as indicated in the AdvancedUsageNotes)
In-Reply-To: <20071114092537.GB3558@thunder.jkraemer.net>
References:
<20071114092537.GB3558@thunder.jkraemer.net>
Message-ID:
Jens,
> I just tried and I'm afraid I couldn't reproduce your problem here (with
aaf trunk). ...
> Could you please try the lates aaf from trunk to see if it fixes your
problem?
Same problem after installing the lasted version (262) of aaf : the custop
analyzer I pass as an aaf parameter is not used.
As a quick test, I tried using the "No Stop Word" custom analyzer as
documented @
http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage
on a simple LUT table/model, to no avail.
I tried the new syntax with the same wrong result.
Setup :
* I've installed the latest trunk version of aaf (262)
* killed + restarted a (new) DrB server
$ ./script/ferret_server -e production start
* checked the Ferret version :
$ gem list ferret ==> ferret (0.11.4)
Test :
I created a record where the name is a default stop word
>> Country.find 11
Country Load (0.000388) SELECT * FROM countries WHERE
(countries.`id` = 11)
=> #
model, way 1 :
class Country < ActiveRecord::Base
acts_as_ferret( { :fields => [:name] }, { :analyzer =>
Ferret::Analysis::StandardAnalyzer.new( []) } )
end
model, way 2 :
class Country < ActiveRecord::Base
acts_as_ferret(
:fields => [:name] ,
:remote => true,
:ferret => {:analyzer => Ferret::Analysis:: StandardAnalyzer.new([]) }
)
end
PROBLEM : in both cases it doesn't find any record where the name is 'the'
>> reload! ; Country.*rebuild_index* ; Country.*find_by_contents*(" the")
>> reload! ; Country.rebuild_index ; Country.find_by_contents ("the")
Reloading...
Asked for a remote server ? true, ENV["FERRET_USE_LOCAL_INDEX"] is nil,
looks like we are not the server
Will use remote index server which should be available at
druby://localhost:9010
default field list: [:name]
Query: the
total hits: 0, results delivered: 0
=> #
I tried with my custom analyser (from the previous message), with the same
wrong result.
So, it looks like aaf is not using the custom analyzer I declared in the
model.
It doesn't make any sense to me.
Alain Ravet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20071114/a02e70a9/attachment.html
From alain.ravet+ferret at gmail.com Wed Nov 14 16:58:08 2007
From: alain.ravet+ferret at gmail.com (Alain Ravet)
Date: Wed, 14 Nov 2007 22:58:08 +0100
Subject: [Ferret-talk] acts_as_ferret : cannot use a customized Analyzer
(as indicated in the AdvancedUsageNotes)
In-Reply-To:
References:
<20071114092537.GB3558@thunder.jkraemer.net>
Message-ID:
remark : some spaces were erroneously inserted before the word "the"
when I formatted the email, and are not present in the real code.
So
> => #
> ..
> >> reload! ; Country.rebuild_index ; Country.find_by_contents(" the")
should read :
> => #
> ..
> >> reload! ; Country.rebuild_index ; Country.find_by_contents("the")
From alain.ravet+ferret at gmail.com Wed Nov 14 18:00:04 2007
From: alain.ravet+ferret at gmail.com (Alain Ravet)
Date: Thu, 15 Nov 2007 00:00:04 +0100
Subject: [Ferret-talk] acts_as_ferret : cannot use a customized Analyzer
(as indicated in the AdvancedUsageNotes)
In-Reply-To: <20071114092537.GB3558@thunder.jkraemer.net>
References:
<20071114092537.GB3558@thunder.jkraemer.net>
Message-ID:
I'm one step further :
- Good : I now know aaf knows about/received the custom analyzer
but
- Bad : the analyzer is not used by aaf ( : it stops on words it should
not stop on)
New test : a "no stop word" analyzer, adapted from the german stemming
analyser @
http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage
file: model/country.rb
----------------------
class Test2Analyzer < ::Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = [])
@stop_words = stop_words
end
def token_stream(field, str)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(
StandardTokenizer.new(str)), @stop_words), 'de')
end
end
class Country < ActiveRecord::Base
acts_as_ferret(
:fields => [:name] ,
:remote => true,
:ferret => {:analyzer => Test2Analyzer.new([]) }
)
end
0?/ delete the ferret index directory
1?/ restart the console and rebuild the index :
./script/console
>> Country.rebuild_index
Asked for a remote server ? true, ENV["FERRET_USE_LOCAL_INDEX"] is nil,
looks like we are not the server
Will use remote index server which should be available at
druby://localhost:9010
default field list: [:name]
=> nil
2?/ confirm that aaf knows about my "no_stop_words" custom analyzer :
>> puts Country.aaf_index.to_yaml
--- !ruby/object:ActsAsFerret::RemoteIndex
config:
:fields:
- :name
:mysql_fast_batches: true
:name: countries
:class_name: Country
:index_dir:
/Users/aravet/aaprojets/newgids/newgids_machine/index/development/country
:remote: druby://localhost:9010
:reindex_batch_size: 1000
:store_class_name: false
:ferret_fields:
:name:
:store: :no
:term_vector: :with_positions_offsets
:boost: 1.0
:index: :yes
:highlight: :yes
:single_index: false
:ferret: &id001
:key: :id
:auto_flush: true
:or_default: false
:path:
/Users/aravet/aaprojets/newgids/newgids_machine/index/development/country
:create_if_missing: true
:handle_parse_errors: true
:analyzer: !ruby/object:Test2Analyzer <<<<----------- Good
stop_words: [] <<<<----------- Good
:default_field:
- :name
:enabled: true
ferret_config: *id001
server: !ruby/object:DRb::DRbObject
ref:
uri: druby://localhost:9010
=> nil
3?/ confirm that there is record with name == "the"
>> Country.find_by_name "the"
Country Load (0.000427) SELECT * FROM countries WHERE (countries.`name`
= 'the') LIMIT 1
=> #
4?/ try and find "t*" it with aaf
=> DOES NOT WORK (does not find Country[:name => "the"])
>> Country.find_by_contents "t*"
Query: t*
total hits: 0, results delivered: 0
=> #
5?/ do the same for "t*", a non stop word
=> IT WORKS (finds Country[:name => "Frankrijk"])
>> Country.find_by_contents "f*"
Country Load (0.000420) SELECT * FROM countries WHERE (countries.id in
('2'))
Query: f*
total hits: 1, results delivered: 1
=> #], total_pages1
So, aaf (rev 262)
* associates the right custom analyzer with the model,
* but doesn't seem to use it when finding_by_contents (? and rebuilding the
index ??)
Alain
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20071115/1c0510a6/attachment-0001.html
From hongli at plan99.net Wed Nov 14 18:24:25 2007
From: hongli at plan99.net (Hongli Lai)
Date: Thu, 15 Nov 2007 00:24:25 +0100
Subject: [Ferret-talk] acts_as_ferret : cannot use a customized Analyzer
(as indicated in the AdvancedUsageNotes)
In-Reply-To:
References: <20071114092537.GB3558@thunder.jkraemer.net>
Message-ID: <473B83A9.4050509@plan99.net>
Alain Ravet wrote:
> class Country < ActiveRecord::Base
> acts_as_ferret(
> :fields => [:name] ,
> :remote => true,
> :ferret => {:analyzer => Test2Analyzer.new([]) }
> )
> end
Try this:
acts_as_ferret({ :fields => [:name], :remote => true },
{ :analyzer => Test2Analyzer.new([]) })
From kraemer at webit.de Thu Nov 15 04:07:11 2007
From: kraemer at webit.de (Jens Kraemer)
Date: Thu, 15 Nov 2007 10:07:11 +0100
Subject: [Ferret-talk] acts_as_ferret : cannot use a customized Analyzer
(as indicated in the AdvancedUsageNotes)
In-Reply-To: <473B83A9.4050509@plan99.net>
References:
<20071114092537.GB3558@thunder.jkraemer.net>
<473B83A9.4050509@plan99.net>
Message-ID: <20071115090711.GX10556@cordoba.webit.de>
On Thu, Nov 15, 2007 at 12:24:25AM +0100, Hongli Lai wrote:
> Alain Ravet wrote:
> > class Country < ActiveRecord::Base
> > acts_as_ferret(
> > :fields => [:name] ,
> > :remote => true,
> > :ferret => {:analyzer => Test2Analyzer.new([]) }
> > )
> > end
>
> Try this:
>
> acts_as_ferret({ :fields => [:name], :remote => true },
> { :analyzer => Test2Analyzer.new([]) })
this won't help, these are both valid ways to call acts_as_ferret. The
:ferret syntax is the preferred one, however.
Jens
--
Jens Kr?mer
webit! Gesellschaft f?r neue Medien mbH
Schnorrstra?e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer at webit.de | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
From kraemer at webit.de Thu Nov 15 04:13:18 2007
From: kraemer at webit.de (Jens Kraemer)
Date: Thu, 15 Nov 2007 10:13:18 +0100
Subject: [Ferret-talk] acts_as_ferret : cannot use a customized Analyzer
(as indicated in the AdvancedUsageNotes)
In-Reply-To:
References:
<20071114092537.GB3558@thunder.jkraemer.net>
Message-ID: <20071115091318.GY10556@cordoba.webit.de>
Hi Alain,
could you please check the index created by aaf with plain ferret and
your custom analyzer to see if your queries deliver the expected results
then?
That way we should be able to find out if the problem is with indexing
or searching through aaf.
Jens
On Thu, Nov 15, 2007 at 12:00:04AM +0100, Alain Ravet wrote:
> I'm one step further :
> - Good : I now know aaf knows about/received the custom analyzer
> but
> - Bad : the analyzer is not used by aaf ( : it stops on words it should
> not stop on)
>
> New test : a "no stop word" analyzer, adapted from the german stemming
> analyser @
> http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage
>
>
> file: model/country.rb
> ----------------------
> class Test2Analyzer < ::Ferret::Analysis::Analyzer
> include Ferret::Analysis
> def initialize(stop_words = [])
> @stop_words = stop_words
> end
> def token_stream(field, str)
> StemFilter.new(StopFilter.new(LowerCaseFilter.new(
> StandardTokenizer.new(str)), @stop_words), 'de')
> end
> end
> class Country < ActiveRecord::Base
> acts_as_ferret(
> :fields => [:name] ,
> :remote => true,
> :ferret => {:analyzer => Test2Analyzer.new([]) }
> )
> end
>
>
> 0?/ delete the ferret index directory
> 1?/ restart the console and rebuild the index :
>
>
> ./script/console
> >> Country.rebuild_index
> Asked for a remote server ? true, ENV["FERRET_USE_LOCAL_INDEX"] is nil,
> looks like we are not the server
> Will use remote index server which should be available at
> druby://localhost:9010
> default field list: [:name]
> => nil
>
>
> 2?/ confirm that aaf knows about my "no_stop_words" custom analyzer :
>
> >> puts Country.aaf_index.to_yaml
> --- !ruby/object:ActsAsFerret::RemoteIndex
> config:
> :fields:
> - :name
> :mysql_fast_batches: true
> :name: countries
> :class_name: Country
> :index_dir:
> /Users/aravet/aaprojets/newgids/newgids_machine/index/development/country
> :remote: druby://localhost:9010
> :reindex_batch_size: 1000
> :store_class_name: false
> :ferret_fields:
> :name:
> :store: :no
> :term_vector: :with_positions_offsets
> :boost: 1.0
> :index: :yes
> :highlight: :yes
> :single_index: false
> :ferret: &id001
> :key: :id
> :auto_flush: true
> :or_default: false
> :path:
> /Users/aravet/aaprojets/newgids/newgids_machine/index/development/country
> :create_if_missing: true
> :handle_parse_errors: true
> :analyzer: !ruby/object:Test2Analyzer <<<<----------- Good
> stop_words: [] <<<<----------- Good
> :default_field:
> - :name
> :enabled: true
> ferret_config: *id001
> server: !ruby/object:DRb::DRbObject
> ref:
> uri: druby://localhost:9010
> => nil
>
>
>
>
> 3?/ confirm that there is record with name == "the"
>
> >> Country.find_by_name "the"
> Country Load (0.000427) SELECT * FROM countries WHERE (countries.`name`
> = 'the') LIMIT 1
> => #
>
>
> 4?/ try and find "t*" it with aaf
> => DOES NOT WORK (does not find Country[:name => "the"])
>
> >> Country.find_by_contents "t*"
> Query: t*
> total hits: 0, results delivered: 0
> => # @total_hits=0, @results=[], @total_pages=0>
>
>
> 5?/ do the same for "t*", a non stop word
> => IT WORKS (finds Country[:name => "Frankrijk"])
>
> >> Country.find_by_contents "f*"
> Country Load (0.000420) SELECT * FROM countries WHERE (countries.id in
> ('2'))
> Query: f*
> total hits: 1, results delivered: 1
> => # @total_hits=1, @results=[#], total_pages1
>
>
> So, aaf (rev 262)
> * associates the right custom analyzer with the model,
> * but doesn't seem to use it when finding_by_contents (? and rebuilding the
> index ??)
>
>
> Alain
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
--
Jens Kr?mer
webit! Gesellschaft f?r neue Medien mbH
Schnorrstra?e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer at webit.de | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
From ssmoot at gmail.com Thu Nov 15 09:37:00 2007
From: ssmoot at gmail.com (Sam Smoot)
Date: Thu, 15 Nov 2007 08:37:00 -0600
Subject: [Ferret-talk] Ferret/AAF Stability?
Message-ID:
Hello. I'm the author of DataMapper (http://datamapper.org), and am
trying to choose what Full-Text-Indexing engine/plugin I want to
include by default. I was hoping you guys could help. :-)
Sphinx comes highly recommended, but without live index updates, it
just doesn't seem practical for most of my work.
I'm most experienced with Solr, but the whole HTTP::Request and
general complexity of it is off-putting.
I haven't used Ferret in an application yet, but I love what I see so
far. The ability to have an in-process server in development, and the
clean Ruby API are big wins for me. But I've heard a lot of scary
things about corrupted indexes, even when using the DRb server. Is
this just FUD? Are there any unresolved issues revolving around
corrupted indexes? Can I afford to use Ferret in big applications for
Fortune-500 clients? (I know that sounds... pompous really, but it's a
genuine concern.)
Any advice you could offer would be greatly appreciated.
I've also read a few messages about serializing index requests/updates
to Ferret through message-queues. Are there any decent
guides/blog-posts on this topic?
Thanks, -Sam
From eimorton at gmail.com Thu Nov 15 10:44:39 2007
From: eimorton at gmail.com (Erik Morton)
Date: Thu, 15 Nov 2007 10:44:39 -0500
Subject: [Ferret-talk] Ferret/AAF Stability?
In-Reply-To:
References:
Message-ID: <8A33A7ED-7B45-4C3C-B130-443FE3D6D179@gmail.com>
We have several 3GB indexes with approximately 1 million documents in
each of them. Here are some quick notes, feel free to reach out with
other questions:
* no corruption problems that weren't our fault.
* there was an issue with large index files (> ~2GB) that was patched,
but I'm honestly not sure if it is in the trunk, as the ferret trac/
svn is frequently MIA (which is a concern of course)
* the code is clear and fairly easy to follow. AAF is very easy to
follow.
* I've been very happy with performance of the actual indexing/
searching, however you need to watch out for the processes that are
actually doing the synchronization for writes. DRB is a bottleneck for
us right now, though our volume isn't high enough that I'd call it a
real problem yet.
* for moderately high-volume sites you'll want to consider batching
index updates "offline", though for large indexes make sure that you
have enough IO capacity to optimize the index. We host on EC2 and the
$.1/hour instances simply do not have anywhere near the IO capacity to
optimize a large index without having _every other process_ waiting
for IO. I haven't tested the larger instance types yet.
* we love how easy and efficient it is to combine many indexes into
one. We index tens of thousands of websites in parallel and then
combine 100 or so indexes into one index very quickly.
* the mailing list is great. Jens is on top of things, very receptive
to new ideas and takes *very* good care of AAF. Haven't seen Dave
Balmain in a while.
Overall we are happy. There are times when search accuracy questions
come up, and frequently the problem is that we are not effectively
parsing queries or using the right analyzer for the problem at hand,
so RTFM (http://www.oreilly.com/catalog/9780596527853/).
That's all I can think of now...
Erik
On Nov 15, 2007, at 9:37 AM, Sam Smoot wrote:
> Hello. I'm the author of DataMapper (http://datamapper.org), and am
> trying to choose what Full-Text-Indexing engine/plugin I want to
> include by default. I was hoping you guys could help. :-)
>
> Sphinx comes highly recommended, but without live index updates, it
> just doesn't seem practical for most of my work.
>
> I'm most experienced with Solr, but the whole HTTP::Request and
> general complexity of it is off-putting.
>
> I haven't used Ferret in an application yet, but I love what I see so
> far. The ability to have an in-process server in development, and the
> clean Ruby API are big wins for me. But I've heard a lot of scary
> things about corrupted indexes, even when using the DRb server. Is
> this just FUD? Are there any unresolved issues revolving around
> corrupted indexes? Can I afford to use Ferret in big applications for
> Fortune-500 clients? (I know that sounds... pompous really, but it's a
> genuine concern.)
>
> Any advice you could offer would be greatly appreciated.
>
> I've also read a few messages about serializing index requests/updates
> to Ferret through message-queues. Are there any decent
> guides/blog-posts on this topic?
>
> Thanks, -Sam
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
From bk at benjaminkrause.com Thu Nov 15 13:41:42 2007
From: bk at benjaminkrause.com (Benjamin Krause)
Date: Thu, 15 Nov 2007 19:41:42 +0100
Subject: [Ferret-talk] Ferret/AAF Stability?
In-Reply-To:
References:
Message-ID: <73691E3A-A60B-4055-A610-3CCBA56C4622@benjaminkrause.com>
Hey ..
> I haven't used Ferret in an application yet, but I love what I see so
> far. The ability to have an in-process server in development, and the
> clean Ruby API are big wins for me. But I've heard a lot of scary
> things about corrupted indexes, even when using the DRb server. Is
> this just FUD? Are there any unresolved issues revolving around
> corrupted indexes? Can I afford to use Ferret in big applications for
> Fortune-500 clients? (I know that sounds... pompous really, but it's a
> genuine concern.)
We're using ferret on omdb.org for 14 month without any problems.
There're a few things you might want to work around (Erik pointed
some out). If you expect a huge amount of index updates, you need
to think about a few infrastructural problems, because right now, AAF
does not allow you to cluster indexing servers. but i know there is a
solution for that :)
If you just have huge amount of search queries, there is no need
to worry.. i would not suggest usings AAF's ferret server for searching,
though .. but it's quite easy to do the searching in each mongrel, so
not concern here either.
i guess we need more information about the data you want to index
to give more detailed advices.
> I've also read a few messages about serializing index requests/updates
> to Ferret through message-queues. Are there any decent
> guides/blog-posts on this topic?
yes, that's currently being worked on .. so there will be some guides
later on :)
Cheers
Ben
---
Benjamin Krause
http://www.omdb.org/
bk at benjaminkrause.com
Rails-Schulung "Advancing with Rails" mit David A. Black
19.11.-22.11.2007, Berlin-Mitte
Details u. Anmeldung: http://www.railsschulung.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20071115/c58d15dc/attachment.html
From john at digitalpulp.com Thu Nov 15 14:00:16 2007
From: john at digitalpulp.com (John Bachir)
Date: Thu, 15 Nov 2007 14:00:16 -0500
Subject: [Ferret-talk] Ferret/AAF Stability?
In-Reply-To: <73691E3A-A60B-4055-A610-3CCBA56C4622@benjaminkrause.com>
References:
<73691E3A-A60B-4055-A610-3CCBA56C4622@benjaminkrause.com>
Message-ID: <4AFD0835-1E8C-43B1-BB8C-325D011CCF6F@digitalpulp.com>
On Nov 15, 2007, at 1:41 PM, Benjamin Krause wrote:
> i would not suggest usings AAF's ferret server for searching,
> though .. but it's quite easy to do the searching in each mongrel, so
> not concern here either.
I'm confused... what does "searching" mean in this context? :)
John
From jjm at codewell.com Thu Nov 15 13:39:46 2007
From: jjm at codewell.com (Jeff Mallatt)
Date: Thu, 15 Nov 2007 13:39:46 -0500
Subject: [Ferret-talk] indexing runs out of memory
Message-ID: <7.0.1.0.2.20071115133259.03837958@codewell.com>
I'm using Ferret to index a whole bunch of stuff at once. Thousands
of documents that produce an index which grows to about
1.25Gb. While the indexer is running, I watch the memory use of the
Ruby process grow steadily until it, too, is up to about 1.25Gb -- at
which point the process crashes printing:
[FATAL] failed to allocate memory
Does anyone else have any experience with this mode of
failure? Should I not try to create the index all at once, but
rather do a few documents then close the index then re-open it then
do a few more? Or is a 1.25Gb index simply too big to try to create
on my machine?
TIA
From bk at benjaminkrause.com Thu Nov 15 15:04:34 2007
From: bk at benjaminkrause.com (Benjamin Krause)
Date: Thu, 15 Nov 2007 21:04:34 +0100
Subject: [Ferret-talk] Ferret/AAF Stability?
In-Reply-To: <4AFD0835-1E8C-43B1-BB8C-325D011CCF6F@digitalpulp.com>
References:
<73691E3A-A60B-4055-A610-3CCBA56C4622@benjaminkrause.com>
<4AFD0835-1E8C-43B1-BB8C-325D011CCF6F@digitalpulp.com>
Message-ID: <1DBFE5E9-9565-44AB-B7E7-0010848B869B@benjaminkrause.com>
John,
> On Nov 15, 2007, at 1:41 PM, Benjamin Krause wrote:
>> i would not suggest usings AAF's ferret server for searching,
>> though .. but it's quite easy to do the searching in each mongrel, so
>> not concern here either.
>
> I'm confused... what does "searching" mean in this context? :)
If you're using AAF, you should use the ferret drb server to index
your objects. however, using the ferret server means, whenever
someone is search (if you're using Model.find_by_contents)
the search will be forwarded to the ferret server.
The ferret server will process the searching request and send
the response back to the mongrel. This overhead isn't
necessary, as mongrel could use a local index to do the
search. there is no need to bother the ferret server.
so, indexing (aka updating, creating, saving, whatever) should
use the ferret server, but searching (using find_by_contents)
will use the ferret server if you're using standard AAF, even
though it's not really necessary and could result in a bottleneck.
don't get me wrong. it is totally fine to use standard AAF, unless
you're having huge amounts of searches or livesearches. I would
not recommend use a custom ferret solution, unless you
expect a problem or already have one :)
Cheers
Ben
---
Benjamin Krause
http://www.omdb.org/
bk at benjaminkrause.com
Rails-Schulung "Advancing with Rails" mit David A. Black
19.11.-22.11.2007, Berlin-Mitte
Details u. Anmeldung: http://www.railsschulung.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20071115/122a7037/attachment.html
From bk at benjaminkrause.com Thu Nov 15 15:12:44 2007
From: bk at benjaminkrause.com (Benjamin Krause)
Date: Thu, 15 Nov 2007 21:12:44 +0100
Subject: [Ferret-talk] indexing runs out of memory
In-Reply-To: <7.0.1.0.2.20071115133259.03837958@codewell.com>
References: <7.0.1.0.2.20071115133259.03837958@codewell.com>
Message-ID: <38BBC4CB-F693-499B-82E9-8C1F03B3A7E3@benjaminkrause.com>
Jeff,
On 2007-11-15, at 19:39, Jeff Mallatt wrote:
> [FATAL] failed to allocate memory
Yes, closing and reopening the IndexWriter might
help.
There has been reports about ferret index with 3 or
more gigs on this list.. so i don't think this is a
general problem.
Ben
From aquajags at yahoo.com Fri Nov 16 01:56:12 2007
From: aquajags at yahoo.com (Jagdish rao)
Date: Thu, 15 Nov 2007 22:56:12 -0800 (PST)
Subject: [Ferret-talk] problem with searching plurals (with apostrophe)
Message-ID: <784388.66109.qm@web60416.mail.yahoo.com>
hello guys,
i am using acts_as_ferret plugin(0.4.1 Latest) with ferret gem(0.11.4 Latest)
on rails 1.2.5 and ruby 1.8.6(UBUNTU Gutsy)
i have this
:Stores Model
acts_as_ferret :fields => {:name => { :boost => 2 ,:store => :yes},
:short_desc => { :boost => 1.5,:store =>
:yes },
:tag_list => {:boost => 1 },
:name_for_sort => {:index => :untokenized}
}
and i search using this code in my Stores controller
@products = Store.find_by_contents params[:q].to_s.upcase+"*"
for e.g i have a Stores with name as
"benhank's coffee outlet"
when i search for "benhank" i get the resultant store as expected.
but when i search with param as "benhank's" or "benhanks" ---
i dont get anyresults.
atleast i shud have got the result for search with "benhank's"
which is actually what is entered in :name field
how can i get this done. pls help
i have been trying to understand to use the analysers and tokenisers
but couldn't get through.also looking at wildcardquery and fuzzy things
thanks
jags
____________________________________________________________________________________
Be a better sports nut! Let your teams follow you
with Yahoo Mobile. Try it now. http://mobile.yahoo.com/sports;_ylt=At9_qDKvtAbMuh1G1SQtBI7ntAcJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20071115/d387f517/attachment.html
From kraemer at webit.de Fri Nov 16 04:31:53 2007
From: kraemer at webit.de (Jens Kraemer)
Date: Fri, 16 Nov 2007 10:31:53 +0100
Subject: [Ferret-talk] problem with searching plurals (with apostrophe)
In-Reply-To: <784388.66109.qm@web60416.mail.yahoo.com>
References: <784388.66109.qm@web60416.mail.yahoo.com>
Message-ID: <20071116093153.GE10556@cordoba.webit.de>
Hi,
your problem is pretty much analyzers and tokenization related, so you
really should understand what happens there.
Tried the Ferret short cut pdf book from o'reilly?
In general you'll need a stemming analyzer to strip plural endings from
words. Regarding the "'" - it's a question of the tokenizer you're using
whether the 's ending is considered to be part of the word it follows,
or 's' is interpreted as a term of it's own.
Cheers,
Jens
On Thu, Nov 15, 2007 at 10:56:12PM -0800, Jagdish rao wrote:
>
>
> hello guys,
>
> i am using acts_as_ferret plugin(0.4.1 Latest) with ferret gem(0.11.4 Latest)
> on rails 1.2.5 and ruby 1.8.6(UBUNTU Gutsy)
> i have this
> :Stores Model
> acts_as_ferret :fields => {:name => { :boost => 2 ,:store => :yes},
> :short_desc => { :boost => 1.5,:store =>
> :yes },
> :tag_list => {:boost => 1 },
> :name_for_sort => {:index => :untokenized}
> }
>
> and i search using this code in my Stores controller
>
> @products = Store.find_by_contents params[:q].to_s.upcase+"*"
>
> for e.g i have a Stores with name as
> "benhank's coffee outlet"
>
> when i search for "benhank" i get the resultant store as expected.
> but when i search with param as "benhank's" or "benhanks" ---
> i dont get anyresults.
> atleast i shud have got the result for search with "benhank's"
> which is actually what is entered in :name field
>
> how can i get this done. pls help
> i have been trying to understand to use the analysers and tokenisers
> but couldn't get through.also looking at wildcardquery and fuzzy things
>
> thanks
> jags
>
>
>
>
>
> ____________________________________________________________________________________
> Be a better sports nut! Let your teams follow you
> with Yahoo Mobile. Try it now. http://mobile.yahoo.com/sports;_ylt=At9_qDKvtAbMuh1G1SQtBI7ntAcJ
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
--
Jens Kr?mer
webit! Gesellschaft f?r neue Medien mbH
Schnorrstra?e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer at webit.de | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
From scottd at gmail.com Fri Nov 16 05:56:26 2007
From: scottd at gmail.com (Scott Davies)
Date: Fri, 16 Nov 2007 02:56:26 -0800
Subject: [Ferret-talk] Multithreading / multiprocessing woes
Message-ID: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
I've been running some multithreaded tests on Ferret. Using a single
Ferret::Index::Index inside a DRb server, it definitely behaves for me
as if all readers are locked out of the index when writing is going on
in that index, not just optimization -- at least when segment merging
happens, which is when the writes take the longest and you can
therefore least afford to lock out all reads. This is very easy to
notice when you add, say, your 100,000th document to the index, and
that one write takes over 5 seconds to complete because it triggers a
bunch of incremental segment-merging, and all queries to the index
stall in the meantime. Or when you add your millionth document, which
can stall all reads for over a minute. :-(
When I try to use an IndexReader in a separate process, things are
even worse. The IndexReader doesn't see any updates to the index
since it was created. Not too surprising, but if I try creating a new
IndexReader for every query, and have the Index in the other writing
process turn on auto_flush, then the reading process crashes after a
few (generally fewer than 100) queries, in one of at least two
different ways selected apparently at random:
Failure Mode #1:
script/ferret_speedtest2_reader:30:in `initialize': IO Error occured
at :93 in xraise (IOError)
Error occured in index.c:901 - sis_find_segments_file
Error reading the segment infos. Store listing was
from script/ferret_speedtest2_reader:30:in `new'
from script/ferret_speedtest2_reader:30:in `run_test_query'
[Yes, there really are two blank lines after "Store listing was".]
Failure Mode #2:
script/ferret_speedtest2_reader:30:in `initialize': IO Error occured
at :93 in xraise (IOError)
Error occured in fs_store.c:127 - fs_each
doing 'each' in
/Users/scott/dev/ruby/timetracker/tmp/ferret_speedtest_index:
from script/ferret_speedtest2_reader:30:in `new'
from script/ferret_speedtest2_reader:30:in `run_test_query'
Meanwhile, if I try eliminating this second failure mode by explicitly
calling close on the IndexReader
before I throw it away, the close immediately crashes with:
script/ferret_speedtest2_reader:45: [BUG] Bus Error
ruby 1.8.6 (2007-03-13) [i686-darwin8.8.5]
Abort trap
Given the combination of problems above, I'm at a loss to understand
how to use Ferret on a live website that requires reasonably fast
turnaround between a user submitting data and the user being able to
search over that data, unless either (1) the site only gets a few
thousand new index entries per day and the site can be taken down for
a few minutes daily to optimize the index, or (2) it's OK for the
entire site to periodically stall on all queries for seconds or even
minutes whenever segment-merging happens to kick in.
Do all Ferret users just suck it up and live with one of these
limitations, or am I missing something and/or just getting "lucky"
with the errors above?
For reference, the system being used here is a Mac running Leopard,
although I doubt that matters...
From bk at benjaminkrause.com Fri Nov 16 07:12:34 2007
From: bk at benjaminkrause.com (Benjamin Krause)
Date: Fri, 16 Nov 2007 13:12:34 +0100
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
Message-ID:
Scott,
> Do all Ferret users just suck it up and live with one of these
> limitations, or am I missing something and/or just getting "lucky"
> with the errors above?
This limitations you're talking about are known and will be fixed
in the near future.. the trick is, to have one read-only and one
write-only index.. This is currently being worked on. If you need
a fix right now, you need to do it yourself but you can take a look
on omdb's code and how it's done there:
http://bugs.omdb.org/browser/branches/2007.1/lib/omdb/ferret/lib/util.rb
(see the switch code)
If you don't need a fix right now, i'm sure AAF will come up with
a solution for that in the near future (aka probably not this year).
on a side note.. for the to many open files error, see:
http://ferret.davebalmain.com/api/classes/Ferret/Index/IndexWriter.html
(use_compound_file, you may have set this to false) or simply increase
the number of open files. On omdb we're running with 32k :-)
rails at homer.omdb.org ~ $ ulimit -n
32768
Cheers
Ben
From pjones at pmade.com Fri Nov 16 11:24:01 2007
From: pjones at pmade.com (Peter Jones)
Date: Fri, 16 Nov 2007 09:24:01 -0700
Subject: [Ferret-talk] Reducing dependency on remote ferret process
In-Reply-To:
References:
Message-ID:
Morten,
If you're still looking at how to solve this, here is what I did.
This is just a hack, but I didn't really have a choice, this coupling
was killing my entire application stack.
--- act_methods.rb (revision 1534)
+++ act_methods.rb (working copy)
@@ -185,9 +185,10 @@
end
logger.info "default field list: #{aaf_configuration[:ferret]
[:default_field].inspect}"
- if options[:remote]
- aaf_index.ensure_index_exists
- end
+ # FIXME fix and send a patch to the AAF team
+ # if options[:remote]
+ # aaf_index.ensure_index_exists
+ # end
end
--
Peter Jones - 303-669-2637
pmade inc. - http://pmade.com
On Nov 11, 2007, at 08:09, Morten wrote:
> We use FerretDrb for search. If the ferret process is down, our entire
> application comes down the moment we try to save a model which is
> indexed.
>
> Is there a way to decouple this relationship such that we can somehow
> resume normal operations despite ferret being down and not index the
> model?
From mail at stuartsierra.com Fri Nov 16 12:19:10 2007
From: mail at stuartsierra.com (Stuart Sierra)
Date: Fri, 16 Nov 2007 12:19:10 -0500
Subject: [Ferret-talk] Ferret/AAF Stability?
In-Reply-To:
References:
Message-ID: <314ee0450711160919y7eaad1cl16ef0bc349b09dee@mail.gmail.com>
On Nov 15, 2007 9:37 AM, Sam Smoot wrote:
> Hello. I'm the author of DataMapper (http://datamapper.org), and am
> trying to choose what Full-Text-Indexing engine/plugin I want to
> include by default. I was hoping you guys could help. :-)
>
> Sphinx comes highly recommended, but without live index updates, it
> just doesn't seem practical for most of my work.
>
> I'm most experienced with Solr, but the whole HTTP::Request and
> general complexity of it is off-putting.
For a different perspective: I'm in the middle of switching from
Ferret to Solr. I like Ferret a lot, and still use it on several
sites, but I had some problems with one large site:
1. the patches for large-index support are still in development;
2. each update to Ferret requires rebuilding the index;
3. Ferret doesn't yet support compressed indexes.
My other reason for switching is that Rails' ActiveRecord is not
well-suited to storing large documents, which made acts_as_ferret less
compelling.
I was nervous about tackling Solr, but I've found it quite easy to
use, and the built-in caching and multithreading make it fast.
I think Ferret is adequate for most search tasks, but if (like me)
you're building a dedicated search engine, Solr is currently a
stronger candidate.
-Stuart Sierra
From scottd at gmail.com Fri Nov 16 15:35:36 2007
From: scottd at gmail.com (Scott Davies)
Date: Fri, 16 Nov 2007 12:35:36 -0800
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To:
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
Message-ID: <75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
Hi Ben --
Thanks much for the quick and helpful reply! Unfortunately, the
solution you're using on omdb looks suspect to me, for the same reason
that Alex Neth brought up a few days ago on this list: to my knowledge
there's no guarantee that rsync will produce a coherent snapshot of
the source directory as it was at any one particular instant in time.
In fact, I don't see how rsync could both always terminate in finite
time and provide such a guarantee, except on exotic filesystems that
provide, say, atomic snapshots with copy-on-write capabilities.
(Sigh...sometimes I miss the Google File System.) In which case you'd
have to disable your site during the rsync in order to prevent
corruption, which basically boils down to the "must take site offline
daily for a few minutes to deal with this problem" limitation. I'm
guessing the rsync is faster than an index optimization, so I guess
this might at least cut down on the amount of time the site has to be
down, but still...wah.
Am I a fool for wondering whether it might ultimately be less painful
to try an index server that runs Lucene under a JRuby process?
On Nov 16, 2007 4:12 AM, Benjamin Krause wrote:
> Scott,
>
> > Do all Ferret users just suck it up and live with one of these
> > limitations, or am I missing something and/or just getting "lucky"
> > with the errors above?
>
> This limitations you're talking about are known and will be fixed
> in the near future.. the trick is, to have one read-only and one
> write-only index.. This is currently being worked on. If you need
> a fix right now, you need to do it yourself but you can take a look
> on omdb's code and how it's done there:
>
> http://bugs.omdb.org/browser/branches/2007.1/lib/omdb/ferret/lib/util.rb
> (see the switch code)
>
> If you don't need a fix right now, i'm sure AAF will come up with
> a solution for that in the near future (aka probably not this year).
>
> on a side note.. for the to many open files error, see:
>
> http://ferret.davebalmain.com/api/classes/Ferret/Index/IndexWriter.html
> (use_compound_file, you may have set this to false) or simply increase
> the number of open files. On omdb we're running with 32k :-)
>
> rails at homer.omdb.org ~ $ ulimit -n
> 32768
>
> Cheers
> Ben
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
From bk at benjaminkrause.com Fri Nov 16 17:40:03 2007
From: bk at benjaminkrause.com (Benjamin Krause)
Date: Fri, 16 Nov 2007 23:40:03 +0100
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
Message-ID: <0F759925-5885-4389-BD81-95E28665AA90@benjaminkrause.com>
Scott,
we're using two directories, not one for ferret. One
index is the passive index. it is not used for searches,
but new indexing requests will be added to that index.
so lets call it the indexing-index.
all mongrels will use the second directory, lets call it
searching-index. Both indexes are almost identical,
i'll explain the differences.
All out indexing requests are queued. So whenever
you want to index something, it will be placed in the
queue, and added to the indexing-index. After a
certain amount of queue-items added to the index,
we're stopping indexing. The queue will be halted.
New requests can be added, but nothing will be
added to the indexing-index.
Now we're rsyncing the indexing-index to all machines,
remember, searching is still done in the searching-index,
which is outdated, but we don't mind about that :)
After rsync is complete, we're switching both directories,
so the indexing-index becomes the searching-index and
vice versa. Actually we're just switching symlinks, so
the this will take almost no time. And even if one of the
mongrels still have a filehandle to the old index open,
nothing will happen, it is still using the outdated index,
but the next request will use the new index. After that,
the new indexing-index will be synced from the
searching-index. As the searching-index is read-only,
there is no risk of corrupting something during the
sync.
Now we're resuming processing the queue, until we've
added our certain amount of queue entries, or the queue
is empty.
The downside is, that the searching-index is outdated,
but not more that a couple of minutes (about 2 minutes
on omdb). We didn't have one corrupted index since.
There is now downtime whatsoever, and the rsync snapshot
will always be coherent.
Cheers
Ben
From scottd at gmail.com Fri Nov 16 19:37:26 2007
From: scottd at gmail.com (Scott Davies)
Date: Fri, 16 Nov 2007 16:37:26 -0800
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <0F759925-5885-4389-BD81-95E28665AA90@benjaminkrause.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<0F759925-5885-4389-BD81-95E28665AA90@benjaminkrause.com>
Message-ID: <75f591160711161637p74fded32h39d58fc56f29a341@mail.gmail.com>
Ben --
Thanks for the detailed explanation! Yes, that does make sense. If I
understand it correctly, though, something won't show up in a search
until at least one index switch happens after it's been submitted,
which means we're talking about a minute or so on average (not just
worst-case) from submission to search result, even if the switches are
being done constantly (given that each switch takes about two
minutes). For my site, I'm really hoping that most content will show
up within a second or so of its submission. That simply can't happen
if I'm not updating the same index I'm doing searches with. I'd be OK
with the turnaround *occasionally* being a minute -- say, while an
index optimization or particularly large segment merge happens. But
so far it looks to me like the choices with Ferret are either:
(1) The *average* time from submission to search result is on the
order of minutes. However, searches are always reasonably fast.
(Your approach.)
(2) The average time from submission to search result is less than a
second. However, the *worst-case* times can be minutes, and now all
*searches* stall over those minutes as well, which is Bad. If you
don't get more than a few thousand submissions per day, you can at
least schedule these outages as nightly index optimizations, but
you'll have the outages one way or another. (All "same index used for
reading + writing" approaches.)
I don't think either of these choices is very good for the particular
site I have in mind (at least if I'm being optimistic enough about its
chances of "taking off" to worry about the possibility of many
thousands of submissions / day). Am I correct in my summarization of
the two choices with Ferret here, or have I missed something?
Anyhow, thanks again! If those two options are in fact what I have, I
think I'll run some tests with Lucene/JRuby to see whether that
provides a third option as far as performance goes, and report back
what sort of issues come up. (My guess is that it'll be moderately
painful to set up and that the average throughput will be worse than
Ferret's, but that an average submission-to-search-result turnaround
time of a second or two will be achievable without the site
necessarily going completely down for minutes every now and then.
We'll see.)
-- Scott
On Nov 16, 2007 2:40 PM, Benjamin Krause wrote:
> Scott,
>
> we're using two directories, not one for ferret. One
> index is the passive index. it is not used for searches,
> but new indexing requests will be added to that index.
> so lets call it the indexing-index.
>
> all mongrels will use the second directory, lets call it
> searching-index. Both indexes are almost identical,
> i'll explain the differences.
>
> All out indexing requests are queued. So whenever
> you want to index something, it will be placed in the
> queue, and added to the indexing-index. After a
> certain amount of queue-items added to the index,
> we're stopping indexing. The queue will be halted.
> New requests can be added, but nothing will be
> added to the indexing-index.
>
> Now we're rsyncing the indexing-index to all machines,
> remember, searching is still done in the searching-index,
> which is outdated, but we don't mind about that :)
>
> After rsync is complete, we're switching both directories,
> so the indexing-index becomes the searching-index and
> vice versa. Actually we're just switching symlinks, so
> the this will take almost no time. And even if one of the
> mongrels still have a filehandle to the old index open,
> nothing will happen, it is still using the outdated index,
> but the next request will use the new index. After that,
> the new indexing-index will be synced from the
> searching-index. As the searching-index is read-only,
> there is no risk of corrupting something during the
> sync.
>
> Now we're resuming processing the queue, until we've
> added our certain amount of queue entries, or the queue
> is empty.
>
> The downside is, that the searching-index is outdated,
> but not more that a couple of minutes (about 2 minutes
> on omdb). We didn't have one corrupted index since.
> There is now downtime whatsoever, and the rsync snapshot
> will always be coherent.
>
>
> Cheers
> Ben
>
>
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
From erik at ehatchersolutions.com Fri Nov 16 16:13:15 2007
From: erik at ehatchersolutions.com (Erik Hatcher)
Date: Fri, 16 Nov 2007 16:13:15 -0500
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
Message-ID: <8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
On Nov 16, 2007, at 3:35 PM, Scott Davies wrote:
> Am I a fool for wondering whether it might ultimately be less painful
> to try an index server that runs Lucene under a JRuby process?
Or, rather, an index server that runs Solr accessed with a pure Ruby,
solr-ruby, API (which works with MRI or JRuby)? :)
Erik
From scottd at gmail.com Sat Nov 17 05:12:46 2007
From: scottd at gmail.com (Scott Davies)
Date: Sat, 17 Nov 2007 02:12:46 -0800
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
Message-ID: <75f591160711170212q1d4d6475v3e830a64ff4c3dc2@mail.gmail.com>
Hmmm...I'd first heard of Solr only a couple of days ago, and I hadn't
been aware of a Ruby API to it until you mentioned it.
Interesting...thanks!
On Nov 16, 2007 1:13 PM, Erik Hatcher wrote:
>
> On Nov 16, 2007, at 3:35 PM, Scott Davies wrote:
> > Am I a fool for wondering whether it might ultimately be less painful
> > to try an index server that runs Lucene under a JRuby process?
>
> Or, rather, an index server that runs Solr accessed with a pure Ruby,
> solr-ruby, API (which works with MRI or JRuby)? :)
>
> Erik
>
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
From jk at jkraemer.net Sat Nov 17 07:39:26 2007
From: jk at jkraemer.net (Jens Kraemer)
Date: Sat, 17 Nov 2007 13:39:26 +0100
Subject: [Ferret-talk] Ferret/AAF Stability?
In-Reply-To: <314ee0450711160919y7eaad1cl16ef0bc349b09dee@mail.gmail.com>
References:
<314ee0450711160919y7eaad1cl16ef0bc349b09dee@mail.gmail.com>
Message-ID: <20071117123925.GO3558@thunder.jkraemer.net>
Hi!
On Fri, Nov 16, 2007 at 12:19:10PM -0500, Stuart Sierra wrote:
[..]
> For a different perspective: I'm in the middle of switching from
> Ferret to Solr. I like Ferret a lot, and still use it on several
> sites, but I had some problems with one large site:
>
> 1. the patches for large-index support are still in development;
Let's hope Dave reads this ;-) However there are several sites I know of
with Index sizes > several GB, so they seem to be working well enough.
> 2. each update to Ferret requires rebuilding the index;
This for sure is annoying but I'd consider this normal for a library
that has developed that fast. I think Dave has had very good reasons for each
of the changes he did to the index format. Plus I don't think *every*
release had a new index format ;-)
> 3. Ferret doesn't yet support compressed indexes.
At least from the docs it looks like it does, see
http://ferret.davebalmain.com/api/classes/Ferret/Index/FieldInfo.html .
I didn't ever try this out however.
> My other reason for switching is that Rails' ActiveRecord is not
> well-suited to storing large documents, which made acts_as_ferret less
> compelling.
That's a good point, and we plan to make aaf independent from
active_record in the future.
> I was nervous about tackling Solr, but I've found it quite easy to
> use, and the built-in caching and multithreading make it fast.
numbers, please :-)
> I think Ferret is adequate for most search tasks, but if (like me)
> you're building a dedicated search engine, Solr is currently a
> stronger candidate.
Well, As Solr uses Lucene internally, the mechanics and performance
characteristics naturally can't be that different from Ferret. Maybe
Ferret has a bug or two and a non-working inter-process locking (which
doesn't matter when you think about building a dedicated search server
like Solr is, since it's only one process), but the general internal
handling of the index is the same, i.e. you can also only have one
Writer open to a Lucene index at a time, and Searchers won't see index
changes until re-opened, too.
Having that said, if my application's main concern would be search, I
most probably wouldn't choose any pre-cooked solution like aaf or Solr,
but build exactly the thing I need from scratch, basing it either on
Lucene or Ferret. But maybe that's just me ;-)
Cheers,
Jens
--
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database
From jk at jkraemer.net Sat Nov 17 16:50:57 2007
From: jk at jkraemer.net (Jens Kraemer)
Date: Sat, 17 Nov 2007 22:50:57 +0100
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
Message-ID: <20071117215057.GP3558@thunder.jkraemer.net>
Hi!
On Fri, Nov 16, 2007 at 02:56:26AM -0800, Scott Davies wrote:
> I've been running some multithreaded tests on Ferret. Using a single
> Ferret::Index::Index inside a DRb server, it definitely behaves for me
> as if all readers are locked out of the index when writing is going on
> in that index, not just optimization -- at least when segment merging
> happens, which is when the writes take the longest and you can
> therefore least afford to lock out all reads. This is very easy to
> notice when you add, say, your 100,000th document to the index, and
> that one write takes over 5 seconds to complete because it triggers a
> bunch of incremental segment-merging, and all queries to the index
> stall in the meantime. Or when you add your millionth document, which
> can stall all reads for over a minute. :-(
Don't get me wrong, but how often do you think you'll add your millionth
document to the index?
And even if you really do index a million documents per week - I
wouldn't exactly call it bad performance if one or two search requests
*per week* take a minute to complete, while all others are completed in
less than a second...
Having that said, the problem with blocking searches might be possible
to solve by not using Ferret's Index class for searching/indexing, but
using the lower level APIs (Searcher and IndexWriter) and doing manual
synchronization (inside *one* process). I didn't feel the need to
implement this for aaf (yet ;-), since I think it's already fast enough
to not be the bottleneck in most real world usage scenarios (say -
typical Rails apps using aaf for full text search).
> When I try to use an IndexReader in a separate process, things are
> even worse. The IndexReader doesn't see any updates to the index
> since it was created. Not too surprising, but if I try creating a new
> IndexReader for every query, and have the Index in the other writing
> process turn on auto_flush, then the reading process crashes after a
> few (generally fewer than 100) queries, in one of at least two
> different ways selected apparently at random:
[..]
Stick to the one-process-per-index rule to be on the safe side.
> Given the combination of problems above, I'm at a loss to understand
> how to use Ferret on a live website that requires reasonably fast
> turnaround between a user submitting data and the user being able to
> search over that data, unless either (1) the site only gets a few
> thousand new index entries per day and the site can be taken down for
> a few minutes daily to optimize the index, or (2) it's OK for the
> entire site to periodically stall on all queries for seconds or even
> minutes whenever segment-merging happens to kick in.
I wouldn't set the limit at a few thousand new documents per day, and
also optimizing daily is only useful if you're having lots of document
deletions per day.
Cheers,
Jens
PS: If you happen to benchmark Solr against aaf's DRb server, be sure to
let us know your findings :-)
--
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database
From ndaniels at mac.com Sat Nov 17 18:27:29 2007
From: ndaniels at mac.com (Noah M. Daniels)
Date: Sat, 17 Nov 2007 18:27:29 -0500
Subject: [Ferret-talk] crash while building index
Message-ID:
Hi,
I'm trying to reindex a model (I'm using acts_as_ferret) after having
added (via metaprogramming) a large number of fields (several hundred)
to the index.
It keeps crashing when trying to rebuild the index (the crash log is
below, from ferret_server.out) but it only seems to crash on Linux
(Ubuntu server 7.04, x86-64) whereas it's fine on my OS X laptop
(10.5.1). This is with ferret 0.11.4 in both cases.
Any thoughts? Is there a hard field limit in ferret?
*** glibc detected *** ruby: realloc(): invalid next size:
0x000000000232ffc0 ***
======= Backtrace: =========
/lib/libc.so.6[0x2ae17c1a549d]
/lib/libc.so.6(realloc+0x124)[0x2ae17c1a74e4]
/usr/lib/libruby1.8.so.1.8(ruby_xrealloc+0x5c)[0x2ae17b5baf8c]
/usr/lib/ruby/gems/1.8/gems/ferret-0.11.4/lib/ferret_ext.so(mp_alloc
+0xb6)[0x2ae18094c886]
/usr/lib/ruby/gems/1.8/gems/ferret-0.11.4/lib/
ferret_ext.so(dw_get_fld_inv+0xf7)[0x2ae1809732b7]
/usr/lib/ruby/gems/1.8/gems/ferret-0.11.4/lib/ferret_ext.so(dw_add_doc
+0x86)[0x2ae18097a146]
/usr/lib/ruby/gems/1.8/gems/ferret-0.11.4/lib/ferret_ext.so(iw_add_doc
+0x24)[0x2ae18097a284]
/usr/lib/ruby/gems/1.8/gems/ferret-0.11.4/lib/
ferret_ext.so[0x2ae1809384a3]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a40ce]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a45d8]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac541]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5af52e]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac6f0]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5af52e]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac6f0]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ad207]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a40ea]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a45d8]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac67f]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ad8dd]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5acfb1]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a40ea]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a45d8]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac541]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ad8dd]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a40ea]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a45d8]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac541]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5af52e]
/usr/lib/libruby1.8.so.1.8(rb_ary_each+0x23)[0x2ae17b58a853]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a40ce]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a45d8]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac541]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ad8dd]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5af52e]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac6f0]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a40ea]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a45d8]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac67f]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ad8dd]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac44c]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5aaa57]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5af52e]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac6f0]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5af52e]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5cdd53]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a40ce]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a45d8]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac541]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ad8dd]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5af52e]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac6f0]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5acfb1]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ad207]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a40ea]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a45d8]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ac541]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ae19f]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5abb23]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5aaa57]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5acfb1]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5ad207]
/usr/lib/libruby1.8.so.1.8[0x2ae17b5a40ea]
======= Memory map: ========
00400000-00401000 r-xp 00000000 fe:00
6148859 /usr/bin/ruby1.8
00600000-00601000 rw-p 00000000 fe:00
6148859 /usr/bin/ruby1.8
00601000-02cb4000 rw-p 00601000 00:00
0 [heap]
40000000-40001000 ---p 40000000 00:00 0
40001000-40801000 rw-p 40001000 00:00 0
2aaaaaaac000-2aaaaaaaf000 rw-p 2aaaaaaac000 00:00 0
2aaaaaaaf000-2aaaaaaea000 r--p 00000000 fe:00
6178429 /usr/lib/locale/en_US.utf8/LC_CTYPE
2aaaaaaea000-2aaaaaaf1000 r--s 00000000 fe:00
6146048 /usr/lib/gconv/gconv-modules.cache
2aaaaaaf1000-2aaaaaaf9000 r-xp 00000000 fe:00
6294630 /usr/lib/ruby/gems/1.8/gems/postgres-0.7.1/
postgres.so
2aaaaaaf9000-2aaaaacf8000 ---p 00008000 fe:00
6294630 /usr/lib/ruby/gems/1.8/gems/postgres-0.7.1/
postgres.so
2aaaaacf8000-2aaaaacf9000 rw-p 00007000 fe:00
6294630 /usr/lib/ruby/gems/1.8/gems/postgres-0.7.1/
postgres.so
2aaaaacf9000-2aaaaacfd000 rw-p 2aaaaacf9000 00:00 0
2aaaaacff000-2aaaaad1e000 r-xp 00000000 fe:00
6149409 /usr/lib/libpq.so.5.0
2aaaaad1e000-2aaaaaf1e000 ---p 0001f000 fe:00
6149409 /usr/lib/libpq.so.5.0
2aaaaaf1e000-2aaaaaf20000 rw-p 0001f000 fe:00
6149409 /usr/lib/libpq.so.5.0
2aaaaaf20000-2aaaaaf34000 r-xp 00000000 fe:00
29212699 /lib/libnsl-2.5.so
2aaaaaf34000-2aaaab134000 ---p 00014000 fe:00
29212699 /lib/libnsl-2.5.so
2aaaab134000-2aaaab136000 rw-p 00014000 fe:00
29212699 /lib/libnsl-2.5.so
2aaaab136000-2aaaab138000 rw-p 2aaaab136000 00:00 0
2aaaab138000-2aaaab1bb000 r-xp 00000000 fe:00
6148718 /usr/lib/libkrb5.so.3.2
2aaaab1bb000-2aaaab3ba000 ---p 00083000 fe:00
6148718 /usr/lib/libkrb5.so.3.2
2aaaab3ba000-2aaaab3be000 rw-p 00082000 fe:00
6148718 /usr/lib/libkrb5.so.3.2
2aaaab3be000-2aaaab3c0000 r-xp 00000000 fe:00
29212682 /lib/libcom_err.so.2.1
2aaaab3c0000-2aaaab5bf000 ---p 00002000 fe:00
29212682 /lib/libcom_err.so.2.1
2aaaab5bf000-2aaaab5c0000 rw-p 00001000 fe:00
29212682 /lib/libcom_err.so.2.1
2aaaab5c0000-2aaaab5e3000 r-xp 00000000 fe:00
6148715 /usr/lib/libk5crypto.so.3.0
2aaaab5e3000-2aaaab7e2000 ---p 00023000 fe:00
6148715 /usr/lib/libk5crypto.so.3.0
2aaaab7e2000-2aaaab7e4000 rw-p 00022000 fe:00
6148715 /usr/lib/libk5crypto.so.3.0
2aaaab7e4000-2aaaab7f5000 r-xp 00000000 fe:00
29212708 /lib/libresolv-2.5.so
2aaaab7f5000-2aaaab9f5000 ---p 00011000 fe:00
29212708 /lib/libresolv-2.5.so
2aaaab9f5000-2aaaab9f7000 rw-p 00011000 fe:00
29212708 /lib/libresolv-2.5.so
2aaaab9f7000-2aaaab9f9000 rw-p 2aaaab9f7000 00:00 0
2aaaab9f9000-2aaaab9fd000 r-xp 00000000 fe:00
6148719 /usr/lib/libkrb5support.so.0.0
2aaaab9fd000-2aaaabbfc000 ---p 00004000 fe:00
6148719 /usr/lib/libkrb5support.so.0.0
2aaaabbfc000-2aaaabbfd000 rw-p 00003000 fe:00
6148719 /usr/lib/libkrb5support.so.0.0
2aaaabbfd000-2aaaabc04000 r-xp 00000000 fe:00
29212700 /lib/libnss_compat-2.5.so
2aaaabc04000-2aaaabe04000 ---p 00007000 fe:00
29212700 /lib/libnss_compat-2.5.so
2aaaabe04000-2aaaabe06000 rw-p 00007000 fe:00
29212700 /lib/libnss_compat-2.5.so
2aaaabe06000-2aaaabe10000 r-xp 00000000 fe:00
29212704 /lib/libnss_nis-2.5.so
2aaaabe10000-2aaaac00f000 ---p 0000a000 fe:00
29212704 /lib/libnss_nis-2.5.so
2aaaac00f000-2aaaac011000 rw-p 00009000 fe:00
29212704 /lib/libnss_nis-2.5.so
2aaaac011000-2aaaac415000 rw-p 2aaaac011000 00:00 0
2aaaac41b000-2aaaac428000 r-xp 00000000 fe:00
29212686 /lib/libgcc_s.so.1
2aaaac428000-2aaaac628000 ---p 0000d000 fe:00
29212686 /lib/libgcc_s.so.1
2aaaac628000-2aaaac629000 rw-p 0000d000 fe:00
29212686 /lib/libgcc_s.so.1
2aaab0000000-2aaab0021000 rw-p 2aaab0000000 00:00 0
2aaab0021000-2aaab4000000 ---p 2aaab0021000 00:00 0
2ae17b353000-2ae17b36f000 r-xp 00000000 fe:00
29212687 /lib/ld-2.5.so
2ae17b36f000-2ae17b3d4000 rw-p 2ae17b36f000 00:00 0
2ae17b3d5000-2ae17b488000 rw-p 2ae17b3d5000 00:00 0
2ae17b56e000-2ae17b570000 rw-p 0001b000 fe:00
29212687 /lib/ld-2.5.so
2ae17b570000-2ae17b63d000 r-xp 00000000 fe:00
6148857 /usr/lib/libruby1.8.so.1.8.5
2ae17b63d000-2ae17b83c000 ---p 000cd000 fe:00
6148857 /usr/lib/libruby1.8.so.1.8.5
2ae17b83c000-2ae17b841000 rw-p 000cc000 fe:00
6148857 /usr/lib/libruby1.8.so.1.8.5
2ae17b841000-2ae17b85e000 rw-p 2ae17b841000 00:00 0
2ae17b85e000-2ae17b873000 r-xp 00000000 fe:00
29212707 /lib/libpthread-2.5.so
2ae17b873000-2ae17ba73000 ---p 00015000 fe:00
29212707 /lib/libpthread-2.5.so
2ae17ba73000-2ae17ba75000 rw-p 00015000 fe:00
29212707 /lib/libpthread-2.5.so
2ae17ba75000-2ae17ba79000 rw-p 2ae17ba75000 00:00 0
2ae17ba79000-2ae17ba7b000 r-xp 00000000 fe:00
29212696 /lib/libdl-2.5.so
2ae17ba7b000-2ae17bc7b000 ---p 00002000 fe:00
29212696 /lib/libdl-2.5.so
2ae17bc7b000-2ae17bc7d000 rw-p 00002000 fe:00
29212696 /lib/libdl-2.5.so
2ae17bc7d000-2ae17bc82000 r-xp 00000000 fe:00
29212695 /lib/libcrypt-2.5.so
2ae17bc82000-2ae17be81000 ---p 00005000 fe:00
29212695 /lib/libcrypt-2.5.so
2ae17be81000-2ae17be83000 rw-p 00004000 fe:00
29212695 /lib/libcrypt-2.5.so
2ae17be83000-2ae17beb2000 rw-p 2ae17be83000 00:00 0
2ae17beb2000-2ae17bf33000 r-xp 00000000 fe:00
29212697 /lib/libm-2.5.so
2ae17bf33000-2ae17c132000 ---p 00081000 fe:00
29212697 /lib/libm-2.5.so
2ae17c132000-2ae17c134000 rw-p 00080000 fe:00
29212697 /lib/libm-2.5.so
2ae17c134000-2ae17c27b000 r-xp 00000000 fe:00
29212693 /lib/libc-2.5.so
2ae17c27b000-2ae17c47b000 ---p 00147000 fe:00
29212693 /lib/libc-2.5.so
2ae17c47b000-2ae17c47e000 r--p 00147000 fe:00
29212693 /lib/libc-2.5.so
2ae17c47e000-2ae17c480000 rw-p 0014a000 fe:00
29212693 /lib/libc-2.5.so
2ae17c480000-2ae17c487000 rw-p 2ae17c480000 00:00 0
2ae17c487000-2ae17c492000 r-xp 00000000 fe:00
6209867 /usr/lib/ruby/1.8/x86_64-linux/socket.so
2ae17c492000-2ae17c691000 ---p 0000b000 fe:00
6209867 /usr/lib/ruby/1.8/x86_64-linux/socket.so
2ae17c691000-2ae17c692000 rw-p 0000a000 fe:00
6209867 /usr/lib/ruby/1.8/x86_64-linux/socket.so
2ae17c692000-2ae17c7cf000 rw-p 2ae17c692000 00:00 0
2ae17c7cf000-2ae17c7d4000 r-xp 00000000 fe:00
6209868 /usr/lib/ruby/1.8/x86_64-linux/stringio.so
2ae17c7d4000-2ae17c9d3000 ---p 00005000 fe:00
6209868 /usr/lib/ruby/1.8/x86_64-linux/stringio.so
2ae17c9d3000-2ae17c9d4000 rw-p 00004000 fe:00
6209868 /usr/lib/ruby/1.8/x86_64-linux/stringio.so
2ae17c9d4000-2ae17c9f0000 r-xp 00000000 fe:00
6209870 /usr/lib/ruby/1.8/x86_64-linux/syck.so
2ae17c9f0000-2ae17cbef000 ---p 0001c000 fe:00
6209870 /usr/lib/ruby/1.8/x86_64-linux/syck.so
2ae17cbef000-2ae17cbf1000 rw-p 0001b000 fe:00
6209870 /usr/lib/ruby/1.8/x86_64-linux/syck.so
2ae17cbf1000-2ae17cbfa000 r-xp 00000000 fe:00
6209872 /usr/lib/ruby/1.8/x86_64-linux/zlib.so
2ae17cbfa000-2ae17cdf9000 ---p 00009000 fe:00
6209872 /usr/lib/ruby/1.8/x86_64-linux/zlib.so
2ae17cdf9000-2ae17cdfa000 rw-p 00008000 fe:00
6209872 /usr/lib/ruby/1.8/x86_64-linux/zlib.so
2ae17ce00000-2ae17ce16000 r-xp 00000000 fe:00
6146044 /usr/lib/libz.so.1.2.3
2ae17ce16000-2ae17d015000 ---p 00016000 fe:00
6146044 /usr/lib/libz.so.1.2.3
2ae17d015000-2ae17d016000 rw-p 00015000 fe:00
6146044 /usr/lib/libz.so.1.2.3
2ae17d016000-2ae17d01a000 r-xp 00000000 fe:00
6209853 /usr/lib/ruby/1.8/x86_64-linux/digest/sha2.so
2ae17d01a000-2ae17d219000 ---p 00004000 fe:00
6209853 /usr/lib/ruby/1.8/x86_64-linux/digest/sha2.so
2ae17d219000-2ae17d21a000 rw-p 00003000 fe:00
6209853 /usr/lib/ruby/1.8/x86_64-linux/digest/sha2.so
2ae17d21a000-2ae17d21c000 r-xp 00000000 fe:00
6209848 /usr/lib/ruby/1.8/x86_64-linux/digest.so
2ae17d21c000-2ae17d41b000 ---p 00002000 fe:00
6209848 /usr/lib/ruby/1.8/x86_64-linux/digest.so
2ae17d41b000-2ae17d41c000 rw-p 00001000 fe:00
6209848 /usr/lib/ruby/1.8/x86_64-linux/digest.so
2ae17d41c000-2ae17d457000 r-xp 00000000 fe:00
6212089 /usr/lib/ruby/1.8/x86_64-linux/openssl.so
2ae17d457000-2ae17d656000 ---p 0003b000 fe:00
6212089 /usr/lib/ruby/1.8/x86_64-linux/openssl.so
2ae17d656000-2ae17d659000 rw-p 0003a000 fe:00
6212089 /usr/lib/ruby/1.8/x86_64-linux/openssl.so
2ae17d65f000-2ae17d6a1000 r-xp 00000000 fe:00
6149328 /usr/lib/libssl.so.0.9.8
2ae17d6a1000-2ae17d8a1000 ---p 00042000 fe:00
6149328 /usr/lib/libssl.so.0.9.8
2ae17d8a1000-2ae17d8a7000 rw-p 00042000 fe:00
6149328 /usr/lib/libssl.so.0.9.8
2ae17d8a7000-2ae17d9fc000 r-xp 00000000 fe:00
6149327 /usr/lib/libcrypto.so.0.9.8
2ae17d9fc000-2ae17dbfc000 ---p 00155000 fe:00
6149327 /usr/lib/libcrypto.so.0.9.8
2ae17dbfc000-2ae17dc1f000 rw-p 00155000 fe:00
6149327 /usr/lib/libcrypto.so.0.9.8
2ae17dc1f000-2ae17dc22000 rw-p 2ae17dc1f000 00:00 0
2ae17dc22000-2ae17dc23000 r-xp 00000000 fe:00
6209857 /usr/lib/ruby/1.8/x86_64-linux/fcntl.so
2ae17dc23000-2ae17de22000 ---p 00001000 fe:00
6209857 /usr/lib/ruby/1.8/x86_64-linux/fcntl.so
2ae17de22000-2ae17de23000 rw-p 00000000 fe:00
6209857 /usr/lib/ruby/1.8/x86_64-linux/fcntl.so
2ae17de23000-2ae17e05d000 rw-p 2ae17de23000 00:00 0
2ae17e05d000-2ae17e061000 r-xp 00000000 fe:00
6209869 /usr/lib/ruby/1.8/x86_64-linux/strscan.so
2ae17e061000-2ae17e261000 ---p 00004000 fe:00
6209869 /usr/lib/ruby/1.8/x86_64-linux/strscan.so
2ae17e261000-2ae17e262000 rw-p 00004000 fe:00
6209869 /usr/lib/ruby/1.8/x86_64-linux/strscan.so
2ae17e262000-2ae17e26d000 r-xp 00000000 fe:00
6209846 /usr/lib/ruby/1.8/x86_64-linux/bigdecimal.so
2ae17e26d000-2ae17e46c000 ---p 0000b000 fe:00
6209846 /usr/lib/ruby/1.8/x86_64-linux/bigdecimal.so
2ae17e46c000-2ae17e46d000 rw-p 0000a000 fe:00
6209846 /usr/lib/ruby/1.8/x86_64-linux/bigdecimal.so
2ae17e46d000-2ae17e86f000 rw-p 2ae17e46d000 00:00 0
2ae17e86f000-2ae17e8ab000 r-xp 00000000 fe:00
6209861 /usr/lib/ruby/1.8/x86_64-linux/nkf.so
2ae17e8ab000-2ae17eaab000 ---p 0003c000 fe:00
6209861 /usr/lib/ruby/1.8/x86_64-linux/nkf.so
2ae17eaab000-2ae17eaaf000 rw-p 0003c000 fe:00
6209861 /usr/lib/ruby/1.8/x86_64-linux/nkf.so
2ae17eaaf000-2ae17eab0000 rw-p 2ae17eaaf000 00:00 0
2ae17eab1000-2ae17f1e7000 rw-p 2ae17eab1000 00:00 0
2ae17f1e7000-2ae17f1e9000 r-xp 00000000 fe:00
6209856 /usr/lib/ruby/1.8/x86_64-linux/etc.so
2ae17f1e9000-2ae17f3e9000 ---p 00002000 fe:00
6209856 /usr/lib/ruby/1.8/x86_64-linux/etc.so
2ae17f3e9000-2ae17f3ea000 rw-p 00002000 fe:00
6209856 /usr/lib/ruby/1.8/x86_64-linux/etc.so
2ae17f3ea000-2ae17f3ec000 r-xp 00000000 fe:00
6209850 /usr/lib/ruby/1.8/x86_64-linux/digest/md5.so
2ae17f3ec000-2ae17f5eb000 ---p 00002000 fe:00
6209850 /usr/lib/ruby/1.8/x86_64-linux/digest/md5.so
2ae17f5eb000-2ae17f5ec000 rw-p 00001000 fe:00
6209850 /usr/lib/ruby/1.8/x86_64-linux/digest/md5.so
2ae17f5ec000-2ae17f5ef000 r-xp 00000000 fe:00
6209864 /usr/lib/ruby/1.8/x86_64-linux/racc/cparse.so
2ae17f5ef000-2ae17f7ef000 ---p 00003000 fe:00
6209864 /usr/lib/ruby/1.8/x86_64-linux/racc/cparse.so
2ae17f7ef000-2ae17f7f0000 rw-p 00003000 fe:00
6209864 /usr/lib/ruby/1.8/x86_64-linux/racc/cparse.so
2ae17f7f0000-2ae17f7f4000 r-xp 00000000 fe:00
6209858 /usr/lib/ruby/1.8/x86_64-linux/iconv.so
2ae17f7f4000-2ae17f9f3000 ---p 00004000 fe:00
6209858 /usr/lib/ruby/1.8/x86_64-linux/iconv.so
2ae17f9f3000-2ae17f9f4000 rw-p 00003000 fe:00
6209858 /usr/lib/ruby/1.8/x86_64-linux/iconv.so
2ae17f9f4000-2ae17f9f5000 rw-p 2ae17f9f4000 00:00 0
2ae17f9f5000-2ae17f9f8000 r-xp 00000000 fe:00
6209852 /usr/lib/ruby/1.8/x86_64-linux/digest/sha1.so
2ae17f9f8000-2ae17fbf8000 ---p 00003000 fe:00
6209852 /usr/lib/ruby/1.8/x86_64-linux/digest/sha1.so
2ae17fbf8000-2ae17fbf9000 rw-p 00003000 fe:00
6209852 /usr/lib/ruby/1.8/x86_64-linux/digest/sha1.so
2ae17fbfa000-2ae1808f4000 rw-p 2ae17fbfa000 00:00 0
2ae1808f4000-2ae180997000 r-xp 00000000 fe:00
7297883 /usr/lib/ruby/gems/1.8/gems/ferret-0.11.4/
lib/ferret_ext.so
2ae180997000-2ae180b96000 ---p 000a3000 fe:00
7297883 /usr/lib/ruby/gems/1.8/gems/ferret-0.11.4/
lib/ferret_ext.so
2ae180b96000-2ae180bb7000 rw-p 000a2000 fe:00
7297883 /usr/lib/ruby/gems/1.8/gems/ferret-0.11.4/
lib/ferret_ext.so
2ae180bb7000-2ae180bb8000 rw-p 2ae180bb7000 00:00 0
2ae180bb8000-2ae180bbe000 r-xp 00000000 fe:00
6508818 /usr/lib/ruby/gems/1.8/gems/amatch-0.2.3/
ext/amatch.so
2ae180bbe000-2ae180dbd000 ---p 00006000 fe:00
6508818 /usr/lib/ruby/gems/1.8/gems/amatch-0.2.3/
ext/amatch.so
2ae180dbd000-2ae180dbe000 rw-p 00005000 fe:00
6508818 /usr/lib/ruby/gems/1.8/gems/amatch-0.2.3/
ext/amatch.so
2ae180dbe000-2ae180dc0000 r-xp 00000000 fe:00
17203639 /var/www/webroot/panjiva.com/admin/releases/
20071117220121/vendor/ruby_inline/.ruby_inline/Inline_String_7dae.so
2ae180dc0000-2ae180fbf000 ---p 00002000 fe:00
17203639 /var/www/webroot/panjiva.com/admin/releases/
20071117220121/vendor/ruby_inline/.ruby_inline/Inline_String_7dae.so
2ae180fbf000-2ae180fc0000 rw-p 00001000 fe:00
17203639 /var/www/webroot/panjiva.com/admin/releases/
20071117220121/vendor/ruby_inline/.ruby_inline/Inline_String_7dae.so
2ae180fc0000-2ae180fc8000 rw-p 2ae180fc0000 00:00 0
2ae180fc8000-2ae180fd2000 r-xp 00000000 fe:00
29212702 /lib/libnss_files-2.5.so
2ae180fd2000-2ae1811d1000 ---p 0000a000 fe:00
29212702 /lib/libnss_files-2.5.so
2ae1811d1000-2ae1811d3000 rw-p 00009000 fe:00
29212702 /lib/libnss_files-2.5.so
7fff2f6dc000-7fff2f757000 rw-p 7fff2f6dc000 00:00
0 [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00
0 [vdso]
From jk at jkraemer.net Sun Nov 18 04:53:14 2007
From: jk at jkraemer.net (Jens Kraemer)
Date: Sun, 18 Nov 2007 10:53:14 +0100
Subject: [Ferret-talk] crash while building index
In-Reply-To:
References:
Message-ID: <20071118095314.GQ3558@thunder.jkraemer.net>
On Sat, Nov 17, 2007 at 06:27:29PM -0500, Noah M. Daniels wrote:
> Hi,
>
> I'm trying to reindex a model (I'm using acts_as_ferret) after having
> added (via metaprogramming) a large number of fields (several hundred)
> to the index.
>
> It keeps crashing when trying to rebuild the index (the crash log is
> below, from ferret_server.out) but it only seems to crash on Linux
> (Ubuntu server 7.04, x86-64) whereas it's fine on my OS X laptop
> (10.5.1). This is with ferret 0.11.4 in both cases.
>
> Any thoughts? Is there a hard field limit in ferret?
>
>
> *** glibc detected *** ruby: realloc(): invalid next size:
> 0x000000000232ffc0 ***
> ======= Backtrace: =========
[..]
Looks strange - maybe a problem with Ubuntu's 64bit libs? Can you
try to provide a simple script reproducing this behaviour?
Cheers,
Jens
--
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database
From erik at ehatchersolutions.com Sun Nov 18 05:24:15 2007
From: erik at ehatchersolutions.com (Erik Hatcher)
Date: Sun, 18 Nov 2007 05:24:15 -0500
Subject: [Ferret-talk] Ferret/AAF Stability?
In-Reply-To: <20071117123925.GO3558@thunder.jkraemer.net>
References:
<314ee0450711160919y7eaad1cl16ef0bc349b09dee@mail.gmail.com>
<20071117123925.GO3558@thunder.jkraemer.net>
Message-ID: <5513C9F2-E3AE-4B0F-8D48-4B0FD8E9965F@ehatchersolutions.com>
On Nov 17, 2007, at 7:39 AM, Jens Kraemer wrote:
>> I think Ferret is adequate for most search tasks, but if (like me)
>> you're building a dedicated search engine, Solr is currently a
>> stronger candidate.
>
> Well, As Solr uses Lucene internally, the mechanics and performance
> characteristics naturally can't be that different from Ferret. Maybe
> Ferret has a bug or two and a non-working inter-process locking (which
> doesn't matter when you think about building a dedicated search server
> like Solr is, since it's only one process), but the general internal
> handling of the index is the same, i.e. you can also only have one
> Writer open to a Lucene index at a time, and Searchers won't see index
> changes until re-opened, too.
That's all true. However, Solr manages all the IndexWriter/
IndexSearcher stuff for you quite transparently (which I guess is
comparable to Ferret + DRb, eh?). Because it is a single point of
access to the index, it takes care of the single writer situation,
and also handles warming IndexSearchers before coming online so that
caches are built and a search on an updated index is as fast as it
was before being updated.
> Having that said, if my application's main concern would be search, I
> most probably wouldn't choose any pre-cooked solution like aaf or
> Solr,
> but build exactly the thing I need from scratch, basing it either on
> Lucene or Ferret. But maybe that's just me ;-)
You'd be reinventing a lot of wheels doing that, with IndexWriter
synchronization, IndexSearcher warming, caching, and much more.
Erik
From andreas.korth at gmail.com Sun Nov 18 10:05:23 2007
From: andreas.korth at gmail.com (Andreas Korth)
Date: Sun, 18 Nov 2007 16:05:23 +0100
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
Message-ID: <323C44A8-07A5-4716-AC8C-3C4B7F221A83@gmail.com>
Hi everyone!
This is a very interesting thread, because it raises the question as
to whether Ferret is something you would want to use in a production
environment - or not.
I've been using Ferret in two applications and my experiences were
quite disappointing. I chose Ferret because it's fast and it's got a
Ruby API. Everything else about it is just annoying and potentially
hazardous.
What worries me most is the fact that Ferret is effectively an
abandoned project. The original author, who is the sole owner of the
code, hasn't been posting to this list for about six months. He hasn't
introduced any improvements in about the same period of time and many
bugs still remain unfixed. New bugs can't be submitted (let alone
patches) because the project Trac is offline.
There is no other component in my applications which behaves as badly
as Ferret. If you don't treat it _very_ carefully it will throw
segfaults as if this was an established way of indicating an error
condition.
The ActsAsFerret plugin _does_ treat ferret quite carefully and it's
the only reason why many people are able to use Ferret at all.
However, AAF is one approach and for some applications it might not be
the right one. Especially if you want to put multiple models in one
index - it's possible, but not really a flexible solution.
The most sensitive point of Ferret is concurrency and many people
actually use Ferret in distributed environments (which is usually a
Rails app that scales across several machines). AAF introduces a DRb
server to work around this problem, but with many concurrent read/
write requests, performance quickly degrades.
With the advent of JRuby, a myriad of Java-based solutions is now
accessible to Ruby developers, including many full-text indices. There
are very mature solutions readily available for production use and
many next-generation search engines currently in development.
For the next application that needs full text search, I'm most
definitely not going to use Ferret. I agree with Erik and give Solr a
shot.
I would like to encourage everyone, who is already using another full
text index for Ruby/Rails to share his/her experiences on this list.
Because I have the feeling that many people would like to get rid of
Ferret for exactly the same reasons I've pointed out above.
Andy
On 16.11.2007, at 22:13, Erik Hatcher wrote:
>
> On Nov 16, 2007, at 3:35 PM, Scott Davies wrote:
>> Am I a fool for wondering whether it might ultimately be less painful
>> to try an index server that runs Lucene under a JRuby process?
>
> Or, rather, an index server that runs Solr accessed with a pure Ruby,
> solr-ruby, API (which works with MRI or JRuby)? :)
>
> Erik
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
From casey at nerdle.com Sun Nov 18 10:24:34 2007
From: casey at nerdle.com (casey at nerdle.com)
Date: Sun, 18 Nov 2007 10:24:34 -0500 (EST)
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <323C44A8-07A5-4716-AC8C-3C4B7F221A83@gmail.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
<323C44A8-07A5-4716-AC8C-3C4B7F221A83@gmail.com>
Message-ID:
Andy,
You asked about other full text indexes for Ruby/Rails. I am using both
AAF/Ferret and Sphinx in my app.
I haven't had any problems with Ferret or acts_as_ferret so far. I am
using the DRb server and it is being hit with 200-250,000 requests a day
from dozens of clients (Mongrel instances). My index isn't huge - it is
about 600 MB.
I'm using Sphinx (http://www.sphinxsearch.com/) wherever I don't need
realtime updates. A large portion of my site requires search indexes to
be always up-to-date but in many places, I can live with an index that may
be 5 minutes old. Sphinx trades realtime indexing for performance - both
search and indexing speed is blazingly fast. Sphinx comes with a server
component that speaks a simple protocol and there are several rails
plugins available.
Sphinx (and acts_as_sphinx or whatever plugin you choose) and
acts_as_ferret are very different animals, but I'm very pleased with the
combination.
Casey
On Sun, 18 Nov 2007, Andreas Korth wrote:
> Hi everyone!
>
> This is a very interesting thread, because it raises the question as
> to whether Ferret is something you would want to use in a production
> environment - or not.
>
> I've been using Ferret in two applications and my experiences were
> quite disappointing. I chose Ferret because it's fast and it's got a
> Ruby API. Everything else about it is just annoying and potentially
> hazardous.
>
> What worries me most is the fact that Ferret is effectively an
> abandoned project. The original author, who is the sole owner of the
> code, hasn't been posting to this list for about six months. He hasn't
> introduced any improvements in about the same period of time and many
> bugs still remain unfixed. New bugs can't be submitted (let alone
> patches) because the project Trac is offline.
>
> There is no other component in my applications which behaves as badly
> as Ferret. If you don't treat it _very_ carefully it will throw
> segfaults as if this was an established way of indicating an error
> condition.
>
> The ActsAsFerret plugin _does_ treat ferret quite carefully and it's
> the only reason why many people are able to use Ferret at all.
> However, AAF is one approach and for some applications it might not be
> the right one. Especially if you want to put multiple models in one
> index - it's possible, but not really a flexible solution.
>
> The most sensitive point of Ferret is concurrency and many people
> actually use Ferret in distributed environments (which is usually a
> Rails app that scales across several machines). AAF introduces a DRb
> server to work around this problem, but with many concurrent read/
> write requests, performance quickly degrades.
>
> With the advent of JRuby, a myriad of Java-based solutions is now
> accessible to Ruby developers, including many full-text indices. There
> are very mature solutions readily available for production use and
> many next-generation search engines currently in development.
>
> For the next application that needs full text search, I'm most
> definitely not going to use Ferret. I agree with Erik and give Solr a
> shot.
>
> I would like to encourage everyone, who is already using another full
> text index for Ruby/Rails to share his/her experiences on this list.
> Because I have the feeling that many people would like to get rid of
> Ferret for exactly the same reasons I've pointed out above.
>
> Andy
>
>
> On 16.11.2007, at 22:13, Erik Hatcher wrote:
>
>>
>> On Nov 16, 2007, at 3:35 PM, Scott Davies wrote:
>>> Am I a fool for wondering whether it might ultimately be less painful
>>> to try an index server that runs Lucene under a JRuby process?
>>
>> Or, rather, an index server that runs Solr accessed with a pure Ruby,
>> solr-ruby, API (which works with MRI or JRuby)? :)
>>
>> Erik
>>
>> _______________________________________________
>> Ferret-talk mailing list
>> Ferret-talk at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/ferret-talk
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
From jk at jkraemer.net Sun Nov 18 12:51:04 2007
From: jk at jkraemer.net (Jens Kraemer)
Date: Sun, 18 Nov 2007 18:51:04 +0100
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To:
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
<323C44A8-07A5-4716-AC8C-3C4B7F221A83@gmail.com>
Message-ID: <20071118175104.GS3558@thunder.jkraemer.net>
Hi!
On Sun, Nov 18, 2007 at 10:24:34AM -0500, casey at nerdle.com wrote:
> Andy,
>
> You asked about other full text indexes for Ruby/Rails. I am using both
> AAF/Ferret and Sphinx in my app.
>
> I haven't had any problems with Ferret or acts_as_ferret so far. I am
> using the DRb server and it is being hit with 200-250,000 requests a day
> from dozens of clients (Mongrel instances). My index isn't huge - it is
> about 600 MB.
ah, glad to see somebody where everything just works standing up and tell
the world :-)
On Sun, 18 Nov 2007, Andreas Korth wrote:
[..]
> >
> > What worries me most is the fact that Ferret is effectively an
> > abandoned project. The original author, who is the sole owner of the
> > code, hasn't been posting to this list for about six months. He hasn't
> > introduced any improvements in about the same period of time and many
> > bugs still remain unfixed. New bugs can't be submitted (let alone
> > patches) because the project Trac is offline.
Trac is online again for days, and Ferret even got a new logo :-) I
wouldn't call it abandoned, it's just stabilizing.
> > There is no other component in my applications which behaves as badly
> > as Ferret. If you don't treat it _very_ carefully it will throw
> > segfaults as if this was an established way of indicating an error
> > condition.
> >
> > The ActsAsFerret plugin _does_ treat ferret quite carefully and it's
> > the only reason why many people are able to use Ferret at all.
> > However, AAF is one approach and for some applications it might not be
> > the right one. Especially if you want to put multiple models in one
> > index - it's possible, but not really a flexible solution.
Well, even if aaf doesn't fit your needs, you might at least have a look
at it if you want to know how to treat your Ferret well :-) I admit it
isn't always an easy library to deal with, but with a proper set of unit
tests it's entirely possible and no headache at all. Imho.
> > The most sensitive point of Ferret is concurrency and many people
> > actually use Ferret in distributed environments (which is usually a
> > Rails app that scales across several machines). AAF introduces a DRb
> > server to work around this problem, but with many concurrent read/
> > write requests, performance quickly degrades.
AAf's DRb server can handle some serious load as it is now, but for sure
there's much room for improvement. However I didn't receive many
complaints from people actually *having* this problem in real life
applications yet. Most of the time this is brought up as some kind of
'what if' problem. Somebody did a speed comparison of Solr and aaf/Drb a
while back, where aaf was at least as fast as Solr was, with it's
admittedly naive DRb server.
I don't say this was a representative benchmark or anything, but it's
the only numbers I know of...
So please from now on, anybody feeling to blame aaf's DRb as slow,
*please* show us some numbers and the test process which led to
these numbers.
Ideally you'd also show us the numbers of any solution you've found to
be faster solving the same problem. Thanks.
> > With the advent of JRuby, a myriad of Java-based solutions is now
> > accessible to Ruby developers, including many full-text indices. There
> > are very mature solutions readily available for production use and
> > many next-generation search engines currently in development.
For sure. I'm excited by these possiblities as well.
Cheers,
Jens
--
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database
From marvin at rectangular.com Sun Nov 18 12:29:31 2007
From: marvin at rectangular.com (Marvin Humphrey)
Date: Sun, 18 Nov 2007 09:29:31 -0800
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <323C44A8-07A5-4716-AC8C-3C4B7F221A83@gmail.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
<323C44A8-07A5-4716-AC8C-3C4B7F221A83@gmail.com>
Message-ID:
On Nov 18, 2007, at 7:05 AM, Andreas Korth wrote:
> What worries me most is the fact that Ferret is effectively an
> abandoned project. The original author, who is the sole owner of the
> code, hasn't been posting to this list for about six months. He hasn't
> introduced any improvements in about the same period of time and many
> bugs still remain unfixed.
I have a large fraction of the expertise needed to maintain the C
part of the Ferret code base, FWIW. What I'm missing is significant
Ruby expertise, which I wouldn't mind accumulating. :)
If what's needed is C-level bug fixing, I can probably help out.
> New bugs can't be submitted (let alone
> patches) because the project Trac is offline.
I know it's been down before, but looks like it's up to me, now. Also, I see a commit from Dave
bumping the version to 0.11.5 yesterday.
The C code base that I am currently working on, which has a
foundation designed by Dave and I to be shared by multiple host
languages, is going to wind up having Ruby bindings eventually. It
will either happen as part of the Lucy project, or independently.
In the meantime, perhaps I can contribute to Ferret in a caretaker/
troubleshooter role. Dave gave me commit access to the repository a
while ago, and I just verified that I still have it.
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
From andreas.korth at gmail.com Sun Nov 18 14:50:04 2007
From: andreas.korth at gmail.com (Andreas Korth)
Date: Sun, 18 Nov 2007 20:50:04 +0100
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <20071118175104.GS3558@thunder.jkraemer.net>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
<323C44A8-07A5-4716-AC8C-3C4B7F221A83@gmail.com>
<20071118175104.GS3558@thunder.jkraemer.net>
Message-ID: <2868A3E0-B0C3-46D8-97F0-541888C5C979@gmail.com>
On 18.11.2007, at 18:51, Jens Kraemer wrote:
> Trac is online again for days, and Ferret even got a new logo :-) I
> wouldn't call it abandoned, it's just stabilizing.
Yes, I noticed that. I should have checked before posting. However, a
project site that is frequently down for extended periods of time is
not exactly building up trust :)
> AAf's DRb server can handle some serious load as it is now, but for
> sure
> there's much room for improvement. However I didn't receive many
> complaints from people actually *having* this problem in real life
> applications yet. Most of the time this is brought up as some kind of
> 'what if' problem.
My apologies for implying that AAF is part of the problem. It
certainly isn't. I made the mistake to mix up my concerns about Ferret
with comments on AAF. What I actually meant to say, is that AAF is one
viable way to deal with some of Ferret's shortcomings.
The fact that in the Rails community AAF is almost synonymous with
Ferret speaks for your plugin and I'm not in a position to question
that.
> So please from now on, anybody feeling to blame aaf's DRb as slow,
> *please* show us some numbers and the test process which led to
> these numbers.
Again, I wasn't to blame AAF here.
To be more precise: Ferret is pretty damn fast. The problem is its
extremely sensitive API which exposes problems from the C
implementation to the Ruby developer. I don't know of any way to catch
a segfault in Ruby, and even if I did, there's little I can do about
it from Rubyland.
Without transactional index updates, such behavior is intolerable,
unless you can afford to rebuild your index several times a day. This
leaves us to build another Ruby API on top of Ferret's in order to
compensate for these imperfections.
I wrote a custom solution with a focus on reliability. But with all
the infrastructure built around Ferret (DRb server, transactions,
queuing), the overall indexing performance wasn't that great anymore:
Remote indexing with 10 concurrent clients was 8-9 times slower than
local indexing.
Maybe AAF is faster, but since the implementations are different,
there's no point in comparing them directly.
Andy
From andreas.korth at gmail.com Sun Nov 18 14:56:27 2007
From: andreas.korth at gmail.com (Andreas Korth)
Date: Sun, 18 Nov 2007 20:56:27 +0100
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To:
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
<323C44A8-07A5-4716-AC8C-3C4B7F221A83@gmail.com>
Message-ID: <3DAD6E50-FDEB-4B5F-9248-CBD16BF51174@gmail.com>
On 18.11.2007, at 16:24, casey at nerdle.com wrote:
> I'm using Sphinx (http://www.sphinxsearch.com/) wherever I don't need
> realtime updates. A large portion of my site requires search
> indexes to
> be always up-to-date but in many places, I can live with an index
> that may
> be 5 minutes old. Sphinx trades realtime indexing for performance -
> both
> search and indexing speed is blazingly fast. Sphinx comes with a
> server
> component that speaks a simple protocol and there are several rails
> plugins available.
Thanks, Casey. I'll take a look at Sphinx. Since I'm primarily
concerned about index consistency and don't mind short delays either,
it sounds like a pretty good alternative.
Cheers,
Andy
From erik at ehatchersolutions.com Sun Nov 18 04:29:36 2007
From: erik at ehatchersolutions.com (Erik Hatcher)
Date: Sun, 18 Nov 2007 04:29:36 -0500
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <75f591160711170212q1d4d6475v3e830a64ff4c3dc2@mail.gmail.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
<75f591160711170212q1d4d6475v3e830a64ff4c3dc2@mail.gmail.com>
Message-ID:
On Nov 17, 2007, at 5:12 AM, Scott Davies wrote:
> Hmmm...I'd first heard of Solr only a couple of days ago, and I hadn't
> been aware of a Ruby API to it until you mentioned it.
> Interesting...thanks!
I've honestly given fairly little of my time to Ferret, though I have
tinkered with it some and it is mighty fine!
Believe you me, I don't want to steal any thunder from Ferret. And
I've not compared/contrasted them much myself. Truth be told I'm
still a Java dude, and knowing that Lucene and Solr are in Java,
excel at what they are designed to do and already gulping the Apache
cool-ade I really dig Solr.
I've presented solr+ruby a couple of times now, once at RailsConf and
then again a few weeks ago at rubyconf.
RailsConf:
rubyconf:
acts_as_solr as it exists today is sub-optimal compared to
acts_as_ferret. I'm quite admittedly not much into relational
databases so I have only tinkered in this area myself.
Erik
From julioody at gmail.com Sun Nov 18 19:45:31 2007
From: julioody at gmail.com (Julio Cesar Ody)
Date: Mon, 19 Nov 2007 11:45:31 +1100
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To:
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
<75f591160711170212q1d4d6475v3e830a64ff4c3dc2@mail.gmail.com>
Message-ID:
Great. For my own curiosity, and maybe people here share some of it:
Is it possible to write your own custom analyzers for Solr? If so, how
easy it is? Can one do that in Ruby or do I have to write it in Java?
I personally think that's one of the greatest things about Ferret. So
far I haven't bothered looking into Sphinx or Solr precisely because,
from a glance, I couldn't find a way to customize anything in detail
like I can do with Ferret. I assume there is a way...
Thing is, reading through the Ferret booklet (the one from OReilly),
you get a glimpse of how easy it is to build custom solutions using
it. So whereas it's kind of sad that the lead developer has been
distant from the project in the last few months (?), I have to say,
there's hardly matching how easy it is to work with it.
On Nov 18, 2007 8:29 PM, Erik Hatcher wrote:
>
> On Nov 17, 2007, at 5:12 AM, Scott Davies wrote:
> > Hmmm...I'd first heard of Solr only a couple of days ago, and I hadn't
> > been aware of a Ruby API to it until you mentioned it.
> > Interesting...thanks!
>
> I've honestly given fairly little of my time to Ferret, though I have
> tinkered with it some and it is mighty fine!
>
> Believe you me, I don't want to steal any thunder from Ferret. And
> I've not compared/contrasted them much myself. Truth be told I'm
> still a Java dude, and knowing that Lucene and Solr are in Java,
> excel at what they are designed to do and already gulping the Apache
> cool-ade I really dig Solr.
>
> I've presented solr+ruby a couple of times now, once at RailsConf and
> then again a few weeks ago at rubyconf.
>
> RailsConf:
>
>
> rubyconf:
>
>
> acts_as_solr as it exists today is sub-optimal compared to
> acts_as_ferret. I'm quite admittedly not much into relational
> databases so I have only tinkered in this area myself.
>
> Erik
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
From alex at liivid.com Sun Nov 18 19:33:23 2007
From: alex at liivid.com (Alex Neth)
Date: Mon, 19 Nov 2007 08:33:23 +0800
Subject: [Ferret-talk] My AAF tweaks
In-Reply-To:
References:
Message-ID:
I have had to fix a few issues with AAF in order to get it working
well for myself in a production environment. I'm using the latest
"release" version which is 0.4.1:
1) When there is no index in place, every request starts a new
rebuild. While there is some code in place to allow this to happen
during testing, personally I see no reason to even test for this,
although maybe it's necessary for those using multiple indexes (?).
At the very least, the same index should never be built twice at the
same time I hope, so I put in rudimentary code that just locks until
the index is complete on a request.
The real problem is that all the rebuilds use the same FERRET_INDEX/
rebuild path, which causes the dRB server to core dump and massive
CPU load as two reindexes are running and files are replaced
underneath them. This may be the reason for a lot of stability
complaints, as I think a lot of people just remove their index
instead of calling rebuild.
2) Performance degrades when I index articles until I call optimize
on the index. Optimize can take many seconds and seems to lock all
access via the dRb server. I added logic to use a separate index for
modifications (adds/deletes) and optimizations. It required
significant hacking of the AAF plug-in. I basically have a writable
index that is then copied each time to a new read-only index
location, followed by changing the index_dir in AAF to the new read-
only index. This prevents the slowdown during indexing, but
everything still seems to lock during optimizations. I have a faster
server now, so optimizations only take around 6 seconds. I may have
to use a separate dRb server to do the optimizations. I am not sure
where this locking occurs. I would like to see aaf take take or this
locking issue somehow.
None of the above are ready to be checked in publicly but I'd be
happy to send a patch if someone wants to base some work on it.
Other than these issues, aaf/ferret have been excellent, and
basically "just worked". I am able to handle around 40 requests per
second, including rendering a results page. I haven't finished
performance testing as that is more than enough performance for me
right now.
-Alex
From mail at stuartsierra.com Sun Nov 18 21:59:22 2007
From: mail at stuartsierra.com (Stuart Sierra)
Date: Sun, 18 Nov 2007 21:59:22 -0500
Subject: [Ferret-talk] Ferret/AAF Stability?
In-Reply-To: <20071117123925.GO3558@thunder.jkraemer.net>
References:
<314ee0450711160919y7eaad1cl16ef0bc349b09dee@mail.gmail.com>
<20071117123925.GO3558@thunder.jkraemer.net>
Message-ID: <314ee0450711181859y5711b63dje9a2e183204d4b37@mail.gmail.com>
On Nov 17, 2007 7:39 AM, Jens Kraemer wrote:
> > 3. Ferret doesn't yet support compressed indexes.
>
> At least from the docs it looks like it does, see
> http://ferret.davebalmain.com/api/classes/Ferret/Index/FieldInfo.html .
> I didn't ever try this out however.
Yes, it's in the API, but there's no code for it yet.
> > I was nervous about tackling Solr, but I've found it quite easy to
> > use, and the built-in caching and multithreading make it fast.
>
> numbers, please :-)
I make no claim that it's faster than Ferret, but it's fast enough.
> Having that said, if my application's main concern would be search, I
> most probably wouldn't choose any pre-cooked solution like aaf or Solr,
> but build exactly the thing I need from scratch, basing it either on
> Lucene or Ferret. But maybe that's just me ;-)
I'd like to do that, but I lack sufficient time and skill. :) In the
mean time, I'm hoping Solr will let me offer an open search API to my
users without too much extra effort on my part. We'll see how it
goes; I may end up back on Ferret at some point.
-Stuart
From tvollmer at codemart.de Tue Nov 20 07:17:30 2007
From: tvollmer at codemart.de (Till Vollmer)
Date: Tue, 20 Nov 2007 13:17:30 +0100
Subject: [Ferret-talk] Compound search / grouping
Message-ID:
Hi,
Following problem:
We have a tree structure with children and a root element (recursivly)
stored in one table (imagine a threaded forum).
Each of the children has a title which should be indexed by ferret.
Now we want to make a search that returns only the root and searches all
items.
So if one node has "expensive" and nother node has "car" I want to enter
"expensive car" in search and still find the root of all children (and
only once!)
Also paging should work as well.
Any clues how to achieve that?
Regards
Till
--
Posted via http://www.ruby-forum.com/.
From cstrom at mdlogix.com Tue Nov 20 07:52:29 2007
From: cstrom at mdlogix.com (Chris Strom)
Date: Tue, 20 Nov 2007 07:52:29 -0500
Subject: [Ferret-talk] Compound search / grouping
In-Reply-To:
References:
Message-ID: <20071120125229.GC5312@jaynestown.users.mdlogix.com>
On Tue, Nov 20, 2007 at 01:17:30PM +0100, Till Vollmer wrote:
> Hi,
>
> Following problem:
>
> We have a tree structure with children and a root element (recursivly)
> stored in one table (imagine a threaded forum).
>
> Each of the children has a title which should be indexed by ferret.
>
> Now we want to make a search that returns only the root and searches all
> items.
>
>
> So if one node has "expensive" and nother node has "car" I want to enter
> "expensive car" in search and still find the root of all children (and
> only once!)
>
> Also paging should work as well.
>
> Any clues how to achieve that?
An instance method in the root class to the effect of
children_titles_with_spaces would get you this. That method would return
"expensive car" given your simple, two-node example, which would be
indexable with the normal analyzer.
-Chris
From tvollmer at codemart.de Tue Nov 20 08:01:39 2007
From: tvollmer at codemart.de (Till Vollmer)
Date: Tue, 20 Nov 2007 14:01:39 +0100
Subject: [Ferret-talk] Compound search / grouping
In-Reply-To: <20071120125229.GC5312@jaynestown.users.mdlogix.com>
References:
<20071120125229.GC5312@jaynestown.users.mdlogix.com>
Message-ID:
Hi,
Thank you for the clue.
Ok, like a virtual attribute. Works technically but:
Downside: How often is that called ? Our tree has e.g. 200 children.
This means that the children are collected on every change of one of
the children (index) or?
Any other ideas?
Regards
Till
Am 20.11.2007 um 13:52 schrieb Chris Strom:
> On Tue, Nov 20, 2007 at 01:17:30PM +0100, Till Vollmer wrote:
>> Hi,
>>
>> Following problem:
>>
>> We have a tree structure with children and a root element
>> (recursivly)
>> stored in one table (imagine a threaded forum).
>>
>> Each of the children has a title which should be indexed by ferret.
>>
>> Now we want to make a search that returns only the root and
>> searches all
>> items.
>>
>>
>> So if one node has "expensive" and nother node has "car" I want to
>> enter
>> "expensive car" in search and still find the root of all children
>> (and
>> only once!)
>>
>> Also paging should work as well.
>>
>> Any clues how to achieve that?
>
> An instance method in the root class to the effect of
> children_titles_with_spaces would get you this. That method would
> return
> "expensive car" given your simple, two-node example, which would be
> indexable with the normal analyzer.
>
> -Chris
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
Codemart GmbH
Till Vollmer
Managing Director
Tel: +49 (0)89 1213 5359
Mob: + 49 (0)160 718 7403
Fax: +49 (0)89 1892 1347
Yahoo ID: till_vollmer
Skype: till_vollmer
www.codemart.de
till.vollmer at codemart.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20071120/936d5394/attachment-0001.html
From cstrom at mdlogix.com Tue Nov 20 08:36:09 2007
From: cstrom at mdlogix.com (Chris Strom)
Date: Tue, 20 Nov 2007 08:36:09 -0500
Subject: [Ferret-talk] Compound search / grouping
In-Reply-To:
References:
<20071120125229.GC5312@jaynestown.users.mdlogix.com>
Message-ID: <20071120133609.GD5312@jaynestown.users.mdlogix.com>
If you are using acts_as_ferret, it will never get called. The
acts_as_ferret declaration would go on the root class. Updates to the
child classes would not trigger an aaf index update in the root class.
If you want to real-time index updates, you would have to add an
after_save callback to the child class that forces an aaf update in the
root class.
If real-time updates are not too important, then you could dump the child
updates into a queue that performs bulk updates. This would minimize the
number of times this method gets called.
If you're worried about 200+ SQL calls, don't perform the join in ruby, do
it via SQL using CONCAT and "Advanced Attribute" as described in AWDWR,
19.3.
-Chris
On Tue, Nov 20, 2007 at 02:01:39PM +0100, Till Vollmer wrote:
> Hi,
> Thank you for the clue.
> Ok, like a virtual attribute. Works technically but:
> Downside: How often is that called ? Our tree has e.g. 200 children. This
> means that the children are collected on every change of one of the
> children (index) or?
> Any other ideas?
> Regards
> Till
>
>
> Am 20.11.2007 um 13:52 schrieb Chris Strom:
>
>> On Tue, Nov 20, 2007 at 01:17:30PM +0100, Till Vollmer wrote:
>>> Hi,
>>>
>>> Following problem:
>>>
>>> We have a tree structure with children and a root element (recursivly)
>>> stored in one table (imagine a threaded forum).
>>>
>>> Each of the children has a title which should be indexed by ferret.
>>>
>>> Now we want to make a search that returns only the root and searches all
>>> items.
>>>
>>>
>>> So if one node has "expensive" and nother node has "car" I want to enter
>>> "expensive car" in search and still find the root of all children (and
>>> only once!)
>>>
>>> Also paging should work as well.
>>>
>>> Any clues how to achieve that?
>>
>> An instance method in the root class to the effect of
>> children_titles_with_spaces would get you this. That method would return
>> "expensive car" given your simple, two-node example, which would be
>> indexable with the normal analyzer.
>>
>> -Chris
>> _______________________________________________
>> Ferret-talk mailing list
>> Ferret-talk at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/ferret-talk
>
> Codemart GmbH
> Till Vollmer
> Managing Director
> Tel: +49 (0)89 1213 5359
> Mob: + 49 (0)160 718 7403
> Fax: +49 (0)89 1892 1347
> Yahoo ID: till_vollmer
> Skype: till_vollmer
> www.codemart.de
> till.vollmer at codemart.de
>
>
>
>
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
From tvollmer at codemart.de Tue Nov 20 09:08:52 2007
From: tvollmer at codemart.de (Till Vollmer)
Date: Tue, 20 Nov 2007 15:08:52 +0100
Subject: [Ferret-talk] Compound search / grouping
In-Reply-To: <20071120133609.GD5312@jaynestown.users.mdlogix.com>
References:
<20071120125229.GC5312@jaynestown.users.mdlogix.com>
<20071120133609.GD5312@jaynestown.users.mdlogix.com>
Message-ID:
Hi,
Thank you for your answer.
The root and the nodes are in the same table for us.
Is there no "group_by" or something for ferret? That would probably
make the deal.
Regards
Till
Am 20.11.2007 um 14:36 schrieb Chris Strom:
> If you are using acts_as_ferret, it will never get called. The
> acts_as_ferret declaration would go on the root class. Updates to the
> child classes would not trigger an aaf index update in the root class.
>
> If you want to real-time index updates, you would have to add an
> after_save callback to the child class that forces an aaf update in
> the
> root class.
>
> If real-time updates are not too important, then you could dump the
> child
> updates into a queue that performs bulk updates. This would
> minimize the
> number of times this method gets called.
>
> If you're worried about 200+ SQL calls, don't perform the join in
> ruby, do
> it via SQL using CONCAT and "Advanced Attribute" as described in
> AWDWR,
> 19.3.
>
> -Chris
>
> On Tue, Nov 20, 2007 at 02:01:39PM +0100, Till Vollmer wrote:
>> Hi,
>> Thank you for the clue.
>> Ok, like a virtual attribute. Works technically but:
>> Downside: How often is that called ? Our tree has e.g. 200
>> children. This
>> means that the children are collected on every change of one of the
>> children (index) or?
>> Any other ideas?
>> Regards
>> Till
>>
>>
>> Am 20.11.2007 um 13:52 schrieb Chris Strom:
>>
>>> On Tue, Nov 20, 2007 at 01:17:30PM +0100, Till Vollmer wrote:
>>>> Hi,
>>>>
>>>> Following problem:
>>>>
>>>> We have a tree structure with children and a root element
>>>> (recursivly)
>>>> stored in one table (imagine a threaded forum).
>>>>
>>>> Each of the children has a title which should be indexed by ferret.
>>>>
>>>> Now we want to make a search that returns only the root and
>>>> searches all
>>>> items.
>>>>
>>>>
>>>> So if one node has "expensive" and nother node has "car" I want
>>>> to enter
>>>> "expensive car" in search and still find the root of all children
>>>> (and
>>>> only once!)
>>>>
>>>> Also paging should work as well.
>>>>
>>>> Any clues how to achieve that?
>>>
>>> An instance method in the root class to the effect of
>>> children_titles_with_spaces would get you this. That method would
>>> return
>>> "expensive car" given your simple, two-node example, which would be
>>> indexable with the normal analyzer.
>>>
>>> -Chris
>>> _______________________________________________
>>> Ferret-talk mailing list
>>> Ferret-talk at rubyforge.org
>>> http://rubyforge.org/mailman/listinfo/ferret-talk
>>
>> Codemart GmbH
>> Till Vollmer
>> Managing Director
>> Tel: +49 (0)89 1213 5359
>> Mob: + 49 (0)160 718 7403
>> Fax: +49 (0)89 1892 1347
>> Yahoo ID: till_vollmer
>> Skype: till_vollmer
>> www.codemart.de
>> till.vollmer at codemart.de
>>
>>
>>
>>
>>
>
>> _______________________________________________
>> Ferret-talk mailing list
>> Ferret-talk at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/ferret-talk
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
Codemart GmbH
Till Vollmer
Managing Director
Tel: +49 (0)89 1213 5359
Mob: + 49 (0)160 718 7403
Fax: +49 (0)89 1892 1347
Yahoo ID: till_vollmer
Skype: till_vollmer
www.codemart.de
till.vollmer at codemart.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20071120/1fd65759/attachment.html
From cstrom at mdlogix.com Tue Nov 20 09:45:57 2007
From: cstrom at mdlogix.com (Chris Strom)
Date: Tue, 20 Nov 2007 09:45:57 -0500
Subject: [Ferret-talk] Compound search / grouping
In-Reply-To:
References:
<20071120125229.GC5312@jaynestown.users.mdlogix.com>
<20071120133609.GD5312@jaynestown.users.mdlogix.com>
Message-ID: <20071120144557.GE5312@jaynestown.users.mdlogix.com>
If it's all the same class, then don't index children:
def ferret_enabled?(is_rebuild = false)
@ferret_disabled.nil? && self.root?
end
Only root nodes would get indexed with the above method (and an
appropriate root? definition) in place.
AFAIK, there is no group_by concept in aaf. At the same time, I don't
think it's really necessary. The above, combined with a single method
definition for children_titles_with_spaces should get you exactly what
you're looking to do.
-Chris
On Tue, Nov 20, 2007 at 03:08:52PM +0100, Till Vollmer wrote:
> Hi,
> Thank you for your answer.
>
> The root and the nodes are in the same table for us.
> Is there no "group_by" or something for ferret? That would probably make
> the deal.
>
> Regards
> Till
>
> Am 20.11.2007 um 14:36 schrieb Chris Strom:
>
>> If you are using acts_as_ferret, it will never get called. The
>> acts_as_ferret declaration would go on the root class. Updates to the
>> child classes would not trigger an aaf index update in the root class.
>>
>> If you want to real-time index updates, you would have to add an
>> after_save callback to the child class that forces an aaf update in the
>> root class.
>>
>> If real-time updates are not too important, then you could dump the child
>> updates into a queue that performs bulk updates. This would minimize the
>> number of times this method gets called.
>>
>> If you're worried about 200+ SQL calls, don't perform the join in ruby, do
>> it via SQL using CONCAT and "Advanced Attribute" as described in AWDWR,
>> 19.3.
>>
>> -Chris
>>
>> On Tue, Nov 20, 2007 at 02:01:39PM +0100, Till Vollmer wrote:
>>> Hi,
>>> Thank you for the clue.
>>> Ok, like a virtual attribute. Works technically but:
>>> Downside: How often is that called ? Our tree has e.g. 200 children. This
>>> means that the children are collected on every change of one of the
>>> children (index) or?
>>> Any other ideas?
>>> Regards
>>> Till
>>>
>>>
>>> Am 20.11.2007 um 13:52 schrieb Chris Strom:
>>>
>>>> On Tue, Nov 20, 2007 at 01:17:30PM +0100, Till Vollmer wrote:
>>>>> Hi,
>>>>>
>>>>> Following problem:
>>>>>
>>>>> We have a tree structure with children and a root element (recursivly)
>>>>> stored in one table (imagine a threaded forum).
>>>>>
>>>>> Each of the children has a title which should be indexed by ferret.
>>>>>
>>>>> Now we want to make a search that returns only the root and searches
>>>>> all
>>>>> items.
>>>>>
>>>>>
>>>>> So if one node has "expensive" and nother node has "car" I want to
>>>>> enter
>>>>> "expensive car" in search and still find the root of all children (and
>>>>> only once!)
>>>>>
>>>>> Also paging should work as well.
>>>>>
>>>>> Any clues how to achieve that?
>>>>
>>>> An instance method in the root class to the effect of
>>>> children_titles_with_spaces would get you this. That method would
>>>> return
>>>> "expensive car" given your simple, two-node example, which would be
>>>> indexable with the normal analyzer.
>>>>
>>>> -Chris
>>>> _______________________________________________
>>>> Ferret-talk mailing list
>>>> Ferret-talk at rubyforge.org
>>>> http://rubyforge.org/mailman/listinfo/ferret-talk
>>>
>>> Codemart GmbH
>>> Till Vollmer
>>> Managing Director
>>> Tel: +49 (0)89 1213 5359
>>> Mob: + 49 (0)160 718 7403
>>> Fax: +49 (0)89 1892 1347
>>> Yahoo ID: till_vollmer
>>> Skype: till_vollmer
>>> www.codemart.de
>>> till.vollmer at codemart.de
>>>
>>>
>>>
>>>
>>>
>>
>>> _______________________________________________
>>> Ferret-talk mailing list
>>> Ferret-talk at rubyforge.org
>>> http://rubyforge.org/mailman/listinfo/ferret-talk
>> _______________________________________________
>> Ferret-talk mailing list
>> Ferret-talk at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/ferret-talk
>
> Codemart GmbH
> Till Vollmer
> Managing Director
> Tel: +49 (0)89 1213 5359
> Mob: + 49 (0)160 718 7403
> Fax: +49 (0)89 1892 1347
> Yahoo ID: till_vollmer
> Skype: till_vollmer
> www.codemart.de
> till.vollmer at codemart.de
>
>
>
>
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
From smaloff at veer.com Tue Nov 20 12:07:00 2007
From: smaloff at veer.com (Sheldon Maloff)
Date: Tue, 20 Nov 2007 18:07:00 +0100
Subject: [Ferret-talk] Question on Deploying a Ferret DRb server
Message-ID:
(Sorry if some people see this twice. I originally posted this question
from ruby-forum.com, but didn't realize that the Ferret forum was a
mirror and that I actually wasn't a member.)
Anyway,
I've read all the documentation I could find, and read most of this
forum, but I'm a still a little confused on running a ferret DRb server.
All the examples seem to be from the point of view of running the DRb
server from within the context of a RoR web site. I'd like to consider
the following scenario:
Server 1: front-end web server + mongrel cluster
Server 2: Ferret DRb server
Server 3: MySQL database
My question is related to Server 2? Exactly what is it that I have to
deploy to that computer to have a Ferret DRb server? I understand that
on Server 1, ferret_server.yml should have a production entry that
points to Server 2, and that all the models on Server 1 need :remote =>
:true. But what lives on server 2? Do I just deploy the models folder,
the config folder and the scripts folder? Or do I deploy an entire copy
of the web site code?
I haven't found the answer to that in anything I've read, so I thought
I'd ask here.
Thanks,
Sheldon Maloff
--
Posted via http://www.ruby-forum.com/.
From kraemer at webit.de Tue Nov 20 13:13:46 2007
From: kraemer at webit.de (Jens Kraemer)
Date: Tue, 20 Nov 2007 19:13:46 +0100
Subject: [Ferret-talk] Question on Deploying a Ferret DRb server
In-Reply-To:
References:
Message-ID: <20071120181346.GN29982@cordoba.webit.de>
Hi!
On Tue, Nov 20, 2007 at 06:07:00PM +0100, Sheldon Maloff wrote:
> (Sorry if some people see this twice. I originally posted this question
> from ruby-forum.com, but didn't realize that the Ferret forum was a
> mirror and that I actually wasn't a member.)
>
> Anyway,
>
> I've read all the documentation I could find, and read most of this
> forum, but I'm a still a little confused on running a ferret DRb server.
>
> All the examples seem to be from the point of view of running the DRb
> server from within the context of a RoR web site. I'd like to consider
> the following scenario:
>
> Server 1: front-end web server + mongrel cluster
> Server 2: Ferret DRb server
> Server 3: MySQL database
>
> My question is related to Server 2? Exactly what is it that I have to
> deploy to that computer to have a Ferret DRb server? I understand that
> on Server 1, ferret_server.yml should have a production entry that
> points to Server 2, and that all the models on Server 1 need :remote =>
> :true. But what lives on server 2? Do I just deploy the models folder,
> the config folder and the scripts folder? Or do I deploy an entire copy
> of the web site code?
The DRb server needs at least access to your model classes, I think
the easiest way is to just deploy a copy of your whole Rails app to
the DRb server.
Of course you won't need your views there, but I'd find it easier to
just deploy the whole app to a second place with capistrano than
manually ripping the parts needed for DRb off.
Btw, you don't need :remote => true anymore with the current release of
aaf. Just configure your ferret_server.yml for production environment.
Cheers,
Jens
--
Jens Kr?mer
webit! Gesellschaft f?r neue Medien mbH
Schnorrstra?e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer at webit.de | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
From smaloff at veer.com Tue Nov 20 13:31:20 2007
From: smaloff at veer.com (Sheldon Maloff)
Date: Tue, 20 Nov 2007 19:31:20 +0100
Subject: [Ferret-talk] Question on Deploying a Ferret DRb server
In-Reply-To: <20071120181346.GN29982@cordoba.webit.de>
References:
<20071120181346.GN29982@cordoba.webit.de>
Message-ID:
Jens Kraemer wrote:
> Of course you won't need your views there, but I'd find it easier to
> just deploy the whole app to a second place with capistrano than
> manually ripping the parts needed for DRb off.
Thanks Jens. That's what I understood by reading everything I could. I
just wanted confirmation that that was the recommended practice.
Keep up the excellent work on AAF.
Cheers,
Sheldon Maloff
--
Posted via http://www.ruby-forum.com/.
From scottd at gmail.com Wed Nov 21 14:53:51 2007
From: scottd at gmail.com (Scott Davies)
Date: Wed, 21 Nov 2007 11:53:51 -0800
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To:
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
<75f591160711170212q1d4d6475v3e830a64ff4c3dc2@mail.gmail.com>
Message-ID: <75f591160711211153w460068acr931b7b0408368da6@mail.gmail.com>
For the record, while Lucene is pretty well-behaved as far as I can
tell, DRb running under JRuby is not. When hit with multiple request
streams simultaneously, DRb under JRuby 1.0.2 very quickly falls over
and stops responding to all queries. DRb under JRuby 1.1b1 *almost*
works, but every now and then JRuby will freak out and for a few
requests things will fail in very strange ways. (Attempts to
construct Java objects will fail with exceptions such as "undefined
method `constructors' for nil:NilClass" or "undefined method
`java_class' for Class:Class"; sometimes looking up a class will
fail...)
On the plus side, I do get the impression that JRuby development is
pretty active, and I see some concurrency bugs listed as high-priority
for JRuby 1.1, some of which have already been patched in the trunk.
My guess is that JRuby+Lucene+DRb will be a fine choice in a few
months...it was actually pretty painless to set up, even with MLI Ruby
RoR clients talking to a JRuby indexing server. (I have a simple
metaprogramming hack that lets the client specify a sequence of code
to execute on the server side, where the specification looks *almost*
like normal Ruby code; this effectively lets me easily construct
gnarly Lucene query trees in MLI Ruby clients that know nothing about
Lucene or Java. I actually initially came up with this hack to work
around Ferret's "query trees and filters don't marshal" issue.)
JRuby's not ready for serious use in scenarios with concurrency just
yet, though.
Meanwhile, I'm hoping to avoid Solr because it seems (1) kind of
complicated for what I'd actually get out of it in my particular
application, (2) not particularly well-documented given its size, and
(3) likely to get in my way when I want to do anything low-level and
gnarly with Lucene.
I guess I'll continue limping along with Ferret for the moment and
hope the concurrency issues get worked out soonish. Has anyone
actually decided specifically to make Ferret bulletproof in the face
of concurrency over the next few months, or is it probably just not
going to happen? If it doesn't, I suspect Ferret will probably fall
by the wayside as more Ruby people jump ship for Lucene-based
solutions. Which would be a shame, because Ferret does hold a lot of
promise...indexing is hard, and Ferret is *almost* a great solution.
(Too bad the last 20% is usually 80% of the work...)
-- Scott
On Nov 18, 2007 4:45 PM, Julio Cesar Ody wrote:
> Great. For my own curiosity, and maybe people here share some of it:
>
> Is it possible to write your own custom analyzers for Solr? If so, how
> easy it is? Can one do that in Ruby or do I have to write it in Java?
>
> I personally think that's one of the greatest things about Ferret. So
> far I haven't bothered looking into Sphinx or Solr precisely because,
> from a glance, I couldn't find a way to customize anything in detail
> like I can do with Ferret. I assume there is a way...
>
> Thing is, reading through the Ferret booklet (the one from OReilly),
> you get a glimpse of how easy it is to build custom solutions using
> it. So whereas it's kind of sad that the lead developer has been
> distant from the project in the last few months (?), I have to say,
> there's hardly matching how easy it is to work with it.
>
>
>
>
> On Nov 18, 2007 8:29 PM, Erik Hatcher wrote:
> >
> > On Nov 17, 2007, at 5:12 AM, Scott Davies wrote:
> > > Hmmm...I'd first heard of Solr only a couple of days ago, and I hadn't
> > > been aware of a Ruby API to it until you mentioned it.
> > > Interesting...thanks!
> >
> > I've honestly given fairly little of my time to Ferret, though I have
> > tinkered with it some and it is mighty fine!
> >
> > Believe you me, I don't want to steal any thunder from Ferret. And
> > I've not compared/contrasted them much myself. Truth be told I'm
> > still a Java dude, and knowing that Lucene and Solr are in Java,
> > excel at what they are designed to do and already gulping the Apache
> > cool-ade I really dig Solr.
> >
> > I've presented solr+ruby a couple of times now, once at RailsConf and
> > then again a few weeks ago at rubyconf.
> >
> > RailsConf:
> >
> >
> > rubyconf:
> >
> >
> > acts_as_solr as it exists today is sub-optimal compared to
> > acts_as_ferret. I'm quite admittedly not much into relational
> > databases so I have only tinkered in this area myself.
> >
> > Erik
> >
> > _______________________________________________
> > Ferret-talk mailing list
> > Ferret-talk at rubyforge.org
> > http://rubyforge.org/mailman/listinfo/ferret-talk
> >
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
From erik at ehatchersolutions.com Wed Nov 21 15:24:51 2007
From: erik at ehatchersolutions.com (Erik Hatcher)
Date: Wed, 21 Nov 2007 15:24:51 -0500
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <75f591160711211153w460068acr931b7b0408368da6@mail.gmail.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
<75f591160711170212q1d4d6475v3e830a64ff4c3dc2@mail.gmail.com>
<75f591160711211153w460068acr931b7b0408368da6@mail.gmail.com>
Message-ID: <40DFBD5A-8552-4639-8599-979BABA927F6@ehatchersolutions.com>
On Nov 21, 2007, at 2:53 PM, Scott Davies wrote:
> My guess is that JRuby+Lucene+DRb will be a fine choice in a few
> months...
Definitely not a bad choice. However I still implore you to give
Solr another chance. More on that....
> Meanwhile, I'm hoping to avoid Solr because it seems (1) kind of
> complicated for what I'd actually get out of it in my particular
> application
How so? It's a "search server" with the same goals that I imagine
you'd have for the JRuby+Lucene+DRb combination.
It's not really complicated, especially with the solr-ruby library.
Add documents, delete them, query for them. Leverage highlighting
and more-like these features, dismax querying, etc.
> , (2) not particularly well-documented given its size
Wow. Have you seen the Solr wiki? http://wiki.apache.org/solr -
there are nooks and crannies documented on that wiki that go well
beyond what I'd consider good documentation.
By all means point me to areas that aren't documented that you need
to know (off list) and I'll get those taken care of.
> (3) likely to get in my way when I want to do anything low-level and
> gnarly with Lucene.
Maybe, but not much in your way. You'd have to wrap your low-level
mojo inside some Solr API perhaps, but not even if we're just talking
about custom analyzers or similarity implementation.
> Which would be a shame, because Ferret does hold a lot of
> promise..
hear hear! I definitely extend major kudos to Dave and the other
Ferret contributors. Great stuff.
Erik
From scottd at gmail.com Wed Nov 21 17:04:51 2007
From: scottd at gmail.com (Scott Davies)
Date: Wed, 21 Nov 2007 14:04:51 -0800
Subject: [Ferret-talk] Multithreading / multiprocessing woes
In-Reply-To: <40DFBD5A-8552-4639-8599-979BABA927F6@ehatchersolutions.com>
References: <75f591160711160256i68802afcg8af5ae4e95c67636@mail.gmail.com>
<75f591160711161235g692622c5p4b7fb907e9596cdb@mail.gmail.com>
<8C5DFF2D-440E-4A5A-8D25-5CA965938D02@ehatchersolutions.com>
<75f591160711170212q1d4d6475v3e830a64ff4c3dc2@mail.gmail.com>
<75f591160711211153w460068acr931b7b0408368da6@mail.gmail.com>
<40DFBD5A-8552-4639-8599-979BABA927F6@ehatchersolutions.com>
Message-ID: <75f591160711211404r68c831f5p85a107b240f1b86b@mail.gmail.com>
On Nov 21, 2007 12:24 PM, Erik Hatcher wrote:
>
> How so? It's a "search server" with the same goals that I imagine
> you'd have for the JRuby+Lucene+DRb combination.
It's a bit more than I need right out of the gate, what with the
caching, replication, faceted search, etc. Of course, that might not
be a problem if it uses sensible configuration defaults I can safely
ignore to start with.
> It's not really complicated, especially with the solr-ruby library.
> Add documents, delete them, query for them. Leverage highlighting
> and more-like these features, dismax querying, etc.
My particular application does enough weird things that, for the most
part, I'd prefer unfettered access to the low-level Lucene APIs. (For
example, my application uses a lot of gnarly query trees involving
filters and ranges, and I'm not sure whether those are easily
transmitted through the Solr APIs. Then I have "run all of these
queries against each of the documents in this specific set and tell me
which document/query pairs match in one fell swoop" routines, in which
case it might be a good idea to copy the documents into a temporary
RAM index to run the queries against.)
>
> > , (2) not particularly well-documented given its size
>
> Wow. Have you seen the Solr wiki? http://wiki.apache.org/solr -
> there are nooks and crannies documented on that wiki that go well
> beyond what I'd consider good documentation.
>
> By all means point me to areas that aren't documented that you need
> to know (off list) and I'll get those taken care of.
Wikis are fine for looking up details when you already mostly know
what you're doing, but they're not nearly as useful when you're in the
earlier stages trying to get the big "What does this system look like
and how does it work?" picture and evaluate initial plans of attack.
Ferret and Lucene both have entire *books* written about them that are
*excellent* for those purposes. (They're not free-as-in-beer, but are
well worth the cost.) By comparison, Solr has a very simple "here is
how you get a straightforward app off the ground" tutorial that says
little about how Solr is actually organized, and then you're basicaly
left staring at a Wiki page with a thousand bullet points and no clear
path to big-picture enlightenment. And given the choice between (1)
using a lower-level system that's been very well-documented in a
well-organized explanatory fashion and (2) using a slightly
higher-level system I still haven't acquired a mental "big picture"
for, I generally find (1) more productive.
This isn't a criticism of Solr's documentation nearly as much as a
hearty "Book-style documentation is useful, and, holy crap, Ferret and
Lucene actually HAVE IT. Woohoo!", plus an added bonus testament to
my own laziness.
> > (3) likely to get in my way when I want to do anything low-level and
> > gnarly with Lucene.
>
> Maybe, but not much in your way. You'd have to wrap your low-level
> mojo inside some Solr API perhaps, but not even if we're just talking
> about custom analyzers or similarity implementation.
Yeah, my guess is that if I sit down and figure out how Solr is laid
out, adding APIs to do what I want won't be too hard. Might still be
kind of tedious implementing all the necessary marshaling, though.
-- Scott
From me at benjaminarai.com Sat Nov 24 17:30:56 2007
From: me at benjaminarai.com (Benjamin Arai)
Date: Sat, 24 Nov 2007 14:30:56 -0800
Subject: [Ferret-talk] Getting a Lucene.net index readable by Ferret
Message-ID: <4748A620.2070608@benjaminarai.com>
Hi,
What would it take to get the a Lucene.net index readable by Ferret? I
know that there has been discussion on this before but I am trying to
figure the actual amount of work (cost) would be required to get this
done. Any help would be greatly appreciated.
Benjamin
From jk at jkraemer.net Mon Nov 26 16:11:26 2007
From: jk at jkraemer.net (=?utf-8?Q?Jens_Kr=c3=a4mer?=)
Date: Mon, 26 Nov 2007 22:11:26 +0100
Subject: [Ferret-talk] search not working after upgrade
In-Reply-To:
References: <3d7046ce8dad1b0bc671415dc978587a@ruby-forum.com>
Message-ID: <3d32bbb7d2da1727835043216042065a@ruby-forum.com>
Izit Izit wrote:
> Correction on my previous post.
>
> The correct way to do it is:
>
> Product.find_by_contents("*",{},:conditions =>search_conditions,:include
> => [:supplier],:order =>"products.id" )
>
> Leave out the :limit=>:all that is put in by default.
Exactly - I tried to make aaf a bit more clever by letting it assume
:limit => :all whenever sql conditions are given, but messed it up
somehow ;-)
It's fixed in trunk
(http://projects.jkraemer.net/acts_as_ferret/changeset/286), or just
apply the attached patch.
Btw, this whole thread hasn't come through to the mailing list (yet?), I
discovered it by pure chance. Please subscribe to the ferret mailing
list (http://rubyforge.org/mail/?group_id=1028) and post there directly
to make sure your posting gets actually read.
Cheers,
Jens
Attachments:
http://www.ruby-forum.com/attachment/1044/fix_limit_all.diff
--
Posted via http://www.ruby-forum.com/.
From john at digitalpulp.com Tue Nov 27 21:11:35 2007
From: john at digitalpulp.com (John Bachir)
Date: Tue, 27 Nov 2007 21:11:35 -0500
Subject: [Ferret-talk] flakey web-list interface (was: search not working
after upgrade)
In-Reply-To: <3d32bbb7d2da1727835043216042065a@ruby-forum.com>
References: <3d7046ce8dad1b0bc671415dc978587a@ruby-forum.com>
<3d32bbb7d2da1727835043216042065a@ruby-forum.com>
Message-ID:
On Nov 26, 2007, at 4:11 PM, Jens Kr?mer wrote:
> Btw, this whole thread hasn't come through to the mailing list
> (yet?), I
> discovered it by pure chance. Please subscribe to the ferret mailing
> list (http://rubyforge.org/mail/?group_id=1028) and post there
> directly
> to make sure your posting gets actually read.
Jens-
I see this happy a lot on rubyforge-- is it because it only brings
email in from the web interface when the poster is subscribed? Or is
it just flakey software? Do you have any insight into how we might be
able to get rubyforge to either address or document this issue?
John
From kraemer at webit.de Wed Nov 28 04:14:40 2007
From: kraemer at webit.de (Jens Kraemer)
Date: Wed, 28 Nov 2007 10:14:40 +0100
Subject: [Ferret-talk] flakey web-list interface (was: search
not working after upgrade)
In-Reply-To:
References: <3d7046ce8dad1b0bc671415dc978587a@ruby-forum.com>
<3d32bbb7d2da1727835043216042065a@ruby-forum.com>
Message-ID: <20071128091440.GI5751@cordoba.webit.de>
On Tue, Nov 27, 2007 at 09:11:35PM -0500, John Bachir wrote:
>
> On Nov 26, 2007, at 4:11 PM, Jens Kr?mer wrote:
> > Btw, this whole thread hasn't come through to the mailing list
> > (yet?), I
> > discovered it by pure chance. Please subscribe to the ferret mailing
> > list (http://rubyforge.org/mail/?group_id=1028) and post there
> > directly
> > to make sure your posting gets actually read.
>
> Jens-
>
> I see this happy a lot on rubyforge-- is it because it only brings
> email in from the web interface when the poster is subscribed? Or is
> it just flakey software? Do you have any insight into how we might be
> able to get rubyforge to either address or document this issue?
I'm not sure why this happens, maybe some spam prevention kicks in, or
it's the way you said, that it only accepts messages from people
subscribed to the mailing list. I'll try and ask Andreas Schwarz,
the creator of ruby-forum.com, about this.
Cheers,
Jens
--
Jens Kr?mer
webit! Gesellschaft f?r neue Medien mbH
Schnorrstra?e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
kraemer at webit.de | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
From tvollmer at codemart.de Wed Nov 28 04:25:35 2007
From: tvollmer at codemart.de (Till Vollmer)
Date: Wed, 28 Nov 2007 10:25:35 +0100
Subject: [Ferret-talk] Ferret on Mac OS X Leopard
In-Reply-To: <20071128091440.GI5751@cordoba.webit.de>
References: <3d7046ce8dad1b0bc671415dc978587a@ruby-forum.com>
<3d32bbb7d2da1727835043216042065a@ruby-forum.com>
<20071128091440.GI5751@cordoba.webit.de>
Message-ID: <9ED901E2-1E96-4614-AEF2-F67E10B75E54@codemart.de>
Hello,
I have some major problems on installing ferret on Leopard. While I
know its already installed when you install Leopard I want to do it
manually as I am not using the installed version of ruby (since I
migrated from Tiger).
When the native extensions are compiled I get some linker problems.
Can anyone reproduce that?
I have an older version of ferret installed (which I installed while
being on Tiger) and this works fine, but I want to upgrade.
Regards
Till
From andreas.korth at gmail.com Wed Nov 28 07:36:05 2007
From: andreas.korth at gmail.com (Andreas Korth)
Date: Wed, 28 Nov 2007 13:36:05 +0100
Subject: [Ferret-talk] Ferret on Mac OS X Leopard
In-Reply-To: <9ED901E2-1E96-4614-AEF2-F67E10B75E54@codemart.de>
References: <3d7046ce8dad1b0bc671415dc978587a@ruby-forum.com>
<3d32bbb7d2da1727835043216042065a@ruby-forum.com>
<20071128091440.GI5751@cordoba.webit.de>
<9ED901E2-1E96-4614-AEF2-F67E10B75E54@codemart.de>
Message-ID:
On 28.11.2007, at 10:25, Till Vollmer wrote:
> I have some major problems on installing ferret on Leopard.
Interestingly, the words "major problems" and "Leopard" coincide a lot
lately.
Here's my advice:
- get rid of the probably buggiest piece of software that apple has
ever shipped
- go back to tiger and stay there for at least a couple of major updates
- oh and if you made the same mistake and updated tiger instead of a
fresh install, I'm very sorry for you ;)
> When the native extensions are compiled I get some linker problems.
If you can't resist the temptation of leopard's amazingly great
feature set
- make sure you install the latest ruby version as well as any other
package via macports
- have apple's developer tools installed in advance (includes gcc +
build tools)
- a talisman might probably be helpful
Best of luck,
Andy
From ndaniels at mac.com Wed Nov 28 13:36:24 2007
From: ndaniels at mac.com (Noah M. Daniels)
Date: Wed, 28 Nov 2007 13:36:24 -0500
Subject: [Ferret-talk] Ferret on Mac OS X Leopard
In-Reply-To:
References: <3d7046ce8dad1b0bc671415dc978587a@ruby-forum.com>
<3d32bbb7d2da1727835043216042065a@ruby-forum.com>
<20071128091440.GI5751@cordoba.webit.de>
<9ED901E2-1E96-4614-AEF2-F67E10B75E54@codemart.de>
Message-ID: <9952A4BA-F039-4738-9AF6-0186D6DCBAAF@mac.com>
On Nov 28, 2007, at 7:36 AM, Andreas Korth wrote:
>
> On 28.11.2007, at 10:25, Till Vollmer wrote:
>
>> I have some major problems on installing ferret on Leopard.
>
> Interestingly, the words "major problems" and "Leopard" coincide a lot
> lately.
>
> Here's my advice:
>
> - get rid of the probably buggiest piece of software that apple has
> ever shipped
>
> If you can't resist the temptation of leopard's amazingly great
> feature set
>
> - make sure you install the latest ruby version as well as any other
> package via macports
> - have apple's developer tools installed in advance (includes gcc +
> build tools)
> - a talisman might probably be helpful
For what it's worth, I've had zero problems with the Apple-supplied
ruby, rails, etc... I updated some gems but ferret is running great
(better than on our Linux servers, in fact -- no end of problems on
Ubuntu 7.04 x64). I would recommend sticking with the Apple-supplied
ruby; for once they've gotten it right, and everything seems to work
beautifully.
From andreas.korth at gmail.com Thu Nov 29 07:01:21 2007
From: andreas.korth at gmail.com (Andreas Korth)
Date: Thu, 29 Nov 2007 13:01:21 +0100
Subject: [Ferret-talk] Ferret on Mac OS X Leopard
In-Reply-To: <9952A4BA-F039-4738-9AF6-0186D6DCBAAF@mac.com>
References: <3d7046ce8dad1b0bc671415dc978587a@ruby-forum.com>
<3d32bbb7d2da1727835043216042065a@ruby-forum.com>
<20071128091440.GI5751@cordoba.webit.de>
<9ED901E2-1E96-4614-AEF2-F67E10B75E54@codemart.de>
<9952A4BA-F039-4738-9AF6-0186D6DCBAAF@mac.com>
Message-ID: <4136BE39-04ED-4B11-B6C5-6472E17FDA05@gmail.com>
On 28.11.2007, at 19:36, Noah M. Daniels wrote:
>> Interestingly, the words "major problems" and "Leopard" coincide a
>> lot
>> lately.
> For what it's worth, I've had zero problems with the Apple-supplied
> ruby, rails, etc...
Frankly, knowing that everything works well for others isn't worth
much to people who _are_ having problems ;)
But it appears to be a common reaction ? especially in the Apple
community.
> I would recommend sticking with the Apple-supplied
> ruby; for once they've gotten it right, and everything seems to work
> beautifully.
You must have a different understanding of 'getting it right'. Here's
what I got when I entered 'ruby -v' or 'gems' into the console of a
fresh 10.5 install:
-bash: ruby: command not found
After getting it to work eventually, a 'gem update --system' just
wrecked the whole Ruby installation. At that point I just gave up and
installed Ruby/Gems and Rails via Macports.
One thing I'd really like to know is how one is supposed to update the
Ruby/Rails packages which shipped with Leopard. I had no chance to
check, but are they still shipping Rails 1.1.2? I bet that Apple isn't
going to update Ruby during the whole lifetime of Leopard. Anything
else would be a big surprise.
So here goes my advice again: Use Macports. Do not use whatever Apple
ships.
Cheers,
Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20071129/c1e5c6a6/attachment.html
From ndaniels at mac.com Thu Nov 29 11:58:03 2007
From: ndaniels at mac.com (Noah M. Daniels)
Date: Thu, 29 Nov 2007 11:58:03 -0500
Subject: [Ferret-talk] Ferret on Mac OS X Leopard
In-Reply-To: <4136BE39-04ED-4B11-B6C5-6472E17FDA05@gmail.com>
References: <3d7046ce8dad1b0bc671415dc978587a@ruby-forum.com>
<3d32bbb7d2da1727835043216042065a@ruby-forum.com>
<20071128091440.GI5751@cordoba.webit.de>
<9ED901E2-1E96-4614-AEF2-F67E10B75E54@codemart.de>
<9952A4BA-F039-4738-9AF6-0186D6DCBAAF@mac.com>
<4136BE39-04ED-4B11-B6C5-6472E17FDA05@gmail.com>
Message-ID:
On Nov 29, 2007, at 7:01 AM, Andreas Korth wrote:
>
> On 28.11.2007, at 19:36, Noah M. Daniels wrote:
>
>>> Interestingly, the words "major problems" and "Leopard" coincide a
>>> lot
>>> lately.
>
>> For what it's worth, I've had zero problems with the Apple-supplied
>> ruby, rails, etc...
>
> Frankly, knowing that everything works well for others isn't worth
> much to people who _are_ having problems ;)
>
> But it appears to be a common reaction ? especially in the Apple
> community.
Point taken, but I was responding to the poster's statement that
they'd avoided the apple-supplied Ruby.
>
>
> You must have a different understanding of 'getting it right'.
> Here's what I got when I entered 'ruby -v' or 'gems' into the
> console of a fresh 10.5 install:
>
> -bash: ruby: command not found
>
That's very strange, but just sounds like a path issue.
> After getting it to work eventually, a 'gem update --system' just
> wrecked the whole Ruby installation. At that point I just gave up
> and installed Ruby/Gems and Rails via Macports.
>
Yes, this one is a known issue. See these links:
http://discussions.apple.com/thread.jspa?threadID=1200950&tstart=0
http://discussions.apple.com/thread.jspa?threadID=1202925&tstart=0
> One thing I'd really like to know is how one is supposed to update
> the Ruby/Rails packages which shipped with Leopard. I had no chance
> to check, but are they still shipping Rails 1.1.2? I bet that Apple
> isn't going to update Ruby during the whole lifetime of Leopard.
> Anything else would be a big surprise.
>
It comes with rails 1.2.3, and gem update rails updates it to 1.2.5
(well, now 1.2.6) just fine.
> So here goes my advice again: Use Macports. Do not use whatever
> Apple ships.
I disagree in the most friendly way possible :)
From marvin at rectangular.com Thu Nov 29 12:02:26 2007
From: marvin at rectangular.com (Marvin Humphrey)
Date: Thu, 29 Nov 2007 09:02:26 -0800
Subject: [Ferret-talk] Ferret on Mac OS X Leopard
In-Reply-To: <4136BE39-04ED-4B11-B6C5-6472E17FDA05@gmail.com>
References: <3d7046ce8dad1b0bc671415dc978587a@ruby-forum.com>
<3d32bbb7d2da1727835043216042065a@ruby-forum.com>
<20071128091440.GI5751@cordoba.webit.de>
<9ED901E2-1E96-4614-AEF2-F67E10B75E54@codemart.de>
<9952A4BA-F039-4738-9AF6-0186D6DCBAAF@mac.com>
<4136BE39-04ED-4B11-B6C5-6472E17FDA05@gmail.com>
Message-ID: <931A7E6F-5D4E-4F92-BA5E-4296BF1F5446@rectangular.com>
On Nov 29, 2007, at 4:01 AM, Andreas Korth wrote:
> Here's what I got when I entered 'ruby -v' or 'gems' into the
> console of a fresh 10.5 install:
>
> -bash: ruby: command not found
Curious. Did you install the developer tools? Here's what I get in
Terminal with a fresh install of 10.5 and XCode 3.0:
/Users/marvin/ $ ruby -v
ruby 1.8.6 (2007-06-07 patchlevel 36) [universal-darwin9.0]
/Users/marvin/ $ which ruby
/usr/bin/ruby
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
From lebreeze at gmail.com Fri Nov 30 07:48:10 2007
From: lebreeze at gmail.com (Levent Ali)
Date: Fri, 30 Nov 2007 12:48:10 +0000
Subject: [Ferret-talk] Cannot install ferret gem on Leopard
Message-ID: <76685bc50711300448p646c8124q6f5a42ce28ff6946@mail.gmail.com>
I have 0.11.3 installed
When I try 0.11.6 or 0.11.5 I get the following output
Building native extensions. This could take a while...
ERROR: While executing gem ... (Gem::Installer::ExtensionBuildError)
ERROR: Failed to build gem native extension.
ruby extconf.rb install ferret
creating Makefile
make
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c analysis.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c api.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c array.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c bitvector.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c compound_io.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c document.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c except.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c ferret.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c filter.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c fs_store.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c global.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c hash.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c hashset.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c helper.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c index.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c libstemmer.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c mempool.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c multimapper.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c posh.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
priorityqueue.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c q_boolean.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
q_const_score.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
q_filtered_query.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c q_fuzzy.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c q_match_all.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
q_multi_term.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c q_parser.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c q_phrase.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c q_prefix.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c q_range.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c q_span.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c q_term.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c q_wildcard.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c r_analysis.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c r_index.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c r_qparser.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c r_search.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c r_store.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c r_utils.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c ram_store.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c search.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c similarity.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c sort.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_danish.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_dutch.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_english.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_finnish.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_french.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_german.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_italian.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_norwegian.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_porter.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_portuguese.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_spanish.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_ISO_8859_1_swedish.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_KOI8_R_russian.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_danish.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_dutch.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_english.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_finnish.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_french.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_german.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_italian.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_norwegian.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_porter.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_portuguese.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_russian.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_spanish.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
stem_UTF_8_swedish.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c stopwords.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c store.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c
term_vectors.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3
-I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I. -fno-common -g -O2
-fno-common -pipe -fno-common -D_FILE_OFFSET_BITS=64 -c utilities.c
cc -dynamic -bundle -undefined suppress -flat_namespace
-L"/usr/local/lib" -o ferret_ext.bundle analysis.o api.o array.o
bitvector.o compound_io.o document.o except.o ferret.o filter.o
fs_store.o global.o hash.o hashset.o helper.o index.o libstemmer.o
mempool.o multimapper.o posh.o priorityqueue.o q_boolean.o
q_const_score.o q_filtered_query.o q_fuzzy.o q_match_all.o
q_multi_term.o q_parser.o q_phrase.o q_prefix.o q_range.o q_span.o
q_term.o q_wildcard.o r_analysis.o r_index.o r_qparser.o r_search.o
r_store.o r_utils.o ram_store.o search.o similarity.o sort.o
stem_ISO_8859_1_danish.o stem_ISO_8859_1_dutch.o
stem_ISO_8859_1_english.o stem_ISO_8859_1_finnish.o
stem_ISO_8859_1_french.o stem_ISO_8859_1_german.o
stem_ISO_8859_1_italian.o stem_ISO_8859_1_norwegian.o
stem_ISO_8859_1_porter.o stem_ISO_8859_1_portuguese.o
stem_ISO_8859_1_spanish.o stem_ISO_8859_1_swedish.o
stem_KOI8_R_russian.o stem_UTF_8_danish.o stem_UTF_8_dutch.o
stem_UTF_8_english.o stem_UTF_8_finnish.o stem_UTF_8_french.o
stem_UTF_8_german.o stem_UTF_8_italian.o stem_UTF_8_norwegian.o
stem_UTF_8_porter.o stem_UTF_8_portuguese.o stem_UTF_8_russian.o
stem_UTF_8_spanish.o stem_UTF_8_swedish.o stopwords.o store.o
term_vectors.o utilities.o -lruby -lpthread -ldl -lobjc
/usr/bin/ld: /usr/lib/gcc/i686-apple-darwin8/4.0.1/../../../libpthread.dylib
unknown flags (type) of section 6 (__TEXT,__dof_plockstat) in load
command 0
/usr/bin/ld: /usr/lib/gcc/i686-apple-darwin8/4.0.1/../../../libdl.dylib
unknown flags (type) of section 6 (__TEXT,__dof_plockstat) in load
command 0
/usr/bin/ld: /usr/lib/gcc/i686-apple-darwin8/4.0.1/../../../libobjc.dylib
load command 9 unknown cmd field
/usr/bin/ld: /usr/lib/gcc/i686-apple-darwin8/4.0.1/../../../libSystem.dylib
unknown flags (type) of section 6 (__TEXT,__dof_plockstat) in load
command 0
/usr/bin/ld: /usr/lib/libSystem.B.dylib unknown flags (type) of
section 6 (__TEXT,__dof_plockstat) in load command 0
collect2: ld returned 1 exit status
make: *** [ferret_ext.bundle] Error 1