From adamjroth at gmail.com Wed Mar 1 00:50:46 2006 From: adamjroth at gmail.com (Adam Roth) Date: Wed, 1 Mar 2006 06:50:46 +0100 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: References: Message-ID: <3810d363d9653e8ed930f83ff448bb26@ruby-forum.com> David Balmain wrote: > One other thing. When you say the blog index is being updated when you > modify comments, does that mean only the blog index ever gets updated? > Does the comments index remains empty? I used the second block of code on the wiki and it now works with 1 index... however, I can't seem to get any results back. Editing any of the models forces the index to update, so I know something is going on. Hopefully you guys can answer a few questions: 1. Are all the "fields" of a model indexed? When I used that acts_as_ferret code from the svn repository, I had to specify the fields. And despite the problems, I was also getting results back. 2. Is there a way to limit the results that you get back? (IE, limit to 10, or pass a limit/offset for paging). 3. Is there a way I can "dump" the information Ferret has indexed so that I can see if the correct data is there? I'd like to figure out why no results are coming back from: def test_ferret @results = Comment.find_by_contents( params['query'] ) render_text @results.inspect end Thanks again. Adam -- Posted via http://www.ruby-forum.com/. From atomgiant at gmail.com Wed Mar 1 07:12:41 2006 From: atomgiant at gmail.com (Tom Davies) Date: Wed, 1 Mar 2006 07:12:41 -0500 Subject: [Ferret-talk] Most Popular Searches In-Reply-To: References: Message-ID: Wow, thanks for taking the time to put that together Dave. That looks very promising. I appreciate it. If I have a chance to do performance tests, I will report back to this list. Tom On 2/28/06, David Balmain wrote: > On 2/28/06, Tom Davies wrote: > > Hi, > > > > I have an index where each document contains an untokenized 'url' > > field. I would like to query the index for the most popular urls. In > > SQL I would do this via a Group By clause. Is there anything in > > Ferret that will do something similar? > > > > I found this discussion that proposed a solution involving TermEnums: > > > > http://www.gossamer-threads.com/lists/lucene/java-user/32272#32272 > > > > But I noticed the IndexReader.terms and IndexReader.term_docs are not > > implemented. Is that solution the way to go? Would an index-only > > solution perform a lot faster than a pure database solution using a > > group by clause? > > Hi Tom, > > Those methods are implemented. Just not in IndexReader. They're > implemented in SegmentReader and MultiReader. IndexReader is an > abstract class. Whenever you call IndexReader#open you'll get either a > SegmentReader or a MultiReader. > > Anyway, if you want to run searches on all documents with the url > field you could use a filter like this; > > module Ferret::Search > # A Filter that restricts search results to only those documents with a > # certain field called @group_name. > class GroupFilter < Filter > include Ferret::Index > > def initialize(group_name) > @group_name = group_name > end > > # Returns a BitVector with true for documents which should be permitted in > # search results, and false for those that should not. > def bits(reader) > bits = Ferret::Utils::BitVector.new() > term_enum = reader.terms_from(Term.new(@group_name, "")) > > begin > if (term_enum.term() == nil) > return bits > end > term_docs = reader.term_docs > begin > begin > term = term_enum.term() > break if (term.nil? or term.field != @group_name) > > term_docs.seek(term_enum) > while term_docs.next? > bits.set(term_docs.doc) > end > end while term_enum.next? > ensure > term_docs.close() > end > ensure > term_enum.close() > end > > return bits > end > end > end > > Or perhaps you only want the 10 most popular urls and you'd like to > create the filter like this; > > filter = Filter.new("url", ["url1", "url2", ..., "url10"]) > > This filter might look something like this; > > module Ferret::Search > # A Filter that restricts search results to only those documents with a > # certain field called @field_name with values in the @values array. > class GroupFilter < Filter > include Ferret::Index > > def initialize(field_name, values) > @field_name = field_name > @values = values > end > > # Returns a BitVector with true for documents which should be permitted in > # search results, and false for those that should not. > def bits(reader) > bits = Ferret::Utils::BitVector.new() > term_enum = reader.terms_from(Term.new(@field_name, "")) > > begin > if (term_enum.term() == nil) > return bits > end > term_docs = reader.term_docs > begin > begin > term = term_enum.term() > break if (term.nil? or term.field != @field_name) > > if @values.index(term.text) > term_docs.seek(term_enum) > while term_docs.next? > bits.set(term_docs.doc) > end > end > end while term_enum.next? > ensure > term_docs.close() > end > ensure > term_enum.close() > end > > return bits > end > end > end > > WARNING:: I haven't tested any of this code. Also, I don't know how it > would perform compared to using a group_by on the database itself > although I'd be happy to hear about any performance tests you might > do. I hope this helps. > > Cheers, > Dave > > > > > Any feedback is appreciated. > > > > Tom > > > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From atomgiant at gmail.com Wed Mar 1 08:46:37 2006 From: atomgiant at gmail.com (Tom Davies) Date: Wed, 1 Mar 2006 08:46:37 -0500 Subject: [Ferret-talk] Updating Index Is Very Slow Message-ID: Hi, I am experiencing very poor performance when updating my index. For example, to update the index for 10 documents, it is taking 3 to 4 seconds. My index is currently very small... with probably less than 100 docs in it. I have created my index as follows: GIFT_INDEX = Index::Index.new(:path => "#{index_dir}/gift", :key => 'id', :auto_flush => true) and I have an after_save filter in my model as follows: def update_index INDEX << self.to_doc end Is there anything I can do to improve this performance? Thanks, Tom From dbalmain.ml at gmail.com Wed Mar 1 08:58:04 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 1 Mar 2006 22:58:04 +0900 Subject: [Ferret-talk] Updating Index Is Very Slow In-Reply-To: References: Message-ID: On 3/1/06, Tom Davies wrote: > Hi, > > I am experiencing very poor performance when updating my index. For > example, to update the index for 10 documents, it is taking 3 to 4 > seconds. My index is currently very small... with probably less than > 100 docs in it. > > I have created my index as follows: > > GIFT_INDEX = Index::Index.new(:path => "#{index_dir}/gift", :key => > 'id', :auto_flush => true) > > and I have an after_save filter in my model as follows: > > def update_index > INDEX << self.to_doc > end > > > Is there anything I can do to improve this performance? Hi Tom, That sounds very slow. How large are the documents? The first thing you can do is turn off auto_flush. That should substantially speed things up. If you only have one thread you won't need auto_flush. If you have more then one thread then I'd suggest having a dedicated indexing thread (and again you won't need auto_flush). If things are still too slow after that, I'm nearly finished with the C rewrite of ferret. A linux version should be out some time next week. This will be at least 10 times as fast. Hope that helps, Cheers, Dave > > Thanks, > Tom > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From garypelliott at gmail.com Wed Mar 1 09:18:58 2006 From: garypelliott at gmail.com (Gary Elliott) Date: Wed, 1 Mar 2006 09:18:58 -0500 Subject: [Ferret-talk] Updating Index Is Very Slow In-Reply-To: References: Message-ID: That's great, David. Is the C version of ferret similar enough to the current version so that someone could develop on windows or osx with the current version of ferret and deploy on linux with cFerret? On 3/1/06, David Balmain wrote: > On 3/1/06, Tom Davies wrote: > > Hi, > > > > I am experiencing very poor performance when updating my index. For > > example, to update the index for 10 documents, it is taking 3 to 4 > > seconds. My index is currently very small... with probably less than > > 100 docs in it. > > > > I have created my index as follows: > > > > GIFT_INDEX = Index::Index.new(:path => "#{index_dir}/gift", :key => > > 'id', :auto_flush => true) > > > > and I have an after_save filter in my model as follows: > > > > def update_index > > INDEX << self.to_doc > > end > > > > > > Is there anything I can do to improve this performance? > > Hi Tom, > > That sounds very slow. How large are the documents? > > The first thing you can do is turn off auto_flush. That should > substantially speed things up. If you only have one thread you won't > need auto_flush. If you have more then one thread then I'd suggest > having a dedicated indexing thread (and again you won't need > auto_flush). If things are still too slow after that, I'm nearly > finished with the C rewrite of ferret. A linux version should be out > some time next week. This will be at least 10 times as fast. > > Hope that helps, > > Cheers, > Dave > > > > > > Thanks, > > Tom > > > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From kraemer at webit.de Wed Mar 1 09:23:09 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 1 Mar 2006 15:23:09 +0100 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: <3810d363d9653e8ed930f83ff448bb26@ruby-forum.com> References: <3810d363d9653e8ed930f83ff448bb26@ruby-forum.com> Message-ID: <20060301142309.GA19880@cordoba.webit.de> On Wed, Mar 01, 2006 at 06:50:46AM +0100, Adam Roth wrote: > > 3. Is there a way I can "dump" the information Ferret has indexed so > that I can see if the correct data is there? I'd like to figure out why > no results are coming back from: > > def test_ferret > @results = Comment.find_by_contents( params['query'] ) > render_text @results.inspect > end you can use Luke (http://www.getopt.org/luke/) to inspect an existing ferret index. I'll try to reproduce and fix the problems you had concerning multiple indexes/classes with the acts_as_ferret version from the svn repository. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From dbalmain.ml at gmail.com Wed Mar 1 09:52:36 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 1 Mar 2006 23:52:36 +0900 Subject: [Ferret-talk] Updating Index Is Very Slow In-Reply-To: References: Message-ID: On 3/1/06, Gary Elliott wrote: > That's great, David. Is the C version of ferret similar enough to the > current version so that someone could develop on windows or osx with > the current version of ferret and deploy on linux with cFerret? > It's similar enough that it won't take long to port but it won't be exactly the same. The pure ruby version is being phased out, but I do plan have a windows version of cferret. Hopefully a keen windows developer will lend a hand. Unfortunately I don't have the microsoft C compiler. From atomgiant at gmail.com Wed Mar 1 09:57:23 2006 From: atomgiant at gmail.com (Tom Davies) Date: Wed, 1 Mar 2006 09:57:23 -0500 Subject: [Ferret-talk] Updating Index Is Very Slow In-Reply-To: References: Message-ID: Hi Dave, That is good news about the Linux version. I tried turning off autoflush, but that did not appear to have a very noticable difference. I just added the following benchmark code around the index update: def update_index Gift.benchmark("updating index") do INDEX << self.to_doc end end I am attaching a trace of running this against 28 docs, which is the entire index. The documents are not large, as you will see from the trace. Each index update is averaging around .4 seconds. Does that seem acceptable? I am hoping this turns out to be some sort of configuration error. Tom On 3/1/06, Gary Elliott wrote: > That's great, David. Is the C version of ferret similar enough to the > current version so that someone could develop on windows or osx with > the current version of ferret and deploy on linux with cFerret? > > On 3/1/06, David Balmain wrote: > > On 3/1/06, Tom Davies wrote: > > > Hi, > > > > > > I am experiencing very poor performance when updating my index. For > > > example, to update the index for 10 documents, it is taking 3 to 4 > > > seconds. My index is currently very small... with probably less than > > > 100 docs in it. > > > > > > I have created my index as follows: > > > > > > GIFT_INDEX = Index::Index.new(:path => "#{index_dir}/gift", :key => > > > 'id', :auto_flush => true) > > > > > > and I have an after_save filter in my model as follows: > > > > > > def update_index > > > INDEX << self.to_doc > > > end > > > > > > > > > Is there anything I can do to improve this performance? > > > > Hi Tom, > > > > That sounds very slow. How large are the documents? > > > > The first thing you can do is turn off auto_flush. That should > > substantially speed things up. If you only have one thread you won't > > need auto_flush. If you have more then one thread then I'd suggest > > having a dedicated indexing thread (and again you won't need > > auto_flush). If things are still too slow after that, I'm nearly > > finished with the C rewrite of ferret. A linux version should be out > > some time next week. This will be at least 10 times as fast. > > > > Hope that helps, > > > > Cheers, > > Dave > > > > > > > > > > Thanks, > > > Tom > > > > > > _______________________________________________ > > > Ferret-talk mailing list > > > Ferret-talk at rubyforge.org > > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > > > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- A non-text attachment was scrubbed... Name: ferret.log Type: application/octet-stream Size: 23845 bytes Desc: not available Url : http://rubyforge.org/pipermail/ferret-talk/attachments/20060301/41c1b379/ferret-0001.obj From pannalal at usa.net Wed Mar 1 10:19:04 2006 From: pannalal at usa.net (P L Patodia) Date: Wed, 1 Mar 2006 16:19:04 +0100 Subject: [Ferret-talk] Sorting the Result Message-ID: The document describes search(query, options) sort: An array of SortFields describing how to sort the results. I have created index with two fields: 'file' and 'content' When I give SortField name as 'file' while searching, it results into error. The exact command given by me: index.search_each("sleepless AND dreams", :num_docs => 100, :sort => 'file') do |doc, score| .... end My question is how do I give field name for sorting. Thanks and Regards, P L Patodia -- Posted via http://www.ruby-forum.com/. From atomgiant at gmail.com Wed Mar 1 10:22:26 2006 From: atomgiant at gmail.com (Tom Davies) Date: Wed, 1 Mar 2006 10:22:26 -0500 Subject: [Ferret-talk] Sorting the Result In-Reply-To: References: Message-ID: I believe your sort field needs to be a Ferret::Search::SortField Here is how I am sorting (with two fields). You might be able to do it without the array and just passing in a Ferret::Search::SortField to :sort. I haven't tried that though. sort_fields = [] sort_fields << Ferret::Search::SortField.new('created_at') sort_fields << Ferret::Search::SortField.new('url') INDEX.search_each(query, {:sort => sort_fields}) do |doc, score| Tom On 3/1/06, P L Patodia wrote: > The document describes > search(query, options) > sort: An array of SortFields describing how to sort the results. > > I have created index with two fields: 'file' and 'content' > > When I give SortField name as 'file' while searching, it results into > error. > > The exact command given by me: > index.search_each("sleepless AND dreams", :num_docs => 100, :sort => > 'file') > do |doc, score| > .... > end > > My question is how do I give field name for sorting. > > Thanks and Regards, > > P L Patodia > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From lmarlow at yahoo.com Wed Mar 1 12:18:10 2006 From: lmarlow at yahoo.com (Lee Marlow) Date: Wed, 1 Mar 2006 10:18:10 -0700 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: <20060301142309.GA19880@cordoba.webit.de> References: <3810d363d9653e8ed930f83ff448bb26@ruby-forum.com> <20060301142309.GA19880@cordoba.webit.de> Message-ID: <7968d7490603010918g179433c9i72b15bf81b2a8d5c@mail.gmail.com> We have also been using our own version of acts_as_ferret, put together from the wiki and a version that was on the rails mailing list. We added a simple rake task for rebuilding the index and pagination. We kept the one index for all models as I expect Ferret to be fast enough to handle it and maybe we'll want to query across models one day. I tried to checkout the code from https://svn.jkraemer.net/svn/projects/ferret-demo/trunk/vendor/plugins/acts_as_ferret/ but it is requiring a username and password, even though I can access it fine through the web. I was hoping I could send you some diffs for some of the things we've done. Is there a guest account I could use to checkout? Thanks -Lee On 3/1/06, Jens Kraemer wrote: > On Wed, Mar 01, 2006 at 06:50:46AM +0100, Adam Roth wrote: > > > > 3. Is there a way I can "dump" the information Ferret has indexed so > > that I can see if the correct data is there? I'd like to figure out why > > no results are coming back from: > > > > def test_ferret > > @results = Comment.find_by_contents( params['query'] ) > > render_text @results.inspect > > end > > you can use Luke (http://www.getopt.org/luke/) to inspect an existing > ferret index. > > I'll try to reproduce and fix the problems you had concerning multiple > indexes/classes with the acts_as_ferret version from the svn repository. > > Jens > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From kraemer at webit.de Wed Mar 1 13:07:02 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 1 Mar 2006 19:07:02 +0100 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: <7968d7490603010918g179433c9i72b15bf81b2a8d5c@mail.gmail.com> References: <3810d363d9653e8ed930f83ff448bb26@ruby-forum.com> <20060301142309.GA19880@cordoba.webit.de> <7968d7490603010918g179433c9i72b15bf81b2a8d5c@mail.gmail.com> Message-ID: <20060301180702.GA26099@cordoba.webit.de> On Wed, Mar 01, 2006 at 10:18:10AM -0700, Lee Marlow wrote: > We have also been using our own version of acts_as_ferret, put > together from the wiki and a version that was on the rails mailing > list. We added a simple rake task for rebuilding the index and > pagination. We kept the one index for all models as I expect Ferret > to be fast enough to handle it and maybe we'll want to query across > models one day. > > I tried to checkout the code from > https://svn.jkraemer.net/svn/projects/ferret-demo/trunk/vendor/plugins/acts_as_ferret/ > but it is requiring a username and password, even though I can access > it fine through the web. I was hoping I could send you some diffs for > some of the things we've done. Is there a guest account I could use > to checkout? Ooops, my mistake, read-only access via svn should work now without any authentication. patches are welcome of course :-) btw: I just fixed the issue reported in this thread. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From kraemer at webit.de Wed Mar 1 17:13:35 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 1 Mar 2006 23:13:35 +0100 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: <3810d363d9653e8ed930f83ff448bb26@ruby-forum.com> References: <3810d363d9653e8ed930f83ff448bb26@ruby-forum.com> Message-ID: <20060301221335.GB26099@cordoba.webit.de> On Wed, Mar 01, 2006 at 06:50:46AM +0100, Adam Roth wrote: [..] > 1. Are all the "fields" of a model indexed? When I used that > acts_as_ferret code from the svn repository, I had to specify the > fields. And despite the problems, I was also getting results back. You have to specify the fields when using the second code snippet from the wiki, too. If no fields are specified, only the id and the class name will be indexed. > 2. Is there a way to limit the results that you get back? (IE, limit to > 10, or pass a limit/offset for paging). I just added this to the svn version of the plugin. You can have a look at test/unit/content_test.rb from the demo project (https://svn.jkraemer.net/svn/projects/ferret-demo/trunk/test/unit/content_test.rb) for an example. As the other issue concerning multiple indexes is now fixed, too, you could give the svn version another try. Regards, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From dbalmain.ml at gmail.com Wed Mar 1 20:52:56 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 2 Mar 2006 10:52:56 +0900 Subject: [Ferret-talk] Updating Index Is Very Slow In-Reply-To: References: Message-ID: Hi Tom, It does seem rather slow. The unit tests are adding a few hundred documents and they're finishing in under ten seconds on my machine. I'd suggest you don't worry about it too much though as the new version of Ferret will solve any performance problems you're having. Cheers, Dave On 3/1/06, Tom Davies wrote: > Hi Dave, > > That is good news about the Linux version. I tried turning off > autoflush, but that did not appear to have a very noticable > difference. > > I just added the following benchmark code around the index update: > def update_index > Gift.benchmark("updating index") do > INDEX << self.to_doc > end > end > > I am attaching a trace of running this against 28 docs, which is the > entire index. The documents are not large, as you will see from the > trace. Each index update is averaging around .4 seconds. Does that > seem acceptable? I am hoping this turns out to be some sort of > configuration error. > > Tom > > > > On 3/1/06, Gary Elliott wrote: > > That's great, David. Is the C version of ferret similar enough to the > > current version so that someone could develop on windows or osx with > > the current version of ferret and deploy on linux with cFerret? > > > > On 3/1/06, David Balmain wrote: > > > On 3/1/06, Tom Davies wrote: > > > > Hi, > > > > > > > > I am experiencing very poor performance when updating my index. For > > > > example, to update the index for 10 documents, it is taking 3 to 4 > > > > seconds. My index is currently very small... with probably less than > > > > 100 docs in it. > > > > > > > > I have created my index as follows: > > > > > > > > GIFT_INDEX = Index::Index.new(:path => "#{index_dir}/gift", :key => > > > > 'id', :auto_flush => true) > > > > > > > > and I have an after_save filter in my model as follows: > > > > > > > > def update_index > > > > INDEX << self.to_doc > > > > end > > > > > > > > > > > > Is there anything I can do to improve this performance? > > > > > > Hi Tom, > > > > > > That sounds very slow. How large are the documents? > > > > > > The first thing you can do is turn off auto_flush. That should > > > substantially speed things up. If you only have one thread you won't > > > need auto_flush. If you have more then one thread then I'd suggest > > > having a dedicated indexing thread (and again you won't need > > > auto_flush). If things are still too slow after that, I'm nearly > > > finished with the C rewrite of ferret. A linux version should be out > > > some time next week. This will be at least 10 times as fast. > > > > > > Hope that helps, > > > > > > Cheers, > > > Dave > > > > > > > > > > > > > > Thanks, > > > > Tom > > > > > > > > _______________________________________________ > > > > Ferret-talk mailing list > > > > Ferret-talk at rubyforge.org > > > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > > > > > > > _______________________________________________ > > > Ferret-talk mailing list > > > Ferret-talk at rubyforge.org > > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > > > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > > > From dbalmain.ml at gmail.com Wed Mar 1 20:59:09 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 2 Mar 2006 10:59:09 +0900 Subject: [Ferret-talk] Sorting the Result In-Reply-To: References: Message-ID: On 3/2/06, Tom Davies wrote: > I believe your sort field needs to be a Ferret::Search::SortField > > Here is how I am sorting (with two fields). You might be able to do > it without the array and just passing in a Ferret::Search::SortField > to :sort. I haven't tried that though. > > sort_fields = [] > sort_fields << Ferret::Search::SortField.new('created_at') > sort_fields << Ferret::Search::SortField.new('url') > INDEX.search_each(query, {:sort => sort_fields}) do |doc, score| > > Tom You may also want to specify the sort type. For example; include Ferret::Search sort_fields = [] sort_fields << SortField.new('created_at', :sort_type => SortField::SortType::INTEGER) sort_fields << SortField.new('url', :sort_type => SortField::SortType::STRING) INDEX.search_each(query, {:sort => sort_fields}) do |doc, score| Although, now that I think about it, it would be nice if you could just pass the search method a string or array of strings. Expect to see this functionality in the future. Cheers, Dave > On 3/1/06, P L Patodia wrote: > > The document describes > > search(query, options) > > sort: An array of SortFields describing how to sort the results. > > > > I have created index with two fields: 'file' and 'content' > > > > When I give SortField name as 'file' while searching, it results into > > error. > > > > The exact command given by me: > > index.search_each("sleepless AND dreams", :num_docs => 100, :sort => > > 'file') > > do |doc, score| > > .... > > end > > > > My question is how do I give field name for sorting. > > > > Thanks and Regards, > > > > P L Patodia > > > > -- > > Posted via http://www.ruby-forum.com/. > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From adamjroth at gmail.com Thu Mar 2 12:49:25 2006 From: adamjroth at gmail.com (aroth) Date: Thu, 2 Mar 2006 18:49:25 +0100 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: <20060301221335.GB26099@cordoba.webit.de> References: <3810d363d9653e8ed930f83ff448bb26@ruby-forum.com> <20060301221335.GB26099@cordoba.webit.de> Message-ID: <0e87693024511ad94389be18eec826f6@ruby-forum.com> Jens, Your fix works great. Thank you. Another quick question. I notice the "rebuild_index.rb" in the plugin dir. If I try to run this directly, it doesnt want to run (complains about not being able to require 'ferret'). Anyway, I'd like to know how I can use Ferret or acts_as_ferret to index all of my existing content based on the fields/models I have declared as 'acts_as_ferret'. Right now, they are added to the index after any CRUD operation -- is there a way to force this outside of the scope of the web? Thanks Adam -- Posted via http://www.ruby-forum.com/. From lmarlow at yahoo.com Thu Mar 2 18:23:12 2006 From: lmarlow at yahoo.com (Lee Marlow) Date: Thu, 2 Mar 2006 16:23:12 -0700 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: <0e87693024511ad94389be18eec826f6@ruby-forum.com> References: <3810d363d9653e8ed930f83ff448bb26@ruby-forum.com> <20060301221335.GB26099@cordoba.webit.de> <0e87693024511ad94389be18eec826f6@ruby-forum.com> Message-ID: <7968d7490603021523s38e2eea0y9a088c0d896ec21d@mail.gmail.com> Here is our rake task which uses a slightly different version of acts_as_ferret. It will try to load up all models in app/models and call ferret_update on each instance. On 3/2/06, aroth wrote: > Jens, > > Your fix works great. Thank you. Another quick question. I notice the > "rebuild_index.rb" in the plugin dir. If I try to run this directly, it > doesnt want to run (complains about not being able to require 'ferret'). > Anyway, I'd like to know how I can use Ferret or acts_as_ferret to index > all of my existing content based on the fields/models I have declared as > 'acts_as_ferret'. Right now, they are added to the index after any CRUD > operation -- is there a way to force this outside of the scope of the > web? > > Thanks > Adam > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- A non-text attachment was scrubbed... Name: indexer.rake Type: application/octet-stream Size: 1476 bytes Desc: not available Url : http://rubyforge.org/pipermail/ferret-talk/attachments/20060302/7b274620/indexer-0001.obj From f at andreas-s.net Fri Mar 3 06:43:48 2006 From: f at andreas-s.net (Andreas S.) Date: Fri, 3 Mar 2006 12:43:48 +0100 Subject: [Ferret-talk] Updating Index Is Very Slow In-Reply-To: References: Message-ID: David Balmain wrote: > I'd suggest you don't worry about it too much though as the new > version of Ferret will solve any performance problems you're having. I'm looking forward to trying the new version, as I have quite a lot of problems with the current version on ruby-forum.com. -- Posted via http://www.ruby-forum.com/. From garypelliott at gmail.com Fri Mar 3 08:32:56 2006 From: garypelliott at gmail.com (Gary Elliott) Date: Fri, 3 Mar 2006 08:32:56 -0500 Subject: [Ferret-talk] Updating Index Is Very Slow In-Reply-To: References: Message-ID: Andreas, do you just mean performance problems or other bugs as well? On 3/3/06, Andreas S. wrote: > David Balmain wrote: > > I'd suggest you don't worry about it too much though as the new > > version of Ferret will solve any performance problems you're having. > > I'm looking forward to trying the new version, as I have quite a lot of > problems with the current version on ruby-forum.com. > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From f at andreas-s.net Fri Mar 3 09:43:50 2006 From: f at andreas-s.net (Andreas S.) Date: Fri, 3 Mar 2006 15:43:50 +0100 Subject: [Ferret-talk] Updating Index Is Very Slow In-Reply-To: References: Message-ID: Gary Elliott wrote: > Andreas, do you just mean performance problems or other bugs as well? Other bugs as well, occasional crashes, stale lockfiles, exploding index file size (see bugtracker on Ferret website). -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Fri Mar 3 12:54:15 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 4 Mar 2006 02:54:15 +0900 Subject: [Ferret-talk] Updating Index Is Very Slow In-Reply-To: References: Message-ID: On 3/3/06, Andreas S. wrote: > David Balmain wrote: > > I'd suggest you don't worry about it too much though as the new > > version of Ferret will solve any performance problems you're having. > > I'm looking forward to trying the new version, as I have quite a lot of > problems with the current version on ruby-forum.com. Once the new version is out I'll be able to be a lot more responsive when it comes to fixing bugs and addressing other problems. I should warn though that the new version is an alpha release and it's going to take some work before it's even as stable as the current ruby version. It'll get there though. > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From atomgiant at gmail.com Sun Mar 5 15:47:38 2006 From: atomgiant at gmail.com (Tom Davies) Date: Sun, 5 Mar 2006 15:47:38 -0500 Subject: [Ferret-talk] How to make a single Writer Message-ID: I am using lighttpd with two procs and occasionally the .lock file will not be properly removed by Ferret at which point my application will end up throwing nothing but 500 errors. Therefore, I have decided to go with a single writer thread... which is probably a better long term solution anyways. I would like some feedback on the best way to structure this. My app is hosted on TextDrive, so drb (distributed ruby) is not allowed. The only other solution I can come up with is to write all pending updates to a shared file. This could involve either: 1) serialize each object using something like YAML to a file and then deserializing them by the writer during updates. 2) just write the ids that need to be updated in the index and then read each object fresh from the database using its id when updating the index. I am leaning towards solution 2, as it is easier to implement, should be faster to write and read from the intermediate file and will be easier to remove duplicate index updates. The only drawback to 2 is it will require one additional database read for every index update... but this could be minimized by batch reading with a where id in (...). Also, both 1 and 2 will require a lockfile for managing concurrent access to the intermediate file. I am thinking of just using this lockfile library: http://raa.ruby-lang.org/project/lockfile/ Does anyone have any experience with this? Thanks, Tom From alex at blackkettle.org Sun Mar 5 16:07:32 2006 From: alex at blackkettle.org (Alex Young) Date: Sun, 05 Mar 2006 21:07:32 +0000 Subject: [Ferret-talk] How to make a single Writer In-Reply-To: References: Message-ID: <440B5314.8040901@blackkettle.org> Tom Davies wrote: > 2) just write the ids that need to be updated in the index and then > read each object fresh from the database using its id when updating > the index. > > I am leaning towards solution 2, as it is easier to implement, should > be faster to write and read from the intermediate file and will be > easier to remove duplicate index updates. The only drawback to 2 is > it will require one additional database read for every index update... > but this could be minimized by batch reading with a where id in (...). Why not add a needs_indexing column to your object table? That way, not only do you not have to care about concurrent intermediate file access (because the DB takes care of that for you), but you can also do all your pending database reads at once, if that's appropriate. If you've got a single writer thread, it can write the flag back either on all once it's done, or on each as it goes. It seems much simpler all round to me... Of course, if you don't want to change your object table schema, then you could create a separate table specifically for this. -- Alex From atomgiant at gmail.com Sun Mar 5 16:40:34 2006 From: atomgiant at gmail.com (Tom Davies) Date: Sun, 5 Mar 2006 16:40:34 -0500 Subject: [Ferret-talk] How to make a single Writer In-Reply-To: <440B5314.8040901@blackkettle.org> References: <440B5314.8040901@blackkettle.org> Message-ID: That is an excellent idea Alex. Not sure why I didn't think of that :) Basically, your concept is like adding a dirty flag to my table. I like this approach much better. However, for my particular case, I will modify it slightly to just use the existing updated_at columns that I have for each of my models that need indexing. Then my index writer won't have to lock the model database tables to reset the dirty flag. It will just keep track of the last time it updated the index. Thanks for finding a much simpler solution. That .lock file way was making me nervous :) Tom On 3/5/06, Alex Young wrote: > Tom Davies wrote: > > 2) just write the ids that need to be updated in the index and then > > read each object fresh from the database using its id when updating > > the index. > > > > I am leaning towards solution 2, as it is easier to implement, should > > be faster to write and read from the intermediate file and will be > > easier to remove duplicate index updates. The only drawback to 2 is > > it will require one additional database read for every index update... > > but this could be minimized by batch reading with a where id in (...). > Why not add a needs_indexing column to your object table? That way, not > only do you not have to care about concurrent intermediate file access > (because the DB takes care of that for you), but you can also do all > your pending database reads at once, if that's appropriate. If you've > got a single writer thread, it can write the flag back either on all > once it's done, or on each as it goes. It seems much simpler all round > to me... Of course, if you don't want to change your object table > schema, then you could create a separate table specifically for this. > > -- > Alex > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From alex at blackkettle.org Sun Mar 5 17:49:48 2006 From: alex at blackkettle.org (Alex Young) Date: Sun, 05 Mar 2006 22:49:48 +0000 Subject: [Ferret-talk] How to make a single Writer In-Reply-To: References: <440B5314.8040901@blackkettle.org> Message-ID: <440B6B0C.3060107@blackkettle.org> Tom Davies wrote: > Basically, your concept is like adding a dirty flag to my table. Pretty much - it's dirty within a specific context. > I like this approach much better. However, for my particular case, I > will modify it slightly to just use the existing updated_at columns > that I have for each of my models that need indexing. Then my index > writer won't have to lock the model database tables to reset the dirty > flag. It will just keep track of the last time it updated the index. Sounds good. Just remember to record the *start* of the write, not the end - otherwise you'll get records being marked as updated while your write's happening, and they'll get missed by the next update. > Thanks for finding a much simpler solution. That .lock file way was > making me nervous :) No worries :-) -- Alex From carl at youngbloods.org Mon Mar 6 11:30:15 2006 From: carl at youngbloods.org (Carl Youngblood) Date: Mon, 6 Mar 2006 08:30:15 -0800 Subject: [Ferret-talk] C version of ferret? Message-ID: Hey Dave, I understand you've been very busy lately, but I was really excited when you said before Christmas sometime that you were soon to release a fast C version of ferret. Is that still in the works? Do you have even a rough ETA? I have a rails site that would greatly benefit from it. Thanks, Carl -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20060306/b79c9b81/attachment.htm From dbalmain.ml at gmail.com Mon Mar 6 20:21:54 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 7 Mar 2006 10:21:54 +0900 Subject: [Ferret-talk] C version of ferret? In-Reply-To: References: Message-ID: Hey Carl, Yes, it is still in the works. It's taking a little longer than I expected. I should have a very rough release out this week. Probably Friday. How long it will be before it becomes stable will depend on the community. I need lots of people testing. Cheers, Dave On 3/7/06, Carl Youngblood wrote: > Hey Dave, I understand you've been very busy lately, but I was really > excited when you said before Christmas sometime that you were soon to > release a fast C version of ferret. Is that still in the works? Do you > have even a rough ETA? I have a rails site that would greatly benefit from > it. > > Thanks, > Carl > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > > From sconover at gmail.com Wed Mar 8 11:37:19 2006 From: sconover at gmail.com (Steve Conover) Date: Wed, 8 Mar 2006 08:37:19 -0800 Subject: [Ferret-talk] indexing a document object fails Message-ID: Hi, I'm trying out the example (more or less) straight from the tutorial: doc = Document.new doc << Field.new("id", "a", Field::Store::NO, Field::Index::UNTOKENIZED) doc << Field.new("title", "b", Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("data", "c", Field::Store::YES, Field::Index::TOKENIZED) doc << Field.new("image", "d", Field::Store::YES, Field::Index::NO) index << doc And I get: Exception: Unknown document type Ferret::Document::Document C:/dev/workspace/fred/config/../vendor/ferret/index/index.rb:259:in `<<' C:/dev/workspace/fred/config/../vendor/ferret/index/index.rb:238:in `synchronize' C:/dev/workspace/fred/config/../vendor/ferret/index/index.rb:238:in `<<' C:\dev\workspace\fred/test/unit/user_test.rb:56:in `test_index_document_sanity_check' Which appears to be caused because elsif doc.is_a?(Document) is expecting a Ferret::Document rather than Ferret::Document::Document. When I change this line to elsif doc.is_a?(Document::Document) I get past the indexing part, and am able to retrieve the document...i.e. index.search_each("*") do |score_doc, score| p index.doc(score_doc) end which results in #[#], "image"=>[#], "data"=>[#]}, @boost=1.0> Also, what's the significance of score_doc? It appears to be just the doc id. Am I missing something? Thanks, Steve From g.melhorn at web.de Thu Mar 9 09:13:01 2006 From: g.melhorn at web.de (Gregor Melhorn) Date: Thu, 9 Mar 2006 15:13:01 +0100 Subject: [Ferret-talk] Missing fields in search result Message-ID: Hello ferret users, I have a problem with ferret dropping stored fields in the index. Not all fields I want to store get stored, so they can be searched, but can't be retrieved in a search. Index creation: INDEX = Index::Index.new(:path => '/home/gregor/wisa/index', :analyzer => Analysis::WhiteSpaceAnalyzer.new) SR = Index::IndexSearcher(:path => '/home/gregor/wisa/index', :analyzer => Analysis::WhiteSpaceAnalyzer.new) Storing: # initial creation of a lucene index def create_index # our central INDEX index = FerretConfig::INDEX # get all Companies, iterate over and index them companies = Company.find(:all) for company in companies doc = Document.new doc << Field.new("id", company.id, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("country", company.company_group.address.country.name, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("name", company.name, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("zip", company.address.zip, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("city", company.address.city, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("street", company.address.street, Field::Store::YES, Field::Index::TOKENIZED) doc << Field.new("sectorid", company.sector_id, Field::Store::NO, Field::Index::UNTOKENIZED) doc << Field.new("parentsectorid", (company.sector.parent_id || ""), Field::Store::NO, Field::Index::UNTOKENIZED) doc << Field.new("sector", company.sector.name, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("parentsector", company.sector.breadcrumb.first.name, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("districtid", company.district_id, Field::Store::NO, Field::Index::UNTOKENIZED) doc << Field.new("district", company.district.name, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("sales", sprintf('%010.4f', company.sales), Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("employees", sprintf('%010d', company.employees), Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("codesector", company.code_sector.code, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("codesectorparent", company.code_sector.parent, Field::Store::NO, Field::Index::UNTOKENIZED) doc << Field.new("codesectorname", company.code_sector.name, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("cooperation", company.cooperation.to_i.to_s, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("lastname", company.contact.lastname, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("firstname", company.contact.firstname, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("position", company.contact.position, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("title", company.contact.title, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("salutation", company.contact.salutation, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("mail", company.contact.mail, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("fax", company.fax, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("phone", company.phone, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("url", company.url, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("comments", company.comments, Field::Store::YES, Field::Index::TOKENIZED) doc << Field.new("products", company.products, Field::Store::YES, Field::Index::TOKENIZED) doc << Field.new("state", company.state.name, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("investorname", company.company_group.name, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("investorcity", company.company_group.address.city, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new("investorstreet", company.company_group.address.street, Field::Store::YES, Field::Index::TOKENIZED) doc << Field.new("investorzip", company.company_group.address.zip, Field::Store::YES, Field::Index::UNTOKENIZED) index << doc logger.info "URL: " + doc["url"] # gets url, so doc is ok end # don't close the index, but be sure to write it to the fs index.optimize() logger.info "URL 1: " + index[0]["url"] # passed, prints out correctly logger.info "URL 2: " + index[1]["url"] # passed, prints out correctly redirect_to(:action => 'new_index') end Searching: if (@conditions != conditions) @records = Array.new sr = FerretConfig::SR sr.reader = Index::IndexReader.open(Store::FSDirectory.get_directory("/home/gregor/wisa/index")) qp = Ferret::QueryParser.new("id", { :analyzer => Ferret::Analysis::WhiteSpaceAnalyzer.new(), :wild_lower => false}) @records = sr.search(qp.parse(conditions)) end Search results are correct, but many fields are missing and can't be accessed. The only fields in the results are city, name, zip, country, mail, firstname, lastname, id, phone, fax, street, sector, parentsector All others are missing.... Hope someone can help me soon, this is getting me crazy.. :-/ Best regards Gregor -- Posted via http://www.ruby-forum.com/. From g.melhorn at web.de Thu Mar 9 12:02:56 2006 From: g.melhorn at web.de (Gregor Melhorn) Date: Thu, 9 Mar 2006 18:02:56 +0100 Subject: [Ferret-talk] Missing fields in search result In-Reply-To: References: Message-ID: <606933917ff355fe16c4a83fab26421d@ruby-forum.com> resolved: restarting webrick was all I needed to do (apart from some other stupid errors in the code..) *bangingheadontable* -- Posted via http://www.ruby-forum.com/. From testing at testing.testing Thu Mar 9 12:25:02 2006 From: testing at testing.testing (Someone) Date: Thu, 9 Mar 2006 18:25:02 +0100 Subject: [Ferret-talk] Test Message-ID: <60a40f42da95818249be4f8b5f291af6@ruby-forum.com> Test Html code as there is no
    Preview
button -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Thu Mar 9 21:00:46 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 10 Mar 2006 11:00:46 +0900 Subject: [Ferret-talk] indexing a document object fails In-Reply-To: References: Message-ID: On 3/9/06, Steve Conover wrote: > Hi, > > I'm trying out the example (more or less) straight from the tutorial: > > doc = Document.new > doc << Field.new("id", "a", Field::Store::NO, > Field::Index::UNTOKENIZED) > doc << Field.new("title", "b", Field::Store::YES, Field::Index::UNTOKENIZED) > doc << Field.new("data", "c", Field::Store::YES, Field::Index::TOKENIZED) > doc << Field.new("image", "d", Field::Store::YES, Field::Index::NO) > index << doc > > And I get: > > Exception: Unknown document type Ferret::Document::Document > C:/dev/workspace/fred/config/../vendor/ferret/index/index.rb:259:in `<<' > C:/dev/workspace/fred/config/../vendor/ferret/index/index.rb:238:in > `synchronize' > C:/dev/workspace/fred/config/../vendor/ferret/index/index.rb:238:in `<<' > C:\dev\workspace\fred/test/unit/user_test.rb:56:in > `test_index_document_sanity_check' > > Which appears to be caused because > > elsif doc.is_a?(Document) > > is expecting a Ferret::Document rather than > Ferret::Document::Document. When I change this line to > > elsif doc.is_a?(Document::Document) Funny, I can't duplicate this but I'll change it anyway as it won't hurt. > Also, what's the significance of score_doc? It appears to be just the > doc id. Am I missing something? No, you're not missing anything. It's just the document id. Just to be clear, this is the internal id used by ferret to access the document, not any id that you add to the document yourself. > > Thanks, > Steve > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From dbalmain.ml at gmail.com Mon Mar 13 22:57:19 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 14 Mar 2006 12:57:19 +0900 Subject: [Ferret-talk] cFerret nearing completion Message-ID: Hey folks, Some good news. I've finished cFerret and it's ruby bindings to the point where I can run all of the unit tests. I still have to work out how I'm going to package and release it but it shouldn't be long now. If you can't wait you might like to try it from the subversion repository. It'll probably only work on linux at the moment and you might have to mess with the make file a little. As for performance, cFerret seems to be about 10 to 20 times faster so all this work has been worth it. Cheers, Dave From tom.styles at nottscc.gov.uk Wed Mar 15 08:01:56 2006 From: tom.styles at nottscc.gov.uk (Tom Styles) Date: Wed, 15 Mar 2006 14:01:56 +0100 Subject: [Ferret-talk] ActiveRecord::RecordNotFound in search results act_as Ferret Message-ID: <35440623eb6f223f4d3011fc936e2e3b@ruby-forum.com> Hello, I've installed the Ferret gem and also got the act_as_ferret code from the wiki. I've set up my model "Branch" to act as ferret using the code below. acts_as_ferret :options => {:fields => ['name', 'body_text', 'address']} I've also set up a ferret_controller with the code below def find if params[:search_terms] @branch_results = Branch.find_by_contents(params[:search_terms]) render_text @branch_results.inspect end end I've done some updates and the index files seem to be being generated ok. If I go to mywebapp/ferret/find?search_terms=gamston where "gamston" is the name of one of the branches I get --- [] as the result. If I go to mywebapp/ferret/find?search_terms=* I get ActiveRecord::RecordNotFound in Ferret#find Couldn't find Branch with ID=5 There are no Branches with an ID of 5. Does anyone have any ideas whats going on with my system. I tried my best to trouble shoot the issue but to no avail. Any help is appreciated. Tom Styles -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Mar 15 08:15:27 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 15 Mar 2006 14:15:27 +0100 Subject: [Ferret-talk] ActiveRecord::RecordNotFound in search results act_as Ferret In-Reply-To: <35440623eb6f223f4d3011fc936e2e3b@ruby-forum.com> References: <35440623eb6f223f4d3011fc936e2e3b@ruby-forum.com> Message-ID: <20060315131527.GI10401@cordoba.webit.de> Hi! On Wed, Mar 15, 2006 at 02:01:56PM +0100, Tom Styles wrote: > Hello, > > I've installed the Ferret gem and also got the act_as_ferret code from > the wiki. > I've set up my model "Branch" to act as ferret using the code below. > > acts_as_ferret :options => {:fields => ['name', 'body_text', 'address']} I'd say that must read acts_as_ferret :fields => ['name', 'body_text', 'address'] What Version of acts_as_ferret are you using ? You should really use the one from the svn repository as described there: http://projects.jkraemer.net/acts_as_ferret/ The plugin repository is svn://projects.jkraemer.net/acts_as_ferret/trunk/plugin/acts_as_ferret Kasper Weibel and me are actively working on the plugin there, the Code in the Wiki is more or less outdated. There also is a small rails project utilizing the plugin, under svn://projects.jkraemer.net/acts_as_ferret/trunk/demo Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From tom.styles at nottscc.gov.uk Wed Mar 15 08:32:28 2006 From: tom.styles at nottscc.gov.uk (Tom Styles) Date: Wed, 15 Mar 2006 14:32:28 +0100 Subject: [Ferret-talk] ActiveRecord::RecordNotFound in search results act_as Fe In-Reply-To: <20060315131527.GI10401@cordoba.webit.de> References: <35440623eb6f223f4d3011fc936e2e3b@ruby-forum.com> <20060315131527.GI10401@cordoba.webit.de> Message-ID: <872c69cfa65e499b884c4385ece0b163@ruby-forum.com> Jens Kraemer wrote: > You should really use the one from the svn repository as described > there: http://projects.jkraemer.net/acts_as_ferret/ > The plugin repository is > svn://projects.jkraemer.net/acts_as_ferret/trunk/plugin/acts_as_ferret Thanks Jens, I'm having a lot of difficulty connecting to the svn because of our firewall. Are there any other ways in which the files can be downloaded. Thanks Tom -- Posted via http://www.ruby-forum.com/. From tom.styles at nottscc.gov.uk Wed Mar 15 08:37:17 2006 From: tom.styles at nottscc.gov.uk (Tom Styles) Date: Wed, 15 Mar 2006 14:37:17 +0100 Subject: [Ferret-talk] ActiveRecord::RecordNotFound in search results act_as Fe In-Reply-To: <872c69cfa65e499b884c4385ece0b163@ruby-forum.com> References: <35440623eb6f223f4d3011fc936e2e3b@ruby-forum.com> <20060315131527.GI10401@cordoba.webit.de> <872c69cfa65e499b884c4385ece0b163@ruby-forum.com> Message-ID: <3b6fe42bea4d703fa4d19d8edc88c824@ruby-forum.com> Tom Styles wrote: > Jens Kraemer wrote: >> You should really use the one from the svn repository as described >> there: http://projects.jkraemer.net/acts_as_ferret/ >> The plugin repository is >> svn://projects.jkraemer.net/acts_as_ferret/trunk/plugin/acts_as_ferret > > Thanks Jens, > I'm having a lot of difficulty connecting to the svn because of our > firewall. Are there any other ways in which the files can be downloaded. > > Thanks > Tom Ignore that, I've worked it out now. http://projects.jkraemer.net/acts_as_ferret/browser/trunk/plugin/acts_as_ferret/lib/acts_as_ferret.rb Thanks Tom -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Mar 15 08:41:19 2006 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 15 Mar 2006 14:41:19 +0100 Subject: [Ferret-talk] ActiveRecord::RecordNotFound in search results act_as Fe In-Reply-To: <872c69cfa65e499b884c4385ece0b163@ruby-forum.com> References: <35440623eb6f223f4d3011fc936e2e3b@ruby-forum.com> <20060315131527.GI10401@cordoba.webit.de> <872c69cfa65e499b884c4385ece0b163@ruby-forum.com> Message-ID: <20060315134119.GJ10401@cordoba.webit.de> On Wed, Mar 15, 2006 at 02:32:28PM +0100, Tom Styles wrote: > Jens Kraemer wrote: > > You should really use the one from the svn repository as described > > there: http://projects.jkraemer.net/acts_as_ferret/ > > The plugin repository is > > svn://projects.jkraemer.net/acts_as_ferret/trunk/plugin/acts_as_ferret > > Thanks Jens, > I'm having a lot of difficulty connecting to the svn because of our > firewall. Are there any other ways in which the files can be downloaded. I attached an archive containing the current trunk to the wiki front page at http://projects.jkraemer.net/acts_as_ferret/wiki . Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From JanPrill at blauton.de Thu Mar 16 14:21:18 2006 From: JanPrill at blauton.de (Jan Prill) Date: Thu, 16 Mar 2006 20:21:18 +0100 Subject: [Ferret-talk] cFerret nearing completion In-Reply-To: References: Message-ID: <6756f739219e74b04fe63b040cbba63b@ruby-forum.com> Hey David, this is great news and your work is greatly appreciated. I've had some issues on gentoo compiling cferret and therefore added a ticket (and a small patch for the makefile) on ferret.davebalmain.com. Just thought I should add this here if someone is searching for gentoo on ruby-forum.com... and show my appreciation along the lines. I'm really looking forward to do searching on ruby with lucene-syntax as fast (or nearly as fast) as on java... Best Regards Jan Prill David Balmain wrote: > Hey folks, > > Some good news. I've finished cFerret and it's ruby bindings to the > point where I can run all of the unit tests. I still have to work out > how I'm going to package and release it but it shouldn't be long now. > If you can't wait you might like to try it from the subversion > repository. It'll probably only work on linux at the moment and you > might have to mess with the make file a little. > > As for performance, cFerret seems to be about 10 to 20 times faster so > all this work has been worth it. > > Cheers, > Dave -- Posted via http://www.ruby-forum.com/. From alex at blackkettle.org Thu Mar 16 14:46:50 2006 From: alex at blackkettle.org (Alex Young) Date: Thu, 16 Mar 2006 19:46:50 +0000 Subject: [Ferret-talk] cFerret nearing completion In-Reply-To: References: Message-ID: <4419C0AA.700@blackkettle.org> David Balmain wrote: > Hey folks, > > Some good news. I've finished cFerret and it's ruby bindings to the > point where I can run all of the unit tests. Yay! However, I just checked it out, and a 'make' in the project root gives: > 1) Failure: > test_new_binary_field(FieldTest) [./ruby/test/unit/../unit/document/tc_field.rb:96]: > <"stored/uncompressed,binary,"> expected but was > <"stored/uncompressed,binary,">. Is this something you're expecting, or do you want platform details? -- Alex From tom.styles at nottscc.gov.uk Fri Mar 17 04:21:09 2006 From: tom.styles at nottscc.gov.uk (Tom Styles) Date: Fri, 17 Mar 2006 10:21:09 +0100 Subject: [Ferret-talk] Fuzzy searching using act_as_ferret Message-ID: Hello, My Ferret integration has gone quite well. I'm now returning all the results I need from two models using "id_multi_search" and combining the results in the view using a couple of partials. Is there any way that I can turn on fuzzy searching? Would fuzzy searching pick up basic spelling mistakes such as "Bnadit" instead of "Bandit" my experience with search technology is quite limited. Cheers Tom, Nottingham UK -- Posted via http://www.ruby-forum.com/. From etienne.durand at mail.com Fri Mar 17 04:30:51 2006 From: etienne.durand at mail.com (Jean-Etienne Durand) Date: Fri, 17 Mar 2006 10:30:51 +0100 Subject: [Ferret-talk] Fuzzy searching using act_as_ferret In-Reply-To: References: Message-ID: <441A81CB.5070609@mail.com> Tom, It depends on how you build your query. Textually, the syntax is: +(body:word~0.5) if you want to search for 'word' in field 'body'. A more advanced way would to parse your query and then build manually your ferret query. See http://ferret.davebalmain.com/api/classes/Ferret/Search/FuzzyQuery.html for more help. Jean-Etienne Tom Styles wrote: > Hello, > > My Ferret integration has gone quite well. I'm now returning all the > results I need from two models using "id_multi_search" and combining the > results in the view using a couple of partials. > > Is there any way that I can turn on fuzzy searching? > > Would fuzzy searching pick up basic spelling mistakes such as "Bnadit" > instead of "Bandit" my experience with search technology is quite > limited. > > Cheers > Tom, Nottingham UK > From dbalmain.ml at gmail.com Sat Mar 18 22:53:08 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sun, 19 Mar 2006 12:53:08 +0900 Subject: [Ferret-talk] cFerret nearing completion In-Reply-To: <4419C0AA.700@blackkettle.org> References: <4419C0AA.700@blackkettle.org> Message-ID: On 3/17/06, Alex Young wrote: > David Balmain wrote: > > Hey folks, > > > > Some good news. I've finished cFerret and it's ruby bindings to the > > point where I can run all of the unit tests. > Yay! > > However, I just checked it out, and a 'make' in the project root gives: > > > 1) Failure: > > test_new_binary_field(FieldTest) [./ruby/test/unit/../unit/document/tc_field.rb:96]: > > <"stored/uncompressed,binary,"> expected but was > > <"stored/uncompressed,binary,">. > > Is this something you're expecting, or do you want platform details? Sorry, I'm in the middle of moving the ruby bindings from the cferret repository to the ferret repository. cferret will just contain cferret so this problem will be fixed shortly. > > -- > Alex > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From alex at blackkettle.org Sun Mar 19 01:52:49 2006 From: alex at blackkettle.org (Alex Young) Date: Sun, 19 Mar 2006 06:52:49 +0000 Subject: [Ferret-talk] cFerret nearing completion In-Reply-To: References: <4419C0AA.700@blackkettle.org> Message-ID: <441CFFC1.2020305@blackkettle.org> David Balmain wrote: > On 3/17/06, Alex Young wrote: >>> 1) Failure: >>>test_new_binary_field(FieldTest) [./ruby/test/unit/../unit/document/tc_field.rb:96]: >>><"stored/uncompressed,binary,"> expected but was >>><"stored/uncompressed,binary,">. >> >>Is this something you're expecting, or do you want platform details? > > > Sorry, I'm in the middle of moving the ruby bindings from the cferret > repository to the ferret repository. cferret will just contain cferret > so this problem will be fixed shortly. Ah - no worries. I just wasn't sure if you wanted field test data yet :-) -- Alex From dbalmain.ml at gmail.com Sun Mar 19 07:11:43 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sun, 19 Mar 2006 21:11:43 +0900 Subject: [Ferret-talk] [ANN] Ferret 0.9.0-alpha (port of Apache Lucene to pure ruby) Message-ID: Hi Folks, I've just released version 0.9.0. This latest version of Ferret is an alpha release. I have removed the old c extension and Ferret is now running on a fully ported C library. This has allowed some huge performance improvements both with regard to memory and CPU usage. There will probably be a few portability issues to start with. It has been developed on Linux so it should work fine there. Windows and Mac users beware. Also, the current version doesn't allow you to extend Ferret. For example, you can't write your own analyzer or filter. This will be rectified in the near future. http://ferret.davebalmain.com/trac/ Dave Balmain == Description Ferret is a full port of the Apache Lucene searching and indexing library. It's available as a gem so try it out! To get started quickly read the quick start at the project homepage; http://ferret.davebalmain.com/api http://ferret.davebalmain.com/api/files/TUTORIAL.html == Changes * currently this version isn't very extendable. For example, you can't write your own Analyzer, Filter or Query. * changed Token#term_text to Token#text * changed Token#position_increment to Term#pos_inc * changed order of args to Token.new. Now Term.new(text, start_offset, end_offset, pos_inc=1, type="text"). NOTE: type does nothing. * changed TermVectorOffsetInfo#start_offset to TermVectorOffsetInfo#start * changed TermVectorOffsetInfo#end_offset to TermVectorOffsetInfo#end * added :id_field option to Index::Index class. From kraemer at webit.de Sat Mar 25 08:33:48 2006 From: kraemer at webit.de (Jens Kraemer) Date: Sat, 25 Mar 2006 14:33:48 +0100 Subject: [Ferret-talk] [ANN] RDig - ferret-based website crawler/indexer Message-ID: <20060325133348.GA28424@cordoba.webit.de> Hi! RDig is a small tool to build a Ferret index for the contents of a website or intranet. It contains a simple HTTP crawler and some support for extracting textual content from the fetched pages. I built this to implement a site-wide search for a recent project that combined a Rails application with lots of static html files generated by a CMS. Any feedback is very welcome! Rubyforge project page: http://rubyforge.org/projects/rdig RDocs: http://rdig.rubyforge.org/ `gem install rdig` should work once the gem has reached the rubyforge mirrors. Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 From JanPrill at blauton.de Sat Mar 25 10:30:27 2006 From: JanPrill at blauton.de (Jan Prill) Date: Sat, 25 Mar 2006 16:30:27 +0100 Subject: [Ferret-talk] [ANN] RDig - ferret-based website crawler/indexer In-Reply-To: <20060325133348.GA28424@cordoba.webit.de> References: <20060325133348.GA28424@cordoba.webit.de> Message-ID: <562a35c10603250730x13fa21a1iff5663703791ee6d@mail.gmail.com> Hi, Jens, great stuff. Just installed it and made a short test as described in the readme. It works as announced. Thanks for sharing this! The crawler has problems with frames but this is a quite common problem. I've had to configure it to the main content frame. You'll probably know nutch. But here is a pointer anyway: http://lucene.apache.org/nutch/ just if you're in search for some inspiration. Nutch is a great tool for webcrawling. I've used it and it worked great... Best Regards Jan Prill On 3/25/06, Jens Kraemer wrote: > > Hi! > > RDig is a small tool to build a Ferret index for the contents of a > website or intranet. It contains a simple HTTP crawler and some support > for extracting textual content from the fetched pages. > > I built this to implement a site-wide search for a recent project > that combined a Rails application with lots of static html files > generated by a CMS. > > Any feedback is very welcome! > > Rubyforge project page: http://rubyforge.org/projects/rdig > RDocs: http://rdig.rubyforge.org/ > > `gem install rdig` should work once the gem has reached the rubyforge > mirrors. > > > Jens > > -- > webit! Gesellschaft f?r neue Medien mbH www.webit.de > Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de > Schnorrstra?e 76 Tel +49 351 46766 0 > D-01069 Dresden Fax +49 351 46766 66 > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20060325/872b5828/attachment.htm From josh.nug at gmail.com Mon Mar 27 12:22:52 2006 From: josh.nug at gmail.com (Josh Di) Date: Mon, 27 Mar 2006 19:22:52 +0200 Subject: [Ferret-talk] cFerret nearing completion In-Reply-To: <441CFFC1.2020305@blackkettle.org> References: <4419C0AA.700@blackkettle.org> <441CFFC1.2020305@blackkettle.org> Message-ID: <3a737d00e7ae9b5ce3ac568493b07b1e@ruby-forum.com> Hi David, thank you for the new release. To the details. How do I compile the latest svn checkout? If I do the normal procedure "setup.rb config && setub.rb setup && setup.rb install", it won't compile because files are missing in ext/. By trying out I found that "rake package" copies those files to ext/. But now "setup.rb setup" gives 3 compile errors for except.c. First one is: except.c:8: error: `THREAD_ONCE_INIT' undeclared here (not in a function) I found that constant in no file in the checkout. Where is it? On the other hand building the latest release from http://ferret.davebalmain.com/trac/wiki/DownloadStable works. So I played around with that one and have added/changed some things on which I would like to hear your opinion. * I made QueryParser's "clean_string" callable via Ruby. So that one can override the method. For that it must be called in frt_qp_parse() via rb_funcall(). Problem is: qp_parse() is also directly called from C (index_get_query), so in this case "clean_string" will not be called. * The current StandardAnalyzer does not parse UTF-8 strings correctly. So I made a quick hack and copy-and-pasted your old SA-implementation with Regular Expression to C. Is this of interest? I then would add the stuff I didnt need (handling of acronyms) and send you the diffs. * I needed .reader on IndexSearcher . This should be in the main branch too, right? Best regards josh -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Mon Mar 27 20:02:01 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 28 Mar 2006 10:02:01 +0900 Subject: [Ferret-talk] cFerret nearing completion In-Reply-To: <3a737d00e7ae9b5ce3ac568493b07b1e@ruby-forum.com> References: <4419C0AA.700@blackkettle.org> <441CFFC1.2020305@blackkettle.org> <3a737d00e7ae9b5ce3ac568493b07b1e@ruby-forum.com> Message-ID: On 3/28/06, Josh Di wrote: > Hi David, > > thank you for the new release. > > To the details. How do I compile the latest svn checkout? If I do the > normal procedure "setup.rb config && setub.rb setup && setup.rb > install", it won't compile because files are missing in ext/. Sorry, I forgot to check some files in. It should work now. Run rake ext to copy all the files into the ext directory and build the extension. Then setup.rb should work correctly. > > So I played around with that one and have added/changed some things on > which I would like to hear your opinion. > > * I made QueryParser's "clean_string" callable via Ruby. So that one can > override the method. For that it must be called in frt_qp_parse() via > rb_funcall(). Problem is: qp_parse() is also directly called from C > (index_get_query), so in this case "clean_string" will not be called. I've added a :clean_string attribute to Index and QueryParser. So; index = Index::Index.new(:clean_string => false) will create Index that uses a QueryParser that doesn't call the clean_string function. This way you can clean the string yourself before you even pass it to the search method. I think this makes the most sense. > * The current StandardAnalyzer does not parse UTF-8 strings correctly. > So I made a quick hack and copy-and-pasted your old SA-implementation > with Regular Expression to C. Is this of interest? I then would add the > stuff I didnt need (handling of acronyms) and send you the diffs. I'd definitely like to see this. Send me a patch or the code or whatever. > * I needed .reader on IndexSearcher . This should be in the main branch > too, right? I've added a reader attribute to IndexSearcher. You may like to look at what I changed and compare it to the way you did it. The memory management between C and ruby can be quite confusing. > > Best regards > > josh > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From onurturgay at labristeknoloji.com Tue Mar 28 03:48:00 2006 From: onurturgay at labristeknoloji.com (Onur Turgay) Date: Tue, 28 Mar 2006 11:48:00 +0300 Subject: [Ferret-talk] [ANN] Ferret 0.9.0-alpha (port of Apache Lucene to pure ruby) In-Reply-To: References: Message-ID: hi david, I installed 0.9.0 to a heavily busy webserver (100k pagevisits/day) and its working flawlessly (at least it seems so :) ).. But I have a major problem. Now ferret doesnt index nor search unicode turkish characters. I was using StandardAnalyzer in 0.3.2 and it was working fine; because w+ RegExp statement was somehow working with turkish charset (UTF-8) (in normal conditions it shouldnt be; but I am luck I think :) ). Now is there a way that I can make ferret work with unicode again or should I stick to 0.3.2 thanks in advance, thanks for great work. onur David Balmain wrote: > Hi Folks, > > I've just released version 0.9.0. This latest version of Ferret is an > alpha release. I have removed the old c extension and Ferret is now > running on a fully ported C library. This has allowed some huge > performance improvements both with regard to memory and CPU usage. > > There will probably be a few portability issues to start with. It has > been developed on Linux so it should work fine there. Windows and Mac > users beware. > > Also, the current version doesn't allow you to extend Ferret. For > example, you can't write your own analyzer or filter. This will be > rectified in the near future. > > http://ferret.davebalmain.com/trac/ > > Dave Balmain > > == Description > > Ferret is a full port of the Apache Lucene searching and indexing > library. It's available as a gem so try it out! To get started quickly > read the quick start at the project homepage; > > http://ferret.davebalmain.com/api > http://ferret.davebalmain.com/api/files/TUTORIAL.html > > == Changes > > * currently this version isn't very extendable. For example, > you can't write your own Analyzer, Filter or Query. > * changed Token#term_text to Token#text > * changed Token#position_increment to Term#pos_inc > * changed order of args to Token.new. Now Term.new(text, start_offset, > end_offset, pos_inc=1, type="text"). NOTE: type does nothing. > * changed TermVectorOffsetInfo#start_offset to TermVectorOffsetInfo#start > * changed TermVectorOffsetInfo#end_offset to TermVectorOffsetInfo#end > * added :id_field option to Index::Index class. From f at andreas-s.net Wed Mar 29 08:12:02 2006 From: f at andreas-s.net (Andreas S.) Date: Wed, 29 Mar 2006 15:12:02 +0200 Subject: [Ferret-talk] Problems with Ferret 0.9.0 Message-ID: Hi, I upgraded from 0.3.2 to 0.9.0, and now my old search code doesn't work anymore. I get a lot of ArgumentErrors, for example: "query.add_clause(Search::BooleanClause.new(query_parser.parse(term), Search::BooleanClause::Occur::MUST))" raises: ArgumentError (wrong number of arguments (2 for 0)) "index_searcher.search_each(query)" raises: ArgumentError (wrong number of arguments (1 for 2)) These shouldn't happen according to the Api doc. You can see the code here: http://rforum.andreas-s.net/trac/file/trunk/app/models/search_ferret.rb Andreas -- Posted via http://www.ruby-forum.com/. From alainravet-spam2004 at yahoo.com Wed Mar 29 08:23:11 2006 From: alainravet-spam2004 at yahoo.com (Alain Ravet) Date: Wed, 29 Mar 2006 15:23:11 +0200 Subject: [Ferret-talk] EdgeRails: "undefined method `weight' for # Hi all, I was playing with the sample project found on the Wiki at http://wiki.rubyonrails.com/rails/pages/HowToIntegrateFerretWithRails , and everything was working fine, ... till I moved to EdgeRails : undefined method `weight' for # (full error thread below) Any idea? Alain /Users/aravet/Desktop/Locomotive/Bundles/RubyonRails-1.0-Min.locobundle/Contents/Resources/unix/lib/ruby/site_ruby/1.8/ferret/search/index_searcher.rb:107:in `search' /Users/aravet/Desktop/Locomotive/Bundles/RubyonRails-1.0-Min.locobundle/Contents/Resources/unix/lib/ruby/site_ruby/1.8/ferret/index/index.rb:622:in `do_search' /Users/aravet/Desktop/Locomotive/Bundles/RubyonRails-1.0-Min.locobundle/Contents/Resources/unix/lib/ruby/site_ruby/1.8/ferret/index/index.rb:317:in `search_each' /Users/aravet/Desktop/Locomotive/Bundles/RubyonRails-1.0-Min.locobundle/Contents/Resources/unix/lib/ruby/1.8/monitor.rb:229:in `synchronize' /Users/aravet/Desktop/Locomotive/Bundles/RubyonRails-1.0-Min.locobundle/Contents/Resources/unix/lib/ruby/site_ruby/1.8/ferret/index/index.rb:316:in `search_each' #{RAILS_ROOT}/app/models/result.rb:27:in `search_index' #{RAILS_ROOT}/app/models/result.rb:15:in `count' #{RAILS_ROOT}/app/controllers/search_controller.rb:31:in `get_results' Any idea -- Posted via http://www.ruby-forum.com/. From alainravet-spam2004 at yahoo.com Wed Mar 29 09:06:29 2006 From: alainravet-spam2004 at yahoo.com (Alain Ravet) Date: Wed, 29 Mar 2006 16:06:29 +0200 Subject: [Ferret-talk] EdgeRails: "undefined method `weight' for # References: Message-ID: additional info: I also tried without the c extension, without success. For the 1st unsuccessful attemp, I just installed the gem $ gem install ferret. For the 2nd attempt, I uninstalled the gem $ gem uninstall ferret. and installed from the zip file, AND skipped the c extension compilation, as indicated on Trac $ ruby setup.rb config $ ruby setup.rb install I've looked in the ferret code and noticed that a weight method can be found in the Query class, that doesn't extend Hash. I'm puzzled by the error message and more than annoyed, as my project runs on EdgeRails. Alain -- Posted via http://www.ruby-forum.com/. From JanPrill at blauton.de Wed Mar 29 09:21:40 2006 From: JanPrill at blauton.de (Jan Prill) Date: Wed, 29 Mar 2006 16:21:40 +0200 Subject: [Ferret-talk] EdgeRails: "undefined method `weight' for # References: Message-ID: <562a35c10603290621y53749d54jbde89d7a1e72005d@mail.gmail.com> Hi, Alain, since I've originally posted this HowTo to the RoR-wiki but haven't got the time right now even for the small contribution of helping to keep this site up to date I'm not quite sure if it is about time to delete the sample project. There has been happened so much in ferret as well as on rails that it's no wonder that something isn't working any longer. My advice right now is: Don't be disappointed or annoyed but have a look at the ferret wiki http://ferret.davebalmain.com and look at the integration efforts under acts_as_ferret. There even is a svn-tree for integrating ferret with rails. You should find the location of this as well on the ferret wiki and in the archives of ferret-talk on http://www.ruby-forum.com. And maybe as you are going on you'll have the time to update the rails wiki as well. Sorry for the unconvenience, but I've got exams in a few month and am learning all the time. All things IT aren't really happening for me right now... Best Regards Jan Prill On 3/29/06, Alain Ravet wrote: > > additional info: I also tried without the c extension, without success. > > > For the 1st unsuccessful attemp, I just installed the gem > > $ gem install ferret. > > > For the 2nd attempt, > I uninstalled the gem > $ gem uninstall ferret. > and installed from the zip file, AND skipped the c extension > compilation, as indicated on Trac > $ ruby setup.rb config > $ ruby setup.rb install > > > I've looked in the ferret code and noticed that a weight method can be > found in the Query class, that doesn't extend Hash. I'm puzzled by the > error message and more than annoyed, as my project runs on EdgeRails. > > > Alain > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20060329/3c19d609/attachment.htm From alainravet-spam2004 at yahoo.com Thu Mar 30 02:16:26 2006 From: alainravet-spam2004 at yahoo.com (Alain Ravet) Date: Thu, 30 Mar 2006 09:16:26 +0200 Subject: [Ferret-talk] EdgeRails: "undefined method `weight' for # References: <562a35c10603290621y53749d54jbde89d7a1e72005d@mail.gmail.com> Message-ID: I started from scratch on EdgeRails, and it worked. It's strange though, that changing the Rails version has an impact on Ferret (that doesn't require Rails). Alain -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Thu Mar 30 11:58:51 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 31 Mar 2006 01:58:51 +0900 Subject: [Ferret-talk] Problems with Ferret 0.9.0 In-Reply-To: References: Message-ID: Hi Andreas, Thanks for bringing this to my attention. I hadn't added BooleanClauses yet. They'll be in the next version. In the mean time, I recommend you use the BooleanQuery#add_query method. Cheers, Dave On 3/29/06, Andreas S. wrote: > Hi, > > I upgraded from 0.3.2 to 0.9.0, and now my old search code doesn't work > anymore. I get a lot of ArgumentErrors, for example: > > "query.add_clause(Search::BooleanClause.new(query_parser.parse(term), > Search::BooleanClause::Occur::MUST))" > raises: > ArgumentError (wrong number of arguments (2 for 0)) > > "index_searcher.search_each(query)" > raises: > ArgumentError (wrong number of arguments (1 for 2)) > > These shouldn't happen according to the Api doc. > > You can see the code here: > http://rforum.andreas-s.net/trac/file/trunk/app/models/search_ferret.rb > > Andreas > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From dbalmain.ml at gmail.com Thu Mar 30 12:29:36 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 31 Mar 2006 02:29:36 +0900 Subject: [Ferret-talk] EdgeRails: "undefined method `weight' for # References: <562a35c10603290621y53749d54jbde89d7a1e72005d@mail.gmail.com> Message-ID: Hi Alain, I'm glad the problem seems to have fixed itself. Basically what must have been happening was that a Hash was being passed to the search_each method instead of a String. The search_each method knows how to convert a String to a Query but a Hash will cause the error you saw. I don't know exactly why changing the Rails version would do something like that but I've had a similar problem in the past. That's why it's EdgeRails. :D Expect the next couple of versions of ferret to be even less stable. Cheers, Dave On 3/30/06, Alain Ravet wrote: > > I started from scratch on EdgeRails, and it worked. > It's strange though, that changing the Rails version has an impact on > Ferret (that doesn't require Rails). > > Alain > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From etienne.durand at woa.hu Fri Mar 31 16:03:12 2006 From: etienne.durand at woa.hu (etienne.durand at woa.hu) Date: Fri, 31 Mar 2006 23:03:12 +0200 Subject: [Ferret-talk] undefined method `<=>' for :id:Symbol Message-ID: <9c5f400c2ea80223977a3b0b50299359@ruby-forum.com> Upgrading to 0.9.0, I have the following error. Anybody? c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.9.0/lib/ferret/index/term.rb:35:in `<=>': undefined method `<=>' for :id:Symbol (NoMethodError) from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.9.0/lib/ferret/index/term_infos_io.rb:263:in `get_index_offset' from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.9.0/lib/ferret/index/term_infos_io.rb:162:in `get_term_info' from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.9.0/lib/ferret/index/segment_reader.rb:176:in `doc_freq' from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.9.0/lib/ferret/search/index_searcher.rb:47:in `doc_freq' from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.9.0/lib/ferret/search/term_query.rb:13:in `initialize' from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.9.0/lib/ferret/search/term_query.rb:99:in `create_weight' from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.9.0/lib/ferret/search/query.rb:51:in `weight' from c:/ruby/lib/ruby/gems/1.8/gems/ferret-0.9.0/lib/ferret/search/index_searcher.rb:151:in `search_each' ... 23 levels... from c:/ruby/lib/ruby/gems/1.8/gems/activerecord-1.14.0/lib/active_record/base.rb:1850:in `each_with_index' from C:/tmp/agora/trunk/db/content/import.rb:27 from C:/tmp/agora/trunk/db/content/import.rb:20 from C:/tmp/agora/trunk/db/content/import.rb:13 -- Posted via http://www.ruby-forum.com/.