From bigmac928 at gmail.com Tue Feb 7 13:23:42 2006 From: bigmac928 at gmail.com (big mac) Date: Wed, 8 Feb 2006 02:23:42 +0800 Subject: [Ferret-talk] setting of :key to :id in cFerret Message-ID: <2687b7810602071023m105712fas9a5afee532e0f69@mail.gmail.com> Hi Dave, I've been reading this post below back in December 2005. Is it possible to set :key to :id in cFerret like suggested below? Thanks, Mac On 12/3/05, Carl Youngblood > wrote: >* I seem to be getting the same document multiple times in my search *>* results. I'm wondering if this is because by default a document is *>* placed in the search results every time the word you're looking for *>* shows up. Is that the way it works? * Hi Carl, This means the document has been placed in the index more than once. Sounds to me like you are adding the an object to the index every time it is updated. You could try setting :key to :id. This will make sure that :id is unique in the index. That is, every time you add an existing document, the document is replaced. index = Index::Index.new (:key => :id) Alternatively you could handle the deletes yourself. Hope this helps. Dave >* Thanks, *>* Carl *>* *>* _______________________________________________ *>* Ferret-talk mailing list *>* Ferret-talk at rubyforge.org *>* http://rubyforge.org/mailman/listinfo/ferret-talk *> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20060208/6d307287/attachment.htm From dbalmain.ml at gmail.com Tue Feb 7 21:44:10 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 8 Feb 2006 11:44:10 +0900 Subject: [Ferret-talk] setting of :key to :id in cFerret In-Reply-To: <2687b7810602071023m105712fas9a5afee532e0f69@mail.gmail.com> References: <2687b7810602071023m105712fas9a5afee532e0f69@mail.gmail.com> Message-ID: Hi Mac, I was planning on handling this in the Ruby bindings rather than in the C code. If you don't mind me asking, what are you using cFerret for? Dave On 2/8/06, big mac wrote: > Hi Dave, > I've been reading this post below back in December 2005. > Is it possible to set :key to :id in cFerret like suggested below? > Thanks, > Mac > > > On 12/3/05, Carl Youngblood < > carl at youngbloods.org > > wrote: > > I seem to be getting the same document multiple times in my search > > results. I'm wondering if this is because by default a document is > > placed in the search results every time the word you're looking for > > > > shows up. Is that the way it works? > > Hi Carl, > > This means the document has been placed in the index more than once. > Sounds to me like you are adding the an object to the index every time > > > it is updated. You could try setting :key to :id. This will make sure > that :id is unique in the index. That is, every time you add an > existing document, the document is replaced. > > index = Index::Index.new > > (:key => :id) > > Alternatively you could handle the deletes yourself. > > Hope this helps. > Dave > > > Thanks, > > Carl > > > > _______________________________________________ > > > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > > From chrisroos at revieworld.com Thu Feb 9 06:13:43 2006 From: chrisroos at revieworld.com (Chris Roos) Date: Thu, 9 Feb 2006 12:13:43 +0100 Subject: [Ferret-talk] Finding related items (like latent semantic indexing) Message-ID: I've been trying to use Classifier::LSI to provide a means of finding 'related items', where each item is a one line description of a product. Although on small samples the Classifier works great, it completely baulks on my current dataset of 3000 items. I've started to look at ferret this morning, following a post on the ruby mailing list. I'd guess that the Fuzzy Query would be the thing that I need, although it doesn't appear to be as comprehensive as the LSI stuff in classifier (I realise they are doing different things). I'm really just after any thoughts anyone might have.. Thanks in advance, Chris -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Thu Feb 9 07:42:22 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 9 Feb 2006 21:42:22 +0900 Subject: [Ferret-talk] Finding related items (like latent semantic indexing) In-Reply-To: References: Message-ID: Hi Chris, I plan on adding a "More Like This" function to Ferret but I'm really swamped (doing other stuff on Ferret) at the moment. If you want to have a go at implementing it yourself you could have a look at the way it's done in Lucene. It's not too much work but it could take you a while to get your head around the Ferret internals and the current Ferret codebase is soon to be obselete. Sorry I can't be of more help. Cheers, Dave On 2/9/06, Chris Roos wrote: > I've been trying to use Classifier::LSI to provide a means of finding > 'related items', where each item is a one line description of a product. > > Although on small samples the Classifier works great, it completely > baulks on my current dataset of 3000 items. > > I've started to look at ferret this morning, following a post on the > ruby mailing list. I'd guess that the Fuzzy Query would be the thing > that I need, although it doesn't appear to be as comprehensive as the > LSI stuff in classifier (I realise they are doing different things). > > I'm really just after any thoughts anyone might have.. > > Thanks in advance, > > Chris > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From dbalmain.ml at gmail.com Thu Feb 9 10:09:51 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Fri, 10 Feb 2006 00:09:51 +0900 Subject: [Ferret-talk] Finding related items (like latent semantic indexing) In-Reply-To: References: Message-ID: Hi Chris, I just noticed that you are indexing one line product descriptions. What I'd suggest doing (I believe this is how the lucene MoreLikeThis query works) is just taking the description of your start product and using that as the query. So if the description is; "apple ipod nano 4Gb black" then your query will be; "description:(apple ipod nano 4Gb black)" Hope that helps, Dave On 2/9/06, David Balmain wrote: > Hi Chris, > > I plan on adding a "More Like This" function to Ferret but I'm really > swamped (doing other stuff on Ferret) at the moment. If you want to > have a go at implementing it yourself you could have a look at the way > it's done in Lucene. It's not too much work but it could take you a > while to get your head around the Ferret internals and the current > Ferret codebase is soon to be obselete. Sorry I can't be of more help. > > Cheers, > Dave > > On 2/9/06, Chris Roos wrote: > > I've been trying to use Classifier::LSI to provide a means of finding > > 'related items', where each item is a one line description of a product. > > > > Although on small samples the Classifier works great, it completely > > baulks on my current dataset of 3000 items. > > > > I've started to look at ferret this morning, following a post on the > > ruby mailing list. I'd guess that the Fuzzy Query would be the thing > > that I need, although it doesn't appear to be as comprehensive as the > > LSI stuff in classifier (I realise they are doing different things). > > > > I'm really just after any thoughts anyone might have.. > > > > Thanks in advance, > > > > Chris > > > > -- > > Posted via http://www.ruby-forum.com/. > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > From JFraumeni at tufts-nemc.org Fri Feb 10 17:00:50 2006 From: JFraumeni at tufts-nemc.org (Fraumeni, James) Date: Fri, 10 Feb 2006 17:00:50 -0500 Subject: [Ferret-talk] Ferret Trampling Namespace Message-ID: Greetings. Apologies if I missed something like this in the archives or internet, but I'm having issues using Ferret with Rails. Specifically, I have a class class "Weight" in my application (it happens to be a model). My app runs perfectly fine until it first executes "require 'ferret'". At that point, my definition of the class "Weight" disappears and I get "uninitialized constant Weight" errors. I see that Ferret also has a Weight class (Ferret::Search::Weight) so I assume that Ferret isn't playing nicely with namespaces, although I assume that it's caused by some interaction between Rails and Ferret as it does not occurr in IRB. Has anyone else come across an issue of Ferret trampling namespace? I have tried putting the indexing code in a number of places within my app, and "load"ing Ferret rather then "require"ing it, but loading tends to fail to load Ferret. Thanks for any insight. I am a Ruby and Rails nuby and am using Win32 (temporarilly), Ruby 1.8.2, Ferret 0.3.2, and Rails 1.0.0. James -- James Fraumeni Center for the Evaluation of Value and Risk in Health Institute for Clinical Research and Health Policy Studies 750 Washington St. Tufts-New England Medical Center, #063 Boston, MA 02111 (617) 636-2577 jfraumeni at tufts-nemc.org ********************** Confidentiality Notice ********************** The information transmitted in this e-mail is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged information. Any review, retransmission, dissemination or other use of or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this e-mail in error, please contact the sender and delete the e-mail and any attached material immediately. Thank you. From atomgiant at gmail.com Thu Feb 16 09:38:44 2006 From: atomgiant at gmail.com (Tom Davies) Date: Thu, 16 Feb 2006 09:38:44 -0500 Subject: [Ferret-talk] Ferret with relative index paths Message-ID: Hi, I have ferret working fine on my Dev machine using a relative index path as follows: USER_INDEX = Index::Index.new(:path => "indexes/user", :key => 'id', :auto_flush => true) And the indexes/user directory is located directly off the root of my project tree. But when I migrate this same code to my shared TextDrive account, Ferret cannot find the index directory and throws this exception: /usr/local/lib/ruby/gems/1.8/gems/ferret-0.3.2/lib/ferret/store/fs_store.rb:41:in `initialize': There is no directory: indexes/user. Use create = true to create one (RuntimeError) Do I have to use absolute paths? Or is it perhaps looking somewhere else for the root of its relative paths in different environments? Thanks, Tom From JFraumeni at tufts-nemc.org Thu Feb 16 11:24:10 2006 From: JFraumeni at tufts-nemc.org (Fraumeni, James) Date: Thu, 16 Feb 2006 11:24:10 -0500 Subject: [Ferret-talk] Ferret with relative index paths Message-ID: I'm a relative nuby to Ruby, but my inclination would be to add a call to Dir::pwd() right before you open the index to see what Ruby is using as its current working directory (and then figure out why). One fix would be to use Dir::chdir() to change the current working directory, although that doesn't seem to me to be a very robust solution. James -- James Fraumeni Center for the Evaluation of Value and Risk in Health Institute for Clinical Research and Health Policy Studies 750 Washington St. Tufts-New England Medical Center, #063 Boston, MA 02111 jfraumeni at tufts-nemc.org -----Original Message----- From: ferret-talk-bounces at rubyforge.org [mailto:ferret-talk-bounces at rubyforge.org]On Behalf Of Tom Davies Sent: Thursday, February 16, 2006 9:39 AM To: ferret-talk at rubyforge.org Subject: [Ferret-talk] Ferret with relative index paths Hi, I have ferret working fine on my Dev machine using a relative index path as follows: USER_INDEX = Index::Index.new(:path => "indexes/user", :key => 'id', :auto_flush => true) And the indexes/user directory is located directly off the root of my project tree. But when I migrate this same code to my shared TextDrive account, Ferret cannot find the index directory and throws this exception: /usr/local/lib/ruby/gems/1.8/gems/ferret-0.3.2/lib/ferret/store/fs_store.rb:41:in `initialize': There is no directory: indexes/user. Use create = true to create one (RuntimeError) Do I have to use absolute paths? Or is it perhaps looking somewhere else for the root of its relative paths in different environments? Thanks, Tom _______________________________________________ Ferret-talk mailing list Ferret-talk at rubyforge.org http://rubyforge.org/mailman/listinfo/ferret-talk ********************** Confidentiality Notice ********************** The information transmitted in this e-mail is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged information. Any review, retransmission, dissemination or other use of or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this e-mail in error, please contact the sender and delete the e-mail and any attached material immediately. Thank you. From atomgiant at gmail.com Thu Feb 16 17:09:55 2006 From: atomgiant at gmail.com (Tom Davies) Date: Thu, 16 Feb 2006 17:09:55 -0500 Subject: [Ferret-talk] Ferret with relative index paths In-Reply-To: References: Message-ID: ok, it was due to the setup at TextDrive. The current directory was the public directory rather than the project root. Thanks for the idea James. It looks like I am going to have to configure my dev vs production independently. Tom On 2/16/06, Fraumeni, James wrote: > I'm a relative nuby to Ruby, but my inclination would be to add a call to Dir::pwd() right before you open the index to see what Ruby is using as its current working directory (and then figure out why). One fix would be to use Dir::chdir() to change the current working directory, although that doesn't seem to me to be a very robust solution. > > > > James > > -- > James Fraumeni > Center for the Evaluation of Value and Risk in Health > Institute for Clinical Research and Health Policy Studies > 750 Washington St. > Tufts-New England Medical Center, #063 > Boston, MA 02111 > jfraumeni at tufts-nemc.org > > > -----Original Message----- > From: ferret-talk-bounces at rubyforge.org > [mailto:ferret-talk-bounces at rubyforge.org]On Behalf Of Tom Davies > Sent: Thursday, February 16, 2006 9:39 AM > To: ferret-talk at rubyforge.org > Subject: [Ferret-talk] Ferret with relative index paths > > > Hi, > > I have ferret working fine on my Dev machine using a relative index > path as follows: > > USER_INDEX = Index::Index.new(:path => "indexes/user", :key => 'id', > :auto_flush => true) > > And the indexes/user directory is located directly off the root of my > project tree. > > But when I migrate this same code to my shared TextDrive account, > Ferret cannot find the index directory and throws this exception: > > /usr/local/lib/ruby/gems/1.8/gems/ferret-0.3.2/lib/ferret/store/fs_store.rb:41:in > `initialize': There is no directory: indexes/user. Use create = true > to create one (RuntimeError) > > Do I have to use absolute paths? Or is it perhaps looking somewhere > else for the root of its relative paths in different environments? > > Thanks, > Tom > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > > > ********************** > Confidentiality Notice > ********************** > The information transmitted in this e-mail is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged information. Any review, retransmission, dissemination or other use of or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. > If you received this e-mail in error, please contact the sender and delete the e-mail and any attached material immediately. Thank you. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From alainravet-spam2004 at yahoo.com Fri Feb 17 13:04:21 2006 From: alainravet-spam2004 at yahoo.com (Alain Ravet) Date: Fri, 17 Feb 2006 19:04:21 +0100 Subject: [Ferret-talk] plz register this mailing-list with gmane Message-ID: <67bdfa095ebec4a9d57681d3c7bc8b02@ruby-forum.com> Thanks in advance. Alain -- Posted via http://www.ruby-forum.com/. From alex at blackkettle.org Fri Feb 17 13:28:27 2006 From: alex at blackkettle.org (Alex Young) Date: Fri, 17 Feb 2006 18:28:27 +0000 Subject: [Ferret-talk] IndexReader NotImplemented Message-ID: <43F615CB.90004@blackkettle.org> Hi there, Sorry if this has come up before, but I couldn't see it obviously addressed anywhere. There are a few methods in IndexReader that raise NotImplementedErrors. I'm specifically interested in get_term_vector, but there are a number of others. Is there anything specific holding these back, or would patches to implement them be accepted? Thanks, -- Alex From dbalmain.ml at gmail.com Fri Feb 17 20:57:32 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Sat, 18 Feb 2006 10:57:32 +0900 Subject: [Ferret-talk] IndexReader NotImplemented In-Reply-To: <43F615CB.90004@blackkettle.org> References: <43F615CB.90004@blackkettle.org> Message-ID: Hi Alex, These methods are implemented by SegmentReader and MultiReader. IndexReader is kind of an abstract class. When you open an IndexReader you'll get either a SegmentReader or a MultiReader so go ahead and use those methods. They'll be implemented. Cheers, Dave PS: make sure you use the IndexReader#open method to create an IndexReader. On 2/18/06, Alex Young wrote: > Hi there, > > Sorry if this has come up before, but I couldn't see it obviously > addressed anywhere. There are a few methods in IndexReader that raise > NotImplementedErrors. I'm specifically interested in get_term_vector, > but there are a number of others. Is there anything specific holding > these back, or would patches to implement them be accepted? > > Thanks, > -- > Alex > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From alainravet-spam2004 at yahoo.com Mon Feb 20 09:14:15 2006 From: alainravet-spam2004 at yahoo.com (Alain Ravet) Date: Mon, 20 Feb 2006 15:14:15 +0100 Subject: [Ferret-talk] plz register this mailing-list with gmane In-Reply-To: <67bdfa095ebec4a9d57681d3c7bc8b02@ruby-forum.com> References: <67bdfa095ebec4a9d57681d3c7bc8b02@ruby-forum.com> Message-ID: Anybody? It's free, it takes 30 seconds once and the whole community benefits from it. Newsgroups are really convenient. They are threaded, serve archives automatically and don't force you to distribute your email address. TIA Alain -- Posted via http://www.ruby-forum.com/. From jennyw at dangerousideas.com Tue Feb 21 23:37:51 2006 From: jennyw at dangerousideas.com (jennyw) Date: Tue, 21 Feb 2006 20:37:51 -0800 Subject: [Ferret-talk] plz register this mailing-list with gmane In-Reply-To: References: <67bdfa095ebec4a9d57681d3c7bc8b02@ruby-forum.com> Message-ID: <43FBEA9F.40904@dangerousideas.com> Alain Ravet wrote: > Anybody? > It's free, it takes 30 seconds once and the whole community benefits > from it. > For what it's worth, I think it'd be a nice idea. For the time being, though, you can also read this group at ruby-forum.com. Jen From bigliu at gmail.com Wed Feb 22 18:32:25 2006 From: bigliu at gmail.com (Jerry Liu) Date: Thu, 23 Feb 2006 00:32:25 +0100 Subject: [Ferret-talk] Chinese search support Message-ID: <66ad2f5ed69c3c5c1d70adb57c90ffbb@ruby-forum.com> I need decide on if our site will go with Java or Ruby on Rails. The major factor is that does Farret support Lucene's ChineseAnalyzer or CJKAnalyzer or not. Can anyboby shine some lights on Farret's Chinese search support? Really appreciate. -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Thu Feb 23 01:54:33 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Thu, 23 Feb 2006 15:54:33 +0900 Subject: [Ferret-talk] Chinese search support In-Reply-To: <66ad2f5ed69c3c5c1d70adb57c90ffbb@ruby-forum.com> References: <66ad2f5ed69c3c5c1d70adb57c90ffbb@ruby-forum.com> Message-ID: Hi Jerry, Basically you'll have to write an analyzer that matches Chinese tokens (words). If you can write a regular expression in Ruby that matches Chinese tokens then it's very simple to write an Analyzer for Ferret. I haven't looked at teh CJKAnalyzer in Lucene but I can't imagine it would be too hard to port to Ruby. Cheers, Dave On 2/23/06, Jerry Liu wrote: > I need decide on if our site will go with Java or Ruby on Rails. The > major factor is that does Farret support Lucene's ChineseAnalyzer or > CJKAnalyzer or not. > > Can anyboby shine some lights on Farret's Chinese search support? > > Really appreciate. > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From erik at ehatchersolutions.com Thu Feb 23 03:18:37 2006 From: erik at ehatchersolutions.com (Erik Hatcher) Date: Thu, 23 Feb 2006 03:18:37 -0500 Subject: [Ferret-talk] Chinese search support In-Reply-To: References: <66ad2f5ed69c3c5c1d70adb57c90ffbb@ruby-forum.com> Message-ID: There is nothing fancy about the CJKAnalyzer.... it chunks characters into pairs. So the phrase ??? would be tokenized into two tokens [??] [??]. Erik On Feb 23, 2006, at 1:54 AM, David Balmain wrote: > Hi Jerry, > Basically you'll have to write an analyzer that matches Chinese tokens > (words). If you can write a regular expression in Ruby that matches > Chinese tokens then it's very simple to write an Analyzer for Ferret. > I haven't looked at teh CJKAnalyzer in Lucene but I can't imagine it > would be too hard to port to Ruby. > > Cheers, > Dave > > On 2/23/06, Jerry Liu wrote: >> I need decide on if our site will go with Java or Ruby on Rails. The >> major factor is that does Farret support Lucene's ChineseAnalyzer or >> CJKAnalyzer or not. >> >> Can anyboby shine some lights on Farret's Chinese search support? >> >> Really appreciate. >> >> -- >> Posted via http://www.ruby-forum.com/. >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk >> > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From adamjroth at gmail.com Tue Feb 28 00:58:11 2006 From: adamjroth at gmail.com (A. Roth) Date: Tue, 28 Feb 2006 06:58:11 +0100 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret Message-ID: I have multiple models all with: acts_as_ferret :fields => [...] (models = profiles, blogs, comments ) When I restart the server and perform any crud operation on one of the above models, the index is created/updated. If I then go and perform any crud operation on ANOTHER model, ...the index from that first model is being updated. Any ideas? Can acts_as_ferret handle this? Thanks Adam -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Tue Feb 28 06:06:42 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Tue, 28 Feb 2006 20:06:42 +0900 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: References: Message-ID: On 2/28/06, A. Roth wrote: > I have multiple models all with: > > acts_as_ferret :fields => [...] > > (models = profiles, blogs, comments ) > > > When I restart the server and perform any crud operation on one of the > above models, the index is created/updated. If I then go and perform > any crud operation on ANOTHER model, ...the index from that first model > is being updated. > > Any ideas? Can acts_as_ferret handle this? Hi Adam, I'm assuming here that you used the acts_as_ferret code on the lower part of this page; http://ferret.davebalmain.com/trac/wiki/FerretOnRails There should only be one index for all models, not one index each. Each document in the index contains a ferret_class field which will contain the name of the model so searches on a specific model will only find documents for that model. I hope that makes sense. Basically there should be one index that gets updated whenever any of the models are updated. Cheers, Dave > > > Thanks > Adam > > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From atomgiant at gmail.com Tue Feb 28 08:12:09 2006 From: atomgiant at gmail.com (Tom Davies) Date: Tue, 28 Feb 2006 08:12:09 -0500 Subject: [Ferret-talk] Most Popular Searches Message-ID: Hi, I have an index where each document contains an untokenized 'url' field. I would like to query the index for the most popular urls. In SQL I would do this via a Group By clause. Is there anything in Ferret that will do something similar? I found this discussion that proposed a solution involving TermEnums: http://www.gossamer-threads.com/lists/lucene/java-user/32272#32272 But I noticed the IndexReader.terms and IndexReader.term_docs are not implemented. Is that solution the way to go? Would an index-only solution perform a lot faster than a pure database solution using a group by clause? Any feedback is appreciated. Tom From adamjroth at gmail.com Tue Feb 28 10:11:20 2006 From: adamjroth at gmail.com (Adam Roth) Date: Tue, 28 Feb 2006 16:11:20 +0100 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: References: Message-ID: Hi David, Yes.. that makes sense. Why, however, are there directories created for each model in RAILS_ROOT/index/RAILS_ENV if only one is needed? The first model hit will be the index that is written to for all models (like I had mentioned). Is that correct? For example: - I start up the server - A "blog" page is hit and updated - The index in RAILS_ROOT/index/Development/Blog is updated - A "comment" page is updated - The index is RAILS_ROOT/index/Development/Blog is updated again, despite there being a /Development/Comment dir. I'm probably missing something. I would appriciate if you could fill in my understanding based on my comments above. Thanks you in advance. Adam David Balmain wrote: > On 2/28/06, A. Roth wrote: >> is being updated. >> >> Any ideas? Can acts_as_ferret handle this? > > Hi Adam, > > I'm assuming here that you used the acts_as_ferret code on the lower > part of this page; > > http://ferret.davebalmain.com/trac/wiki/FerretOnRails > > There should only be one index for all models, not one index each. > Each document in the index contains a ferret_class field which will > contain the name of the model so searches on a specific model will > only find documents for that model. I hope that makes sense. Basically > there should be one index that gets updated whenever any of the models > are updated. > > Cheers, > Dave -- Posted via http://www.ruby-forum.com/. From adamjroth at gmail.com Tue Feb 28 17:21:13 2006 From: adamjroth at gmail.com (aroth) Date: Tue, 28 Feb 2006 23:21:13 +0100 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: References: Message-ID: Also, I was using the version found here: https://svn.jkraemer.net/svn/projects/ferret-demo/trunk/vendor/plugins/acts_as_ferret/ His changelog mentions how the index per model structure... should I be using the one on the wiki page? Adam -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Tue Feb 28 18:35:37 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 1 Mar 2006 08:35:37 +0900 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: References: Message-ID: Hi Adam, That version is supposed to work with seperate indexes. I had a quick look at the code but I'm not sure what is wrong. Perhaps you could drop Jens an email about it. Or you could try the other acts_as_ferret plugin. Sorry I can't be of more help. Cheers, Dave On 3/1/06, aroth wrote: > Also, I was using the version found here: > > https://svn.jkraemer.net/svn/projects/ferret-demo/trunk/vendor/plugins/acts_as_ferret/ > > His changelog mentions how the index per model structure... should I be > using the one on the wiki page? > > Adam > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From dbalmain.ml at gmail.com Tue Feb 28 18:37:15 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 1 Mar 2006 08:37:15 +0900 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: References: Message-ID: One other thing. When you say the blog index is being updated when you modify comments, does that mean only the blog index ever gets updated? Does the comments index remains empty? On 3/1/06, aroth wrote: > Also, I was using the version found here: > > https://svn.jkraemer.net/svn/projects/ferret-demo/trunk/vendor/plugins/acts_as_ferret/ > > His changelog mentions how the index per model structure... should I be > using the one on the wiki page? > > Adam > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From tlockney at oddpost.com Tue Feb 28 18:39:11 2006 From: tlockney at oddpost.com (Thomas Lockney) Date: Wed, 1 Mar 2006 00:39:11 +0100 Subject: [Ferret-talk] Multiple Models w/ acts_as_ferret In-Reply-To: References: Message-ID: aroth wrote: > Also, I was using the version found here: > > https://svn.jkraemer.net/svn/projects/ferret-demo/trunk/vendor/plugins/acts_as_ferret/ > Actually, I believe he merged bits and pieces from both versions on the wiki along with some of his own changes. I haven't looked closely at the code, yet, but I from an initial glance it looks like it incorporates my additions pretty cleanly. I'm not quite certain why you would be seeing the behavior described. Could you post some snippets from your code (like the acts_as_ferret lines from each model and any configuration from environment.rb)? Jens just recently set up this SVN repository and gave Kasper and I access to it for furthur updates -- this will be the source for future versions of the plugin. One of us will hopefully be updating the wiki soon to reflect this. Thomas -- Posted via http://www.ruby-forum.com/. From dbalmain.ml at gmail.com Tue Feb 28 18:58:45 2006 From: dbalmain.ml at gmail.com (David Balmain) Date: Wed, 1 Mar 2006 08:58:45 +0900 Subject: [Ferret-talk] Most Popular Searches In-Reply-To: References: Message-ID: On 2/28/06, Tom Davies wrote: > Hi, > > I have an index where each document contains an untokenized 'url' > field. I would like to query the index for the most popular urls. In > SQL I would do this via a Group By clause. Is there anything in > Ferret that will do something similar? > > I found this discussion that proposed a solution involving TermEnums: > > http://www.gossamer-threads.com/lists/lucene/java-user/32272#32272 > > But I noticed the IndexReader.terms and IndexReader.term_docs are not > implemented. Is that solution the way to go? Would an index-only > solution perform a lot faster than a pure database solution using a > group by clause? Hi Tom, Those methods are implemented. Just not in IndexReader. They're implemented in SegmentReader and MultiReader. IndexReader is an abstract class. Whenever you call IndexReader#open you'll get either a SegmentReader or a MultiReader. Anyway, if you want to run searches on all documents with the url field you could use a filter like this; module Ferret::Search # A Filter that restricts search results to only those documents with a # certain field called @group_name. class GroupFilter < Filter include Ferret::Index def initialize(group_name) @group_name = group_name end # Returns a BitVector with true for documents which should be permitted in # search results, and false for those that should not. def bits(reader) bits = Ferret::Utils::BitVector.new() term_enum = reader.terms_from(Term.new(@group_name, "")) begin if (term_enum.term() == nil) return bits end term_docs = reader.term_docs begin begin term = term_enum.term() break if (term.nil? or term.field != @group_name) term_docs.seek(term_enum) while term_docs.next? bits.set(term_docs.doc) end end while term_enum.next? ensure term_docs.close() end ensure term_enum.close() end return bits end end end Or perhaps you only want the 10 most popular urls and you'd like to create the filter like this; filter = Filter.new("url", ["url1", "url2", ..., "url10"]) This filter might look something like this; module Ferret::Search # A Filter that restricts search results to only those documents with a # certain field called @field_name with values in the @values array. class GroupFilter < Filter include Ferret::Index def initialize(field_name, values) @field_name = field_name @values = values end # Returns a BitVector with true for documents which should be permitted in # search results, and false for those that should not. def bits(reader) bits = Ferret::Utils::BitVector.new() term_enum = reader.terms_from(Term.new(@field_name, "")) begin if (term_enum.term() == nil) return bits end term_docs = reader.term_docs begin begin term = term_enum.term() break if (term.nil? or term.field != @field_name) if @values.index(term.text) term_docs.seek(term_enum) while term_docs.next? bits.set(term_docs.doc) end end end while term_enum.next? ensure term_docs.close() end ensure term_enum.close() end return bits end end end WARNING:: I haven't tested any of this code. Also, I don't know how it would perform compared to using a group_by on the database itself although I'd be happy to hear about any performance tests you might do. I hope this helps. Cheers, Dave > > Any feedback is appreciated. > > Tom > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >