From myowntribe at yahoo.com Wed Aug 1 03:50:24 2007 From: myowntribe at yahoo.com (Cass Amino) Date: Wed, 1 Aug 2007 09:50:24 +0200 Subject: [Ferret-talk] How to perform search on multiple models and controllers Message-ID: <6e5fed8aa615758cd58825d5b711d9a6@ruby-forum.com> Hi I am using acts_as_ferret and it works great on inidvidual models. Like I can search seperately for fileds indexed in BasicProfile model and fields indexed in WorkProfile model. I would like to know if it is possible to search commonly for all the fileds in these two models? Cheers Cass -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Aug 1 04:35:45 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 1 Aug 2007 10:35:45 +0200 Subject: [Ferret-talk] How to perform search on multiple models and controllers In-Reply-To: <6e5fed8aa615758cd58825d5b711d9a6@ruby-forum.com> References: <6e5fed8aa615758cd58825d5b711d9a6@ruby-forum.com> Message-ID: <20070801083545.GE6829@cordoba.webit.de> On Wed, Aug 01, 2007 at 09:50:24AM +0200, Cass Amino wrote: > Hi > > I am using acts_as_ferret and it works great on inidvidual models. Like > I can search seperately for fileds indexed in BasicProfile model and > fields indexed in WorkProfile model. > > I would like to know if it is possible to search commonly for all the > fileds in these two models? check out multi_search Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From eimorton at gmail.com Wed Aug 1 08:45:26 2007 From: eimorton at gmail.com (Erik Morton) Date: Wed, 1 Aug 2007 08:45:26 -0400 Subject: [Ferret-talk] RDig and AAF playing together In-Reply-To: <20070730133550.GA30135@cordoba.webit.de> References: <0B92FEA3-EF29-4FA2-A383-492C69CEE0E3@gmail.com> <20070730073833.GA2198@cordoba.webit.de> <034C55C4-B496-42B6-A3BC-F4E9564C61DD@gmail.com> <20070730125812.GE2198@cordoba.webit.de> <3B2EA537-079A-4FD3-BDE8-05482F0E202F@gmail.com> <20070730133550.GA30135@cordoba.webit.de> Message-ID: <88CBED60-B1BE-438B-B2C1-B28B8D049479@gmail.com> If I create an IndexReader like so: ir = IndexReader.new([index1, index2]) How can I get the "sub readers" for the two indexes? From the RDocs I only see the ability to call ir.latest?, which results in the segfault. Thanks again. Erik On Jul 30, 2007, at 9:35 AM, Jens Kraemer wrote: > On Mon, Jul 30, 2007 at 09:18:33AM -0400, Erik Morton wrote: >> It's strange, I'm actually getting the Bus Error anytime I call >> latest? on RDig's index reader. The index is no longer being rebuilt. >> It's interesting because the following lines were commented out of my >> version of RDig: >> # if @ferret_searcher and !@ferret_searcher.reader.latest? >> # # reopen searcher >> # @ferret_searcher.close >> # @ferret_searcher = nil >> # end >> So this has obviously happened before. I must have commented these >> lines out myself :-/ >> >> On linux I get the following: >>>> RDig.searcher.ferret_searcher.reader.latest? >> (irb):5: [BUG] Segmentation fault >> ruby 1.8.4 (2005-12-24) [i386-linux] > > Ah yes :-) > > If your reader looks at two sub-readers for different indexes (as it > seems to do, if I got your first mail right) you'll have to call > latest? on > each of the sub readers to get around this. I do the same in > acts_as_ferret's MultiIndex class. > > Jens > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From kraemer at webit.de Wed Aug 1 09:09:39 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 1 Aug 2007 15:09:39 +0200 Subject: [Ferret-talk] RDig and AAF playing together In-Reply-To: <88CBED60-B1BE-438B-B2C1-B28B8D049479@gmail.com> References: <0B92FEA3-EF29-4FA2-A383-492C69CEE0E3@gmail.com> <20070730073833.GA2198@cordoba.webit.de> <034C55C4-B496-42B6-A3BC-F4E9564C61DD@gmail.com> <20070730125812.GE2198@cordoba.webit.de> <3B2EA537-079A-4FD3-BDE8-05482F0E202F@gmail.com> <20070730133550.GA30135@cordoba.webit.de> <88CBED60-B1BE-438B-B2C1-B28B8D049479@gmail.com> Message-ID: <20070801130939.GI6829@cordoba.webit.de> On Wed, Aug 01, 2007 at 08:45:26AM -0400, Erik Morton wrote: > If I create an IndexReader like so: > > ir = IndexReader.new([index1, index2]) > > How can I get the "sub readers" for the two indexes? From the RDocs I > only see the ability to call ir.latest?, which results in the segfault. First create two separate readers for your indexes: reader1 = IndexReader.new(index1) reader2 = IndexReader.new(index2) Then build your joint reader from them: ir = IndexReader([reader1, reader2]) Now you can easily use reader1.latest? && reader2.latest? to determine if your ir instance needs some refreshing. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From isha.kakodkar at gmail.com Fri Aug 3 09:48:46 2007 From: isha.kakodkar at gmail.com (isha kakodkar) Date: Fri, 3 Aug 2007 06:48:46 -0700 Subject: [Ferret-talk] Group by clause In-Reply-To: <20070731093141.GG30135@cordoba.webit.de> References: <87b412ce0707310212s3a072a9bv6d1711611e7da907@mail.gmail.com> <20070731093141.GG30135@cordoba.webit.de> Message-ID: <87b412ce0708030648j2925d2ddu2830468d74398474@mail.gmail.com> ok...and how do i achieve find_id_by_contents search with where condition but support of "or" and "and"? when i search a query string and provide Messages.full_text_search("searching, user_id:#{user_id} name:#{name}", {:page =>(params[:page]||1), :sort => s}) It searches for all ids where user_id is the given value and name is the given value. But if i want to achieve a "or" between the 2 conditions how do i do it? On 7/31/07, Jens Kraemer wrote: > > On Tue, Jul 31, 2007 at 02:12:23AM -0700, isha kakodkar wrote: > > Hi > > Does acts_as_ferret support a :group clause? > > For e.g any rails options like :select or :group etc? or is it that it > > supports only few of such options?Like it supports :include > > you can put whatever AR options you like into the second argument hash > of find_by_contents. > > Jens > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070803/1f8db6fb/attachment.html From kraemer at webit.de Fri Aug 3 10:16:58 2007 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 3 Aug 2007 16:16:58 +0200 Subject: [Ferret-talk] Group by clause In-Reply-To: <87b412ce0708030648j2925d2ddu2830468d74398474@mail.gmail.com> References: <87b412ce0707310212s3a072a9bv6d1711611e7da907@mail.gmail.com> <20070731093141.GG30135@cordoba.webit.de> <87b412ce0708030648j2925d2ddu2830468d74398474@mail.gmail.com> Message-ID: <20070803141658.GA8232@cordoba.webit.de> On Fri, Aug 03, 2007 at 06:48:46AM -0700, isha kakodkar wrote: > ok...and how do i achieve find_id_by_contents search with where condition > but support of "or" and "and"? > > when i search a query string and provide > Messages.full_text_search("searching, user_id:#{user_id} name:#{name}", > {:page =>(params[:page]||1), :sort => s}) > It searches for all ids where user_id is the given value and name is the > given value. > But if i want to achieve a "or" between the 2 conditions how do i do it? By specifying 'OR' between them in your query :-) You can also change the default behaviour to OR queries with the :or_default Ferret configuration variable. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From isha.kakodkar at gmail.com Fri Aug 3 10:39:46 2007 From: isha.kakodkar at gmail.com (isha kakodkar) Date: Fri, 3 Aug 2007 07:39:46 -0700 Subject: [Ferret-talk] Group by clause In-Reply-To: <20070803141658.GA8232@cordoba.webit.de> References: <87b412ce0707310212s3a072a9bv6d1711611e7da907@mail.gmail.com> <20070731093141.GG30135@cordoba.webit.de> <87b412ce0708030648j2925d2ddu2830468d74398474@mail.gmail.com> <20070803141658.GA8232@cordoba.webit.de> Message-ID: <87b412ce0708030739u66567329l55e66968a3ce3735@mail.gmail.com> I tried it and dint get the resultset..But i messed up something i guess..Thanks. I had something more to it also... Which all sql conditions i can specify with find_id_by_contents?OR,AND... what about 'IN' clause? Or do i have to do that with find_by_contents when the sql is getting fired... On 8/3/07, Jens Kraemer wrote: > > On Fri, Aug 03, 2007 at 06:48:46AM -0700, isha kakodkar wrote: > > ok...and how do i achieve find_id_by_contents search with where > condition > > but support of "or" and "and"? > > > > when i search a query string and provide > > Messages.full_text_search("searching, user_id:#{user_id} name:#{name}", > > {:page =>(params[:page]||1), :sort => s}) > > It searches for all ids where user_id is the given value and name is the > > given value. > > But if i want to achieve a "or" between the 2 conditions how do i do > it? > > By specifying 'OR' between them in your query :-) > > You can also change the default behaviour to OR queries with the > :or_default Ferret configuration variable. > > Jens > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070803/236850c5/attachment.html From stone1549 at gmail.com Fri Aug 3 11:12:59 2007 From: stone1549 at gmail.com (Joe Smith) Date: Fri, 3 Aug 2007 17:12:59 +0200 Subject: [Ferret-talk] can't search for OR (as in the state) Message-ID: <29b5ebfa3996859911e37ca055de4d1b@ruby-forum.com> I'm trying to search a Model by the state field using Acts As Ferret. The query for this is '+state:NY' (substitute state abbreviation for NY). This works find however '+state:OR' returns nothing, though just 'portland' will pull up matches within that state. I'm pretty sure it's reading OR as an or conditional instead of a state. Anyway to escape it to fix this issue? -- Posted via http://www.ruby-forum.com/. From bill.burcham at gmail.com Fri Aug 3 13:02:10 2007 From: bill.burcham at gmail.com (Bill Burcham) Date: Fri, 3 Aug 2007 19:02:10 +0200 Subject: [Ferret-talk] StandardTokenizer Doesn't Support token_stream method Message-ID: <016bd4464104f39d0a6d4e7f79960de2@ruby-forum.com> According to the Analyzer doc and the StandardTokenizer doc: http://ferret.davebalmain.com/api/classes/Ferret/Analysis/Analyzer.html http://ferret.davebalmain.com/api/classes/Ferret/Analysis/StandardTokenizer.html I ought to be able to construct a StandardTokenizer like this: t = StandardTokenizer.new( true) # true to downcase tokens and then later: stream = token_stream( ignored_field_name, some_string) To create a new TokenStream from some_string. This approach would be valuable for my application since I am analyzing many short strings -- so I'm thinking that building my 5-deep analyzer chain for each small string will be a nice savings. Unfortunately, StandardTokenizer#initialize does not work as advertised. It takes a string, not a boolean. So it does not support the reuse model from the documentation cited above. If you have a look at the "source" link on the StandardTokenizer documentation for "new": http://ferret.davebalmain.com/api/classes/Ferret/Analysis/StandardTokenizer.html# You'll see that the rdoc comment apparently lies :) That formal parameter name that should hold "lower" is named "rstr". Fishy. A quick look indicates that WhiteSpaceTokenizer has a similar mismatch with its documentation. Is there an idiomatic way to reuse analyzer chains? -- Posted via http://www.ruby-forum.com/. From andreas.korth at gmx.net Fri Aug 3 14:54:58 2007 From: andreas.korth at gmx.net (Andreas Korth) Date: Fri, 3 Aug 2007 20:54:58 +0200 Subject: [Ferret-talk] can't search for OR (as in the state) References: <59E547E4-2ED5-4F12-9132-04E04DBEBEFC@gmx.net> Message-ID: <2188D5DF-DB87-4A95-AD1C-DA5A8F584FEE@gmx.net> On 03.08.2007, at 17:12, Joe Smith wrote: > I'm trying to search a Model by the state field using Acts As Ferret. > The query for this is '+state:NY' (substitute state abbreviation for > NY). This works find however '+state:OR' returns nothing, though just > 'portland' will pull up matches within that state. Ferret::Analysis::FULL_ENGLISH_STOP_WORDS.include?('or') => true > I'm pretty sure it's reading OR as an or conditional instead of a > state. > Anyway to escape it to fix this issue? Index the state field untokenized: field_infos.add_field(:state, :index => :untokenized, ...) Cheers, Andy From john at digitalpulp.com Fri Aug 3 16:17:55 2007 From: john at digitalpulp.com (John Bachir) Date: Fri, 3 Aug 2007 16:17:55 -0400 Subject: [Ferret-talk] "no such file to load -- ferret_extensions" Message-ID: <54F92792-3602-4511-A1AA-16D7003DFF6A@digitalpulp.com> I get this error when stopping the DRb server via capistrano. "no such file to load -- ferret_extensions" i see that lib/ferret_extensions is a newly required file in 0.4.1... any reason ferret can't see it? The Rails app server starts fine. thanks j From john at digitalpulp.com Fri Aug 3 16:27:07 2007 From: john at digitalpulp.com (John Bachir) Date: Fri, 3 Aug 2007 16:27:07 -0400 Subject: [Ferret-talk] "no such file to load -- ferret_extensions" In-Reply-To: <54F92792-3602-4511-A1AA-16D7003DFF6A@digitalpulp.com> References: <54F92792-3602-4511-A1AA-16D7003DFF6A@digitalpulp.com> Message-ID: On Aug 3, 2007, at 4:17 PM, John Bachir wrote: > I get this error when stopping the DRb server via capistrano. > > "no such file to load -- ferret_extensions" > UGH-- figured it out-- it was the only new file in the new version, and I had forgotten to add it to my repository. From vincent at undefinedrange.com Sat Aug 4 01:17:52 2007 From: vincent at undefinedrange.com (Vincent Woo) Date: Sat, 4 Aug 2007 07:17:52 +0200 Subject: [Ferret-talk] Can't find using instance method as field Message-ID: <7cc99508dde1eeeda35a4d5fdda45bf3@ruby-forum.com> Hi, I just started experimenting with acts_as_ferret. It's working great if I'm searching through database fields through a setup like: acts_as_ferret :fields => ['title'] Page.find_by_contents('Blue') But if I try searching through an instance method, I get no results back. Anybody know why? acts_as_ferret :fields => ['extra'] def extra 'Example' end # after deleting the index directory, running "reload!" in the console: >> Page.find_by_contents('Example') => # It is reindexing since the development.log has many lines like: creating doc for class: Page, id: 9 Adding field extra with value 'Example' to index creating doc for class: Page, id: 10 Adding field extra with value 'Example' to index -- Posted via http://www.ruby-forum.com/. From ror_dave at yahoo.com Sat Aug 4 09:59:27 2007 From: ror_dave at yahoo.com (dave developer) Date: Sat, 4 Aug 2007 06:59:27 -0700 (PDT) Subject: [Ferret-talk] Running two ferret servers for two different applications on the same box Message-ID: <376059.4242.qm@web63203.mail.re1.yahoo.com> Hi! I have a situation where we want to set up two different rails applications on the same server and they both have ferret related functionality that needs to be implemented. How does one setup the ferret_server.yml file and then start each ferrt server to reflect that these two applications should access specific ferret servers that are running? I've tried changing the ports in the ferret_server.yml file, but I can't seem to start the second server using its port when I run vendor/plugins/acts_as_ferret/script/ferret_start. I get the 'bind error, address already in use'. Any help would be appreciated. Thanks! Dave ____________________________________________________________________________________ Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out. http://answers.yahoo.com/dir/?link=list&sid=396545433 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070804/ec048f7f/attachment-0001.html From vincent at undefinedrange.com Sat Aug 4 15:05:45 2007 From: vincent at undefinedrange.com (Vincent Woo) Date: Sat, 4 Aug 2007 21:05:45 +0200 Subject: [Ferret-talk] Can't find using instance method as field In-Reply-To: <7cc99508dde1eeeda35a4d5fdda45bf3@ruby-forum.com> References: <7cc99508dde1eeeda35a4d5fdda45bf3@ruby-forum.com> Message-ID: <6f14ad2416811668dd2ef93786018754@ruby-forum.com> It seems that I needed to exit and reenter the irb console before changes are noticed. -- Posted via http://www.ruby-forum.com/. From eimorton at gmail.com Sun Aug 5 13:17:05 2007 From: eimorton at gmail.com (Erik Morton) Date: Sun, 5 Aug 2007 13:17:05 -0400 Subject: [Ferret-talk] IO Errors on deleting documents with Ferret Message-ID: <9C148DF7-6469-4027-8772-136F296B9E71@gmail.com> I have a large index (~6GB, ~1 million docs) that was built by RDig. I wrote a script to iterate through the index to clear out some duplicate information to try to reduce the size of the index. clients.each {|client| docs = RDig.searcher.search("+supplier_id:#{client.id}") docs.each {|doc| data = doc[:data].dup #the contents of the web page new_results = {} new_results[:client_id] = client.id new_results[:data] = data index.delete doc[:doc_id] index << new_results } } I've run a similar script before with no issues. However today I received the following error after 30 minutes or so: /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:726:in `initialize': IO Error occured at :93 in xraise (IOError) Error occured in index.c:901 - sis_find_segments_file Error reading the segment infos. Store listing was from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:726:in `ensure_reader_open' from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:434:in `delete' from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:8:in `synchrolock' from /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize' from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:8:in `synchrolock' from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:428:in `delete' Despite the error the index appear to not be corrupted, so I ran the script again for fun. The following error occurred after approximately 20 minutes: /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:723:in `close': IO Error occured at :93 in xraise (IOError) Error occured in fs_store.c:264 - fs_new_output couldn't create OutStream /mnt/apps/search/current/../../ shared/indexes/final/_a4kx.prx: from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:723:in `ensure_reader_open' from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:434:in `delete' from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:8:in `synchrolock' from /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize' from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:8:in `synchrolock' from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:428:in `delete' Here are the contents of the index directory: [root at files]# ls -Al ../../shared/indexes/final/ total 5628324 -rw------- 1 initiate initiate 5713121647 Jul 31 14:22 _5d3s.cfs -rw------- 1 root root 115159 Aug 5 12:55 _5d3s_2yyy.del -rw------- 1 root root 22937900 Aug 5 11:28 _7tgc.cfs -rw------- 1 root root 11475 Aug 5 12:55 _7tgc_sx7.del -rw------- 1 root root 2220338 Aug 5 11:38 _820z.cfs -rw------- 1 root root 2311840 Aug 5 11:47 _8alm.cfs -rw------- 1 root root 2261887 Aug 5 11:56 _8j69.cfs -rw------- 1 root root 2089120 Aug 5 12:05 _8rqw.cfs -rw------- 1 root root 2244470 Aug 5 12:14 _90bj.cfs -rw------- 1 root root 2249160 Aug 5 12:22 _98w6.cfs -rw------- 1 root root 2231091 Aug 5 12:31 _9hgt.cfs -rw------- 1 root root 2244881 Aug 5 12:40 _9q1g.cfs -rw------- 1 root root 2273703 Aug 5 12:48 _9ym3.cfs -rw------- 1 root root 235566 Aug 5 12:49 _9zgy.cfs -rw------- 1 root root 220959 Aug 5 12:50 _a0bt.cfs -rw------- 1 root root 229074 Aug 5 12:51 _a16o.cfs -rw------- 1 root root 202310 Aug 5 12:52 _a21j.cfs -rw------- 1 root root 135823 Aug 5 12:53 _a2we.cfs -rw------- 1 root root 132935 Aug 5 12:54 _a3r9.cfs -rw------- 1 root root 14190 Aug 5 12:54 _a3uc.cfs -rw------- 1 root root 13868 Aug 5 12:54 _a3xf.cfs -rw------- 1 root root 13758 Aug 5 12:54 _a40i.cfs -rw------- 1 root root 14912 Aug 5 12:54 _a43l.cfs -rw------- 1 root root 13750 Aug 5 12:54 _a46o.cfs -rw------- 1 root root 14170 Aug 5 12:54 _a49r.cfs -rw------- 1 root root 13764 Aug 5 12:55 _a4cu.cfs -rw------- 1 root root 13719 Aug 5 12:55 _a4fx.cfs -rw------- 1 root root 13115 Aug 5 12:55 _a4j0.cfs -rw------- 1 root root 1826 Aug 5 12:55 _a4jb.cfs -rw------- 1 root root 1935 Aug 5 12:55 _a4jm.cfs -rw------- 1 root root 1739 Aug 5 12:55 _a4jx.cfs -rw------- 1 root root 1865 Aug 5 12:55 _a4k8.cfs -rw------- 1 root root 2072 Aug 5 12:55 _a4kj.cfs -rw------- 1 root root 1733 Aug 5 12:55 _a4ku.cfs -rw------- 1 root root 378 Aug 5 12:55 _a4kv.cfs -rw------- 1 root root 462 Aug 5 12:55 _a4kw.cfs -rw------- 1 root root 128 Aug 5 12:55 _a4kx.fdt -rw------- 1 root root 0 Aug 5 12:55 _a4kx.fdx -rw------- 1 root root 0 Aug 5 12:55 _a4kx.frq -rw------- 1 root root 0 Aug 5 12:55 _a4kx.tfx -rw------- 1 root root 0 Aug 5 12:55 _a4kx.tis -rw------- 1 root root 0 Aug 5 12:55 _a4kx.tix -rw------- 1 root root 0 Aug 5 12:55 ferret-write.lck -rw------- 1 initiate initiate 16 Aug 5 12:55 segments -rw------- 1 root root 1142 Aug 5 12:55 segments_isfj Here's my platform: Linux xenU #1 SMP Thu Nov 30 13:48:50 SAST 2006 i686 athlon i386 GNU/Linux I'm using ruby 1.8.4 and Ferret 0.11.4, which has been hacked to add in better large file support. Does anyone have any idea what's going on? Many thanks in advance. Erik From 61997928 at qq.com Sun Aug 5 22:53:35 2007 From: 61997928 at qq.com (Myhuli Myhuli) Date: Mon, 6 Aug 2007 04:53:35 +0200 Subject: [Ferret-talk] Best quality!Best price!Best supplier! Message-ID: pls visit our website:www.sale-jordan.com,we have many branded name shoes ,such as nike jordan,air max,shox,puma,timberland boots.. and also can supply mp4,clothing,watches and hats.so pls email to us: website:http://www.sale-jordan.com Any questions please contact us. Sincerely yours -- Posted via http://www.ruby-forum.com/. From bk at benjaminkrause.com Mon Aug 6 03:32:37 2007 From: bk at benjaminkrause.com (Benjamin Krause) Date: Mon, 6 Aug 2007 09:32:37 +0200 Subject: [Ferret-talk] IO Errors on deleting documents with Ferret In-Reply-To: <9C148DF7-6469-4027-8772-136F296B9E71@gmail.com> References: <9C148DF7-6469-4027-8772-136F296B9E71@gmail.com> Message-ID: <5FA0DC0D-1210-4DBF-8BFE-98B66E629E22@benjaminkrause.com> On 2007-08-05, at 7:17 PM, Erik Morton wrote: > Error occured in index.c:901 - sis_find_segments_file > Error reading the segment infos. Store listing was > > couldn't create OutStream /mnt/apps/search/current/../../ > shared/indexes/final/_a4kx.prx: Hey .. Both errors might have the same reason - to many open files .. I've had similar errors some month ago and raised my open files to 32k, and didn't had an error since .. rails at omdb.org ~ $ ulimit -n 32768 Benjamin From starburger234 at yahoo.de Tue Aug 7 14:23:44 2007 From: starburger234 at yahoo.de (Star Burger) Date: Tue, 7 Aug 2007 20:23:44 +0200 Subject: [Ferret-talk] Varying case sensitivity Message-ID: <3864066bfb8124a78692e5c08cf9fb82@ruby-forum.com> Hi all, I'm using ferret 11.4 together with acts_as_ferret and I've indexed the geonames.org country files. These files contain worldwide locations in UTF-8 with all their different spellings each. Model definition is like this: class location acts_as_ferret :fields => {:location_names => {}}, :single_index => true ... end The instance method location_names returns a string containing all the different, UTF-8 coded spellings for this location. Problem: Sometimes the search is case sensitive and sometimes not. E.g. it finds "stuttgart" and "Stuttgart". It finds "M?nchen" but does NOT find "m?nchen". It only finds "?berlingen" and not "?berlingen". My feeling is that for locations with "special characters" it behaves case sensitive... My goal is not to be case sensitive. Thanks for your help, Starburger -- Posted via http://www.ruby-forum.com/. From starburger234 at yahoo.de Tue Aug 7 14:34:03 2007 From: starburger234 at yahoo.de (Star Burger) Date: Tue, 7 Aug 2007 20:34:03 +0200 Subject: [Ferret-talk] Varying case sensitivity In-Reply-To: <3864066bfb8124a78692e5c08cf9fb82@ruby-forum.com> References: <3864066bfb8124a78692e5c08cf9fb82@ruby-forum.com> Message-ID: <7b90fc11b6987c31d3702ce034036a23@ruby-forum.com> BTW my locale settings in environment.rb are ENV['LANG'] = 'de_DE.UTF-8 at euro' ENV['LC_TIME'] = 'C' require 'acts_as_ferret' -- Posted via http://www.ruby-forum.com/. From ferret-talk at stuartsierra.com Tue Aug 7 16:54:22 2007 From: ferret-talk at stuartsierra.com (Stuart Sierra) Date: Tue, 07 Aug 2007 16:54:22 -0400 Subject: [Ferret-talk] Varying case sensitivity In-Reply-To: <3864066bfb8124a78692e5c08cf9fb82@ruby-forum.com> References: <3864066bfb8124a78692e5c08cf9fb82@ruby-forum.com> Message-ID: <46B8DBFE.4060909@stuartsierra.com> Reply is below the quote. Star Burger wrote: > Hi all, > > I'm using ferret 11.4 together with acts_as_ferret and I've indexed the > geonames.org country files. These files contain worldwide locations in > UTF-8 with all their different spellings each. > > Model definition is like this: > > class location > acts_as_ferret :fields => {:location_names => {}}, :single_index => > true > ... > end > > The instance method location_names returns a string containing all the > different, UTF-8 coded spellings for this location. > > > Problem: > > Sometimes the search is case sensitive and sometimes not. E.g. it finds > "stuttgart" and "Stuttgart". It finds "M?nchen" but does NOT find > "m?nchen". It only finds "?berlingen" and not "?berlingen". > > My feeling is that for locations with "special characters" it behaves > case sensitive... > > My goal is not to be case sensitive. > > Thanks for your help, > > Starburger Star Burger wrote: > BTW my locale settings in environment.rb are > > ENV['LANG'] = 'de_DE.UTF-8 at euro' > ENV['LC_TIME'] = 'C' > require 'acts_as_ferret' Ferret's LowerCaseFilter (which converts tokens and queries to lower case) uses the C function towlower() [1] to convert multi-byte characters (e.g. UTF-8 characters with accents) to lower case. Maybe the Ferret code does not inherit the correct locale from environment.rb? I'm not sure how to fix this, perhaps someone else does. [1]: http://www.opengroup.org/pubs/online/7908799/xsh/towlower.html -Stuart From mrj at bigpond.net.au Tue Aug 7 19:09:52 2007 From: mrj at bigpond.net.au (Mark Reginald James) Date: Wed, 08 Aug 2007 09:09:52 +1000 Subject: [Ferret-talk] :store => :yes doesn't work in some cases In-Reply-To: <8b21e2d4ed474feec60ccfa9636f1165@ruby-forum.com> References: <3718a6efdb1006fc97cfc328b27f2023@ruby-forum.com> <20070620094917.GF22469@cordoba.webit.de> <8b21e2d4ed474feec60ccfa9636f1165@ruby-forum.com> Message-ID: Jesse Grosjean wrote: > Thanks I was missing that part. Now it's working, or almost. The on > trouble spot is that when I highlight my results the models do get > loaded. I guess is this because FerretResult doesn't have a highlight > method, and that causes it to load the underlying model. > > To get around this I've changed from: > > result.highlight(...) > > to: > > result.model.aaf_index.highlight(result.id, result.model.name...) > > After doing that the database is no longer hit when displaying > highlighted ferret results, but to do that I needed to add > "attr_accessor :model" to ActsAsFerret > ResultAttributes. Is there a > better way to do that? If not could you add that attr_accessor to ferret > proper? > > It might also make sense to also add the highlight method to > FerretResult to avoid the model load, but I'd still like some way to > access the model class without requiring a model load since I need that > to generate the right link the the original page for each result. I just posted a ticket that implements this, preventing use of the highlight method from triggering unnecessary loading of the real AR record: http://projects.jkraemer.net/acts_as_ferret/ticket/161 -- We develop, watch us RoR, in numbers too big to ignore. From isha.kakodkar at gmail.com Wed Aug 8 04:48:50 2007 From: isha.kakodkar at gmail.com (isha kakodkar) Date: Wed, 8 Aug 2007 01:48:50 -0700 Subject: [Ferret-talk] Group by clause In-Reply-To: <87b412ce0708030739u66567329l55e66968a3ce3735@mail.gmail.com> References: <87b412ce0707310212s3a072a9bv6d1711611e7da907@mail.gmail.com> <20070731093141.GG30135@cordoba.webit.de> <87b412ce0708030648j2925d2ddu2830468d74398474@mail.gmail.com> <20070803141658.GA8232@cordoba.webit.de> <87b412ce0708030739u66567329l55e66968a3ce3735@mail.gmail.com> Message-ID: <87b412ce0708080148i2e50cbadxf401f12ffc5f3237@mail.gmail.com> hi, while providing conditions during index file search,do i have to give the exact value?or can i specify condition like "id>4"...i tried but couldnt get it working....Can you please help..? On 8/3/07, isha kakodkar wrote: > > I tried it and dint get the resultset..But i messed up something i > guess..Thanks. > I had something more to it also... > Which all sql conditions i can specify with find_id_by_contents?OR,AND... > what about 'IN' clause? > Or do i have to do that with find_by_contents when the sql is getting > fired... > > On 8/3/07, Jens Kraemer wrote: > > > > On Fri, Aug 03, 2007 at 06:48:46AM -0700, isha kakodkar wrote: > > > ok...and how do i achieve find_id_by_contents search with where > > condition > > > but support of "or" and "and"? > > > > > > when i search a query string and provide > > > Messages.full_text_search("searching, user_id:#{user_id} > > name:#{name}", > > > {:page =>(params[:page]||1), :sort => s}) > > > It searches for all ids where user_id is the given value and name is > > the > > > given value. > > > But if i want to achieve a "or" between the 2 conditions how do i do > > it? > > > > By specifying 'OR' between them in your query :-) > > > > You can also change the default behaviour to OR queries with the > > :or_default Ferret configuration variable. > > > > Jens > > > > -- > > Jens Kr?mer > > webit! Gesellschaft f?r neue Medien mbH > > Schnorrstra?e 76 | 01069 Dresden > > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > > kraemer at webit.de | www.webit.de > > > > Amtsgericht Dresden | HRB 15422 > > GF Sven Haubold, Hagen Malessa > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070808/97011969/attachment-0001.html From marcel.scherf at gmail.com Wed Aug 8 05:40:36 2007 From: marcel.scherf at gmail.com (Marcel Scherf) Date: Wed, 8 Aug 2007 11:40:36 +0200 Subject: [Ferret-talk] Escaping special characters before highlighting Message-ID: <2fa77eeb6dd09ac7e3a711049af8d43e@ruby-forum.com> Hi all. I need to perform some operation on the results of a ferret search before it gets highlighted. Is there a way to do this? This is my code which calls the highlighting inside a Rails View: <%= result.ferret_highlight(@query, :field => :ferret_text, :pre_tag => '', :post_tag => '') %> -- Posted via http://www.ruby-forum.com/. From henke at mac.se Wed Aug 8 05:22:21 2007 From: henke at mac.se (Henrik Zagerholm) Date: Wed, 8 Aug 2007 11:22:21 +0200 Subject: [Ferret-talk] Highlighting broken in TRUNK Message-ID: <3CE321C2-2822-4CE4-88EF-773B33A1DB20@mac.se> Hello list, As there is still some large file support bugs in the 0.11.4 release I had to download the trunk and apply a patch sent in by kyle http:// ferret.davebalmain.com/trac/ticket/215. The problem is now that the highlighting doesn't work. Somehow it combines excerpt_length with num_excerpts so if you have and excerpt_length of 50 and num_excerpts of 5 and you get only one hit when searching the excerpts is either the whole text or 50*5 = 250. So everytime you get a result with less hits than num_excerpts the highlighting goes crazy. Comments, Fixes, Patches are all welcome :) Cheers, Henrik From isha.kakodkar at gmail.com Wed Aug 8 10:19:25 2007 From: isha.kakodkar at gmail.com (isha kakodkar) Date: Wed, 8 Aug 2007 07:19:25 -0700 Subject: [Ferret-talk] indexing only the changed values In-Reply-To: <87b412ce0707302355w7ffce9afs4a45b3eb8631a359@mail.gmail.com> References: <87b412ce0707300532m16ca253eie34f24d3deb7a52d@mail.gmail.com> <5B51C217-620D-4EC8-9691-99F1F868176C@benjaminkrause.com> <87b412ce0707302355w7ffce9afs4a45b3eb8631a359@mail.gmail.com> Message-ID: <87b412ce0708080719j187ead21rd2a36b8ae191cb58@mail.gmail.com> hi,i needed some help on specifying conditions for index searching on filesystem. Along with the main query string,i can give the indexed_field:value,so that it searches the query string at that indexed row. But can i specify a indexed_field > value? or i have to give the exact value? On 7/30/07, isha kakodkar wrote: > > ok thanks... > Can i specify :group clause while searching....I want to apply a groupBy > clause in sql to get the resultset and index it... > Can i do it with Ferret? > > > On 7/30/07, Benjamin Krause wrote: > > > > Hey.. > > > > Ferret does not allow you to update information in > > the index. You can either add information or remove > > information. There is no SQL-like 'UPDATE' > > statement. Ferret or AAF - whatever you use - need > > all information, if you want to index something. And > > AAF does not keep track of changes in the model, > > so it will not allow you to add a flag like > > :only_index_dependencies_if_model_really_changed :) > > > > So if you want to avoid your extra queries, build some > > caching inside your model classes. This has nothing to > > do with Ferret. E.g. try storing your data in a memcache > > with a short TTL or think of any other caching mechanism > > that will work for your model, regardless of Ferret. > > > > > > Ben > > > > > > > > On 2007-07-30, at 2:32 PM, isha kakodkar wrote: > > > > > Hi all, > > > i have model A which has a field indexed from model B. model A > > > belongs to model B. > > > So whenever i insert a row in model 'A', a query is fired to the > > > field from model 'B' even though the data was not changed for the > > > field in model B. > > > Can i somehow avoid these extra queries,or rather query the data > > > and index it,only if the data has been changed>? > > > e.g model A { > > > message > > > chat Name > > > } > > > > > > > > > model B{ > > > chat Name > > > } > > > > > > > > > For 1 chatName there are 10 messages.so i want to index chatName > > > once per 10 messages and not 10 times.How do i do this? > > > _______________________________________________ > > > Ferret-talk mailing list > > > Ferret-talk at rubyforge.org > > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070808/c7c1e437/attachment.html From ferret-talk at stuartsierra.com Wed Aug 8 11:33:33 2007 From: ferret-talk at stuartsierra.com (Stuart Sierra) Date: Wed, 08 Aug 2007 11:33:33 -0400 Subject: [Ferret-talk] Highlighting broken in TRUNK In-Reply-To: <3CE321C2-2822-4CE4-88EF-773B33A1DB20@mac.se> References: <3CE321C2-2822-4CE4-88EF-773B33A1DB20@mac.se> Message-ID: <46B9E24D.30201@stuartsierra.com> Henrik Zagerholm wrote: > As there is still some large file support bugs in the 0.11.4 release > I had to download the trunk and apply a patch sent in by kyle http:// > ferret.davebalmain.com/trac/ticket/215. > > The problem is now that the highlighting doesn't work. > > Somehow it combines excerpt_length with num_excerpts so if you have > and excerpt_length of 50 and num_excerpts of 5 and you get only one > hit when searching the excerpts is either the whole text or 50*5 = 250. > > So everytime you get a result with less hits than num_excerpts the > highlighting goes crazy. I've had the same problem with the large file patch. I couldn't figure out a fix, but I worked around it by trimming the excerpts to the last 150 characters: highlights.collect do |excerpt| if excerpt.length <= 150 excerpt else excerpt[-150..-1] end end This is ugly, and it chops up words, but it seems to work. I wondered if this might have something to do with Ferret's low-level IO routines: is_read_u64, os_read_u64, is_read_u32, etc. Maybe the wrong ones are being used for reading and writing the offsets in the segment file, now that Kyle's patch has changed the offsets from type int to type offset_t. -Stuart Sierra From casey at nerdle.com Wed Aug 8 13:36:51 2007 From: casey at nerdle.com (Casey Forbes) Date: Wed, 8 Aug 2007 13:36:51 -0400 (EDT) Subject: [Ferret-talk] Filtering out low scoring matches with acts_as_ferret (fwd) Message-ID: Hi all, I'm using fuzzy~ queries which can return very poor matches. What is the most elegant way to filter out scores below some threshold? I know that I can do my own thing with find_id_by_contents but I'd like to filter out less relevant results at a lower level so that I can use :limit/:offset for pagination and all that. What am I missing? Thanks, Casey From cpjolicoeur at gmail.com Wed Aug 8 16:16:30 2007 From: cpjolicoeur at gmail.com (Craig Jolicoeur) Date: Wed, 8 Aug 2007 22:16:30 +0200 Subject: [Ferret-talk] issues with index for table with over 18 million records Message-ID: <76b3f8d0ba00908cc6f11a82364190b4@ruby-forum.com> I have a MySQL table with over 18 million records in it. We are indexing about 10 fields in this table with ferret. I am having problems with the initial building of the index. I created a rake task to run the "Model.rebuild_index" command in the background. That process ran fine for about 2.5 days before it just suddenly stopped. The log/ferret_index.log file says it got to about 28% before ending. I'm not sure if the process died because of something on my server or because of something related to ferret. It appears that it will take close to 10 days for the full index to be build with rebuild_index? Is this normal for a table of this size? Also, is there a way to start where the index ended and update from there instead of having to rebuild the entire index from scratch? I got about 28% of the way through so would like to not have to waste the 2.5 days to rebuild that part again trying to get the full index 100% built. Also, is there a way that I can non-destructive rebuild the index since it didnt complete 100%? Meaning, can I rebuild it without overwriting what is already there? That way I can keep what I have to be searched while the rebuild takes place and then move that over the old index? I'm not running ferret as a Drb server so I dont know if I can. Also, is there a faster or better way that I can/should be building the index? Will I have an issue with the index file sizes with a DB this size? -- Posted via http://www.ruby-forum.com/. From eimorton at gmail.com Wed Aug 8 16:50:19 2007 From: eimorton at gmail.com (Erik Morton) Date: Wed, 8 Aug 2007 16:50:19 -0400 Subject: [Ferret-talk] issues with index for table with over 18 million records In-Reply-To: <76b3f8d0ba00908cc6f11a82364190b4@ruby-forum.com> References: <76b3f8d0ba00908cc6f11a82364190b4@ruby-forum.com> Message-ID: <65132D70-09EE-4A2D-B607-341DC8EA4B50@gmail.com> We have a 1 million record index that is about 6GB in size. We build it in parallel w/out AAF so it's hard to comment on the speed of your index build. However I will say that I did need to manually patch Ferret to better handle large indexes. Here is the diff: --- /usr/local/lib/ruby/gems/1.8/gems/ferret-0.11.4/ext/index.c +++ index.c @@ -1375,7 +1375,7 @@ lazy_doc = lazy_doc_new(stored_cnt, fdt_in); for (i = 0; i < stored_cnt; i++) { - int start = 0, end, data_cnt; + off_t start = 0, end, data_cnt; field_num = is_read_vint(fdt_in); fi = fr->fis->fields[field_num]; data_cnt = is_read_vint(fdt_in); @@ -1449,7 +1449,7 @@ if (store_offsets) { int num_positions = tv->offset_cnt = is_read_vint(fdt_in); Offset *offsets = tv->offsets = ALLOC_N(Offset, num_positions); - int offset = 0; + off_t offset = 0; for (i = 0; i < num_positions; i++) { offsets[i].start = offset += is_read_vint(fdt_in); offsets[i].end = offset += is_read_vint(fdt_in); @@ -1683,8 +1683,8 @@ int last_end = 0; os_write_vint(fdt_out, offset_count); /* write shared prefix length */ for (i = 0; i < offset_count; i++) { - int start = offsets[i].start; - int end = offsets[i].end; + off_t start = offsets[i].start; + off_t end = offsets[i].end; os_write_vint(fdt_out, start - last_end); os_write_vint(fdt_out, end - start); last_end = end; @@ -4799,7 +4799,7 @@ * ************************************************************************ ****/ -Offset *offset_new(int start, int end) +Offset *offset_new(off_t start, off_t end) { Offset *offset = ALLOC(Offset); offset->start = start; On Aug 8, 2007, at 4:16 PM, Craig Jolicoeur wrote: > I have a MySQL table with over 18 million records in it. We are > indexing about 10 fields in this table with ferret. > > I am having problems with the initial building of the index. I > created > a rake task to run the "Model.rebuild_index" command in the > background. > That process ran fine for about 2.5 days before it just suddenly > stopped. The log/ferret_index.log file says it got to about 28% > before > ending. I'm not sure if the process died because of something on my > server or because of something related to ferret. > > It appears that it will take close to 10 days for the full index to be > build with rebuild_index? Is this normal for a table of this size? > Also, is there a way to start where the index ended and update from > there instead of having to rebuild the entire index from scratch? > I got > about 28% of the way through so would like to not have to waste the > 2.5 > days to rebuild that part again trying to get the full index 100% > built. > > Also, is there a way that I can non-destructive rebuild the index > since > it didnt complete 100%? Meaning, can I rebuild it without overwriting > what is already there? That way I can keep what I have to be searched > while the rebuild takes place and then move that over the old index? > I'm not running ferret as a Drb server so I dont know if I can. > > Also, is there a faster or better way that I can/should be building > the > index? Will I have an issue with the index file sizes with a DB this > size? > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From cpjolicoeur at gmail.com Wed Aug 8 16:53:57 2007 From: cpjolicoeur at gmail.com (Craig Jolicoeur) Date: Wed, 8 Aug 2007 22:53:57 +0200 Subject: [Ferret-talk] issues with index for table with over 18 million records In-Reply-To: <65132D70-09EE-4A2D-B607-341DC8EA4B50@gmail.com> References: <76b3f8d0ba00908cc6f11a82364190b4@ruby-forum.com> <65132D70-09EE-4A2D-B607-341DC8EA4B50@gmail.com> Message-ID: Erik Morton wrote: > We have a 1 million record index that is about 6GB in size. We build > it in parallel w/out AAF so it's hard to comment on the speed of your > index build. However I will say that I did need to manually patch > Ferret to better handle large indexes. > Erik, What issues did you find that caused you to patch the ferret code? ALso, you say you build the index in parallel w/out AAF; how do you do that? Not sure I'm following how to do that so if you can explain, I'd appreciate it. -- Posted via http://www.ruby-forum.com/. From eimorton at gmail.com Wed Aug 8 17:16:30 2007 From: eimorton at gmail.com (Erik Morton) Date: Wed, 8 Aug 2007 17:16:30 -0400 Subject: [Ferret-talk] issues with index for table with over 18 million records In-Reply-To: References: <76b3f8d0ba00908cc6f11a82364190b4@ruby-forum.com> <65132D70-09EE-4A2D-B607-341DC8EA4B50@gmail.com> Message-ID: <19E23F34-BEB7-44DF-B419-7463B5E4AA37@gmail.com> We had to patch it because we were getting seemingly random errors while searching a 2GB+ index. This the trac ticket: http:// ferret.davebalmain.com/trac/ticket/215. The patch I included changes some ints to off_t's, which solved the problem. As far as I know this patch was never applied to the trunk. We build our index using a modified version of RDig. We basically run up to 80 EC2 servers in parallel to create 80 separate indexes, which we later combine into a single index. You could follow a similar route and still have AAF mange the index after it is built. You'd need to make sure that the documents created by RDig/whatever have the same fields that AAF expects. Erik On Aug 8, 2007, at 4:53 PM, Craig Jolicoeur wrote: > Erik Morton wrote: >> We have a 1 million record index that is about 6GB in size. We build >> it in parallel w/out AAF so it's hard to comment on the speed of your >> index build. However I will say that I did need to manually patch >> Ferret to better handle large indexes. >> > > > Erik, > > What issues did you find that caused you to patch the ferret code? > > ALso, you say you build the index in parallel w/out AAF; how do you do > that? Not sure I'm following how to do that so if you can explain, > I'd > appreciate it. > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From allenmacyoung at gmail.com Thu Aug 9 05:08:16 2007 From: allenmacyoung at gmail.com (Allen Young) Date: Thu, 9 Aug 2007 11:08:16 +0200 Subject: [Ferret-talk] no such file to load -- acts_as_ferret (MissingSourceFile) Message-ID: <151eb9ce20d44b1bd1e7d71f85db2268@ruby-forum.com> Hi I've been using acts_as_ferret for a few weeks on my ibook and everything has been going just fine. But when I migrated my project onto windows xp sp2, acts_as_ferret seemed to become broken. I just put my project into a svn repository from ibook and checked it out into a windows pc. I got ruby1.8.5, rails1.2.3, ferret0.11.4 installed as gems. After I checked out the project, I got no such file to load -- acts_as_ferret (MissingSourceFile) error while trying to start server. I've tried to delete the acts_as_ferret plugin and reinstall it. The reinstallation is successful, but the error is still there. Does anyone has any ideas about this? Thanks a lot. Allen -- Posted via http://www.ruby-forum.com/. From andreas.korth at gmx.net Thu Aug 9 05:12:55 2007 From: andreas.korth at gmx.net (Andreas Korth) Date: Thu, 9 Aug 2007 11:12:55 +0200 Subject: [Ferret-talk] Filtering out low scoring matches with acts_as_ferret (fwd) In-Reply-To: References: Message-ID: On 08.08.2007, at 19:36, Casey Forbes wrote: > I'm using fuzzy~ queries which can return very poor matches. What > is the > most elegant way to filter out scores below some threshold? I know > that I > can do my own thing with find_id_by_contents but I'd like to filter > out > less relevant results at a lower level so that I can > use :limit/:offset > for pagination and all that. > > What am I missing? You can specify a fuzziness factor after the ~. A higher factor will reduce fuzziness and thus return less but more accurate results. Example: fuzzy~0.8 Cheers, Andy From flo at andersground.net Thu Aug 9 05:16:30 2007 From: flo at andersground.net (Florian Gilcher) Date: Thu, 09 Aug 2007 11:16:30 +0200 Subject: [Ferret-talk] Ferret - current status? In-Reply-To: <20070726073112.GQ26963@thunder.jkraemer.net> References: <69c1b1491fcaa44e7799dd5e1639674d@ruby-forum.com> <20070726073112.GQ26963@thunder.jkraemer.net> Message-ID: <46BADB6E.50606@andersground.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 *bump* Are there any news yet? Greetings Florian Gilcher Jens Kraemer wrote: > On Thu, Jul 26, 2007 at 08:43:36AM +0200, Ed Ed wrote: >> Hi guys, >> >> Having committed a fairly large project to ferret I'm a little concerned >> that ferret svn has been essentially unavailable for weeks (pretty much >> every time I try I get "can't connect") and more so now that >> davebalmain.com has gone off the air. >> >> Without meaning to pry, does anyone know whether existing problems in >> ferret are likely to get fixed? (I can get crashes quite easily with a >> combination of highlighting and proximity searches over longish >> distances) and whether, or if, there will be another release? > > I just mailed Dave with pretty much the same questions - I'll keep you > up to date about this. > > Jens > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGutts8RlGMqQ8m7oRAtdIAKCbGeJj8m8HQtlqvGnz84337NcUkgCZAaeC PfUBxmAdmtkX4sf/eVpjwkU= =hrRZ -----END PGP SIGNATURE----- From ferret-talk at stuartsierra.com Thu Aug 9 09:47:42 2007 From: ferret-talk at stuartsierra.com (Stuart Sierra) Date: Thu, 09 Aug 2007 09:47:42 -0400 Subject: [Ferret-talk] Ferret - current status? In-Reply-To: <46BADB6E.50606@andersground.net> References: <69c1b1491fcaa44e7799dd5e1639674d@ruby-forum.com> <20070726073112.GQ26963@thunder.jkraemer.net> <46BADB6E.50606@andersground.net> Message-ID: <46BB1AFE.8080701@stuartsierra.com> Reply is below the quotes. On Thu, Jul 26, 2007 at 08:43:36AM +0200, Ed Ed wrote: >>> Hi guys, >>> >>> Having committed a fairly large project to ferret I'm a little concerned >>> that ferret svn has been essentially unavailable for weeks (pretty much >>> every time I try I get "can't connect") and more so now that >>> davebalmain.com has gone off the air. >>> >>> Without meaning to pry, does anyone know whether existing problems in >>> ferret are likely to get fixed? (I can get crashes quite easily with a >>> combination of highlighting and proximity searches over longish >>> distances) and whether, or if, there will be another release? Jens Kraemer wrote: >> I just mailed Dave with pretty much the same questions - I'll keep you >> up to date about this. >> >> Jens Florian Gilcher wrote: > *bump* > > Are there any news yet? > > Greetings > Florian Gilcher ferret.davebalmain.com is up and running again; last Wiki commit was May 2, 2007; last SVN commit (770) was April 18, 2007. -Stuart Sierra From john at digitalpulp.com Thu Aug 9 13:59:12 2007 From: john at digitalpulp.com (John Bachir) Date: Thu, 9 Aug 2007 13:59:12 -0400 Subject: [Ferret-talk] no such file to load -- acts_as_ferret (MissingSourceFile) In-Reply-To: <151eb9ce20d44b1bd1e7d71f85db2268@ruby-forum.com> References: <151eb9ce20d44b1bd1e7d71f85db2268@ruby-forum.com> Message-ID: <70D3844C-EA3F-4844-8110-FBE058C78B36@digitalpulp.com> On Aug 9, 2007, at 5:08 AM, Allen Young wrote: > After I checked out the project, I got no such file > to load -- acts_as_ferret (MissingSourceFile) error while trying to > start server. > Does the error say which file is missing? From allenmacyoung at gmail.com Thu Aug 9 22:52:37 2007 From: allenmacyoung at gmail.com (Allen Young) Date: Fri, 10 Aug 2007 04:52:37 +0200 Subject: [Ferret-talk] no such file to load -- acts_as_ferret (MissingSourceFil In-Reply-To: <70D3844C-EA3F-4844-8110-FBE058C78B36@digitalpulp.com> References: <151eb9ce20d44b1bd1e7d71f85db2268@ruby-forum.com> <70D3844C-EA3F-4844-8110-FBE058C78B36@digitalpulp.com> Message-ID: John Bachir wrote: > Does the error say which file is missing? No... I've deleted the plugin and installed acts_as_ferret as a gem. Now it's OK. But I still don't understand my installing as a plugin doesn't work. The weired thing is that sometimes it works and sometimes it doesn't. -- Posted via http://www.ruby-forum.com/. From allenmacyoung at gmail.com Thu Aug 9 23:21:38 2007 From: allenmacyoung at gmail.com (Allen Young) Date: Fri, 10 Aug 2007 05:21:38 +0200 Subject: [Ferret-talk] Different ferret fields for instances of the same model? Message-ID: <7466078d3badeecb7cb6397b44870384@ruby-forum.com> Hi all, So far as I know, while using acts_as_ferret, we should add the following declaration in the ActiveRecord model which is going to be indexed: acts_as_ferret({:fields => @@ferrect_fields}) in which @@ferrect_fields is a hash containing all the field to be indexed. This is pretty much for some simple situations. But I got a more complex situation that I want to define the fields to be indexed for every instance of the same model. My requirements are something like this: Suppose we have a model "Product", in "Product" I've declared a polymorphic relationship with model "Property1" and "Property2", the following code will show this: class Product < ActiveRecord::Base belongs_to :property, :polymorphic => true @@ferret_fields = {...} acts_as_ferret({:fields => @@ferret_fields}) end class Property1 < ActiveRecord::Base has_one :product, :as => :property end class Property2 < ActiveRecord::Base has_one :product, :as => :property end Now I want to provide full text search capability for "Product" and it's obvious that "Product" should contains its "property" while being indexed. So I should define "ferret_fields" class method in "Property1" and "Property2" to collect all their fields and dynamically define the corresponding method in "Product". The code is something like this: class Property1 < ActiveRecord::Base has_one :product, :as => :property def self.ferret_fields # return a hash containing all the fields to be indexed in aaf's format end end class Property2 < ActiveRecord::Base has_one :product, :as => :property def self.ferret_fields # return a hash containing all the fields to be indexed in aaf's format end end class Product < ActiveRecord::Base belongs_to :property, :polymorphic => true @@ferret_fields = {...} @@ferret_fields.merge!(Property1.ferret_fields) @@ferret_fields.merge!(Property2.ferret_fields) acts_as_ferret({:fields => @@ferret_fields}) Property1.ferret_fields.keys.each do |field| define_method("#{field}") do result = property.send("#{field}") end end Property2.ferret_fields.keys.each do |field| define_method("#{field}") do result = property.send("#{field}") end end end But there are two problems in the above code: 1. If the property object in a product object is "Property1", property.send("#{field}") in "Property2"'s block will cause a method missing error, vice versa. 2. Say "Property1" has 500 fields as well as "Property2", each product will be indexed using 1000 fields while only at most 500 fields contains value. How can I solve these problems and meet my requirements? Any ideas about this? -- Posted via http://www.ruby-forum.com/. From henke at mac.se Sat Aug 11 06:02:31 2007 From: henke at mac.se (Henrik Zagerholm) Date: Sat, 11 Aug 2007 12:02:31 +0200 Subject: [Ferret-talk] Highlighting broken in TRUNK In-Reply-To: <46B9E24D.30201@stuartsierra.com> References: <3CE321C2-2822-4CE4-88EF-773B33A1DB20@mac.se> <46B9E24D.30201@stuartsierra.com> Message-ID: <9F9D18F0-4D5F-489C-B3AE-9CBBCBC048AB@mac.se> 8 aug 2007 kl. 17:33 skrev Stuart Sierra: > Henrik Zagerholm wrote: >> As there is still some large file support bugs in the 0.11.4 release >> I had to download the trunk and apply a patch sent in by kyle http:// >> ferret.davebalmain.com/trac/ticket/215. >> >> The problem is now that the highlighting doesn't work. >> >> Somehow it combines excerpt_length with num_excerpts so if you have >> and excerpt_length of 50 and num_excerpts of 5 and you get only one >> hit when searching the excerpts is either the whole text or 50*5 = >> 250. >> >> So everytime you get a result with less hits than num_excerpts the >> highlighting goes crazy. > > I've had the same problem with the large file patch. I couldn't > figure > out a fix, but I worked around it by trimming the excerpts to the last > 150 characters: > > highlights.collect do |excerpt| > if excerpt.length <= 150 > excerpt > else > excerpt[-150..-1] > end > end > > This is ugly, and it chops up words, but it seems to work. > > I wondered if this might have something to do with Ferret's low- > level IO > routines: is_read_u64, os_read_u64, is_read_u32, etc. Maybe the wrong > ones are being used for reading and writing the offsets in the segment > file, now that Kyle's patch has changed the offsets from type int to > type offset_t. Thanks for the info. Now I know I'm not the only one experience this problem. =) Thanks for code sharing. I'll probably do something similar until this is fixed. It would be nice to get Dave's input on this but it seems he disappeared from the face of the earth. Thanks, Henrik > > -Stuart Sierra > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From starburger234 at yahoo.de Sat Aug 11 06:34:14 2007 From: starburger234 at yahoo.de (Star Burger) Date: Sat, 11 Aug 2007 12:34:14 +0200 Subject: [Ferret-talk] IO Error: Error reading the segment infos. Message-ID: <332953e930fc40978b618d6a56afcfea@ruby-forum.com> Hi all, I'm using ferret and acts_as_ferret on Win XP. Trying to index a location table with over 2,5 Mio. rows in UTF-8 I'm getting the error: "IO Error: IO Error occured: Error reading the segment infos. Store listing was ..." The error occurs after some hours of running. The index file system looks like this after the abort: 11.08.2007 12:30 . 11.08.2007 12:30 .. 11.08.2007 12:30 0 files.txt 11.08.2007 12:21 16 segments 11.08.2007 12:21 135 segments_2dqf 11.08.2007 10:55 233.012.389 _16v6.cfs 11.08.2007 11:11 293.991.425 _1fft.cfs 11.08.2007 11:29 328.026.860 _1o0g.cfs 11.08.2007 11:46 318.527.256 _1wl3.cfs 11.08.2007 11:56 49.647.656 _255q.cfs 11.08.2007 11:57 29.317.235 _260l.cfs 11.08.2007 11:58 29.952.194 _26vg.cfs 11.08.2007 12:00 27.239.514 _27qb.cfs 11.08.2007 12:01 32.785.402 _28l6.cfs 11.08.2007 12:03 30.472.931 _29g1.cfs 11.08.2007 12:05 30.355.039 _2aaw.cfs 11.08.2007 12:06 9.738.666 _2b5r.cfs 11.08.2007 12:07 201.027 _2c0m.cfs 11.08.2007 12:08 322.501 _2cvh.cfs 11.08.2007 12:08 11.905 _2cyk.cfs 11.08.2007 12:08 12.506 _2d1n.cfs 11.08.2007 12:08 11.981 _2d4q.cfs 11.08.2007 12:08 12.066 _2d7t.cfs 11.08.2007 12:08 12.097 _2daw.cfs 11.08.2007 12:08 11.827 _2ddz.cfs 11.08.2007 12:08 51.846 _2dh2.cfs 11.08.2007 12:08 12.621 _2dk5.cfs 11.08.2007 12:08 12.494 _2dn8.cfs 11.08.2007 12:08 1.567 _2dnj.cfs 11.08.2007 12:08 1.456 _2dnu.cfs 11.08.2007 12:08 1.502 _2do5.cfs 11.08.2007 12:08 1.490 _2dog.cfs 11.08.2007 12:08 1.552 _2dor.cfs 11.08.2007 12:08 1.627 _2dp2.cfs 11.08.2007 12:08 1.510 _2dpd.cfs 11.08.2007 12:08 1.446 _2dpo.cfs 11.08.2007 12:08 1.579 _2dpz.cfs 11.08.2007 12:08 423 _2dq0.cfs 11.08.2007 12:08 421 _2dq1.cfs 11.08.2007 12:08 425 _2dq2.cfs 11.08.2007 12:08 448 _2dq3.cfs 11.08.2007 12:08 423 _2dq4.cfs 11.08.2007 12:08 436 _2dq5.cfs 11.08.2007 12:08 427 _2dq6.cfs 11.08.2007 12:08 457 _2dq7.cfs 11.08.2007 12:08 565 _2dq8.cfs 11.08.2007 12:09 484 _2dq9.cfs 11.08.2007 12:09 1.668 _2dqa.cfs 11.08.2007 12:09 12.629 _2dqb.cfs 11.08.2007 12:09 160.061 _2dqc.cfs 11.08.2007 12:09 190.520.197 _2dqd.cfs 11.08.2007 12:21 2.543.445.730 _2dqe.cfs 11.08.2007 09:56 338.646.539 _8km.cfs 11.08.2007 10:12 302.287.475 _h59.cfs 11.08.2007 10:28 322.882.388 _ppw.cfs 11.08.2007 10:41 166.010.866 _yaj.cfs 54 Datei(en) 5.277.725.380 Bytes 2 Verzeichnis(se), 4.181.032.960 Bytes frei The file _2dqe.fs has repeatedly the same size after the error. How can I deal with this? Will this be different on Linux? Thanks, Starburger -- Posted via http://www.ruby-forum.com/. From starburger234 at yahoo.de Sat Aug 11 06:36:31 2007 From: starburger234 at yahoo.de (Star Burger) Date: Sat, 11 Aug 2007 12:36:31 +0200 Subject: [Ferret-talk] IO Error: Error reading the segment infos. In-Reply-To: <332953e930fc40978b618d6a56afcfea@ruby-forum.com> References: <332953e930fc40978b618d6a56afcfea@ruby-forum.com> Message-ID: <385dda91457c4e84a4416a871cbb4cba@ruby-forum.com> Just to mention that I tried to build the index with Location.rebuild_index from the scratch using Acts_as_ferret when the error happened. Starburger -- Posted via http://www.ruby-forum.com/. From jk at jkraemer.net Mon Aug 13 06:49:45 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Mon, 13 Aug 2007 12:49:45 +0200 Subject: [Ferret-talk] Different ferret fields for instances of the same model? In-Reply-To: <7466078d3badeecb7cb6397b44870384@ruby-forum.com> References: <7466078d3badeecb7cb6397b44870384@ruby-forum.com> Message-ID: <20070813104944.GJ28854@thunder.jkraemer.net> Hi Allen! comments inline On Fri, Aug 10, 2007 at 05:21:38AM +0200, Allen Young wrote: > Suppose we have a model "Product", in "Product" I've declared a > polymorphic relationship with model "Property1" and "Property2", the > following code will show this: > > class Product < ActiveRecord::Base > belongs_to :property, :polymorphic => true > @@ferret_fields = {...} > acts_as_ferret({:fields => @@ferret_fields}) > end > [..] > > Now I want to provide full text search capability for "Product" and it's > obvious that "Product" should contains its "property" while being > indexed. So I should define "ferret_fields" class method in "Property1" > and "Property2" to collect all their fields and dynamically define the > corresponding method in "Product". The code is something like this: > [..] > > class Product < ActiveRecord::Base > belongs_to :property, :polymorphic => true > @@ferret_fields = {...} > @@ferret_fields.merge!(Property1.ferret_fields) > @@ferret_fields.merge!(Property2.ferret_fields) > acts_as_ferret({:fields => @@ferret_fields}) > Property1.ferret_fields.keys.each do |field| > define_method("#{field}") do > result = property.send("#{field}") > end > end > Property2.ferret_fields.keys.each do |field| > define_method("#{field}") do > result = property.send("#{field}") > end > end > end > > But there are two problems in the above code: > > 1. If the property object in a product object is "Property1", > property.send("#{field}") in "Property2"'s block will cause a method > missing error, vice versa. I'd just rescue that and return nil for the field: Property2.ferret_fields.keys.each do |field| define_method("#{field}") do result = property.send("#{field}") rescue nil end end > 2. Say "Property1" has 500 fields as well as "Property2", each product > will be indexed using 1000 fields while only at most 500 fields contains > value. Do you really need to be able to run queries against each single one of these 1000 fields? If not, you could concatenate their values into a single large :properties field. Cheers, Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From jk at jkraemer.net Mon Aug 13 06:59:55 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Mon, 13 Aug 2007 12:59:55 +0200 Subject: [Ferret-talk] indexing only the changed values In-Reply-To: <87b412ce0708080719j187ead21rd2a36b8ae191cb58@mail.gmail.com> References: <87b412ce0707300532m16ca253eie34f24d3deb7a52d@mail.gmail.com> <5B51C217-620D-4EC8-9691-99F1F868176C@benjaminkrause.com> <87b412ce0707302355w7ffce9afs4a45b3eb8631a359@mail.gmail.com> <87b412ce0708080719j187ead21rd2a36b8ae191cb58@mail.gmail.com> Message-ID: <20070813105955.GK28854@thunder.jkraemer.net> On Wed, Aug 08, 2007 at 07:19:25AM -0700, isha kakodkar wrote: > hi,i needed some help on specifying conditions for index searching on > filesystem. > Along with the main query string,i can give the indexed_field:value,so that > it searches the query string at that indexed row. > But can i specify a indexed_field > value? > or i have to give the exact value? Check out RangeQueries. http://ferret.davebalmain.com/api/classes/Ferret/QueryParser.html Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From ed.temp.01 at gmail.com Mon Aug 13 07:43:39 2007 From: ed.temp.01 at gmail.com (Ed --) Date: Mon, 13 Aug 2007 13:43:39 +0200 Subject: [Ferret-talk] Ferret - current status? In-Reply-To: <46BB1AFE.8080701@stuartsierra.com> References: <69c1b1491fcaa44e7799dd5e1639674d@ruby-forum.com> <20070726073112.GQ26963@thunder.jkraemer.net> <46BADB6E.50606@andersground.net> <46BB1AFE.8080701@stuartsierra.com> Message-ID: Stuart Sierra wrote: > > ferret.davebalmain.com is up and running again; last Wiki commit was May > 2, 2007; last SVN commit (770) was April 18, 2007. Just did an svn update and it is now up to rev 775. No Changelogs and difficult to guess what has changed from trac -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Mon Aug 13 08:26:36 2007 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 13 Aug 2007 14:26:36 +0200 Subject: [Ferret-talk] Ferret - current status? In-Reply-To: References: <69c1b1491fcaa44e7799dd5e1639674d@ruby-forum.com> <20070726073112.GQ26963@thunder.jkraemer.net> <46BADB6E.50606@andersground.net> <46BB1AFE.8080701@stuartsierra.com> Message-ID: <20070813122636.GA25837@cordoba.webit.de> On Mon, Aug 13, 2007 at 01:43:39PM +0200, Ed -- wrote: > Stuart Sierra wrote: > > > > > ferret.davebalmain.com is up and running again; last Wiki commit was May > > 2, 2007; last SVN commit (770) was April 18, 2007. > > Just did an svn update and it is now up to rev 775. No Changelogs and > difficult to guess what has changed from trac http://ferret.davebalmain.com/trac/timeline To me this looks like really good news :-) Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From ferret-talk at stuartsierra.com Mon Aug 13 16:47:31 2007 From: ferret-talk at stuartsierra.com (Stuart Sierra) Date: Mon, 13 Aug 2007 16:47:31 -0400 Subject: [Ferret-talk] Ferret - current status? In-Reply-To: <20070813122636.GA25837@cordoba.webit.de> References: <69c1b1491fcaa44e7799dd5e1639674d@ruby-forum.com> <20070726073112.GQ26963@thunder.jkraemer.net> <46BADB6E.50606@andersground.net> <46BB1AFE.8080701@stuartsierra.com> <20070813122636.GA25837@cordoba.webit.de> Message-ID: <46C0C363.3060405@stuartsierra.com> Jens Kraemer wrote: > http://ferret.davebalmain.com/trac/timeline > > To me this looks like really good news :-) Good news indeed. Kudos and thanks to Mr. Balmain! -Stuart From chad at zulu.net Mon Aug 13 20:24:54 2007 From: chad at zulu.net (Chad Thatcher) Date: Tue, 14 Aug 2007 02:24:54 +0200 Subject: [Ferret-talk] =?utf-8?b?U2hvdWxkICJhIiBtYXRjaCAiw6QiIGluIGZlcnJl?= =?utf-8?q?t=3F?= Message-ID: <5c48e6add5f1cd9a0f777eaad93a659a@ruby-forum.com> Hi all, I have indexed a huge amount of data with text from several european languages. In the index are values like Georg Friedrich H?ndel. I would like a search phrase like "Georg Friedrich Handel" to find records with the real spelling of H?ndel but it doesn't seem to work. Can anyone give me an idea of what I need to do to make this happen. A bit lost here and can't seem to find anything on google to help out. I have an idea that it might be a locale issue but not sure. Thanks, Chad. -- Posted via http://www.ruby-forum.com/. From julioody at gmail.com Mon Aug 13 21:22:03 2007 From: julioody at gmail.com (Julio Cesar Ody) Date: Tue, 14 Aug 2007 11:22:03 +1000 Subject: [Ferret-talk] =?iso-8859-1?q?Should_=22a=22_match_=22=E4=22_in_fe?= =?iso-8859-1?q?rret=3F?= In-Reply-To: <5c48e6add5f1cd9a0f777eaad93a659a@ruby-forum.com> References: <5c48e6add5f1cd9a0f777eaad93a659a@ruby-forum.com> Message-ID: I'm a 100% sure this has been asked before in this list. But I know it's not exactly trivial to search for it. I'd say, give it a try on the archives. http://rubyforge.org/pipermail/ferret-talk/ On 8/14/07, Chad Thatcher wrote: > Hi all, > > I have indexed a huge amount of data with text from several european > languages. In the index are values like Georg Friedrich H?ndel. > > I would like a search phrase like "Georg Friedrich Handel" to find > records with the real spelling of H?ndel but it doesn't seem to work. > > Can anyone give me an idea of what I need to do to make this happen. A > bit lost here and can't seem to find anything on google to help out. I > have an idea that it might be a locale issue but not sure. > > Thanks, > > Chad. > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From myowntribe at yahoo.com Mon Aug 13 22:32:03 2007 From: myowntribe at yahoo.com (Cass Amino) Date: Tue, 14 Aug 2007 04:32:03 +0200 Subject: [Ferret-talk] problem searching dynamic strings with acts_as_ferret Message-ID: Hi Jens, I am using acts_as_ferret and multi_search method in it, when I search for Cass I get the result, but I want to get Cass even if I type Cas or Ca or just C. How can we use the LIKE operator for this serach in acts_as_ferret? I am listing the method here: def search_names @users = User.available_users @user = User.find(session[:user_id]) @query = params[:query] || '' @total, @users = User.multi_search(@query, [ WorkProfile ], :page => (params[:page]||1)) @pages = pages_for(@total) unless @query.blank? @results = User.find_by_contents @query end end Could you please help me with this... Cheers Cass -- Posted via http://www.ruby-forum.com/. From allenmacyoung at gmail.com Mon Aug 13 22:51:59 2007 From: allenmacyoung at gmail.com (Allen Young) Date: Tue, 14 Aug 2007 04:51:59 +0200 Subject: [Ferret-talk] Different ferret fields for instances of the same model? In-Reply-To: <20070813104944.GJ28854@thunder.jkraemer.net> References: <7466078d3badeecb7cb6397b44870384@ruby-forum.com> <20070813104944.GJ28854@thunder.jkraemer.net> Message-ID: <1594558ae32d8e31f283e502a33741e2@ruby-forum.com> Jens Kraemer wrote: > I'd just rescue that and return nil for the field: > > Property2.ferret_fields.keys.each do |field| > define_method("#{field}") do > result = property.send("#{field}") rescue nil > end > end > For now, I'm rescuing the "NoMethodError" which works fine as well. >> 2. Say "Property1" has 500 fields as well as "Property2", each product >> will be indexed using 1000 fields while only at most 500 fields contains >> value. > > Do you really need to be able to run queries against each single one of > these 1000 fields? If not, you could concatenate their values into a > single large :properties field. I'm afraid so... because many of these fields are number fields and users will want to search these fields like "field1 is bigger than 1.5 and smaller than 0.7". -- Posted via http://www.ruby-forum.com/. From pulkit.bosco05 at gmail.com Tue Aug 14 03:36:18 2007 From: pulkit.bosco05 at gmail.com (Pulkit Bhuwalka) Date: Tue, 14 Aug 2007 13:06:18 +0530 Subject: [Ferret-talk] how to use index as model in rails Message-ID: <80c925a90708140036q700ac1daqebe552bd6be09110@mail.gmail.com> Hi all, I've used ruby and ferret for the past one month to index resumes. Now I need a web front-end to the application and the obvious choice turns out to be rails. But as I'm not interested in using a database and intend to just use the index, I'm not sure as to how to put it into the MVC framework as the model implicitly takes a database. Is there a way it can take an index instead by using acts_as_ferret. Basically, all the functionality I want is to take some input and provide it to the back-end ruby code which does the parsing, and some index related work, which may be done either at the front-end or preferably, at the back-end. It's the usage of rails I'm not sure about as I've never used it before. Thanks a lot, Pulkit -- The dumb's song to the deaf is exactly what music is to us. The best things in the world are free. Pulkit Bhuwalka Dept. Of Information Science BMS College Of Engineering, Bangalore -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070814/a0be07ed/attachment.html From bk at benjaminkrause.com Tue Aug 14 03:55:08 2007 From: bk at benjaminkrause.com (Benjamin Krause) Date: Tue, 14 Aug 2007 09:55:08 +0200 (CEST) Subject: [Ferret-talk] how to use index as model in rails Message-ID: <55551.212.227.62.4.1187078108.squirrel@orkland.homeunix.org> Pulkit, > I've used ruby and ferret for the past one month to index resumes. Now I > need a web front-end to the application and the obvious choice turns out > to > be rails. But as I'm not interested in using a database and intend to just > use the index, I'm not sure as to how to put it into the MVC framework as > the model implicitly takes a database. Is there a way it can take an index > instead by using acts_as_ferret. There are other web frameworks (like http://code.whytheluckystiff.net/shoes/) so you don't necessarily need to use rails. However, even in Rails you model classes must not inherit from ActiveRecord. It's just that all tutorials, screencasts, etc. focus on db-model classes. Its perfectly fine to create a non-db model class (based on Object), that acts as a wrapper for ferret. acts_as_ferret is just a bridge between you db objects and ferret. so if you do not have db objects (and therefore don't inherit from AR), you can't use acts_as_ferret (at least not without some non-trivial changes). I would suggest to build a ferret-model-object that handels all search requests. Ben From jk at jkraemer.net Tue Aug 14 04:07:15 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Tue, 14 Aug 2007 10:07:15 +0200 Subject: [Ferret-talk] Different ferret fields for instances of the same model? In-Reply-To: <1594558ae32d8e31f283e502a33741e2@ruby-forum.com> References: <7466078d3badeecb7cb6397b44870384@ruby-forum.com> <20070813104944.GJ28854@thunder.jkraemer.net> <1594558ae32d8e31f283e502a33741e2@ruby-forum.com> Message-ID: <20070814080715.GM28854@thunder.jkraemer.net> On Tue, Aug 14, 2007 at 04:51:59AM +0200, Allen Young wrote: > Jens Kraemer wrote: > > I'd just rescue that and return nil for the field: > > > > Property2.ferret_fields.keys.each do |field| > > define_method("#{field}") do > > result = property.send("#{field}") rescue nil > > end > > end > > > For now, I'm rescuing the "NoMethodError" which works fine as well. > > >> 2. Say "Property1" has 500 fields as well as "Property2", each product > >> will be indexed using 1000 fields while only at most 500 fields contains > >> value. > > > > Do you really need to be able to run queries against each single one of > > these 1000 fields? If not, you could concatenate their values into a > > single large :properties field. > I'm afraid so... because many of these fields are number fields and > users will want to search these fields like "field1 is bigger than 1.5 > and smaller than 0.7". Ok. To avoid the 500 nil fields per record you could override the to_doc instance method in your model and only add the relevant fields depending on the type of your property. The original implementation is in acts_as_ferret's instance_methods.rb . Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From jk at jkraemer.net Tue Aug 14 04:24:03 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Tue, 14 Aug 2007 10:24:03 +0200 Subject: [Ferret-talk] =?iso-8859-1?q?Should_=22a=22_match_=22=E4=22_in_fe?= =?iso-8859-1?q?rret=3F?= In-Reply-To: <5c48e6add5f1cd9a0f777eaad93a659a@ruby-forum.com> References: <5c48e6add5f1cd9a0f777eaad93a659a@ruby-forum.com> Message-ID: <20070814082403.GN28854@thunder.jkraemer.net> On Tue, Aug 14, 2007 at 02:24:54AM +0200, Chad Thatcher wrote: > Hi all, > > I have indexed a huge amount of data with text from several european > languages. In the index are values like Georg Friedrich H?ndel. > > I would like a search phrase like "Georg Friedrich Handel" to find > records with the real spelling of H?ndel but it doesn't seem to work. > > Can anyone give me an idea of what I need to do to make this happen. A > bit lost here and can't seem to find anything on google to help out. I > have an idea that it might be a locale issue but not sure. To achieve this, simply replace all occurences of '?' by 'a' in both indexed content and queries. MappingFilter [1] is your friend :-) Cheers, Jens [1] http://ferret.davebalmain.com/api/classes/Ferret/Analysis/MappingFilter.html -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From amaddox02 at gmail.com Tue Aug 14 05:45:34 2007 From: amaddox02 at gmail.com (Adam Maddox) Date: Tue, 14 Aug 2007 11:45:34 +0200 Subject: [Ferret-talk] Ferret::FileNotFoundError - delete In-Reply-To: <1a473520179c4aeed8cd55038152beee@ruby-forum.com> References: <9f16b21a2c67c88d5d2c5ce39fa3102e@ruby-forum.com> <20070706100027.GN11808@cordoba.webit.de> <98a4c3f95edf5cd2e30ec0d0e9169660@ruby-forum.com> <1a473520179c4aeed8cd55038152beee@ruby-forum.com> Message-ID: ok that broke it... ive put the ferret gem back in. -- Posted via http://www.ruby-forum.com/. From amaddox02 at gmail.com Tue Aug 14 05:40:17 2007 From: amaddox02 at gmail.com (Adam Maddox) Date: Tue, 14 Aug 2007 11:40:17 +0200 Subject: [Ferret-talk] Ferret::FileNotFoundError - delete In-Reply-To: References: <9f16b21a2c67c88d5d2c5ce39fa3102e@ruby-forum.com> <20070706100027.GN11808@cordoba.webit.de> <98a4c3f95edf5cd2e30ec0d0e9169660@ruby-forum.com> Message-ID: <1a473520179c4aeed8cd55038152beee@ruby-forum.com> Ikai Lan wrote: > Any luck with this? I am seeing this problem too with 0.11.4/Ubuntu > 7.04/Ruby 1.8.5. > > I downgraded to 0.11.3 ... we'll see if this helps. > > Ikai I am also having this problem: A Ferret::FileNotFoundError occurred in releases#update: File Not Found Error occured at :117 in xpop_context Error occured in fs_store.c:329 - fs_open_input tried to open ".....///...../_1dx_1o.del" but it doesn't exist: I have remvoed the ferret gem and installed the acts_as_ferret one... hopefully this will fix it. -- Posted via http://www.ruby-forum.com/. From bk at benjaminkrause.com Tue Aug 14 06:07:54 2007 From: bk at benjaminkrause.com (Benjamin Krause) Date: Tue, 14 Aug 2007 12:07:54 +0200 (CEST) Subject: [Ferret-talk] Ferret::FileNotFoundError - delete In-Reply-To: <1a473520179c4aeed8cd55038152beee@ruby-forum.com> References: <9f16b21a2c67c88d5d2c5ce39fa3102e@ruby-forum.com> <20070706100027.GN11808@cordoba.webit.de> <98a4c3f95edf5cd2e30ec0d0e9169660@ruby-forum.com> <1a473520179c4aeed8cd55038152beee@ruby-forum.com> Message-ID: <56698.212.227.62.4.1187086074.squirrel@orkland.homeunix.org> Adam, > I am also having this problem: > A Ferret::FileNotFoundError occurred in releases#update: > I have remvoed the ferret gem and installed the acts_as_ferret one... > hopefully this will fix it. most probably this will not fix your problem. 1st of all acts_as_ferret depends on the ferret gem, and secondly these FileNotFoundErrors presumably occure, because you're not using a drb server for indexing. see http://projects.jkraemer.net/acts_as_ferret/wiki/DrbServer Ben From bk at benjaminkrause.com Tue Aug 14 07:42:50 2007 From: bk at benjaminkrause.com (Benjamin Krause) Date: Tue, 14 Aug 2007 13:42:50 +0200 Subject: [Ferret-talk] =?iso-8859-1?q?Should_=22a=22_match_=22=E4=22_in_fe?= =?iso-8859-1?q?rret=3F?= In-Reply-To: <5c48e6add5f1cd9a0f777eaad93a659a@ruby-forum.com> References: <5c48e6add5f1cd9a0f777eaad93a659a@ruby-forum.com> Message-ID: <19CF0D15-FEAD-4BC9-B047-D674219D74DA@benjaminkrause.com> Chad, you should use a mapping filter to transform special characters like german umlauts into its ascii counterpiece. take a look out this analyzer, maybe it helps .. http://bugs.omdb.org/browser/branches/2007.1/lib/omdb/ferret/analysis.rb Ben From ferret-talk at stuartsierra.com Tue Aug 14 09:55:06 2007 From: ferret-talk at stuartsierra.com (Stuart Sierra) Date: Tue, 14 Aug 2007 09:55:06 -0400 Subject: [Ferret-talk] how to use index as model in rails In-Reply-To: <55551.212.227.62.4.1187078108.squirrel@orkland.homeunix.org> References: <55551.212.227.62.4.1187078108.squirrel@orkland.homeunix.org> Message-ID: <46C1B43A.5030000@stuartsierra.com> Pulkit wrote: >> I've used ruby and ferret for the past one month to index resumes. >> Now I need a web front-end to the application and the obvious >> choice turns out to be rails. But as I'm not interested in using a >> database and intend to just use the index, I'm not sure as to how >> to put it into the MVC framework as the model implicitly takes a >> database. Is there a way it can take an index instead by using >> acts_as_ferret. Benjamin Krause wrote: > There are other web frameworks (like > http://code.whytheluckystiff.net/shoes/) so you don't necessarily > need to use rails. However, even in Rails you model classes must not > inherit from ActiveRecord. It's just that all tutorials, screencasts, > etc. focus on db-model classes. Its perfectly fine to create a non-db > model class (based on Object), that acts as a wrapper for ferret. > > acts_as_ferret is just a bridge between you db objects and ferret. so > if you do not have db objects (and therefore don't inherit from AR), > you can't use acts_as_ferret (at least not without some non-trivial > changes). > > I would suggest to build a ferret-model-object that handels all > search requests. Stuart Sierra replies: I did something similar to this with a model class that stored everything in XML files and used Ferret for searching. To make it easier to use with Rails, I imitated some of the methods of ActiveRecord::Base, like find(). It was a bit cumbersome, but it worked. The only problem is, as this list demonstrates, Ferret indexes aren't always the most reliable place to store your data. I'd advise keeping a permanent copy in files or a database somewhere so you can rebuild the index if it gets corrupted or when the Ferret version changes. -S From bk at benjaminkrause.com Tue Aug 14 10:13:16 2007 From: bk at benjaminkrause.com (Benjamin Krause) Date: Tue, 14 Aug 2007 16:13:16 +0200 (CEST) Subject: [Ferret-talk] how to use index as model in rails In-Reply-To: <46C1B43A.5030000@stuartsierra.com> References: <55551.212.227.62.4.1187078108.squirrel@orkland.homeunix.org> <46C1B43A.5030000@stuartsierra.com> Message-ID: <40854.212.227.62.4.1187100796.squirrel@orkland.homeunix.org> > The only problem is, as this list demonstrates, Ferret indexes aren't > always the most reliable place to store your data. I'd advise keeping a > permanent copy in files or a database somewhere so you can rebuild the > index if it gets corrupted or when the Ferret version changes. i agree to that.. it might not be a problem, if you have a static index that never changes. but as soon as your index evolves, you will find yourself in a situation where an index rebuild is necessary. Ben From pulkit.bosco05 at gmail.com Tue Aug 14 10:34:01 2007 From: pulkit.bosco05 at gmail.com (Pulkit Bhuwalka) Date: Tue, 14 Aug 2007 20:04:01 +0530 Subject: [Ferret-talk] how to use index as model in rails In-Reply-To: <40854.212.227.62.4.1187100796.squirrel@orkland.homeunix.org> References: <55551.212.227.62.4.1187078108.squirrel@orkland.homeunix.org> <46C1B43A.5030000@stuartsierra.com> <40854.212.227.62.4.1187100796.squirrel@orkland.homeunix.org> Message-ID: <80c925a90708140734h15660cb4o7fa13736ee71e6d3@mail.gmail.com> Hi, The permanent data is going to be stored separately but as data is parsed into fragments before indexing, it isn't a very good idea to rebuild the index...However, i was looking at rails just to provide the front-end to the indexing and searching that would go on in the backend, which doesn't really justify it's usage but i thought it would probably give more flexibility dealing with the index..I am hoping to make a model based on an index so it can communicate easily with the core program and work it out from there, but it'll need some reading as I only started rails today.. Thanks a lot, Pulkit On 8/14/07, Benjamin Krause wrote: > > > > The only problem is, as this list demonstrates, Ferret indexes aren't > > always the most reliable place to store your data. I'd advise keeping a > > permanent copy in files or a database somewhere so you can rebuild the > > index if it gets corrupted or when the Ferret version changes. > > i agree to that.. it might not be a problem, if you have a static index > that never changes. but as soon as your index evolves, you will find > yourself in a situation where an index rebuild is necessary. > > Ben > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- The dumb's song to the deaf is exactly what music is to us. The best things in the world are free. A foolish dreamer who knows reality is more vague than a dream and that a song's "realer" than the same makes reality. Pulkit Bhuwalka Dept. Of Information Science BMS College Of Engineering, Bangalore -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070814/b21eb20e/attachment.html From plynchnlm at gmail.com Wed Aug 15 21:15:20 2007 From: plynchnlm at gmail.com (Paul Lynch) Date: Thu, 16 Aug 2007 03:15:20 +0200 Subject: [Ferret-talk] Getting distinct result lists Message-ID: <624485a973728d5095c0df0ff1245e4c@ruby-forum.com> Is there a way in Ferret to do something akin to SQL's "SELECT DISTINCT" query? I have a table with multiple columns, and I would like to search just a couple of them (but index others for other kinds of searches on the table). The two columns I want to search do not have unique values, because the other columns finish specifying the record. Is there a way to avoid picking up the duplicates in the result list? Thanks, --Paul -- Posted via http://www.ruby-forum.com/. From bk at benjaminkrause.com Thu Aug 16 03:40:25 2007 From: bk at benjaminkrause.com (Benjamin Krause) Date: Thu, 16 Aug 2007 09:40:25 +0200 Subject: [Ferret-talk] Getting distinct result lists In-Reply-To: <624485a973728d5095c0df0ff1245e4c@ruby-forum.com> References: <624485a973728d5095c0df0ff1245e4c@ruby-forum.com> Message-ID: <041CAD20-C509-495F-9A33-A8ED58B98C94@benjaminkrause.com> On 2007-08-16, at 3:15 AM, Paul Lynch wrote: > Is there a way in Ferret to do something akin to SQL's "SELECT > DISTINCT" > query? I have a table with multiple columns, and I would like to > search just a couple of them (but index others for other kinds of > searches on the table). The two columns I want to search do not have > unique values, because the other columns finish specifying the record. > Is there a way to avoid picking up the duplicates in the result list? Paul, i think i don't get your point.. If you search for something in ferret, you will never get duplicates in your result set (meaning ferret documents). However, your columns (fields in ferret terms) might have the same value for several documents. So you just want to get the different field values for your search, but not the actual ferret documents? Maybe you can give us a small example of your data, searches, actual results and what you would like to have as a result :-) Ben From plynchnlm at gmail.com Thu Aug 16 13:23:14 2007 From: plynchnlm at gmail.com (Paul Lynch) Date: Thu, 16 Aug 2007 19:23:14 +0200 Subject: [Ferret-talk] Getting distinct result lists In-Reply-To: <041CAD20-C509-495F-9A33-A8ED58B98C94@benjaminkrause.com> References: <624485a973728d5095c0df0ff1245e4c@ruby-forum.com> <041CAD20-C509-495F-9A33-A8ED58B98C94@benjaminkrause.com> Message-ID: <5a042fd5882847f04f39c1de7dd6c769@ruby-forum.com> Here's an example. Suppose you have a table "UsedCars" that looks like: Color Make Age ------------------- Green Saturn New Green Saturn Old Red Saturn New Purple Toyota New Purple Toyota Old Blue Yugo Ancient The users searches on "Make", and the returned data to the user is a combination of Color and Make. The user just wants to see the unique values, i.e. Green Saturn Red Saturn Purple Toyota Blue Yugo (or whichever of those match the query). I could get rid of the duplicates (e.g. the two Green Saturns) after doing the query, but the table is large, has a lot of duplicates, and I'm limiting size of the return list, so if I get rid of duplicates then I may need to make a second query to get more if it turns out that after getting rid of the duplicates my list is too short. (Potentially, I could be left with just one item to show.) In SQL, I could get the list like: select distinct Color, Make from UsedCars limit 15 Is there a good way of doing that in ferret? If not, would writing a filter be a good approach? Thanks, --Paul Benjamin Krause wrote: > On 2007-08-16, at 3:15 AM, Paul Lynch wrote: > >> Is there a way in Ferret to do something akin to SQL's "SELECT >> DISTINCT" >> query? I have a table with multiple columns, and I would like to >> search just a couple of them (but index others for other kinds of >> searches on the table). The two columns I want to search do not have >> unique values, because the other columns finish specifying the record. >> Is there a way to avoid picking up the duplicates in the result list? > > Paul, > > i think i don't get your point.. If you search for something in ferret, > you will never get duplicates in your result set (meaning ferret > documents). However, your columns (fields in ferret terms) might > have the same value for several documents. > > So you just want to get the different field values for your search, > but not the actual ferret documents? > > Maybe you can give us a small example of your data, searches, > actual results and what you would like to have as a result :-) > > Ben -- Posted via http://www.ruby-forum.com/. From joshovest at gmail.com Thu Aug 16 18:48:47 2007 From: joshovest at gmail.com (Josh West) Date: Fri, 17 Aug 2007 00:48:47 +0200 Subject: [Ferret-talk] SVN seems to be down Message-ID: <68c371ac932fe967245727f4146a6edc@ruby-forum.com> I've tried throughout the day to install the plugin from SVN and the connection keeps getting refused. On a side note, I tried instead to install the gem rather than the plugin and it seems to only run correctly for the root user. Am I doing something wrong? Sorry if this is a stupid question, but I'm pretty new to Rails and even newer to AAF. Thanks, Josh -- Posted via http://www.ruby-forum.com/. From ferret-talk at stuartsierra.com Fri Aug 17 11:28:39 2007 From: ferret-talk at stuartsierra.com (Stuart Sierra) Date: Fri, 17 Aug 2007 11:28:39 -0400 Subject: [Ferret-talk] SVN seems to be down In-Reply-To: <68c371ac932fe967245727f4146a6edc@ruby-forum.com> References: <68c371ac932fe967245727f4146a6edc@ruby-forum.com> Message-ID: <46C5BEA7.3050304@stuartsierra.com> Josh West wrote: > On a side note, I tried instead to install the gem rather than the > plugin and it seems to only run correctly for the root user. Am I doing > something wrong? Sorry if this is a stupid question, but I'm pretty new > to Rails and even newer to AAF. What is only running for the root user? The 'gem install' command? That's normal if you want to install the gem for system-wide use. -Stuart From jk at jkraemer.net Fri Aug 17 11:46:51 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Fri, 17 Aug 2007 17:46:51 +0200 Subject: [Ferret-talk] Getting distinct result lists In-Reply-To: <5a042fd5882847f04f39c1de7dd6c769@ruby-forum.com> References: <624485a973728d5095c0df0ff1245e4c@ruby-forum.com> <041CAD20-C509-495F-9A33-A8ED58B98C94@benjaminkrause.com> <5a042fd5882847f04f39c1de7dd6c769@ruby-forum.com> Message-ID: <20070817154651.GL27074@thunder.jkraemer.net> Hi! On Thu, Aug 16, 2007 at 07:23:14PM +0200, Paul Lynch wrote: > Here's an example. Suppose you have a table "UsedCars" that looks like: > > Color Make Age > ------------------- > Green Saturn New > Green Saturn Old > Red Saturn New > Purple Toyota New > Purple Toyota Old > Blue Yugo Ancient > > The users searches on "Make", and the returned data to the user is a > combination of Color and Make. The user just wants to see the unique > values, i.e. > > Green Saturn > Red Saturn > Purple Toyota > Blue Yugo > > (or whichever of those match the query). I could get rid of the > duplicates (e.g. the two Green Saturns) after doing the query, but the > table is large, has a lot of duplicates, and I'm limiting size of the > return list, so if I get rid of duplicates then I may need to make a > second query to get more if it turns out that after getting rid of the > duplicates my list is too short. (Potentially, I could be left with > just one item to show.) > > In SQL, I could get the list like: > select distinct Color, Make from UsedCars limit 15 > > Is there a good way of doing that in ferret? None that I know of - imho the best way to do this would be to use Ferret for just fetching the IDs of all matching records and then use these IDs with sql like above to let the database do it's job. > If not, would writing a filter be a good approach? I don't think a Filter would be a good way to solve this problem - Filter's don't have a clue about the query you're running, they just operate on bit vectors indicating which documents may appear in a query result, and which may not. Cheers, Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From jk at jkraemer.net Fri Aug 17 11:49:02 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Fri, 17 Aug 2007 17:49:02 +0200 Subject: [Ferret-talk] SVN seems to be down In-Reply-To: <68c371ac932fe967245727f4146a6edc@ruby-forum.com> References: <68c371ac932fe967245727f4146a6edc@ruby-forum.com> Message-ID: <20070817154902.GM27074@thunder.jkraemer.net> On Fri, Aug 17, 2007 at 12:48:47AM +0200, Josh West wrote: > I've tried throughout the day to install the plugin from SVN and the > connection keeps getting refused. AAF's svn repo is up. Most often the problem is that firewalls block outgoing access to strange ports like the ones the svn:// protocol is using. Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From joshovest at gmail.com Fri Aug 17 12:31:53 2007 From: joshovest at gmail.com (Josh West) Date: Fri, 17 Aug 2007 18:31:53 +0200 Subject: [Ferret-talk] SVN seems to be down In-Reply-To: <46C5BEA7.3050304@stuartsierra.com> References: <68c371ac932fe967245727f4146a6edc@ruby-forum.com> <46C5BEA7.3050304@stuartsierra.com> Message-ID: <0e7c38326205e5b32cf0ef8a5dc1c847@ruby-forum.com> Thanks, I figured it out. "Root" was the only user with access to the index. I fixed that and everything works fine. -- Posted via http://www.ruby-forum.com/. From isha.kakodkar at gmail.com Sat Aug 18 09:49:10 2007 From: isha.kakodkar at gmail.com (isha kakodkar) Date: Sat, 18 Aug 2007 19:19:10 +0530 Subject: [Ferret-talk] Problem with multi-search Message-ID: <87b412ce0708180649r529bbe72k1e8d20cf74271f5b@mail.gmail.com> Hi all, I have indexes on 2 models e.g Message model acts_as_ferret(:store_class_name => true, :fields =>{:message => {:store => :compressed}, :felix_user_id => {:index =>:untokenized_omit_norms,:store => :no}) and the chat model acts_as_ferret(:store_class_name => true, :fields => {:name => {:store => :no},) where mesage belongs to chat. When i search a keyword,it could be present in message or it could be present in chat name. So im doing a multi-search on Message and Chat model.But the results dont come.the keyword is indexed but doesnt show up. Im using the paginating_ferret_find method def paginating_ferret_multi_search(options) count = Chat.multi_search(options[:q], [Message], {:limit => :all, :lazy => false}).total_hits PagingEnumerator.new(options[:page_size], count.total_hits, false, options[:current], 1) do |page| offset = (options[:current].to_i - 1) * options[:page_size] limit = options[:page_size] Chat.multi_search(options[:q], [Message], {:offset => offset, :limit => limit}) end end Can anyone please help on what i am missing? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070818/62d0f0e1/attachment.html From scottd at gmail.com Sat Aug 18 19:26:42 2007 From: scottd at gmail.com (Scott Davies) Date: Sun, 19 Aug 2007 01:26:42 +0200 Subject: [Ferret-talk] range queries in phrases/spans? Message-ID: Hi -- I'm developing a system that lets users associate arbitrary [key, numeric value] pairs (among other forms of data) with "documents". Question: is there any reasonable way to get Ferret to index and do range queries over such beasts? Since the keys are arbitrary and there are potentially more of them than there are documents, I'd assume that adding a specific "field" to Ferret for each key isn't going to work. Therefore, it seems like I'd need to somehow encode all the [key, numeric value] pairs for a given document inside, say, a single "numeric attributes" field. But then I'm not sure how I'd form a query to search for a specific range of values for a specific key. I don't see any way to construct a phrase query that says, for example, "the word 'weight' followed immediately by a number between 160 and 180"...range queries don't seem to be allowed as components of a phrase (or span) query. Meanwhile, boolean queries obviously have problems with false positives coming up where the right key is present but it's actually some other [key, value] pair that's providing the value in the right numeric range. (I suppose it's possible I could post-process the results with a filter of some sort to weed out the false positives, but that seems just a wee bit hacky and potentially very inefficient.) Is there actually no way in Ferret to construct a query that will work properly here, or am I missing something? (Yes, I could try encoding the key and the numeric value in a single token and use a range query like (:lower=>"weight_160", :upper=>"weight_180"), but that would cause problems with multi-word keys (e.g. "my weight") that the user might want to be able to find with only some subset of the key's words (e.g. "weight"). And yes, I'm aware that there are tricks involved in indexing numeric values so that they work properly in range queries, but I have that part figured out, so no worries there...) Thanks, -- Scott -- Posted via http://www.ruby-forum.com/. From mindsarray at gmail.com Sun Aug 19 02:24:59 2007 From: mindsarray at gmail.com (Anurag Jain) Date: Sun, 19 Aug 2007 08:24:59 +0200 Subject: [Ferret-talk] SVN installation problem in ferret Message-ID: <0b0ab9620fe28242561075f164436b62@ruby-forum.com> Hello, I am not able to run this command given in the tutorial http://projects.jkraemer.net/acts_as_ferret/ *Inside your Rails project* Please use script/plugin install svn://projects.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret gem is installed. i have added the desired line in environment.rb as well but while running this particular command of svn://.. nothing actually happening i again come back to prompt....also i cant see any thing happening in my plugin directory. I am a newbie for ROR. but i have installed other plugin with http://..in there repository link ..like authenticate or tagging. but ferret i am not able to. -- Posted via http://www.ruby-forum.com/. From jk at jkraemer.net Sun Aug 19 13:40:22 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Sun, 19 Aug 2007 19:40:22 +0200 Subject: [Ferret-talk] SVN installation problem in ferret In-Reply-To: <0b0ab9620fe28242561075f164436b62@ruby-forum.com> References: <0b0ab9620fe28242561075f164436b62@ruby-forum.com> Message-ID: <20070819174021.GQ27074@thunder.jkraemer.net> please either install the acts_as_ferret gem (wiill need the require in environment.rb) OR the plugin via script/plugin install... when the latter doesn't work for you, chances are some firewall is blocking access to the svn:// protocol ports (google knows which these are). Cheers, Jens On Sun, Aug 19, 2007 at 08:24:59AM +0200, Anurag Jain wrote: > Hello, > > I am not able to run this command given in the tutorial > > http://projects.jkraemer.net/acts_as_ferret/ > > *Inside your Rails project* > > Please use > > script/plugin install > svn://projects.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret > > > gem is installed. > > i have added the desired line in environment.rb as well > > but while running this particular command of svn://.. nothing actually > happening > > i again come back to prompt....also i cant see any thing happening in my > plugin directory. > > I am a newbie for ROR. but i have installed other plugin with > http://..in there repository link ..like authenticate or tagging. > > but ferret i am not able to. > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From isha.kakodkar at gmail.com Mon Aug 20 01:21:00 2007 From: isha.kakodkar at gmail.com (isha kakodkar) Date: Mon, 20 Aug 2007 10:51:00 +0530 Subject: [Ferret-talk] providing ferret filters on multi_search Message-ID: <87b412ce0708192221g237ef109p898add9f2fa6f49a@mail.gmail.com> Hi, I want to do a multiple model search.It works if i give the plain text(query) that i am searching for. But i also want to specify some filters with the query and they are separate for both the models that i am searching. How do i achieve this?Can anyone please help? Thanks Isha -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070820/bcb50f38/attachment.html From mark.johnson at shinetech.com Mon Aug 20 03:38:02 2007 From: mark.johnson at shinetech.com (Mark Johnson) Date: Mon, 20 Aug 2007 09:38:02 +0200 Subject: [Ferret-talk] can't stop stop_words Message-ID: <7f633f5f41571f80f13f08c727cbf9bc@ruby-forum.com> I have looked at the documentation and done some searching, but I can't seem to stop the STOP_WORDS from cutting out common words. I am using acts_as_ferret and I have add the following to my code: STOP_WORDS = [] acts_as_ferret({ :fields => { :name => { :boost => 10 }, :project_client_company_id => { :boost => 0 } } }, {:analyzer => Ferret::Analysis::StandardAnalyzer.new(STOP_WORDS)}) Regardless, words like 'into' are not being indexed (I have looked at the index files). I have been re-indexing, so it isn't a problem like that. If anyone can point out what I am doing wrong that would be great. -- Posted via http://www.ruby-forum.com/. From doug.arogos at gmail.com Mon Aug 20 12:07:04 2007 From: doug.arogos at gmail.com (Doug Smith) Date: Mon, 20 Aug 2007 09:07:04 -0700 Subject: [Ferret-talk] can't stop stop_words In-Reply-To: <7f633f5f41571f80f13f08c727cbf9bc@ruby-forum.com> References: <7f633f5f41571f80f13f08c727cbf9bc@ruby-forum.com> Message-ID: <42d8808f0708200907l446483e0r8966792693212c4e@mail.gmail.com> You're close: here's what works for me. Note the ":ferret => " key: acts_as_ferret({:fields => {:name => {:boost => 10, :store => :yes}, :summary => {:boost => 2, :store => :yes}, :published => {:boost => 1}, :published_on => {:boost => 1}}, :ferret => { :analyzer => Ferret::Analysis::StandardAnalyzer.new([]) } } ) Thanks, Doug On 8/20/07, Mark Johnson wrote: > I have looked at the documentation and done some searching, but I can't > seem to stop the STOP_WORDS from cutting out common words. I am using > acts_as_ferret and I have add the following to my code: > > STOP_WORDS = [] > > acts_as_ferret({ :fields => { :name => { :boost > => 10 }, > :project_client_company_id => { :boost > => 0 } > } > }, > {:analyzer => > Ferret::Analysis::StandardAnalyzer.new(STOP_WORDS)}) > > Regardless, words like 'into' are not being indexed (I have looked at > the index files). I have been re-indexing, so it isn't a problem like > that. > > If anyone can point out what I am doing wrong that would be great. > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From mark.johnson at shinetech.com Mon Aug 20 22:05:22 2007 From: mark.johnson at shinetech.com (Mark Johnson) Date: Tue, 21 Aug 2007 04:05:22 +0200 Subject: [Ferret-talk] can't stop stop_words In-Reply-To: <42d8808f0708200907l446483e0r8966792693212c4e@mail.gmail.com> References: <7f633f5f41571f80f13f08c727cbf9bc@ruby-forum.com> <42d8808f0708200907l446483e0r8966792693212c4e@mail.gmail.com> Message-ID: <6ae78807e74297b323c57dcc85061db4@ruby-forum.com> Doug Smith wrote: > You're close: here's what works for me. Note the ":ferret => " key: > > acts_as_ferret({:fields => {:name => {:boost => 10, > :store => :yes}, > :summary => {:boost => 2, > :store => :yes}, > :published => {:boost => 1}, > :published_on => {:boost => 1}}, > :ferret => { :analyzer => > Ferret::Analysis::StandardAnalyzer.new([]) } > } ) > > Thanks, > > Doug Great - that works! -- Posted via http://www.ruby-forum.com/. From mindsarray at gmail.com Tue Aug 21 05:05:06 2007 From: mindsarray at gmail.com (Anurag Jain) Date: Tue, 21 Aug 2007 11:05:06 +0200 Subject: [Ferret-talk] SVN installation problem in ferret In-Reply-To: <20070819174021.GQ27074@thunder.jkraemer.net> References: <0b0ab9620fe28242561075f164436b62@ruby-forum.com> <20070819174021.GQ27074@thunder.jkraemer.net> Message-ID: Thanks Jens, i am able to do it using gem. Jens Kraemer wrote: > please either install the acts_as_ferret gem (wiill need the require in > environment.rb) OR the plugin via script/plugin install... > > when the latter doesn't work for you, chances are some firewall is > blocking access to the svn:// protocol ports (google knows which these > are). > > Cheers, > Jens > > On Sun, Aug 19, 2007 at 08:24:59AM +0200, Anurag Jain wrote: >> script/plugin install >> i again come back to prompt....also i cant see any thing happening in my >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk >> > > -- > Jens Kr?mer > http://www.jkraemer.net/ - Blog > http://www.omdb.org/ - The new free film database -- Posted via http://www.ruby-forum.com/. From isha.kakodkar at gmail.com Tue Aug 21 06:37:42 2007 From: isha.kakodkar at gmail.com (isha kakodkar) Date: Tue, 21 Aug 2007 16:07:42 +0530 Subject: [Ferret-talk] Sorting with multi-search Message-ID: <87b412ce0708210337s1c0a55f6k89279ea291d7a1aa@mail.gmail.com> Hi, Im doing a multi-model search.Does sorting work on multi-model?My code works fine if its searching a single-model.But doesnt work for multi-model. DO i have to do anything different for multi-search sort? Can anyone please help? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070821/9fbfbc0b/attachment.html From kylenordman at gmail.com Wed Aug 22 19:50:49 2007 From: kylenordman at gmail.com (Kyle Nord) Date: Thu, 23 Aug 2007 01:50:49 +0200 Subject: [Ferret-talk] AAF - 0.4.1 - Ferret 0.11.4 Message-ID: <0fd20b9bf641a973ad4352a1d85fd403@ruby-forum.com> Has anyone ever seen this error before? "./script/console production Loading production environment. /usr/local/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:123: warning: parenthesize argument(s) for future version /usr/local/lib/ruby/gems/1.8/gems/actionwebservice-1.2.3/lib/action_web_service/container/action_controller_container.rb:74:in `require_web_service_api':NameError: neither ApplicationApi or ApplicationAPI found " Ferret DRB server starts up without a hitch, but when trying to access it via remote on application servers, it seems to give out and crash the site =). The weird thing is, this just starting happening. So I attempt to roll everything back to an older version, and the error is still there. I assume it's a conflict with a GEM then. Does anyone have any idea if DRB server does not react well to some gems, maybe rmagick, georuby etc? thanks! -- Posted via http://www.ruby-forum.com/. From kylenordman at gmail.com Wed Aug 22 21:04:07 2007 From: kylenordman at gmail.com (Kyle Nord) Date: Thu, 23 Aug 2007 03:04:07 +0200 Subject: [Ferret-talk] AAF - 0.4.1 - Ferret 0.11.4 In-Reply-To: <0fd20b9bf641a973ad4352a1d85fd403@ruby-forum.com> References: <0fd20b9bf641a973ad4352a1d85fd403@ruby-forum.com> Message-ID: <9694ebe1a55cdd335ecfc69a56c7b5ac@ruby-forum.com> Directly enabled to mogilefs-client, and the setup we have. Either that or some oddness. Ignore. =) Kyle Nord wrote: > Has anyone ever seen this error before? > > "./script/console production > Loading production environment. >...... -- Posted via http://www.ruby-forum.com/. From nappin713 at yahoo.com Wed Aug 22 21:57:39 2007 From: nappin713 at yahoo.com (Raymond O'Connor) Date: Thu, 23 Aug 2007 03:57:39 +0200 Subject: [Ferret-talk] custom sort routine Message-ID: <784142de39d96d1f4e47d5fb0699991d@ruby-forum.com> is it possible to write a custom sort routine for ferret? I use ferret right now to index all my products. One of the variables in these product documents is the product popularity, where 1 = best selling production, 2 = 2nd best, etc.. Right now, I'm just sorting by the popularity column in my search results, although this doesn't always provide "good" results, neither does just sorting by document relevance. I'd like some combination of the two to sort by. Is it possible to do this efficiently with ferret? any help would be appreciated, thanks, ray -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Aug 23 04:12:27 2007 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 23 Aug 2007 10:12:27 +0200 Subject: [Ferret-talk] custom sort routine In-Reply-To: <784142de39d96d1f4e47d5fb0699991d@ruby-forum.com> References: <784142de39d96d1f4e47d5fb0699991d@ruby-forum.com> Message-ID: <20070823081227.GB16680@cordoba.webit.de> On Thu, Aug 23, 2007 at 03:57:39AM +0200, Raymond O'Connor wrote: > is it possible to write a custom sort routine for ferret? > > I use ferret right now to index all my products. One of the variables > in these product documents is the product popularity, where 1 = best > selling production, 2 = 2nd best, etc.. > > Right now, I'm just sorting by the popularity column in my search > results, although this doesn't always provide "good" results, neither > does just sorting by document relevance. I'd like some combination of > the two to sort by. Is it possible to do this efficiently with ferret? Adding boosted ORed clauses that query for specific popularities might work: (query AND popularity:1)^3 OR (query AND popularity:2)^2 OR query Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From nappin713 at yahoo.com Thu Aug 23 04:50:30 2007 From: nappin713 at yahoo.com (Raymond O'Connor) Date: Thu, 23 Aug 2007 10:50:30 +0200 Subject: [Ferret-talk] custom sort routine In-Reply-To: <20070823081227.GB16680@cordoba.webit.de> References: <784142de39d96d1f4e47d5fb0699991d@ruby-forum.com> <20070823081227.GB16680@cordoba.webit.de> Message-ID: <96fe44fcff9db39d843c8cada3d6fc67@ruby-forum.com> Thanks for the help. I'm not sure I fully understand your query, but I dont think that would work because my popularity variable basically ranges from 1 to ~1.5 million (where a product with popularity 1 is the best selling product and a popularity with 1.5 million is the worst selling product). Its similar to an Amazon sales rank if you're familiar with that. Correct me if i'm misunderstanding what you're suggesting though. Thanks, Ray Jens Kraemer wrote: > On Thu, Aug 23, 2007 at 03:57:39AM +0200, Raymond O'Connor wrote: >> is it possible to write a custom sort routine for ferret? >> >> I use ferret right now to index all my products. One of the variables >> in these product documents is the product popularity, where 1 = best >> selling production, 2 = 2nd best, etc.. >> >> Right now, I'm just sorting by the popularity column in my search >> results, although this doesn't always provide "good" results, neither >> does just sorting by document relevance. I'd like some combination of >> the two to sort by. Is it possible to do this efficiently with ferret? > > Adding boosted ORed clauses that query for specific popularities might > work: > > (query AND popularity:1)^3 OR (query AND popularity:2)^2 OR query > > > Jens > > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Thu Aug 23 04:56:49 2007 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 23 Aug 2007 10:56:49 +0200 Subject: [Ferret-talk] custom sort routine In-Reply-To: <96fe44fcff9db39d843c8cada3d6fc67@ruby-forum.com> References: <784142de39d96d1f4e47d5fb0699991d@ruby-forum.com> <20070823081227.GB16680@cordoba.webit.de> <96fe44fcff9db39d843c8cada3d6fc67@ruby-forum.com> Message-ID: <20070823085649.GC16680@cordoba.webit.de> On Thu, Aug 23, 2007 at 10:50:30AM +0200, Raymond O'Connor wrote: > Thanks for the help. > I'm not sure I fully understand your query, but I dont think that would > work because my popularity variable basically ranges from 1 to ~1.5 > million (where a product with popularity 1 is the best selling product > and a popularity with 1.5 million is the worst selling product). Its > similar to an Amazon sales rank if you're familiar with that. In this case you could use RangeQueries instead on the popularity field, or add a new field that is set according to the popularity, i.e. on a scale from 1 to 10. The idea of my example was to let products with a higher popularity match the higher boosted sub queries, which should lead to a higher ferret score then. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From steve.tooke at gmail.com Thu Aug 23 07:12:23 2007 From: steve.tooke at gmail.com (Steve Tooke) Date: Thu, 23 Aug 2007 13:12:23 +0200 Subject: [Ferret-talk] AAF: find_by_contents on AR Association Total Hits Message-ID: <171d91a5cf5d3fd06b8f65fcb7d2eb78@ruby-forum.com> I seem to be getting some behaviour thats unexpected (for me anyway) when using find_by_contents on an ActiveRecord has_many association. The results that are returned are only the records that belong to the model returned, but the total_hits that are being returned appear to be for the whole table. e.g. class Book < AR::Base has_many :pages end class Page < AR::Base belongs_to :book acts_as_ferret :fields => [:content] end b1 = Book.create(:title => "Book One") b1.page.add(:content => "The cat sat on the mat.") b1.page.add(:content => "The cat went for a walk in the country.") b2 = Book.create(:title => "Book Two") b2.page.add(:content => "Once upon a time in a country quite far away.") b2.page.add(:content => "There lived a man with his cat.") results = b1.pages.find_by_content("cat) results.total_hits #=> 3 results.results.length #=> 2 b1.pages.total_hits #=> 3 Is this the expected behviour? Is there any way to get total_hits to report the total_hits only for the pages belonging to b1? It will make pagination very difficult otherwise. Cheers, Steve -- Posted via http://www.ruby-forum.com/. From golak.sarangi at gmail.com Thu Aug 23 07:31:57 2007 From: golak.sarangi at gmail.com (golak Sarangi) Date: Thu, 23 Aug 2007 17:01:57 +0530 Subject: [Ferret-talk] Language support in ferret Message-ID: <3854b3a40708230431v12df9758pfd47dcbb2e56a91@mail.gmail.com> Hi, I am using ferret 0.10.9. I have indexed a whole set of data using the standard tokenizer and stem filter. Its stemming well for english characters. But when i enter any non english character the whole application crashes down. although the index doesn't get corrupted. Instead of crashing down it should atleast so no results.Am I missing out something. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070823/09971ecf/attachment.html From lmlynek at gmail.com Thu Aug 23 08:32:10 2007 From: lmlynek at gmail.com (Mlynco Mlynco) Date: Thu, 23 Aug 2007 14:32:10 +0200 Subject: [Ferret-talk] scoring problem in acts_as_ferret Message-ID: Hi, I am using acts_as_ferret and have a problem with scoring. I would like to organize it in such way that, if any of the searched terms fits, I get 1.0 score as a result. I will explain it on the example. I have in index: a) "one two three four" b) "one two three" c) "one two" d) "one" When I search for "one" I would like to get 1.0 score for all of indexed elements. When I search for "one two" I get 1.0 score for a),b),c). Thanks for any help! -- Posted via http://www.ruby-forum.com/. From andreas.korth at gmail.com Thu Aug 23 11:36:07 2007 From: andreas.korth at gmail.com (Andreas Korth) Date: Thu, 23 Aug 2007 17:36:07 +0200 Subject: [Ferret-talk] scoring problem in acts_as_ferret In-Reply-To: References: Message-ID: <4AE78143-101B-4710-BEA9-C8D773C5051F@gmx.net> On 23.08.2007, at 14:32, Mlynco Mlynco wrote: > I am using acts_as_ferret and have a problem with scoring. I would > like > to organize it in such way that, if any of the searched terms fits, I > get 1.0 score as a result. I will explain it on the example. Sounds to me like the wrong approach. You won't get Ferret to score a document with 1.0 if there are more terms in the document than you search for. > I have in index: > > a) "one two three four" > b) "one two three" > c) "one two" > d) "one" > > When I search for "one" I would like to get 1.0 score for all of > indexed > elements. When I search for "one two" I get 1.0 score for a),b),c). Question is: What do you actually want to achieve? Why do you want the documents to be scored this way? I'm sure there's a better way to do it. You might want to check out phrase queries (i.e. using quotes) and boolean operators and see if a combination of them might work for you. Cheers, Andreas From lmlynek at gmail.com Thu Aug 23 12:22:52 2007 From: lmlynek at gmail.com (Mlynco Mlynco) Date: Thu, 23 Aug 2007 18:22:52 +0200 Subject: [Ferret-talk] scoring problem in acts_as_ferret In-Reply-To: <4AE78143-101B-4710-BEA9-C8D773C5051F@gmx.net> References: <4AE78143-101B-4710-BEA9-C8D773C5051F@gmx.net> Message-ID: <4be62cae86336e2c0a3f23246dbc0a94@ruby-forum.com> Andreas Korth wrote: > On 23.08.2007, at 14:32, Mlynco Mlynco wrote: > >> I am using acts_as_ferret and have a problem with scoring. I would >> like >> to organize it in such way that, if any of the searched terms fits, I >> get 1.0 score as a result. I will explain it on the example. > > Sounds to me like the wrong approach. You won't get Ferret to score a > document with 1.0 if there are more terms in the document than you > search for. > >> I have in index: >> >> a) "one two three four" >> b) "one two three" >> c) "one two" >> d) "one" >> >> When I search for "one" I would like to get 1.0 score for all of >> indexed >> elements. When I search for "one two" I get 1.0 score for a),b),c). > > Question is: What do you actually want to achieve? Why do you want > the documents to be scored this way? I'm sure there's a better way to > do it. > > You might want to check out phrase queries (i.e. using quotes) and > boolean operators and see if a combination of them might work for you. > > Cheers, > Andreas Andreas, Probably you are right. What I don't what to do is to decrease the score for elements which have some additional terms. So, ie I am looking for a recipe. I have indexed some of them with such ingredients: recipe1: "beef" recipe2: "onion beef chicken" recipe3: "onion beef chicken tomato" Looking for a "beef" I wouldn't like to "punish" recipe2 and recipe3 because they are richer, I would like to treat them in the same way. Actually the score does not have to be 1.0, but I would like it to be the same for all recipes mentioned above. Any suggestions? Thanks a lot, mlynco -- Posted via http://www.ruby-forum.com/. From jk at jkraemer.net Thu Aug 23 14:02:46 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Thu, 23 Aug 2007 20:02:46 +0200 Subject: [Ferret-talk] scoring problem in acts_as_ferret In-Reply-To: <4be62cae86336e2c0a3f23246dbc0a94@ruby-forum.com> References: <4AE78143-101B-4710-BEA9-C8D773C5051F@gmx.net> <4be62cae86336e2c0a3f23246dbc0a94@ruby-forum.com> Message-ID: <20070823180246.GB16215@thunder.jkraemer.net> On Thu, Aug 23, 2007 at 06:22:52PM +0200, Mlynco Mlynco wrote: [..] > > So, ie I am looking for a recipe. I have indexed some of them with such > ingredients: > > recipe1: "beef" > recipe2: "onion beef chicken" > recipe3: "onion beef chicken tomato" > > Looking for a "beef" I wouldn't like to "punish" recipe2 and recipe3 > because they are richer, I would like to treat them in the same way. > Actually the score does not have to be 1.0, but I would like it to be > the same for all recipes mentioned above. > > Any suggestions? have a look at ConstantScoreQuery. It allows to turn a Filter into a query where all results score equally. To get this Filter, you could use the QueryFilter class, which allows you to turn your original query into a Filter. query = ConstantScoreQuery.new(QueryFilter.new(TermQuery.new(:ingredients, 'beef')) looks like there should be an easier way to achieve this, though ;-) Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From jha at aughey.com Thu Aug 23 14:28:02 2007 From: jha at aughey.com (John Aughey) Date: Thu, 23 Aug 2007 13:28:02 -0500 Subject: [Ferret-talk] Key-Value extraction methods Message-ID: <7168ce3a0708231128i71b4141fs28c57fd123949fee@mail.gmail.com> I'm going to be using Ferret in a database that is more object based rather than relational. Ferret will provide most of the necessary querying capability for extracting objects out of the database. But anyway, I was curious as to what methods the indexer uses to extract key-value (or field-content) pairs out of the Document. I've seen regular Ruby Hash objects given as well as raw ActiveRecord objects. I see two obvious ways to get the values - "keys" and "each do |key,value|" The reason I ask is I'll likely be giving the indexer some abstract object and I want to make sure I provide the interface it is expecting. Thank you John Aughey -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070823/68723242/attachment.html From otmar.tschendel at philips.com Thu Aug 23 17:00:11 2007 From: otmar.tschendel at philips.com (Otmar Tschendel) Date: Thu, 23 Aug 2007 23:00:11 +0200 Subject: [Ferret-talk] index all but search in some fields Message-ID: <0026329b57de3d96c3c1535e391243ff@ruby-forum.com> Hi, i like to index most of my model fields, but limit the search only to a (changing) subset of this fields. -- Posted via http://www.ruby-forum.com/. From samuelgiffney at gmail.com Thu Aug 23 17:51:42 2007 From: samuelgiffney at gmail.com (Sam Giffney) Date: Thu, 23 Aug 2007 23:51:42 +0200 Subject: [Ferret-talk] custom sort routine In-Reply-To: <784142de39d96d1f4e47d5fb0699991d@ruby-forum.com> References: <784142de39d96d1f4e47d5fb0699991d@ruby-forum.com> Message-ID: <9a3e4677712b25830c4d621973f02449@ruby-forum.com> Raymond O'Connor wrote: > Right now, I'm just sorting by the popularity column in my search > results, although this doesn't always provide "good" results, neither > does just sorting by document relevance. I'd like some combination of > the two to sort by. Is it possible to do this efficiently with ferret? Another option, although requiring a bit more work for the index, would be to boost each product by a dynamic value (appropriate to a normalised popularity perhaps) at index build time. Then you could just search and popularity would automatically be utilised. Sam -- Posted via http://www.ruby-forum.com/. From jha at aughey.com Thu Aug 23 20:33:28 2007 From: jha at aughey.com (John Aughey) Date: Thu, 23 Aug 2007 19:33:28 -0500 Subject: [Ferret-talk] index all but search in some fields In-Reply-To: <0026329b57de3d96c3c1535e391243ff@ruby-forum.com> References: <0026329b57de3d96c3c1535e391243ff@ruby-forum.com> Message-ID: <7168ce3a0708231733g2c7597ceg16d0f021f710eaab@mail.gmail.com> In QueryParser you can specify a :fields => [allyourfieldsyouwanttoquery] option to choose which fields you want to be searchable. I believe this is what you want to do. John On 8/23/07, Otmar Tschendel wrote: > > Hi, > > i like to index most of my model fields, but limit the search only to a > (changing) subset of this fields. > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070823/dd7e8bb6/attachment.html From jk at jkraemer.net Fri Aug 24 02:48:07 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Fri, 24 Aug 2007 08:48:07 +0200 Subject: [Ferret-talk] custom sort routine In-Reply-To: <9a3e4677712b25830c4d621973f02449@ruby-forum.com> References: <784142de39d96d1f4e47d5fb0699991d@ruby-forum.com> <9a3e4677712b25830c4d621973f02449@ruby-forum.com> Message-ID: <20070824064807.GB16504@thunder.jkraemer.net> On Thu, Aug 23, 2007 at 11:51:42PM +0200, Sam Giffney wrote: > Raymond O'Connor wrote: > > Right now, I'm just sorting by the popularity column in my search > > results, although this doesn't always provide "good" results, neither > > does just sorting by document relevance. I'd like some combination of > > the two to sort by. Is it possible to do this efficiently with ferret? > > Another option, although requiring a bit more work for the index, would > be to boost each product by a dynamic value (appropriate to a normalised > popularity perhaps) at index build time. Then you could just search and > popularity would automatically be utilised. cool, didn't think of this - sounds better to me than constructing the complex queries I suggested :-) Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From jk at jkraemer.net Fri Aug 24 02:52:13 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Fri, 24 Aug 2007 08:52:13 +0200 Subject: [Ferret-talk] index all but search in some fields In-Reply-To: <7168ce3a0708231733g2c7597ceg16d0f021f710eaab@mail.gmail.com> References: <0026329b57de3d96c3c1535e391243ff@ruby-forum.com> <7168ce3a0708231733g2c7597ceg16d0f021f710eaab@mail.gmail.com> Message-ID: <20070824065213.GC16504@thunder.jkraemer.net> On Thu, Aug 23, 2007 at 07:33:28PM -0500, John Aughey wrote: > In QueryParser you can specify a :fields => [allyourfieldsyouwanttoquery] > option to choose which fields you want to be searchable. I believe this is > what you want to do. > > John > > On 8/23/07, Otmar Tschendel wrote: > > > > Hi, > > > > i like to index most of my model fields, but limit the search only to a > > (changing) subset of this fields. you can also specify a list of fields in your queries: field1|field2|anotherfield:query_string Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From otmar.tschendel at philips.com Fri Aug 24 03:38:17 2007 From: otmar.tschendel at philips.com (Otmar Tschendel) Date: Fri, 24 Aug 2007 09:38:17 +0200 Subject: [Ferret-talk] index all but search in some fields In-Reply-To: <7168ce3a0708231733g2c7597ceg16d0f021f710eaab@mail.gmail.com> References: <0026329b57de3d96c3c1535e391243ff@ruby-forum.com> <7168ce3a0708231733g2c7597ceg16d0f021f710eaab@mail.gmail.com> Message-ID: <615f5726f95639f36227c7170e87d974@ruby-forum.com> John Aughey wrote: > In QueryParser you can specify a :fields => > [allyourfieldsyouwanttoquery] > option to choose which fields you want to be searchable. I believe this > is > what you want to do. > > John All fields should be searchable. But i like to decide in which fields to search at a give point in time. -- Posted via http://www.ruby-forum.com/. From otmar.tschendel at philips.com Fri Aug 24 05:41:43 2007 From: otmar.tschendel at philips.com (Otmar Tschendel) Date: Fri, 24 Aug 2007 11:41:43 +0200 Subject: [Ferret-talk] index all but search in some fields In-Reply-To: <20070824065213.GC16504@thunder.jkraemer.net> References: <0026329b57de3d96c3c1535e391243ff@ruby-forum.com> <7168ce3a0708231733g2c7597ceg16d0f021f710eaab@mail.gmail.com> <20070824065213.GC16504@thunder.jkraemer.net> Message-ID: Jens Kraemer wrote: > On Thu, Aug 23, 2007 at 07:33:28PM -0500, John Aughey wrote: >> > i like to index most of my model fields, but limit the search only to a >> > (changing) subset of this fields. > > you can also specify a list of fields in your queries: > > field1|field2|anotherfield:query_string > > Jens > > -- > Jens Kr?mer > http://www.jkraemer.net/ - Blog > http://www.omdb.org/ - The new free film database Thanks Jens, that exactly was my problem. It's working now. -- Posted via http://www.ruby-forum.com/. From syrius.ml at no-log.org Fri Aug 24 04:51:12 2007 From: syrius.ml at no-log.org (syrius.ml at no-log.org) Date: Fri, 24 Aug 2007 10:51:12 +0200 Subject: [Ferret-talk] can't create new ticket Message-ID: <87veb5wccc.87tzqpwccc@87sl69wccc.message.id> Hi there, I can't seem to find a way to create a new ticket. I'd like to create tickets for those two reproduced bugs: http://rubyforge.org/pipermail/ferret-talk/2007-June/003588.html http://rubyforge.org/pipermail/ferret-talk/2007-June/003600.html Can anybody please do it for me or point me in the right direction ? TIA -- From kraemer at webit.de Fri Aug 24 09:28:29 2007 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 24 Aug 2007 15:28:29 +0200 Subject: [Ferret-talk] can't create new ticket In-Reply-To: <87veb5wccc.87tzqpwccc@87sl69wccc.message.id> References: <87veb5wccc.87tzqpwccc@87sl69wccc.message.id> Message-ID: <20070824132829.GE16680@cordoba.webit.de> On Fri, Aug 24, 2007 at 10:51:12AM +0200, syrius.ml at no-log.org wrote: > > Hi there, > > I can't seem to find a way to create a new ticket. does http://ferret.davebalmain.com/trac/newticket not work for you? Jens > I'd like to create tickets for those two reproduced bugs: > http://rubyforge.org/pipermail/ferret-talk/2007-June/003588.html > http://rubyforge.org/pipermail/ferret-talk/2007-June/003600.html > > Can anybody please do it for me or point me in the right direction ? > > TIA > > > -- > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From syrius.ml at no-log.org Sat Aug 25 05:37:19 2007 From: syrius.ml at no-log.org (syrius.ml at no-log.org) Date: Sat, 25 Aug 2007 11:37:19 +0200 Subject: [Ferret-talk] can't create new ticket In-Reply-To: <20070824132829.GE16680@cordoba.webit.de> (Jens Kraemer's message of "Fri, 24 Aug 2007 15:28:29 +0200") References: <87veb5wccc.87tzqpwccc@87sl69wccc.message.id> <20070824132829.GE16680@cordoba.webit.de> Message-ID: <876434j6x3.874pioj6x3@873ay8j6x3.message.id> Jens Kraemer writes: >> I can't seem to find a way to create a new ticket. > > does http://ferret.davebalmain.com/trac/newticket not work for you? Hm, I should haven't written that I was getting "500 Internal Server Error (Submission rejected as potential spam)" messages. Could you try to submit them please ? >> I'd like to create tickets for those two reproduced bugs: >> http://rubyforge.org/pipermail/ferret-talk/2007-June/003588.html >> http://rubyforge.org/pipermail/ferret-talk/2007-June/003600.html >> >> Can anybody please do it for me or point me in the right direction ? -- From jk at jkraemer.net Sat Aug 25 06:05:30 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Sat, 25 Aug 2007 12:05:30 +0200 Subject: [Ferret-talk] can't create new ticket In-Reply-To: <876434j6x3.874pioj6x3@873ay8j6x3.message.id> References: <87veb5wccc.87tzqpwccc@87sl69wccc.message.id> <20070824132829.GE16680@cordoba.webit.de> <876434j6x3.874pioj6x3@873ay8j6x3.message.id> Message-ID: <20070825100530.GB23831@thunder.jkraemer.net> On Sat, Aug 25, 2007 at 11:37:19AM +0200, syrius.ml at no-log.org wrote: > Jens Kraemer writes: > > >> I can't seem to find a way to create a new ticket. > > > > does http://ferret.davebalmain.com/trac/newticket not work for you? > > Hm, I should haven't written that I was getting "500 Internal Server > Error (Submission rejected as potential spam)" messages. hm, most of the time adding more text and less links to the problem description helps to get around this. Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From djml1958 at gmail.com Sat Aug 25 18:59:35 2007 From: djml1958 at gmail.com (Djm Lab) Date: Sun, 26 Aug 2007 00:59:35 +0200 Subject: [Ferret-talk] Protecting Your Search Engine Rankings Message-ID: <18411259ec051bc1746e1d92686d081a@ruby-forum.com> Your website's ranking on search engines is a vital element of your overall marketing campaign, and there are ways to improve your link popularity through legitimate methods. Unfortunately, the Internet is populated by bands of dishonest webmasters seeking to improve their link popularity by faking out search engines. The good news is that search engines have figured this out, and are now on guard for "spam" pages and sites that have increased their rankings by artificial methods. When a search engines tracks down such a site, that site is demoted in ranking or completely removed from the search engine's index. The bad news is that some high quality, completely above-board sites are being mistaken for these web page criminals. Your page may be in danger of being caught up in the "spam" net and tossed from a search engine's index, even though you have done nothing to deserve such harsh treatment. But there are things you can do - and things you should be sure NOT to do - which will prevent this kind of misperception. Link popularity is mostly based on the quality of sites you are linked to. Google pioneered this criteria for assigning website ranking, and virtually all search engines on the Internet now use it. There are legitimate ways to go about increasing your link popularity, but at the same time, you must be scrupulously careful about which sites you choose to link to. Google frequently imposes penalties on sites that have linked to other sites solely for the purpose of artificially boosting their link popularity. They have actually labeled these links "bad neighborhoods." You can raise a toast to the fact that you cannot be penalized when a bad neighborhood links to your site; penalty happens only when you are the one sending out the link to a bad neighborhood. But you must check, and double-check, all the links that are active on your links page to make sure you haven't linked to a bad neighborhood. The first thing to check out is whether or not the pages you have linked to have been penalized. The most direct way to do this is to download the Google toolbar at toolbar.google.com. You will then see that most pages are given a "Pagerank" which is represented by a sliding green scale on the Google toolbar. Do not link to any site that shows no green at all on the scale. This is especially important when the scale is completely gray. It is more than likely that these pages have been penalized. If you are linked to these pages, you may catch their penalty, and like the flu, it may be difficult to recover from the infection. There is no need to be afraid of linking to sites whose scale shows only a tiny sliver of green on their scale. These sites have not been penalized, and their links may grow in value and popularity. However, do make sure that you closely monitor these kind of links to ascertain that at some point they do not sustain a penalty once you have linked up to them from your links page. Another evil trick that illicit webmasters use to artificially boost their link popularity is the use of hidden text. Search engines usually use the words on web pages as a factor in forming their rankings, which means that if the text on your page contains your keywords, you have more of an opportunity to increase your search engine ranking than a page that does not contain text inclusive of keywords. Some webmasters have gotten around this formula by hiding their keywords in such a way so that they are invisible to any visitors to their site. For example, they have used the keywords but made them the same color as the background color of the page, such as a plethora of white keywords on a white background. You cannot see these words with the human eye - but the eye of search engine spider can spot them easily! A spider is the program search engines use to index web pages, and when it sees these invisible words, it goes back and boosts that page's link ranking. Webmasters may be brilliant and sometimes devious, but search engines have figured these tricks out. As soon as a search engine perceive the use of hidden text - splat! the page is penalized. The downside of this is that sometimes the spider is a bit noverzealous and will penalize a page by mistake. For example, if the background color of your page is gray, and you have placed gray text inside a black box, the spider will only take note of the gray text and assume you are employing hidden text. To avoid any risk of false penalty, simply direct your webmaster not to assign the same color to text as the background color of the page - ever! Another potential problem that can result in a penalty is called "keyword stuffing." It is important to have your keywords appear in the text on your page, but sometimes you can go a little overboard in your enthusiasm to please those spiders. A search engine uses what is called "Keyphrase Density" to determine if a site is trying to artificially boost their ranking. This is the ratio of keywords to the rest of the words on the page. Search engines assign a limit to the number of times you can use a keyword before it decides you have overdone it and penalizes your site. This ratio is quite high, so it is difficult to surpass without sounding as if you are stuttering - unless your keyword is part of your company name. If this is the case, it is easy for keyword density to soar. So, if your keyword is "renters insurance," be sure you don't use this phrase in every sentence. Carefully edit the text on your site so that the copy flows naturally and the keyword is not repeated incessantly. A good rule of thumb is your keyword should never appear in more than half the sentences on the page. The final potential risk factor is known as "cloaking." To those of you who are diligent Trekkies, this concept should be easy to understand. For the rest of you?cloaking is when the server directs a visitor to one page and a search engine spider to a different page. The page the spider sees is "cloaked" because it is invisible to regular traffic, and deliberately set-up to raise the site's search engine ranking. A cloaked page tries to feed the spider everything it needs to rocket that page's ranking to the top of the list. It is natural that search engines have responded to this act of deception with extreme enmity, imposing steep penalties on these sites. The problem on your end is that sometimes pages are cloaked for legitimate reasons, such as prevention against the theft of code, often referred to as "pagejacking." This kind of shielding is unnecessary these days due to the use of "off page" elements, such as link popularity, that cannot be stolen. To be on the safe side, be sure that your webmaster is aware that absolutely no cloaking is acceptable. Make sure the webmaster understands that cloaking of any kind will put your website at great risk. Just as you must be diligent in increasing your link popularity and your ranking, you must be equally diligent to avoid being unfairly penalized. So be sure to monitor your site closely and avoid any appearance of artificially boosting your rankings. http://seos.awardspace.com -- Posted via http://www.ruby-forum.com/. From syrius.ml at no-log.org Sun Aug 26 06:25:29 2007 From: syrius.ml at no-log.org (syrius.ml at no-log.org) Date: Sun, 26 Aug 2007 12:25:29 +0200 Subject: [Ferret-talk] can't create new ticket In-Reply-To: <20070825100530.GB23831@thunder.jkraemer.net> (Jens Kraemer's message of "Sat, 25 Aug 2007 12:05:30 +0200") References: <87veb5wccc.87tzqpwccc@87sl69wccc.message.id> <20070824132829.GE16680@cordoba.webit.de> <876434j6x3.874pioj6x3@873ay8j6x3.message.id> <20070825100530.GB23831@thunder.jkraemer.net> Message-ID: <87sl66mwej.87r6lqmwej@87ps1amwej.message.id> Jens Kraemer writes: > On Sat, Aug 25, 2007 at 11:37:19AM +0200, syrius.ml at no-log.org wrote: >> Jens Kraemer writes: >> >> >> I can't seem to find a way to create a new ticket. >> > >> > does http://ferret.davebalmain.com/trac/newticket not work for you? >> >> Hm, I should haven't written that I was getting "500 Internal Server >> Error (Submission rejected as potential spam)" messages. > > hm, most of the time adding more text and less links to the problem > description helps to get around this. hm, it seems I'm wasting my time here. I'm saying I can't create new tickets, it's obvious I've tried several time before asking ! Could you PLEASE try to create tickets for the 2 bugs I've reported ? http://rubyforge.org/pipermail/ferret-talk/2007-June/003588.html http://rubyforge.org/pipermail/ferret-talk/2007-June/003600.html -- From jk at jkraemer.net Sun Aug 26 08:43:19 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Sun, 26 Aug 2007 14:43:19 +0200 Subject: [Ferret-talk] can't create new ticket In-Reply-To: <87sl66mwej.87r6lqmwej@87ps1amwej.message.id> References: <87veb5wccc.87tzqpwccc@87sl69wccc.message.id> <20070824132829.GE16680@cordoba.webit.de> <876434j6x3.874pioj6x3@873ay8j6x3.message.id> <20070825100530.GB23831@thunder.jkraemer.net> <87sl66mwej.87r6lqmwej@87ps1amwej.message.id> Message-ID: <20070826124319.GG23831@thunder.jkraemer.net> On Sun, Aug 26, 2007 at 12:25:29PM +0200, syrius.ml at no-log.org wrote: > Jens Kraemer writes: > > > On Sat, Aug 25, 2007 at 11:37:19AM +0200, syrius.ml at no-log.org wrote: > >> Jens Kraemer writes: > >> > >> >> I can't seem to find a way to create a new ticket. > >> > > >> > does http://ferret.davebalmain.com/trac/newticket not work for you? > >> > >> Hm, I should haven't written that I was getting "500 Internal Server > >> Error (Submission rejected as potential spam)" messages. > > > > hm, most of the time adding more text and less links to the problem > > description helps to get around this. > > hm, it seems I'm wasting my time here. > > I'm saying I can't create new tickets, it's obvious I've tried several > time before asking ! Sorry, I didn't intend to offend you. It 's just my experience with aaf's trac, that the akismet spam filter sometimes rejceted very short ticket reports as spam. As it seems, Ferret's Trac doesn't have this problem, but rejects all tickets as spam. As I don't operate this system, I'm afraid I can't do anything but contact Dave about this issue, which I've done now. Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From weibel at gmail.com Sun Aug 26 17:48:16 2007 From: weibel at gmail.com (Kasper Weibel) Date: Sun, 26 Aug 2007 23:48:16 +0200 Subject: [Ferret-talk] custom sort routine In-Reply-To: <20070824064807.GB16504@thunder.jkraemer.net> References: <784142de39d96d1f4e47d5fb0699991d@ruby-forum.com> <9a3e4677712b25830c4d621973f02449@ruby-forum.com> <20070824064807.GB16504@thunder.jkraemer.net> Message-ID: <117ec91fe3bdf7d1369ff313f703354f@ruby-forum.com> Jens Kraemer wrote: > On Thu, Aug 23, 2007 at 11:51:42PM +0200, Sam Giffney wrote: >> Raymond O'Connor wrote: >> > Right now, I'm just sorting by the popularity column in my search >> > results, although this doesn't always provide "good" results, neither >> > does just sorting by document relevance. I'd like some combination of >> > the two to sort by. Is it possible to do this efficiently with ferret? >> >> Another option, although requiring a bit more work for the index, would >> be to boost each product by a dynamic value (appropriate to a normalised >> popularity perhaps) at index build time. Then you could just search and >> popularity would automatically be utilised. > > cool, didn't think of this - sounds better to me than constructing the > complex queries I suggested :-) Cool. Just today I was looking for a solution like this :-) When implementing I stumbled upon a problem indexing with script/runner Mymodel.rebuild_index class Mymodel < ActiveRecord::Base acts_as_ferret :fields => {:name => {:boost => :rating}} # function for determining boost value def rating return instance_rating end end This exits with ./script/../config/../vendor/rails/railties/lib/commands/runner.rb:47: ./script/ ../config/../vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb:136:in `add_fie ld': can't convert Symbol into Float (TypeError) Using the function name instead of the :symbol doesn't work either ./script/../config/../vendor/rails/railties/lib/commands/runner.rb:47: undefined method `rating' for Place:Class (NoMethodError) -- Posted via http://www.ruby-forum.com/. From jk at jkraemer.net Mon Aug 27 01:49:05 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Mon, 27 Aug 2007 07:49:05 +0200 Subject: [Ferret-talk] custom sort routine In-Reply-To: <117ec91fe3bdf7d1369ff313f703354f@ruby-forum.com> References: <784142de39d96d1f4e47d5fb0699991d@ruby-forum.com> <9a3e4677712b25830c4d621973f02449@ruby-forum.com> <20070824064807.GB16504@thunder.jkraemer.net> <117ec91fe3bdf7d1369ff313f703354f@ruby-forum.com> Message-ID: <20070827054905.GD14812@thunder.jkraemer.net> Hi! please see comments below. On Sun, Aug 26, 2007 at 11:48:16PM +0200, Kasper Weibel wrote: > Jens Kraemer wrote: > > On Thu, Aug 23, 2007 at 11:51:42PM +0200, Sam Giffney wrote: > >> Raymond O'Connor wrote: [..] > >> Another option, although requiring a bit more work for the index, would > >> be to boost each product by a dynamic value (appropriate to a normalised > >> popularity perhaps) at index build time. Then you could just search and > >> popularity would automatically be utilised. > > > > cool, didn't think of this - sounds better to me than constructing the > > complex queries I suggested :-) > > Cool. Just today I was looking for a solution like this :-) > > When implementing I stumbled upon a problem indexing with > script/runner Mymodel.rebuild_index > > class Mymodel < ActiveRecord::Base > acts_as_ferret :fields => {:name => {:boost => :rating}} > > # function for determining boost value > def rating > return instance_rating > end > end > > This exits with > > ./script/../config/../vendor/rails/railties/lib/commands/runner.rb:47: > ./script/ > ../config/../vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb:136:in > `add_fie > ld': can't convert Symbol into Float (TypeError) D'uh ;-) Aaf doesn't support dynamic per-document or per-field boosts yet, at least not in the declarative way outlined above. For now, you'll have to override the to_doc instance method so you can manually apply the boost. I'll add that to aaf soon, just created a ticket: http://projects.jkraemer.net/acts_as_ferret/ticket/166 cheers, Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From lmlynek at gmail.com Mon Aug 27 03:34:12 2007 From: lmlynek at gmail.com (Mlynco Mlynco) Date: Mon, 27 Aug 2007 09:34:12 +0200 Subject: [Ferret-talk] scoring problem in acts_as_ferret In-Reply-To: <20070823180246.GB16215@thunder.jkraemer.net> References: <4AE78143-101B-4710-BEA9-C8D773C5051F@gmx.net> <4be62cae86336e2c0a3f23246dbc0a94@ruby-forum.com> <20070823180246.GB16215@thunder.jkraemer.net> Message-ID: <4418b734443b7e52f5a31c552681de2d@ruby-forum.com> Jens Kraemer wrote: > have a look at ConstantScoreQuery. It allows to turn a Filter into a > query where all results score equally. > > To get this Filter, you could use the QueryFilter class, which allows > you to turn your original query into a Filter. > > query = > ConstantScoreQuery.new(QueryFilter.new(TermQuery.new(:ingredients, > 'beef')) > > looks like there should be an easier way to achieve this, though ;-) > > Jens Jens, Thanks a lot for help. The ConstantScoreQuery with QueryFilter and TermQuery/MultiTermQuery work fine. I am able to get constant value for all of the ingredients inserted into the query. The problem appears when I would like to get more precise results with multiple ingredients inserted. So, coming back to my example: recipe1: "beef" recipe2: "onion beef chicken" recipe3: "onion beef chicken tomato" recipe4: "onion chicken" All results from that query are equal and that's great: query = ConstantScoreQuery.new(QueryFilter.new(TermQuery.new(:ingredients, 'beef')) When I put more than one ingredient into account I would like to distinguish different scores for recipes which: include three of them, include two of them, recipes which has one of them only. So I would like create a query which return 100% hits for recipe2 and recipe3, appropriately less for recipe4 and adequately less for recipe1. I created multi query like that: multi_term_query = MultiTermQuery.new(:ingredients) multi_term_query << "onion" << "beef" << "chicken" query = ConstantScoreQuery.new(QueryFilter.new(multi_term_query) but it doesn't work at all. Any ideas? Thanks in advance, mlynco -- Posted via http://www.ruby-forum.com/. From neongrau at gmail.com Mon Aug 27 09:59:47 2007 From: neongrau at gmail.com (neongrau __) Date: Mon, 27 Aug 2007 15:59:47 +0200 Subject: [Ferret-talk] Ferret DRb on windows? Message-ID: i'm running a bunch of proxybalanced mongrels on a windows server. and since the memory consumption of all those mongrels is getting too high i wanted to set up the DRb'ed ferret server. but script/ferret_start doesn't work and seems to be written for linux (unix) only. is there a way to run it on windows? -- Posted via http://www.ruby-forum.com/. From jk at jkraemer.net Mon Aug 27 10:22:54 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Mon, 27 Aug 2007 16:22:54 +0200 Subject: [Ferret-talk] Ferret DRb on windows? In-Reply-To: References: Message-ID: <20070827142254.GA26287@thunder.jkraemer.net> On Mon, Aug 27, 2007 at 03:59:47PM +0200, neongrau __ wrote: > i'm running a bunch of proxybalanced mongrels on a windows server. > and since the memory consumption of all those mongrels is getting too > high i wanted to set up the DRb'ed ferret server. > > but script/ferret_start doesn't work and seems to be written for linux > (unix) only. > > is there a way to run it on windows? yes, see there: http://www.pluitsolutions.com/2007/07/30/acts-as-ferret-drbserver-win32-service/ cheers, Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From neongrau at gmail.com Mon Aug 27 10:29:33 2007 From: neongrau at gmail.com (neongrau __) Date: Mon, 27 Aug 2007 16:29:33 +0200 Subject: [Ferret-talk] Ferret DRb on windows? In-Reply-To: <20070827142254.GA26287@thunder.jkraemer.net> References: <20070827142254.GA26287@thunder.jkraemer.net> Message-ID: <875bba81c815b1a1df21e8293ef88cf3@ruby-forum.com> awesome! thanks alot! i was searching google like crazy and couldn't find anything myself. -- Posted via http://www.ruby-forum.com/. From jk at jkraemer.net Mon Aug 27 14:31:26 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Mon, 27 Aug 2007 20:31:26 +0200 Subject: [Ferret-talk] can't create new ticket In-Reply-To: <20070826124319.GG23831@thunder.jkraemer.net> References: <87veb5wccc.87tzqpwccc@87sl69wccc.message.id> <20070824132829.GE16680@cordoba.webit.de> <876434j6x3.874pioj6x3@873ay8j6x3.message.id> <20070825100530.GB23831@thunder.jkraemer.net> <87sl66mwej.87r6lqmwej@87ps1amwej.message.id> <20070826124319.GG23831@thunder.jkraemer.net> Message-ID: <20070827183125.GD26287@thunder.jkraemer.net> Ok, Dave taught the over zealous spam filter some behaviour so I could file those tickets :-) cheers, Jens On Sun, Aug 26, 2007 at 02:43:19PM +0200, Jens Kraemer wrote: > On Sun, Aug 26, 2007 at 12:25:29PM +0200, syrius.ml at no-log.org wrote: > > Jens Kraemer writes: > > > > > On Sat, Aug 25, 2007 at 11:37:19AM +0200, syrius.ml at no-log.org wrote: > > >> Jens Kraemer writes: > > >> > > >> >> I can't seem to find a way to create a new ticket. > > >> > > > >> > does http://ferret.davebalmain.com/trac/newticket not work for you? > > >> > > >> Hm, I should haven't written that I was getting "500 Internal Server > > >> Error (Submission rejected as potential spam)" messages. > > > > > > hm, most of the time adding more text and less links to the problem > > > description helps to get around this. > > > > hm, it seems I'm wasting my time here. > > > > I'm saying I can't create new tickets, it's obvious I've tried several > > time before asking ! > > Sorry, I didn't intend to offend you. > > It 's just my experience with aaf's trac, that the akismet spam filter > sometimes rejceted very short ticket reports as spam. As it seems, > Ferret's Trac doesn't have this problem, but rejects all tickets as > spam. > > As I don't operate this system, I'm afraid I can't do anything but > contact Dave about this issue, which I've done now. > > > Jens > > -- > Jens Kr?mer > http://www.jkraemer.net/ - Blog > http://www.omdb.org/ - The new free film database > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From mitchmonmouth at gmail.com Mon Aug 27 15:22:06 2007 From: mitchmonmouth at gmail.com (Mitch Monmouth) Date: Mon, 27 Aug 2007 12:22:06 -0700 Subject: [Ferret-talk] Negative numbers in range searches Message-ID: <189d64e90708271222j51efbbe5i5b0115f671f40415@mail.gmail.com> I have been trying to get negative numbers to work in range searches. This is for lat/lon in a geographic search. Obviously I could find workarounds, but it seems like this should work. If I use a straight negative number, like this [-200 -100] it will come up empty. If I escape the - with a backslash, it works sometimes, but not always, but only if I REVERSE the numbers, e.g. [\-100 \-200]. It seems to not work when the number of digits changes as here, but it works if both numbers have the same number of digits, e.g. [\-200 \-10] and [\-10 \-200] will come up empty. Overall it's pretty strange and unpredictable. I tried to file a bug but the Trac system rejected it as spam.... Does anyone know how to get this working or can anyone confirm issues with this? Thanks. -MM -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070827/0f97cbe8/attachment.html From gethryn at ghavalas.com Tue Aug 28 02:52:41 2007 From: gethryn at ghavalas.com (Gethryn Ghavalas) Date: Tue, 28 Aug 2007 08:52:41 +0200 Subject: [Ferret-talk] ERROR: While executing gem ... (Gem::Installer::ExtensionBu Message-ID: Hi all, Sorry if this is answered somewhere -- I am new to ruby and to linux, and can't figure it out: When I try to install ferret (see below), I get ERROR: While executing gem ... (Gem::Installer::ExtensionBuildError). Same thing happens for any version I pick from th list. I am using: gcc (GCC) 4.1.2 (Ubuntu 4.1.2-0ubuntu4) ruby 1.8.5 (2006-08-25) [i486-linux] gem 0.9.4 No idea how to fix this -- please help! Thanks, Geth ======================================================== gethryn at gethryn-desktop:~/rails_space$ sudo gem install ferret Password: Select which gem to install for your platform (i486-linux) 1. ferret 0.11.4 (ruby) 2. ferret 0.11.4 (mswin32) 3. ferret 0.11.3 (ruby) 4. ferret 0.11.2 (ruby) 5. Skip this gem 6. Cancel installation > 1 Building native extensions. This could take a while... ERROR: While executing gem ... (Gem::Installer::ExtensionBuildError) ERROR: Failed to build gem native extension. ruby extconf.rb install ferret extconf.rb:11:in `require': no such file to load -- mkmf (LoadError) from extconf.rb:11 Gem files will remain installed in /usr/lib/ruby/gems/1.8/gems/ferret-0.11.4 for inspection. Results logged to /usr/lib/ruby/gems/1.8/gems/ferret-0.11.4/ext/gem_make.out gethryn at gethryn-desktop:~/rails_space$ -- Posted via http://www.ruby-forum.com/. From bk at benjaminkrause.com Tue Aug 28 03:45:28 2007 From: bk at benjaminkrause.com (Benjamin Krause) Date: Tue, 28 Aug 2007 09:45:28 +0200 Subject: [Ferret-talk] ERROR: While executing gem ... (Gem::Installer::ExtensionBu In-Reply-To: References: Message-ID: > ruby extconf.rb install ferret > extconf.rb:11:in `require': no such file to load -- mkmf (LoadError) > from extconf.rb:11 Geth, i would assume, your ruby setup is broken. at least for root. are you able to compile other extension or install other gems? mkmf is a basic library of ruby, i've got it automatically installed by ruby at /usr/lib/ruby/1.8/mkmf.rb are you able to open a irb session and require mkmf manually? did you check, if that file exists in your ruby 1.8 folder? maybe you should reinstall ruby. and did you check the log file? /usr/lib/ruby/gems/1.8/gems/ferret-0.11.4/ext/gem_make.out Ben From kraemer at webit.de Tue Aug 28 03:58:13 2007 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 28 Aug 2007 09:58:13 +0200 Subject: [Ferret-talk] ERROR: While executing gem ... (Gem::Installer::ExtensionBu In-Reply-To: References: Message-ID: <20070828075813.GA7512@cordoba.webit.de> On Tue, Aug 28, 2007 at 09:45:28AM +0200, Benjamin Krause wrote: > > ruby extconf.rb install ferret > > extconf.rb:11:in `require': no such file to load -- mkmf (LoadError) > > from extconf.rb:11 > > Geth, > > i would assume, your ruby setup is broken. at least for root. > are you able to compile other extension or install other > gems? mkmf is a basic library of ruby, i've got it automatically > installed by ruby at > > /usr/lib/ruby/1.8/mkmf.rb he's on Ubuntu, which inherited the nice feature of having Ruby split up in several packages from Debian :( mkmf is part of ruby1.8-dev. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From syrius.ml at no-log.org Tue Aug 28 04:12:53 2007 From: syrius.ml at no-log.org (syrius.ml at no-log.org) Date: Tue, 28 Aug 2007 10:12:53 +0200 Subject: [Ferret-talk] can't create new ticket In-Reply-To: <20070827183125.GD26287@thunder.jkraemer.net> (Jens Kraemer's message of "Mon, 27 Aug 2007 20:31:26 +0200") References: <87veb5wccc.87tzqpwccc@87sl69wccc.message.id> <20070824132829.GE16680@cordoba.webit.de> <876434j6x3.874pioj6x3@873ay8j6x3.message.id> <20070825100530.GB23831@thunder.jkraemer.net> <87sl66mwej.87r6lqmwej@87ps1amwej.message.id> <20070826124319.GG23831@thunder.jkraemer.net> <20070827183125.GD26287@thunder.jkraemer.net> Message-ID: <873ay484e2.871wdo84e2@87zm0c6ptm.message.id> Jens Kraemer writes: > Dave taught the over zealous spam filter some behaviour so I could file > those tickets :-) Thanks a lot Jens. -- From matthew.langham at indiginox.com Tue Aug 28 04:38:21 2007 From: matthew.langham at indiginox.com (Matthew Langham) Date: Tue, 28 Aug 2007 10:38:21 +0200 Subject: [Ferret-talk] Still getting "too many open files" Message-ID: We have still having problems with Ferret dying on us regularly with the error message: >> ferret server error IO Error occured at :93 in xraiseError occured in fs_store.c:127 - fs_each doing 'each' in /var/www/web1/oms/current/script/../config/../index/production/band/20070805130005: << We are running Ferret as a DrbServer and using ferret 0.11.3. I've read about using ulimit to up the file number - but doesn't that just affect the current shell - i.e. where do I need to put this when running my mongrels etc? Thanks for any other tips - this is currently a bit frustrating. Matthew -- Posted via http://www.ruby-forum.com/. From gethryn at ghavalas.com Tue Aug 28 04:47:27 2007 From: gethryn at ghavalas.com (Gethryn Ghavalas) Date: Tue, 28 Aug 2007 10:47:27 +0200 Subject: [Ferret-talk] ERROR: While executing gem ... (Gem::Installer::Extensio In-Reply-To: <20070828075813.GA7512@cordoba.webit.de> References: <20070828075813.GA7512@cordoba.webit.de> Message-ID: Jens Kraemer wrote: > he's on Ubuntu, which inherited the nice feature of having Ruby split up > in several packages from Debian :( > > mkmf is part of ruby1.8-dev. > A THOUSAND THANK YOUS! I can finish the book I am learning from now! "Successfully installed ferret-0.11.4" worked perfectly after I ran: $ sudo apt-get install ruby1.8-dev Thanks again, Geth -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Tue Aug 28 05:09:14 2007 From: kraemer at webit.de (Jens Kraemer) Date: Tue, 28 Aug 2007 11:09:14 +0200 Subject: [Ferret-talk] Still getting "too many open files" In-Reply-To: References: Message-ID: <20070828090914.GB7512@cordoba.webit.de> On Tue, Aug 28, 2007 at 10:38:21AM +0200, Matthew Langham wrote: > We have still having problems with Ferret dying on us regularly with the > error message: > > >> > ferret server error IO Error occured at :93 in xraiseError > occured in fs_store.c:127 - fs_each > doing 'each' in > /var/www/web1/oms/current/script/../config/../index/production/band/20070805130005: > > << > > We are running Ferret as a DrbServer and using ferret 0.11.3. > > I've read about using ulimit to up the file number - but doesn't that > just affect the current shell - i.e. where do I need to put this when > running my mongrels etc? maybe this does help: http://forums.suselinuxsupport.de/index.php?showtopic=34112&st=0&p=153853&#entry153853 (afair you were on Suse?) cheers, Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From steve.tooke at gmail.com Tue Aug 28 05:55:21 2007 From: steve.tooke at gmail.com (Steve Tooke) Date: Tue, 28 Aug 2007 11:55:21 +0200 Subject: [Ferret-talk] AAF: find_by_contents on AR Association Total Hits In-Reply-To: <171d91a5cf5d3fd06b8f65fcb7d2eb78@ruby-forum.com> References: <171d91a5cf5d3fd06b8f65fcb7d2eb78@ruby-forum.com> Message-ID: <380180d8c7335fe69588172e1dda5346@ruby-forum.com> Is nobody else coming across this problem? Steve Tooke wrote: > I seem to be getting some behaviour thats unexpected (for me anyway) > when using find_by_contents on an ActiveRecord has_many association. The > results that are returned are only the records that belong to the model > returned, but the total_hits that are being returned appear to be for > the whole table. -- Posted via http://www.ruby-forum.com/. From jk at jkraemer.net Tue Aug 28 06:19:51 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Tue, 28 Aug 2007 12:19:51 +0200 Subject: [Ferret-talk] AAF: find_by_contents on AR Association Total Hits In-Reply-To: <380180d8c7335fe69588172e1dda5346@ruby-forum.com> References: <171d91a5cf5d3fd06b8f65fcb7d2eb78@ruby-forum.com> <380180d8c7335fe69588172e1dda5346@ruby-forum.com> Message-ID: <20070828101951.GK26287@thunder.jkraemer.net> On Tue, Aug 28, 2007 at 11:55:21AM +0200, Steve Tooke wrote: > Is nobody else coming across this problem? sorry, seems your mail slipped through somehow. In fact what you are doing is interesting - I never thought of AAF being used on a has_many collection yet. Looks like in one place the scoping is working correctly, but when counting the results it doesn't. As a work around, use Page.find_by_contents(query, {}, { :conditions => ["book_id=?", book.id ] }) to restrict aaf on a special book. Could you please file a ticket for the enhancement to support calling aaf directly on collections at http://projects.jkraemer.net/acts_as_ferret/ ? cheers, Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From steve.tooke at gmail.com Tue Aug 28 06:36:22 2007 From: steve.tooke at gmail.com (Steve Tooke) Date: Tue, 28 Aug 2007 12:36:22 +0200 Subject: [Ferret-talk] AAF: find_by_contents on AR Association Total Hits In-Reply-To: <20070828101951.GK26287@thunder.jkraemer.net> References: <171d91a5cf5d3fd06b8f65fcb7d2eb78@ruby-forum.com> <380180d8c7335fe69588172e1dda5346@ruby-forum.com> <20070828101951.GK26287@thunder.jkraemer.net> Message-ID: Jens Kraemer wrote: > sorry, seems your mail slipped through somehow. Thanks for the quick reply, in hindsight it could probably have a better subject! > Page.find_by_contents(query, {}, { :conditions => ["book_id=?", book.id ] }) Thanks that works great! > Could you please file a ticket for the enhancement to support calling > aaf directly on collections at > http://projects.jkraemer.net/acts_as_ferret/ ? http://projects.jkraemer.net/acts_as_ferret/ticket/167 Great plugin, many thanks Steve -- Posted via http://www.ruby-forum.com/. From jeff.green at jgp.co.uk Tue Aug 28 06:38:57 2007 From: jeff.green at jgp.co.uk (Jeff Green) Date: Tue, 28 Aug 2007 11:38:57 +0100 Subject: [Ferret-talk] ERROR: While executing gem...(Gem::Installer::ExtensionBu In-Reply-To: References: Message-ID: <340d5e2c424a21c0ccba41cde90cc86346d3fc30@jobsgopublic.com> Debian makes the very civillised assumption that not everyone who runs an application needs the ability to compile the application. Now if we can just get a nice integration between apt and gem we would have a much more usable and secure system. On Tue, 2007-08-28 at 08:58 +0100, Jens Kraemer wrote: > On Tue, Aug 28, 2007 at 09:45:28AM +0200, Benjamin Krause wrote: > > > ruby extconf.rb install ferret > > > extconf.rb:11:in `require': no such file to load -- mkmf > (LoadError) > > > from extconf.rb:11 > > > > Geth, > > > > i would assume, your ruby setup is broken. at least for root. > > are you able to compile other extension or install other > > gems? mkmf is a basic library of ruby, i've got it automatically > > installed by ruby at > > > > /usr/lib/ruby/1.8/mkmf.rb > > he's on Ubuntu, which inherited the nice feature of having Ruby split > up > in several packages from Debian :( > > mkmf is part of ruby1.8-dev. > > Jens > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > > Jobs Go Public is a limited company registered in England and Wales. Registration Number 3716200. V.A.T Number 777 9458 52. Registered office; 12-16 Laystall Street, London, EC1R 4PF Consider the environment; please don't print this email unless you really need to. From andreas.korth at gmail.com Tue Aug 28 07:47:56 2007 From: andreas.korth at gmail.com (Andreas Korth) Date: Tue, 28 Aug 2007 13:47:56 +0200 Subject: [Ferret-talk] ERROR: While executing gem...(Gem::Installer::ExtensionBu In-Reply-To: <340d5e2c424a21c0ccba41cde90cc86346d3fc30@jobsgopublic.com> References: <340d5e2c424a21c0ccba41cde90cc86346d3fc30@jobsgopublic.com> Message-ID: <86FF2384-2DEC-484C-A114-546B8AC6AA73@gmx.net> On 28.08.2007, at 12:38, Jeff Green wrote: > Debian makes the very civillised assumption that not everyone who runs > an application needs the ability to compile the application. Now if we > can just get a nice integration between apt and gem we would have a > much > more usable and secure system. RubyGems and APT collide here. Debian's policy is to favor APT over RubyGems, because they don't like the idea of having two package management systems. Fair enough. If you go the Debian way, you would install any gems via APT. Since not every gem (including Ferret) exists as a Debian package, your possibilities are quite limited with this approach. The alternative is to install only ruby, ruby-dev and rubygems via APT and then install gems via RubyGems. However, for gems with native extenstions, you don't have the luxury of dependecy-management. For example: if you install the RMagick gem, you need to install the ImageMagick (dev) packages via APT first. Had you installed the RMagick gem via APT, this dependency would have been installed automatically. On Debian you always need to install the development files for a package in order to compile anything against it. Development packages begin with "lib" and end with "-dev". So for MySQL and ImageMagick these are "libmysqlclient15-dev" and "libmagick9-dev", respectively. It's important to note that, in order to compile _anything_ on Debian, the gcc, make, autoconf and build-essential packages must be installed first. Cheers, Andy From matthew.langham at indiginox.com Tue Aug 28 13:20:40 2007 From: matthew.langham at indiginox.com (Matthew Langham) Date: Tue, 28 Aug 2007 19:20:40 +0200 Subject: [Ferret-talk] Still getting "too many open files" In-Reply-To: <20070828090914.GB7512@cordoba.webit.de> References: <20070828090914.GB7512@cordoba.webit.de> Message-ID: <74694836f028684434305c8341bfb5ed@ruby-forum.com> Jens Kraemer wrote: > On Tue, Aug 28, 2007 at 10:38:21AM +0200, Matthew Langham wrote: >> >> We are running Ferret as a DrbServer and using ferret 0.11.3. >> >> I've read about using ulimit to up the file number - but doesn't that >> just affect the current shell - i.e. where do I need to put this when >> running my mongrels etc? > > maybe this does help: > http://forums.suselinuxsupport.de/index.php?showtopic=34112&st=0&p=153853&#entry153853 > > (afair you were on Suse?) > No, Debian Etch. Any differences? Thanks Matthew -- Posted via http://www.ruby-forum.com/. From kraemer at webit.de Wed Aug 29 04:05:29 2007 From: kraemer at webit.de (Jens Kraemer) Date: Wed, 29 Aug 2007 10:05:29 +0200 Subject: [Ferret-talk] Still getting "too many open files" In-Reply-To: <74694836f028684434305c8341bfb5ed@ruby-forum.com> References: <20070828090914.GB7512@cordoba.webit.de> <74694836f028684434305c8341bfb5ed@ruby-forum.com> Message-ID: <20070829080529.GF7512@cordoba.webit.de> On Tue, Aug 28, 2007 at 07:20:40PM +0200, Matthew Langham wrote: > Jens Kraemer wrote: > > On Tue, Aug 28, 2007 at 10:38:21AM +0200, Matthew Langham wrote: > >> > >> We are running Ferret as a DrbServer and using ferret 0.11.3. > >> > >> I've read about using ulimit to up the file number - but doesn't that > >> just affect the current shell - i.e. where do I need to put this when > >> running my mongrels etc? > > > > maybe this does help: > > http://forums.suselinuxsupport.de/index.php?showtopic=34112&st=0&p=153853&#entry153853 > > > > (afair you were on Suse?) > > > > No, Debian Etch. Any differences? no, looks like this should work on debian in the same way. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa From mitchmonmouth at gmail.com Thu Aug 30 17:36:40 2007 From: mitchmonmouth at gmail.com (Mitch Monmouth) Date: Thu, 30 Aug 2007 14:36:40 -0700 Subject: [Ferret-talk] Method missing error after switching to DRB Message-ID: <189d64e90708301436h4ddd6da3kd001ec8b6035bcc6@mail.gmail.com> I am getting these errors after switchign to dRb: It is trying to call 'add' on MY SourceListing class, not extended with the ferret indexing methods. Any ideas on where to fix this? I'm combing through the code now. no luck, trying to call class method instead ferret server error undefined method `add' for SourceListing:Class /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/base.rb:1235:in `method_missing' /var/lib/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:67:in `send' /var/lib/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:67:in `method_missing' /var/lib/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:113:in `with_class' /var/lib/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:62:in `method_missing' /usr/lib/ruby/1.8/drb/drb.rb:1555:in `__send__' /usr/lib/ruby/1.8/drb/drb.rb:1555:in `perform_without_block' /usr/lib/ruby/1.8/drb/drb.rb:1515:in `perform' /usr/lib/ruby/1.8/drb/drb.rb:1589:in `main_loop' /usr/lib/ruby/1.8/drb/drb.rb:1585:in `loop' /usr/lib/ruby/1.8/drb/drb.rb:1585:in `main_loop' /usr/lib/ruby/1.8/drb/drb.rb:1581:in `start' /usr/lib/ruby/1.8/drb/drb.rb:1581:in `main_loop' /usr/lib/ruby/1.8/drb/drb.rb:1430:in `run' /usr/lib/ruby/1.8/drb/drb.rb:1427:in `start' /usr/lib/ruby/1.8/drb/drb.rb:1427:in `run' /usr/lib/ruby/1.8/drb/drb.rb:1347:in `initialize' /usr/lib/ruby/1.8/drb/drb.rb:1627:in `new' /usr/lib/ruby/1.8/drb/drb.rb:1627:in `start_service'\n/var/lib/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:47:in `start' (eval):55 /usr/lib/ruby/1.8/rubygems/custom_require.rb:27:in `eval' /var/lib/gems/1.8/gems/rails-1.2.3/lib/commands/runner.rb:45 /usr/lib/ruby/1.8/rubygems/custom_require.rb:27:in `gem_original_require' /usr/lib/ruby/1.8/rubygems/custom_require.rb:27:in `require' script/runner:3 and on the other side: Exception: NoMethodError: undefined method `add' for SourceListing:Class (druby://localhost:9010) /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/base.rb:1235:in `method_missing' (druby://localhost:9010) /var/lib/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:67:in `send' (druby://localhost:9010) /var/lib/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:67:in `method_missing' (druby://localhost:9010) /var/lib/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:113:in `with_class' (druby://localhost:9010) /var/lib/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:62:in `method_missing' /var/lib/gems/1.8/gems/acts_as_ferret-0.4.1/lib/remote_index.rb:31:in `<<' /var/lib/gems/1.8/gems/acts_as_ferret-0.4.1/lib/instance_methods.rb:73:in `ferret_create' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/callbacks.rb:333:in `send' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/callbacks.rb:333:in `callback' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/callbacks.rb:330:in `each' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/callbacks.rb:330:in `callback' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/callbacks.rb:255:in `create_without_timestamps' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/timestamp.rb:39:in `create' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/base.rb:1789:in `create_or_update_without_callbacks' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/callbacks.rb:242:in `create_or_update' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/base.rb:1545:in `save_without_validation' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/validations.rb:752:in `save_without_transactions' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/transactions.rb:129:in `save' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/connection_adapters/abstract/database_statements.rb:59:in `transaction' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/transactions.rb:95:in `transaction' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/transactions.rb:121:in `transaction' /var/lib/gems/1.8/gems/activerecord-1.15.3/lib/active_record/transactions.rb:129:in `save' (eval):134:in `process_listing' (eval):50:in `proc_uris' (eval):34:in `each' -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070830/fbe39974/attachment.html From mitchmonmouth at gmail.com Thu Aug 30 19:12:13 2007 From: mitchmonmouth at gmail.com (Mitch Monmouth) Date: Thu, 30 Aug 2007 16:12:13 -0700 Subject: [Ferret-talk] No method exception - figured it out Message-ID: <189d64e90708301612g342d2b75yaba5dc986435e31e@mail.gmail.com> I had to dig through the source and do a lot of debug printing, but I figured out that because I was using NavigableString objects generated from RubyfulSoup parsing lib, instead of String, the drb server was not properly unmarshalling the object. I fixed it by calling to_s on all my strings. ActiveRecord and the local ferret server handled this fine, but the remote one chokes. The error was masked twice by the exception trapping code in acts_as_ferret. How about using responds_to? instead of catching NoMethodError so that underlying errors will be exposed. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070830/0dc08cbf/attachment-0001.html From hakita at gmail.com Fri Aug 31 12:24:35 2007 From: hakita at gmail.com (Hakita Hakita) Date: Fri, 31 Aug 2007 18:24:35 +0200 Subject: [Ferret-talk] ferret acts_as_ferret and performance Message-ID: hello, I am actually indexing thousands of 1ko text documents using ferret and acts_as_ferret, and i face performance problems. I takes me hours to index 20 000 1ko text documents. Methology used : I create and object, fulfill it with the text, and save it. So it is automaticly indexed. Is there a way to make it faster ? ( remove the auto optimize option somewhere ?) Thank you if you have any ideas... Regards -- Posted via http://www.ruby-forum.com/. From jk at jkraemer.net Fri Aug 31 18:06:35 2007 From: jk at jkraemer.net (Jens Kraemer) Date: Sat, 1 Sep 2007 00:06:35 +0200 Subject: [Ferret-talk] ferret acts_as_ferret and performance In-Reply-To: References: Message-ID: <20070831220635.GE11981@thunder.jkraemer.net> On Fri, Aug 31, 2007 at 06:24:35PM +0200, Hakita Hakita wrote: > hello, > > I am actually indexing thousands of 1ko text documents using ferret and > acts_as_ferret, and i face performance problems. > I takes me hours to index 20 000 1ko text documents. > > Methology used : > > I create and object, fulfill it with the text, and save it. So it is > automaticly indexed. > > Is there a way to make it faster ? ( remove the auto optimize option > somewhere ?) You should disable Ferret indexing before you start creating your records, then create them, enable Ferret again and index them as a whole: Model.disable_ferret # create records here, collect ids in id_array Model.enable_ferret Model.bulk_index(id_array) bulk_index temporarily turns off auto_flush, and optimizes the index after finishing. I just committed these functions to trunk, so let us know how it works ;-) Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database