From u.alberton at gmail.com Wed Jun 4 11:20:48 2008 From: u.alberton at gmail.com (Bira) Date: Wed, 4 Jun 2008 10:20:48 -0500 Subject: [Ferret-talk] Is Ferret's SVN repository down? Message-ID: Is Ferret's SVN repository down? I'm trying to get the source from svn://davebalmain.com/ferret, but I keep getting "Connection Refused" errors. Did the URL change, or is Ferret's repository really down? -- Bira http://compexplicita.wordpress.com http://compexplicita.tumblr.com From efren088 at gmail.com Wed Jun 4 15:58:20 2008 From: efren088 at gmail.com (=?ISO-8859-1?Q?Efr=E9n_D=EDaz?=) Date: Wed, 4 Jun 2008 15:58:20 -0400 Subject: [Ferret-talk] Ferret's Memory usage when searching. Message-ID: <57ba33070806041258x31455603g3a7957ccb3dd99cb@mail.gmail.com> Hello, We are experiencing some performance problems when using Ferret and we are trying to isolate the problem. We have about 80 GB in Indexes for one of our clients and when a search is performed on those indexes the application gets really slow and eventually it stops responding. We've been monitoring the memory usage, and it rises very rapidly as the indexes are been loaded. Ferret's documentation says the index reader is automatically closed during garbage collection, but either this doesn't work, or it takes much longer to happen than would be ideal for us. So we are running out of memory and the mongrel instances become unresponsive to a point that not even monit can restart them, we have to kill the instances manually. Does anyone knows how Ferret manages it's memory usage, does it try to load all the indexes needed for a search into RAM all at once? If that's the case, what happens when the indexes size exceeds the available RAM? Does anyone have this problem before? The help anyone can provide will be greatly appreciated. -- Efr?n D?az From kraemer at webit.de Thu Jun 5 06:33:42 2008 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 5 Jun 2008 12:33:42 +0200 Subject: [Ferret-talk] Ferret's Memory usage when searching. In-Reply-To: <57ba33070806041258x31455603g3a7957ccb3dd99cb@mail.gmail.com> References: <57ba33070806041258x31455603g3a7957ccb3dd99cb@mail.gmail.com> Message-ID: <20080605103342.GC8312@cordoba.webit.de> Hi! first of all, among how many indexes are those 80GB distributed? and of what size is the largest index? Anyway this sounds like a really huge setup :-) On Wed, Jun 04, 2008 at 03:58:20PM -0400, Efr?n D?az wrote: > Hello, > > We are experiencing some performance problems when using Ferret and we > are trying to isolate the problem. > > We have about 80 GB in Indexes for one of our clients and when a > search is performed on those indexes the application gets really slow > and eventually it stops responding. We've been monitoring the memory > usage, and it rises very rapidly as the indexes are been loaded. > > Ferret's documentation says the index reader is automatically closed > during garbage collection, but either this doesn't work, or it takes > much longer to happen than would be ideal for us. did you try to manually close the readers instead of waiting for the GC to do it? > So we are running out of memory and the mongrel instances become > unresponsive to a point that not even monit can restart them, we have > to kill the instances manually. Probably not a goot idea to open the readers directly inside the mongrels, since this of course will multiply the maximum memory needed by the number of mongrel instances running. > Does anyone knows how Ferret manages it's memory usage, does it try to > load all the indexes needed for a search into RAM all at once? It doesn't try to load the whole index into RAM, but for sure a reader keeps some data structures in memory to speed up searching. That would be another benefit of having a separate (multithreaded) server handling the search - readers are opened once on startup and kept open all the time. Or at least until you re-open them to reflect any index changes. > If that's the case, what happens when the indexes size exceeds the > available RAM? It's definitely possible to search an index larger than the available RAM. However I don't know a way to estimate the amount of RAM needed for searching an index of a given size. This also depends on your usage pattern and index contents (i.e., number of terms and documents) I'd say. > Does anyone have this problem before? no, but I don't have indexes that large, either. Cheers, Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold From jk at jkraemer.net Thu Jun 5 06:40:05 2008 From: jk at jkraemer.net (Jens Kraemer) Date: Thu, 5 Jun 2008 12:40:05 +0200 Subject: [Ferret-talk] Is Ferret's SVN repository down? In-Reply-To: References: Message-ID: <20080605104005.GF11766@thunder.jkraemer.net> Hi! same here, I mailed Dave about it. Cheers, Jens On Wed, Jun 04, 2008 at 10:20:48AM -0500, Bira wrote: > Is Ferret's SVN repository down? I'm trying to get the source from > svn://davebalmain.com/ferret, but I keep getting "Connection Refused" > errors. Did the URL change, or is Ferret's repository really down? > > -- > Bira > http://compexplicita.wordpress.com > http://compexplicita.tumblr.com > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer Finkenlust 14, 06449 Aschersleben, Germany VAT Id DE251962952 http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From kraemer at webit.de Thu Jun 5 07:37:27 2008 From: kraemer at webit.de (Jens Kraemer) Date: Thu, 5 Jun 2008 13:37:27 +0200 Subject: [Ferret-talk] Is Ferret's SVN repository down? In-Reply-To: <20080605104005.GF11766@thunder.jkraemer.net> References: <20080605104005.GF11766@thunder.jkraemer.net> Message-ID: <20080605113727.GE8312@cordoba.webit.de> SVN is back. Git repository coming soon ;-) cheers, Jens On Thu, Jun 05, 2008 at 12:40:05PM +0200, Jens Kraemer wrote: > Hi! > > same here, I mailed Dave about it. > > Cheers, > Jens > > On Wed, Jun 04, 2008 at 10:20:48AM -0500, Bira wrote: > > Is Ferret's SVN repository down? I'm trying to get the source from > > svn://davebalmain.com/ferret, but I keep getting "Connection Refused" > > errors. Did the URL change, or is Ferret's repository really down? > > > > -- > > Bira > > http://compexplicita.wordpress.com > > http://compexplicita.tumblr.com > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > -- > Jens Kr?mer > Finkenlust 14, 06449 Aschersleben, Germany > VAT Id DE251962952 > http://www.jkraemer.net/ - Blog > http://www.omdb.org/ - The new free film database > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold From efren088 at gmail.com Fri Jun 6 00:21:58 2008 From: efren088 at gmail.com (=?ISO-8859-1?Q?Efr=E9n_D=EDaz?=) Date: Fri, 6 Jun 2008 00:21:58 -0400 Subject: [Ferret-talk] Making sorting case insensitive Message-ID: <57ba33070806052121u7ba3938eh6d656157771da6be@mail.gmail.com> Hello, I've been trying to make sorting in ferret case insensitive without any luck. I've been searching the mailing list, the docs and nothing. Apparently setting the type option for the sort to string should do it, but it doesn't. Does anyone know how to achieve this? Thanks. -- Efr?n D?az http://www.efrendiaz.com From toastkid.williams at gmail.com Fri Jun 6 05:22:19 2008 From: toastkid.williams at gmail.com (Max Williams) Date: Fri, 6 Jun 2008 10:22:19 +0100 Subject: [Ferret-talk] Making sorting case insensitive In-Reply-To: <57ba33070806052121u7ba3938eh6d656157771da6be@mail.gmail.com> References: <57ba33070806052121u7ba3938eh6d656157771da6be@mail.gmail.com> Message-ID: You could always downcase the field you want to sort by before saving it to the index - is that an option? 2008/6/6 Efr?n D?az : > Hello, > > I've been trying to make sorting in ferret case insensitive without > any luck. I've been searching the mailing list, the docs and nothing. > > Apparently setting the type option for the sort to string should do > it, but it doesn't. > > Does anyone know how to achieve this? > > Thanks. > > -- > Efr?n D?az > > http://www.efrendiaz.com > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kraemer at webit.de Fri Jun 6 05:36:15 2008 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 6 Jun 2008 11:36:15 +0200 Subject: [Ferret-talk] project directions In-Reply-To: References: Message-ID: <20080606093615.GF8312@cordoba.webit.de> Hi! Just had a small email conversation with Dave Balmain and he mentioned another developer who will be working on Ferret in the future along with him. So I think yes, development will go on :-) Cheers, Jens On Mon, May 26, 2008 at 01:34:17PM +1000, Julio Cesar Ody wrote: > Hey all, > > just recently I stumbled upon this > > http://ferret.davebalmain.com/trac/timeline > > which seemed like good news. I thought Ferret was put on hold or > perhaps dying, and having participated recently in a few discussions > with people who also thought that was the case, I didn't have a good > answer for it. > > So, is anyone informed if there will be some development going on > Ferret, as in consistently? > > Thanks. > > ps: I'm not bitching. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold From hulme at ebi.ac.uk Wed Jun 11 05:06:15 2008 From: hulme at ebi.ac.uk (Robert Hulme) Date: Wed, 11 Jun 2008 10:06:15 +0100 Subject: [Ferret-talk] Similar words Message-ID: Is there a way to get a list of similar words to the ones a user has searched for? So if they search for (in my case) transferaze which has no matches I can get back an array like this ['transferase'] ? I know I can just add ~ on the end to make it fuzzy, but what I'd like is to be able to say "Sorry, no matches for 'transferaze'. Did you mean 'transferase' (310 matches)?" Ideally I'd like to get the number of matches for those similar words, but I know I could just do a search for each of those to get it. Is this possible? -Rob From julioody at gmail.com Wed Jun 11 07:12:43 2008 From: julioody at gmail.com (Julio Cesar Ody) Date: Wed, 11 Jun 2008 21:12:43 +1000 Subject: [Ferret-talk] Similar words In-Reply-To: References: Message-ID: > Ideally I'd like to get the number of matches for those similar words, but I > know I could just do a search for each of those to get it. And I think it might have to boil down to that really. If you want to get a the *number of results* for 'transferase' when someone searches for 'transferaze', then it means you'll need to hit the index once more with 'transferase' in separate. On Wed, Jun 11, 2008 at 7:06 PM, Robert Hulme wrote: > Is there a way to get a list of similar words to the ones a user has > searched for? > > So if they search for (in my case) transferaze which has no matches I can > get back an array like this ['transferase'] ? > > I know I can just add ~ on the end to make it fuzzy, but what I'd like is to > be able to say "Sorry, no matches for 'transferaze'. Did you mean > 'transferase' (310 matches)?" > > Ideally I'd like to get the number of matches for those similar words, but I > know I could just do a search for each of those to get it. > > Is this possible? > > -Rob > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From toastkid.williams at gmail.com Wed Jun 11 07:40:32 2008 From: toastkid.williams at gmail.com (Max Williams) Date: Wed, 11 Jun 2008 12:40:32 +0100 Subject: [Ferret-talk] Similar words In-Reply-To: References: Message-ID: Is there a way to get a list of results so that you also get, for each result, the phrase that was successfully matched? If so, you could shove all the results into a hash, with the keys being the matched phrase (and the value for each being an array of results), and return the asked-for key's array along with the key and value size for the largest other array. just thinking aloud...i'd like to be able to do this as well, i don;t know if you can get back that info though. 2008/6/11 Julio Cesar Ody : > > Ideally I'd like to get the number of matches for those similar words, > but I > > know I could just do a search for each of those to get it. > > And I think it might have to boil down to that really. If you want to > get a the *number of results* for 'transferase' when someone searches > for 'transferaze', then it means you'll need to hit the index once > more with 'transferase' in separate. > > > On Wed, Jun 11, 2008 at 7:06 PM, Robert Hulme wrote: > > Is there a way to get a list of similar words to the ones a user has > > searched for? > > > > So if they search for (in my case) transferaze which has no matches I can > > get back an array like this ['transferase'] ? > > > > I know I can just add ~ on the end to make it fuzzy, but what I'd like is > to > > be able to say "Sorry, no matches for 'transferaze'. Did you mean > > 'transferase' (310 matches)?" > > > > Ideally I'd like to get the number of matches for those similar words, > but I > > know I could just do a search for each of those to get it. > > > > Is this possible? > > > > -Rob > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hulme at ebi.ac.uk Wed Jun 11 09:04:43 2008 From: hulme at ebi.ac.uk (Robert Hulme) Date: Wed, 11 Jun 2008 14:04:43 +0100 Subject: [Ferret-talk] Similar words In-Reply-To: References: Message-ID: > And I think it might have to boil down to that really. If you want to > get a the *number of results* for 'transferase' when someone searches > for 'transferaze', then it means you'll need to hit the index once > more with 'transferase' in separate. I'm happy to do that if that's the only way to do that, but that's really a secondary issue. Imagine I have in my index the following terms: abcd abce abcf and I search for abca I'd get 0 matches. What I'd like is to be able to present to the user: No matches found for 'abca'. Did you mean 'abcd', 'abce', or 'abcf' ? So I need Ferret to have method call that would return ['abcd', 'abce', 'abcf']. Is this possible? -Rob From julioody at gmail.com Wed Jun 11 19:26:28 2008 From: julioody at gmail.com (Julio Cesar Ody) Date: Thu, 12 Jun 2008 09:26:28 +1000 Subject: [Ferret-talk] Similar words In-Reply-To: References: Message-ID: So the problem you have is where to pull recommendations from. For my own needs, I use a spell checker to do the "did you mean", which means my data source is external, and thus I never hit the index twice. As you seem to want to correlate the user's input with *existing* entries in your index, then I still think you'll need to hit the index twice, one using the analyzers you'd normally use, and another with a fuzzy query. To help scaling things, you could have 2 indexes. But that's another story. On Wed, Jun 11, 2008 at 11:04 PM, Robert Hulme wrote: >> And I think it might have to boil down to that really. If you want to >> get a the *number of results* for 'transferase' when someone searches >> for 'transferaze', then it means you'll need to hit the index once >> more with 'transferase' in separate. > > I'm happy to do that if that's the only way to do that, but that's really a > secondary issue. > > Imagine I have in my index the following terms: > abcd > abce > abcf > > and I search for abca > > I'd get 0 matches. > > What I'd like is to be able to present to the user: > > No matches found for 'abca'. Did you mean 'abcd', 'abce', or 'abcf' ? > > So I need Ferret to have method call that would return ['abcd', 'abce', > 'abcf']. > > Is this possible? > > -Rob > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > From hulme at ebi.ac.uk Thu Jun 12 05:13:22 2008 From: hulme at ebi.ac.uk (Robert Hulme) Date: Thu, 12 Jun 2008 10:13:22 +0100 Subject: [Ferret-talk] Similar words In-Reply-To: References: Message-ID: > So the problem you have is where to pull recommendations from. For my > own needs, I use a spell checker to do the "did you mean", which means > my data source is external, and thus I never hit the index twice. No :-) I definitely need to pull the recommendations from the Ferret index (or reimplement this bit of Ferret in Ruby). I can't use spell checker with an ordinary dictionary because the terms that are stored in my index (which is an index of Protein Databank File headers among other things) are often not ordinary words. I *could* build my own dictionary of all the words that are indexed, then loop through those and compute the levenstein distance for each - but that's obviously what query~ does (it must query a Ferret dictionary to find the matches with a levenstein distance less than foo, then create a query that does word1 or word2 or word 3...), so it seems extraordinarily silly (not to mention slow) to reimplement (in Ruby) something that is already in Ferret. My question really is whether access to this information is exposed throught the Ferret API. I think a Ferret developer is needed to answer this question. > I'm very surprised that I'm the first person (AFAICT from searching the mailing list archive) to ask this question. -Rob From toastkid.williams at gmail.com Fri Jun 13 05:31:04 2008 From: toastkid.williams at gmail.com (Max Williams) Date: Fri, 13 Jun 2008 11:31:04 +0200 Subject: [Ferret-talk] Multi select with conditions In-Reply-To: <399086b973c597455685279b814eab6c@ruby-forum.com> References: <399086b973c597455685279b814eab6c@ruby-forum.com> Message-ID: Thanks all On the way to solving this i found a bug in acts_as_ferret - it seems that if i try to sort, paginate and use conditions at the same time then the sorting breaks down: instead of being sorted and then paginated, the results are paginated (ordered simply by id) and then sorted within each page. I let Jens Kraemer (and the acts as ferret mailing list) know about it but as far as i know it's not been fixed. I ended up doing a ferret search to get the ids of the results (with unallowed records filtered out), and then doing an AR find to get those results, sort and paginate them. So, it's a little inelegant but it works, at least. thanks max -- Posted via http://www.ruby-forum.com/. From toastkid.williams at gmail.com Fri Jun 13 08:43:47 2008 From: toastkid.williams at gmail.com (Max Williams) Date: Fri, 13 Jun 2008 14:43:47 +0200 Subject: [Ferret-talk] strip out non-alphanumeric characters before saving to index Message-ID: <639518d04b5a80911206f436f5b8ff2a@ruby-forum.com> Does anyone know a simple way, with ferret or a_a_f, to strip out everything that's not a letter, number or space before saving to the index? I know that i could do a custom method for every indexed field that regexes them out but i thought that there might be a universal option for it... thanks max -- Posted via http://www.ruby-forum.com/. From toastkid.williams at gmail.com Mon Jun 16 07:08:00 2008 From: toastkid.williams at gmail.com (Max Williams) Date: Mon, 16 Jun 2008 13:08:00 +0200 Subject: [Ferret-talk] Search Ferret Index for Use With Autocomplete / Options In-Reply-To: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> Message-ID: Gaudi Mi wrote: > I've got Ferret up and running for a Rails application and I'd like to > be able to use autocomplete in my text search field so that as the user > is typing each character of the search term, the index is queried for > matching terms starting with those characters, and they are displayed in > a list under the search box, like Google Suggest. > > I have searched the Ferret API but I can't find a way to do for example: > > Show all words in index that start with 's', then 'sp', then 'spa', etc. > > Thanks for any assistance. Is that what google does, though? I thought the autocomplete was being filled with previous matching searches. If that is the case then you might have to save all search queries to a db. Have you seen the jQuery autocomplete plugin? It's pretty good: http://www.pengoworks.com/workshop/jquery/autocomplete.htm -- Posted via http://www.ruby-forum.com/. From hulme at ebi.ac.uk Mon Jun 16 07:12:37 2008 From: hulme at ebi.ac.uk (Robert Hulme) Date: Mon, 16 Jun 2008 12:12:37 +0100 Subject: [Ferret-talk] Search Ferret Index for Use With Autocomplete / Options In-Reply-To: References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> Message-ID: <3936C452-A2FE-400B-AA92-A77EADD9FF15@ebi.ac.uk> > Is that what google does, though? Yes it is. This problem is related to the problem I asked about the other day about levenshtein distance. Is this stuff exposed in Ferret? -Rob From kraemer at webit.de Mon Jun 16 07:50:37 2008 From: kraemer at webit.de (Jens Kraemer) Date: Mon, 16 Jun 2008 13:50:37 +0200 Subject: [Ferret-talk] strip out non-alphanumeric characters before saving to index In-Reply-To: <639518d04b5a80911206f436f5b8ff2a@ruby-forum.com> References: <639518d04b5a80911206f436f5b8ff2a@ruby-forum.com> Message-ID: <20080616115037.GD8395@cordoba.webit.de> Hi! That's a typical job for an analyzer, I think Ferret's StandardAnalyzer which is used by default does exactly that. If not, try RegexpAnalyzer. Cheers, Jens On Fri, Jun 13, 2008 at 02:43:47PM +0200, Max Williams wrote: > Does anyone know a simple way, with ferret or a_a_f, to strip out > everything that's not a letter, number or space before saving to the > index? I know that i could do a custom method for every indexed field > that regexes them out but i thought that there might be a universal > option for it... > > thanks > max > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold From toastkid.williams at gmail.com Mon Jun 16 07:58:12 2008 From: toastkid.williams at gmail.com (Max Williams) Date: Mon, 16 Jun 2008 12:58:12 +0100 Subject: [Ferret-talk] strip out non-alphanumeric characters before saving to index In-Reply-To: <20080616115037.GD8395@cordoba.webit.de> References: <639518d04b5a80911206f436f5b8ff2a@ruby-forum.com> <20080616115037.GD8395@cordoba.webit.de> Message-ID: great, i'll check those out. thanks! max 2008/6/16 Jens Kraemer : > Hi! > > That's a typical job for an analyzer, I think Ferret's StandardAnalyzer > which is used by default does exactly that. If not, try RegexpAnalyzer. > > Cheers, > Jens > > On Fri, Jun 13, 2008 at 02:43:47PM +0200, Max Williams wrote: > > Does anyone know a simple way, with ferret or a_a_f, to strip out > > everything that's not a letter, number or space before saving to the > > index? I know that i could do a custom method for every indexed field > > that regexes them out but i thought that there might be a universal > > option for it... > > > > thanks > > max > > -- > > Posted via http://www.ruby-forum.com/. > > _______________________________________________ > > Ferret-talk mailing list > > Ferret-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/ferret-talk > > > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jk at jkraemer.net Mon Jun 16 08:12:28 2008 From: jk at jkraemer.net (Jens Kraemer) Date: Mon, 16 Jun 2008 14:12:28 +0200 Subject: [Ferret-talk] Search Ferret Index for Use With Autocomplete / Options In-Reply-To: References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> Message-ID: On 16.06.2008, at 13:08, Max Williams wrote: > Gaudi Mi wrote: >> I've got Ferret up and running for a Rails application and I'd like >> to >> be able to use autocomplete in my text search field so that as the >> user >> is typing each character of the search term, the index is queried for >> matching terms starting with those characters, and they are >> displayed in >> a list under the search box, like Google Suggest. >> >> I have searched the Ferret API but I can't find a way to do for >> example: >> >> Show all words in index that start with 's', then 'sp', then 'spa', >> etc. This might be accomplished by using a TermEnum (http://ferret.davebalmain.com/api/classes/Ferret/Index/TermEnum.html ) which basically is a list of all terms present in the index in a given field. Using term_enum.skip_to('s') should bring back the first term starting with letter 's', then get all other terms with s by calling term_enum.next as often as necessary. Never tried this, but it should work. However the 'common' way for autocomplete is indeed to base the completion on past searches, i.e. index user's successful queries and suggest matching past queries while the user is typing. If that's really not what you want, you could also build up a second index containing all the terms that occur in your data, each as a document on its own (like your own dictionary), and get suggestions from there, with fuzzy queries if you like. This can also be used for 'did you mean' stuff in case the user has a typo in his query and got no results because of that. Cheers, Jens -- Jens Kr?mer Finkenlust 14, 06449 Aschersleben, Germany VAT Id DE251962952 http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database From marvin at rectangular.com Wed Jun 18 00:15:34 2008 From: marvin at rectangular.com (Marvin Humphrey) Date: Tue, 17 Jun 2008 21:15:34 -0700 Subject: [Ferret-talk] Search Ferret Index for Use With Autocomplete / Options In-Reply-To: References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> Message-ID: <83343F0A-21BB-4C65-B8F1-8B8343F831D4@rectangular.com> On Jun 16, 2008, at 5:12 AM, Jens Kraemer wrote: > This might be accomplished by using a TermEnum That's how I would do it. However, you have to be careful with Analyzers: if the text is stemmed, the suggestions will be stemmed. The solution would be to have an unstemmed field dedicated to this purpose. Marvin Humphrey Rectangular Research http://www.rectangular.com/ From mattias at oncotype.dk Tue Jun 24 05:33:59 2008 From: mattias at oncotype.dk (Mattias Bodlund) Date: Tue, 24 Jun 2008 11:33:59 +0200 Subject: [Ferret-talk] Find fields beginning with? In-Reply-To: References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> Message-ID: Is it possible to search for fields that start with a word or phrase? Lets say I have the following fields: 1 mooning 2 moon landing 3 landing on the moon Then I would like to be able to only get The result I would like to get is: 1 and 2 if I search for moon only 2 if I search for moon landing or moon land only 3 if I search for landing Is it possible with ferret or would a simple SQL query do this better? Cheers Mattias From ian.connor at gmail.com Fri Jun 27 05:59:45 2008 From: ian.connor at gmail.com (Ian Connor) Date: Fri, 27 Jun 2008 05:59:45 -0400 Subject: [Ferret-talk] Scores in ferret Message-ID: Hi, Are scores an absolute calculation or relative to what is in a given index? I ask because I wanted to look into distributing my index over a few servers. The idea being that I could get 10 results for a couple of servers, do an in memory merge and return the results faster than it would be possible with just the one index server. Would this work? Has anyone tried this type of ghetto map-reduce like deployment with ferret? -- Regards, Ian Connor From kraemer at webit.de Fri Jun 27 08:16:26 2008 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 27 Jun 2008 14:16:26 +0200 Subject: [Ferret-talk] Scores in ferret In-Reply-To: References: Message-ID: <20080627121626.GD6614@cordoba.webit.de> Hi, the scores are relative to the contents of the index, so this won't be *that* easy. However it is possible to have a distributed index in terms of multiple physical indexes on the same machine (this is done by having one IndexReader instance using several underlying IndexReader instances), with consistent scores. What's missing is the possiblity to access remote indexes this way (Lucene has this feature afair). Cheers, Jens On Fri, Jun 27, 2008 at 05:59:45AM -0400, Ian Connor wrote: > Hi, > > Are scores an absolute calculation or relative to what is in a given > index? I ask because I wanted to look into distributing my index over > a few servers. The idea being that I could get 10 results for a couple > of servers, do an in memory merge and return the results faster than > it would be possible with just the one index server. > > Would this work? Has anyone tried this type of ghetto map-reduce like > deployment with ferret? > > -- > Regards, > > Ian Connor > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold From mattias at oncotype.dk Fri Jun 27 08:23:39 2008 From: mattias at oncotype.dk (Mattias Bodlund) Date: Fri, 27 Jun 2008 14:23:39 +0200 Subject: [Ferret-talk] Find fields beginning with? In-Reply-To: References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> Message-ID: <9DC46304-D34A-4CC7-AE41-F4A1C8FDF484@oncotype.dk> Dens't anyone have some thoughts on this? On 24/06/2008, at 11.33, Mattias Bodlund wrote: > Is it possible to search for fields that start with a word or phrase? > > Lets say I have the following fields: > > 1 mooning > 2 moon landing > 3 landing on the moon > > Then I would like to be able to only get > > The result I would like to get is: > > 1 and 2 if I search for moon > only 2 if I search for moon landing or moon land > only 3 if I search for landing > > Is it possible with ferret or would a simple SQL query do this better? > > Cheers > Mattias > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From kraemer at webit.de Fri Jun 27 09:47:23 2008 From: kraemer at webit.de (Jens Kraemer) Date: Fri, 27 Jun 2008 15:47:23 +0200 Subject: [Ferret-talk] Find fields beginning with? In-Reply-To: <9DC46304-D34A-4CC7-AE41-F4A1C8FDF484@oncotype.dk> References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> <9DC46304-D34A-4CC7-AE41-F4A1C8FDF484@oncotype.dk> Message-ID: <20080627134723.GF6614@cordoba.webit.de> On Fri, Jun 27, 2008 at 02:23:39PM +0200, Mattias Bodlund wrote: > Dens't anyone have some thoughts on this? did you have a look at http://ferret.davebalmain.com/api/classes/Ferret/Search/Spans/SpanNearQuery.html ? Not sure but it might solve some if not all of your issues. cheers, Jens > On 24/06/2008, at 11.33, Mattias Bodlund wrote: > >> Is it possible to search for fields that start with a word or phrase? >> >> Lets say I have the following fields: >> >> 1 mooning >> 2 moon landing >> 3 landing on the moon >> >> Then I would like to be able to only get >> >> The result I would like to get is: >> >> 1 and 2 if I search for moon >> only 2 if I search for moon landing or moon land >> only 3 if I search for landing >> -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold From mattias at oncotype.dk Fri Jun 27 09:52:01 2008 From: mattias at oncotype.dk (Mattias Bodlund) Date: Fri, 27 Jun 2008 15:52:01 +0200 Subject: [Ferret-talk] Find fields beginning with? In-Reply-To: <20080627134723.GF6614@cordoba.webit.de> References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> <9DC46304-D34A-4CC7-AE41-F4A1C8FDF484@oncotype.dk> <20080627134723.GF6614@cordoba.webit.de> Message-ID: <0BFA7B0E-63B8-4EA9-905F-B0798BDA6768@oncotype.dk> Yes looked at that but the fields I have are often short. The constrain I'm looking for is that it has to start with the query. Like SELECT * FROM table WHERE title LIKE "term%" mattias On 27/06/2008, at 15.47, Jens Kraemer wrote: > On Fri, Jun 27, 2008 at 02:23:39PM +0200, Mattias Bodlund wrote: >> Dens't anyone have some thoughts on this? > > did you have a look at > http://ferret.davebalmain.com/api/classes/Ferret/Search/Spans/SpanNearQuery.html > ? Not sure but it might solve some if not all of your issues. > > cheers, > Jens > >> On 24/06/2008, at 11.33, Mattias Bodlund wrote: >> >>> Is it possible to search for fields that start with a word or >>> phrase? >>> >>> Lets say I have the following fields: >>> >>> 1 mooning >>> 2 moon landing >>> 3 landing on the moon >>> >>> Then I would like to be able to only get >>> >>> The result I would like to get is: >>> >>> 1 and 2 if I search for moon >>> only 2 if I search for moon landing or moon land >>> only 3 if I search for landing >>> > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From ian.connor at gmail.com Fri Jun 27 10:09:42 2008 From: ian.connor at gmail.com (Ian Connor) Date: Fri, 27 Jun 2008 10:09:42 -0400 Subject: [Ferret-talk] Scores in ferret In-Reply-To: <20080627121626.GD6614@cordoba.webit.de> References: <20080627121626.GD6614@cordoba.webit.de> Message-ID: I am hoping to index many GB of data. It is cheaper hardware wise to have a few machines with 8GB of RAM instead of one large machine. Has anyone had success with large data sets? In my case the full MEDLINE data (pubmed.gov). My initial performance tests is to index 100k articles and it seems 10x faster when RAM is used compared with disks. I am still trying to figure out the bottlenecks in terms of CPU/IO/etc. Once the index is built, I am impressed with the read speeds. On Fri, Jun 27, 2008 at 8:16 AM, Jens Kraemer wrote: > Hi, > > the scores are relative to the contents of the index, so this won't be > *that* easy. > > However it is possible to have a distributed index in terms of multiple > physical indexes on the same machine (this is done by having one IndexReader > instance using several underlying IndexReader instances), with > consistent scores. > > What's missing is the possiblity to access remote indexes this way > (Lucene has this feature afair). > > > Cheers, > Jens > > On Fri, Jun 27, 2008 at 05:59:45AM -0400, Ian Connor wrote: >> Hi, >> >> Are scores an absolute calculation or relative to what is in a given >> index? I ask because I wanted to look into distributing my index over >> a few servers. The idea being that I could get 10 results for a couple >> of servers, do an in memory merge and return the results faster than >> it would be possible with just the one index server. >> >> Would this work? Has anyone tried this type of ghetto map-reduce like >> deployment with ferret? >> >> -- >> Regards, >> >> Ian Connor >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk >> > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > -- Regards, Ian Connor 82 Fellsway W #2 Somerville, MA 02145 Direct Line: +1 (978) 6333372 Call Center Phone: +1 (714) 239 3875 (24 hrs) Mobile Phone: +1 (312) 218 3209 Fax: +1(770) 818 5697 Suisse Phone: +41 (0) 22 548 1664 Skype: ian.connor From henke at mac.se Fri Jun 27 17:48:43 2008 From: henke at mac.se (Henrik) Date: Fri, 27 Jun 2008 23:48:43 +0200 Subject: [Ferret-talk] Find fields beginning with? In-Reply-To: <0BFA7B0E-63B8-4EA9-905F-B0798BDA6768@oncotype.dk> References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> <9DC46304-D34A-4CC7-AE41-F4A1C8FDF484@oncotype.dk> <20080627134723.GF6614@cordoba.webit.de> <0BFA7B0E-63B8-4EA9-905F-B0798BDA6768@oncotype.dk> Message-ID: <9C154B9C-C4BE-4E86-A9E1-788BCF713142@mac.se> 27 jun 2008 kl. 15.52 skrev Mattias Bodlund: > Yes looked at that but the fields I have are often short. The > constrain I'm looking for is that it has to start with the query. > > Like SELECT * FROM table WHERE title LIKE "term%" You want to use the WildQuery alternative. That way you can use term*. Cheers, Henke > > > mattias > > > > > On 27/06/2008, at 15.47, Jens Kraemer wrote: > >> On Fri, Jun 27, 2008 at 02:23:39PM +0200, Mattias Bodlund wrote: >>> Dens't anyone have some thoughts on this? >> >> did you have a look at >> http://ferret.davebalmain.com/api/classes/Ferret/Search/Spans/SpanNearQuery.html >> ? Not sure but it might solve some if not all of your issues. >> >> cheers, >> Jens >> >>> On 24/06/2008, at 11.33, Mattias Bodlund wrote: >>> >>>> Is it possible to search for fields that start with a word or >>>> phrase? >>>> >>>> Lets say I have the following fields: >>>> >>>> 1 mooning >>>> 2 moon landing >>>> 3 landing on the moon >>>> >>>> Then I would like to be able to only get >>>> >>>> The result I would like to get is: >>>> >>>> 1 and 2 if I search for moon >>>> only 2 if I search for moon landing or moon land >>>> only 3 if I search for landing >>>> >> >> -- >> Jens Kr?mer >> webit! Gesellschaft f?r neue Medien mbH >> Schnorrstra?e 76 | 01069 Dresden >> Telefon +49 351 46766-0 | Telefax +49 351 46766-66 >> kraemer at webit.de | www.webit.de >> >> Amtsgericht Dresden | HRB 15422 >> GF Sven Haubold >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From mattias at oncotype.dk Sat Jun 28 04:20:46 2008 From: mattias at oncotype.dk (Mattias Bodlund) Date: Sat, 28 Jun 2008 10:20:46 +0200 Subject: [Ferret-talk] Find fields beginning with? In-Reply-To: <9C154B9C-C4BE-4E86-A9E1-788BCF713142@mac.se> References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> <9DC46304-D34A-4CC7-AE41-F4A1C8FDF484@oncotype.dk> <20080627134723.GF6614@cordoba.webit.de> <0BFA7B0E-63B8-4EA9-905F-B0798BDA6768@oncotype.dk> <9C154B9C-C4BE-4E86-A9E1-788BCF713142@mac.se> Message-ID: But that doesn't restrict the search to the start of the field. dog* will match both "Wild dog" and "Dog bone" mattias On 27/06/2008, at 23.48, Henrik wrote: > > 27 jun 2008 kl. 15.52 skrev Mattias Bodlund: > >> Yes looked at that but the fields I have are often short. The >> constrain I'm looking for is that it has to start with the query. >> >> Like SELECT * FROM table WHERE title LIKE "term%" > > You want to use the WildQuery alternative. That way you can use term*. > > Cheers, > > Henke >> >> >> mattias >> >> >> >> >> On 27/06/2008, at 15.47, Jens Kraemer wrote: >> >>> On Fri, Jun 27, 2008 at 02:23:39PM +0200, Mattias Bodlund wrote: >>>> Dens't anyone have some thoughts on this? >>> >>> did you have a look at >>> http://ferret.davebalmain.com/api/classes/Ferret/Search/Spans/SpanNearQuery.html >>> ? Not sure but it might solve some if not all of your issues. >>> >>> cheers, >>> Jens >>> >>>> On 24/06/2008, at 11.33, Mattias Bodlund wrote: >>>> >>>>> Is it possible to search for fields that start with a word or >>>>> phrase? >>>>> >>>>> Lets say I have the following fields: >>>>> >>>>> 1 mooning >>>>> 2 moon landing >>>>> 3 landing on the moon >>>>> >>>>> Then I would like to be able to only get >>>>> >>>>> The result I would like to get is: >>>>> >>>>> 1 and 2 if I search for moon >>>>> only 2 if I search for moon landing or moon land >>>>> only 3 if I search for landing >>>>> >>> >>> -- >>> Jens Kr?mer >>> webit! Gesellschaft f?r neue Medien mbH >>> Schnorrstra?e 76 | 01069 Dresden >>> Telefon +49 351 46766-0 | Telefax +49 351 46766-66 >>> kraemer at webit.de | www.webit.de >>> >>> Amtsgericht Dresden | HRB 15422 >>> GF Sven Haubold >>> _______________________________________________ >>> Ferret-talk mailing list >>> Ferret-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/ferret-talk >> >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From henke at mac.se Sun Jun 29 08:44:40 2008 From: henke at mac.se (Henrik) Date: Sun, 29 Jun 2008 14:44:40 +0200 Subject: [Ferret-talk] Find fields beginning with? In-Reply-To: References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> <9DC46304-D34A-4CC7-AE41-F4A1C8FDF484@oncotype.dk> <20080627134723.GF6614@cordoba.webit.de> <0BFA7B0E-63B8-4EA9-905F-B0798BDA6768@oncotype.dk> <9C154B9C-C4BE-4E86-A9E1-788BCF713142@mac.se> Message-ID: <0B69F0EE-9DA2-4DE2-930F-D41C5C63BA4F@mac.se> Ahh true. Interesting situation. Need to research that a bit :) //Henke 28 jun 2008 kl. 10.20 skrev Mattias Bodlund: > But that doesn't restrict the search to the start of the field. > dog* will match both "Wild dog" and "Dog bone" > > mattias > > On 27/06/2008, at 23.48, Henrik wrote: > >> >> 27 jun 2008 kl. 15.52 skrev Mattias Bodlund: >> >>> Yes looked at that but the fields I have are often short. The >>> constrain I'm looking for is that it has to start with the query. >>> >>> Like SELECT * FROM table WHERE title LIKE "term%" >> >> You want to use the WildQuery alternative. That way you can use >> term*. >> >> Cheers, >> >> Henke >>> >>> >>> mattias >>> >>> >>> >>> >>> On 27/06/2008, at 15.47, Jens Kraemer wrote: >>> >>>> On Fri, Jun 27, 2008 at 02:23:39PM +0200, Mattias Bodlund wrote: >>>>> Dens't anyone have some thoughts on this? >>>> >>>> did you have a look at >>>> http://ferret.davebalmain.com/api/classes/Ferret/Search/Spans/SpanNearQuery.html >>>> ? Not sure but it might solve some if not all of your issues. >>>> >>>> cheers, >>>> Jens >>>> >>>>> On 24/06/2008, at 11.33, Mattias Bodlund wrote: >>>>> >>>>>> Is it possible to search for fields that start with a word or >>>>>> phrase? >>>>>> >>>>>> Lets say I have the following fields: >>>>>> >>>>>> 1 mooning >>>>>> 2 moon landing >>>>>> 3 landing on the moon >>>>>> >>>>>> Then I would like to be able to only get >>>>>> >>>>>> The result I would like to get is: >>>>>> >>>>>> 1 and 2 if I search for moon >>>>>> only 2 if I search for moon landing or moon land >>>>>> only 3 if I search for landing >>>>>> >>>> >>>> -- >>>> Jens Kr?mer >>>> webit! Gesellschaft f?r neue Medien mbH >>>> Schnorrstra?e 76 | 01069 Dresden >>>> Telefon +49 351 46766-0 | Telefax +49 351 46766-66 >>>> kraemer at webit.de | www.webit.de >>>> >>>> Amtsgericht Dresden | HRB 15422 >>>> GF Sven Haubold >>>> _______________________________________________ >>>> Ferret-talk mailing list >>>> Ferret-talk at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>> >>> _______________________________________________ >>> Ferret-talk mailing list >>> Ferret-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/ferret-talk >> >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From henke at mac.se Sun Jun 29 08:52:48 2008 From: henke at mac.se (Henrik) Date: Sun, 29 Jun 2008 14:52:48 +0200 Subject: [Ferret-talk] Find fields beginning with? In-Reply-To: <0B69F0EE-9DA2-4DE2-930F-D41C5C63BA4F@mac.se> References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> <9DC46304-D34A-4CC7-AE41-F4A1C8FDF484@oncotype.dk> <20080627134723.GF6614@cordoba.webit.de> <0BFA7B0E-63B8-4EA9-905F-B0798BDA6768@oncotype.dk> <9C154B9C-C4BE-4E86-A9E1-788BCF713142@mac.se> <0B69F0EE-9DA2-4DE2-930F-D41C5C63BA4F@mac.se> Message-ID: <30DC61DC-03AB-46E3-AA46-DE343A6E36BF@mac.se> Found it! Ferret :: Search :: PrefixQuery 29 jun 2008 kl. 14.44 skrev Henrik: > Ahh true. Interesting situation. Need to research that a bit :) > > //Henke > 28 jun 2008 kl. 10.20 skrev Mattias Bodlund: > >> But that doesn't restrict the search to the start of the field. >> dog* will match both "Wild dog" and "Dog bone" >> >> mattias >> >> On 27/06/2008, at 23.48, Henrik wrote: >> >>> >>> 27 jun 2008 kl. 15.52 skrev Mattias Bodlund: >>> >>>> Yes looked at that but the fields I have are often short. The >>>> constrain I'm looking for is that it has to start with the query. >>>> >>>> Like SELECT * FROM table WHERE title LIKE "term%" >>> >>> You want to use the WildQuery alternative. That way you can use >>> term*. >>> >>> Cheers, >>> >>> Henke >>>> >>>> >>>> mattias >>>> >>>> >>>> >>>> >>>> On 27/06/2008, at 15.47, Jens Kraemer wrote: >>>> >>>>> On Fri, Jun 27, 2008 at 02:23:39PM +0200, Mattias Bodlund wrote: >>>>>> Dens't anyone have some thoughts on this? >>>>> >>>>> did you have a look at >>>>> http://ferret.davebalmain.com/api/classes/Ferret/Search/Spans/SpanNearQuery.html >>>>> ? Not sure but it might solve some if not all of your issues. >>>>> >>>>> cheers, >>>>> Jens >>>>> >>>>>> On 24/06/2008, at 11.33, Mattias Bodlund wrote: >>>>>> >>>>>>> Is it possible to search for fields that start with a word or >>>>>>> phrase? >>>>>>> >>>>>>> Lets say I have the following fields: >>>>>>> >>>>>>> 1 mooning >>>>>>> 2 moon landing >>>>>>> 3 landing on the moon >>>>>>> >>>>>>> Then I would like to be able to only get >>>>>>> >>>>>>> The result I would like to get is: >>>>>>> >>>>>>> 1 and 2 if I search for moon >>>>>>> only 2 if I search for moon landing or moon land >>>>>>> only 3 if I search for landing >>>>>>> >>>>> >>>>> -- >>>>> Jens Kr?mer >>>>> webit! Gesellschaft f?r neue Medien mbH >>>>> Schnorrstra?e 76 | 01069 Dresden >>>>> Telefon +49 351 46766-0 | Telefax +49 351 46766-66 >>>>> kraemer at webit.de | www.webit.de >>>>> >>>>> Amtsgericht Dresden | HRB 15422 >>>>> GF Sven Haubold >>>>> _______________________________________________ >>>>> Ferret-talk mailing list >>>>> Ferret-talk at rubyforge.org >>>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>>> >>>> _______________________________________________ >>>> Ferret-talk mailing list >>>> Ferret-talk at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>> >>> _______________________________________________ >>> Ferret-talk mailing list >>> Ferret-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/ferret-talk >> >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From mattias at oncotype.dk Mon Jun 30 04:52:15 2008 From: mattias at oncotype.dk (Mattias Bodlund) Date: Mon, 30 Jun 2008 10:52:15 +0200 Subject: [Ferret-talk] Find fields beginning with? In-Reply-To: <30DC61DC-03AB-46E3-AA46-DE343A6E36BF@mac.se> References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> <9DC46304-D34A-4CC7-AE41-F4A1C8FDF484@oncotype.dk> <20080627134723.GF6614@cordoba.webit.de> <0BFA7B0E-63B8-4EA9-905F-B0798BDA6768@oncotype.dk> <9C154B9C-C4BE-4E86-A9E1-788BCF713142@mac.se> <0B69F0EE-9DA2-4DE2-930F-D41C5C63BA4F@mac.se> <30DC61DC-03AB-46E3-AA46-DE343A6E36BF@mac.se> Message-ID: It's the same. Will match any word in the field that starts with the query. Same as putting a * after the query. I was looking at a solution where a have a special index that only contains the first word of the original field and then do the query like firstword:dog* and theholefield:dog* Should only match "dog bone" and not "wild dog". Just feels a bit strange to have to fields here. mattias On 29/06/2008, at 14.52, Henrik wrote: > Found it! > > Ferret :: Search :: PrefixQuery > > 29 jun 2008 kl. 14.44 skrev Henrik: > >> Ahh true. Interesting situation. Need to research that a bit :) >> >> //Henke >> 28 jun 2008 kl. 10.20 skrev Mattias Bodlund: >> >>> But that doesn't restrict the search to the start of the field. >>> dog* will match both "Wild dog" and "Dog bone" >>> >>> mattias >>> >>> On 27/06/2008, at 23.48, Henrik wrote: >>> >>>> >>>> 27 jun 2008 kl. 15.52 skrev Mattias Bodlund: >>>> >>>>> Yes looked at that but the fields I have are often short. The >>>>> constrain I'm looking for is that it has to start with the query. >>>>> >>>>> Like SELECT * FROM table WHERE title LIKE "term%" >>>> >>>> You want to use the WildQuery alternative. That way you can use >>>> term*. >>>> >>>> Cheers, >>>> >>>> Henke >>>>> >>>>> >>>>> mattias >>>>> >>>>> >>>>> >>>>> >>>>> On 27/06/2008, at 15.47, Jens Kraemer wrote: >>>>> >>>>>> On Fri, Jun 27, 2008 at 02:23:39PM +0200, Mattias Bodlund wrote: >>>>>>> Dens't anyone have some thoughts on this? >>>>>> >>>>>> did you have a look at >>>>>> http://ferret.davebalmain.com/api/classes/Ferret/Search/Spans/SpanNearQuery.html >>>>>> ? Not sure but it might solve some if not all of your issues. >>>>>> >>>>>> cheers, >>>>>> Jens >>>>>> >>>>>>> On 24/06/2008, at 11.33, Mattias Bodlund wrote: >>>>>>> >>>>>>>> Is it possible to search for fields that start with a word or >>>>>>>> phrase? >>>>>>>> >>>>>>>> Lets say I have the following fields: >>>>>>>> >>>>>>>> 1 mooning >>>>>>>> 2 moon landing >>>>>>>> 3 landing on the moon >>>>>>>> >>>>>>>> Then I would like to be able to only get >>>>>>>> >>>>>>>> The result I would like to get is: >>>>>>>> >>>>>>>> 1 and 2 if I search for moon >>>>>>>> only 2 if I search for moon landing or moon land >>>>>>>> only 3 if I search for landing >>>>>>>> >>>>>> >>>>>> -- >>>>>> Jens Kr?mer >>>>>> webit! Gesellschaft f?r neue Medien mbH >>>>>> Schnorrstra?e 76 | 01069 Dresden >>>>>> Telefon +49 351 46766-0 | Telefax +49 351 46766-66 >>>>>> kraemer at webit.de | www.webit.de >>>>>> >>>>>> Amtsgericht Dresden | HRB 15422 >>>>>> GF Sven Haubold >>>>>> _______________________________________________ >>>>>> Ferret-talk mailing list >>>>>> Ferret-talk at rubyforge.org >>>>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>>>> >>>>> _______________________________________________ >>>>> Ferret-talk mailing list >>>>> Ferret-talk at rubyforge.org >>>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>>> >>>> _______________________________________________ >>>> Ferret-talk mailing list >>>> Ferret-talk at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>> >>> _______________________________________________ >>> Ferret-talk mailing list >>> Ferret-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/ferret-talk >> >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From henke at mac.se Mon Jun 30 08:25:29 2008 From: henke at mac.se (Henrik) Date: Mon, 30 Jun 2008 14:25:29 +0200 Subject: [Ferret-talk] Find fields beginning with? In-Reply-To: References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> <9DC46304-D34A-4CC7-AE41-F4A1C8FDF484@oncotype.dk> <20080627134723.GF6614@cordoba.webit.de> <0BFA7B0E-63B8-4EA9-905F-B0798BDA6768@oncotype.dk> <9C154B9C-C4BE-4E86-A9E1-788BCF713142@mac.se> <0B69F0EE-9DA2-4DE2-930F-D41C5C63BA4F@mac.se> <30DC61DC-03AB-46E3-AA46-DE343A6E36BF@mac.se> Message-ID: <833D4BC2-EABB-4D09-B8B1-019AB4C698D4@mac.se> Ahh so what you need is a whitespacestemmer? OR I'm I still missing something :) //Henke 30 jun 2008 kl. 10.52 skrev Mattias Bodlund: > It's the same. Will match any word in the field that starts with the > query. Same as putting a * after the query. > > I was looking at a solution where a have a special index that only > contains the first word of the original field and then do the query > like > > firstword:dog* and theholefield:dog* > > Should only match "dog bone" and not "wild dog". > > Just feels a bit strange to have to fields here. > > mattias > > > On 29/06/2008, at 14.52, Henrik wrote: > >> Found it! >> >> Ferret :: Search :: PrefixQuery >> >> 29 jun 2008 kl. 14.44 skrev Henrik: >> >>> Ahh true. Interesting situation. Need to research that a bit :) >>> >>> //Henke >>> 28 jun 2008 kl. 10.20 skrev Mattias Bodlund: >>> >>>> But that doesn't restrict the search to the start of the field. >>>> dog* will match both "Wild dog" and "Dog bone" >>>> >>>> mattias >>>> >>>> On 27/06/2008, at 23.48, Henrik wrote: >>>> >>>>> >>>>> 27 jun 2008 kl. 15.52 skrev Mattias Bodlund: >>>>> >>>>>> Yes looked at that but the fields I have are often short. The >>>>>> constrain I'm looking for is that it has to start with the query. >>>>>> >>>>>> Like SELECT * FROM table WHERE title LIKE "term%" >>>>> >>>>> You want to use the WildQuery alternative. That way you can use >>>>> term*. >>>>> >>>>> Cheers, >>>>> >>>>> Henke >>>>>> >>>>>> >>>>>> mattias >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 27/06/2008, at 15.47, Jens Kraemer wrote: >>>>>> >>>>>>> On Fri, Jun 27, 2008 at 02:23:39PM +0200, Mattias Bodlund wrote: >>>>>>>> Dens't anyone have some thoughts on this? >>>>>>> >>>>>>> did you have a look at >>>>>>> http://ferret.davebalmain.com/api/classes/Ferret/Search/Spans/SpanNearQuery.html >>>>>>> ? Not sure but it might solve some if not all of your issues. >>>>>>> >>>>>>> cheers, >>>>>>> Jens >>>>>>> >>>>>>>> On 24/06/2008, at 11.33, Mattias Bodlund wrote: >>>>>>>> >>>>>>>>> Is it possible to search for fields that start with a word >>>>>>>>> or phrase? >>>>>>>>> >>>>>>>>> Lets say I have the following fields: >>>>>>>>> >>>>>>>>> 1 mooning >>>>>>>>> 2 moon landing >>>>>>>>> 3 landing on the moon >>>>>>>>> >>>>>>>>> Then I would like to be able to only get >>>>>>>>> >>>>>>>>> The result I would like to get is: >>>>>>>>> >>>>>>>>> 1 and 2 if I search for moon >>>>>>>>> only 2 if I search for moon landing or moon land >>>>>>>>> only 3 if I search for landing >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Jens Kr?mer >>>>>>> webit! Gesellschaft f?r neue Medien mbH >>>>>>> Schnorrstra?e 76 | 01069 Dresden >>>>>>> Telefon +49 351 46766-0 | Telefax +49 351 46766-66 >>>>>>> kraemer at webit.de | www.webit.de >>>>>>> >>>>>>> Amtsgericht Dresden | HRB 15422 >>>>>>> GF Sven Haubold >>>>>>> _______________________________________________ >>>>>>> Ferret-talk mailing list >>>>>>> Ferret-talk at rubyforge.org >>>>>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>>>>> >>>>>> _______________________________________________ >>>>>> Ferret-talk mailing list >>>>>> Ferret-talk at rubyforge.org >>>>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>>>> >>>>> _______________________________________________ >>>>> Ferret-talk mailing list >>>>> Ferret-talk at rubyforge.org >>>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>>> >>>> _______________________________________________ >>>> Ferret-talk mailing list >>>> Ferret-talk at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>> >>> _______________________________________________ >>> Ferret-talk mailing list >>> Ferret-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/ferret-talk >> >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk From mattias at oncotype.dk Mon Jun 30 08:32:27 2008 From: mattias at oncotype.dk (Mattias Bodlund) Date: Mon, 30 Jun 2008 14:32:27 +0200 Subject: [Ferret-talk] Find fields beginning with? In-Reply-To: <833D4BC2-EABB-4D09-B8B1-019AB4C698D4@mac.se> References: <70e9300c5b11dd22c4af74b8ae7df53b@ruby-forum.com> <9DC46304-D34A-4CC7-AE41-F4A1C8FDF484@oncotype.dk> <20080627134723.GF6614@cordoba.webit.de> <0BFA7B0E-63B8-4EA9-905F-B0798BDA6768@oncotype.dk> <9C154B9C-C4BE-4E86-A9E1-788BCF713142@mac.se> <0B69F0EE-9DA2-4DE2-930F-D41C5C63BA4F@mac.se> <30DC61DC-03AB-46E3-AA46-DE343A6E36BF@mac.se> <833D4BC2-EABB-4D09-B8B1-019AB4C698D4@mac.se> Message-ID: <42EAD03C-2B3D-4F9B-B36A-1E5D0C13BEA6@oncotype.dk> I think so. I have tried almost everything and the common missbehavior I get is that I keep getting hits where the query isn't in the start of the field but somewhere in the middle or end. mattias On 30/06/2008, at 14.25, Henrik wrote: > Ahh so what you need is a whitespacestemmer? > OR I'm I still missing something :) > > //Henke > > 30 jun 2008 kl. 10.52 skrev Mattias Bodlund: > >> It's the same. Will match any word in the field that starts with >> the query. Same as putting a * after the query. >> >> I was looking at a solution where a have a special index that only >> contains the first word of the original field and then do the query >> like >> >> firstword:dog* and theholefield:dog* >> >> Should only match "dog bone" and not "wild dog". >> >> Just feels a bit strange to have to fields here. >> >> mattias >> >> >> On 29/06/2008, at 14.52, Henrik wrote: >> >>> Found it! >>> >>> Ferret :: Search :: PrefixQuery >>> >>> 29 jun 2008 kl. 14.44 skrev Henrik: >>> >>>> Ahh true. Interesting situation. Need to research that a bit :) >>>> >>>> //Henke >>>> 28 jun 2008 kl. 10.20 skrev Mattias Bodlund: >>>> >>>>> But that doesn't restrict the search to the start of the field. >>>>> dog* will match both "Wild dog" and "Dog bone" >>>>> >>>>> mattias >>>>> >>>>> On 27/06/2008, at 23.48, Henrik wrote: >>>>> >>>>>> >>>>>> 27 jun 2008 kl. 15.52 skrev Mattias Bodlund: >>>>>> >>>>>>> Yes looked at that but the fields I have are often short. The >>>>>>> constrain I'm looking for is that it has to start with the >>>>>>> query. >>>>>>> >>>>>>> Like SELECT * FROM table WHERE title LIKE "term%" >>>>>> >>>>>> You want to use the WildQuery alternative. That way you can use >>>>>> term*. >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Henke >>>>>>> >>>>>>> >>>>>>> mattias >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 27/06/2008, at 15.47, Jens Kraemer wrote: >>>>>>> >>>>>>>> On Fri, Jun 27, 2008 at 02:23:39PM +0200, Mattias Bodlund >>>>>>>> wrote: >>>>>>>>> Dens't anyone have some thoughts on this? >>>>>>>> >>>>>>>> did you have a look at >>>>>>>> http://ferret.davebalmain.com/api/classes/Ferret/Search/Spans/SpanNearQuery.html >>>>>>>> ? Not sure but it might solve some if not all of your issues. >>>>>>>> >>>>>>>> cheers, >>>>>>>> Jens >>>>>>>> >>>>>>>>> On 24/06/2008, at 11.33, Mattias Bodlund wrote: >>>>>>>>> >>>>>>>>>> Is it possible to search for fields that start with a word >>>>>>>>>> or phrase? >>>>>>>>>> >>>>>>>>>> Lets say I have the following fields: >>>>>>>>>> >>>>>>>>>> 1 mooning >>>>>>>>>> 2 moon landing >>>>>>>>>> 3 landing on the moon >>>>>>>>>> >>>>>>>>>> Then I would like to be able to only get >>>>>>>>>> >>>>>>>>>> The result I would like to get is: >>>>>>>>>> >>>>>>>>>> 1 and 2 if I search for moon >>>>>>>>>> only 2 if I search for moon landing or moon land >>>>>>>>>> only 3 if I search for landing >>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Jens Kr?mer >>>>>>>> webit! Gesellschaft f?r neue Medien mbH >>>>>>>> Schnorrstra?e 76 | 01069 Dresden >>>>>>>> Telefon +49 351 46766-0 | Telefax +49 351 46766-66 >>>>>>>> kraemer at webit.de | www.webit.de >>>>>>>> >>>>>>>> Amtsgericht Dresden | HRB 15422 >>>>>>>> GF Sven Haubold >>>>>>>> _______________________________________________ >>>>>>>> Ferret-talk mailing list >>>>>>>> Ferret-talk at rubyforge.org >>>>>>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Ferret-talk mailing list >>>>>>> Ferret-talk at rubyforge.org >>>>>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>>>>> >>>>>> _______________________________________________ >>>>>> Ferret-talk mailing list >>>>>> Ferret-talk at rubyforge.org >>>>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>>>> >>>>> _______________________________________________ >>>>> Ferret-talk mailing list >>>>> Ferret-talk at rubyforge.org >>>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>>> >>>> _______________________________________________ >>>> Ferret-talk mailing list >>>> Ferret-talk at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/ferret-talk >>> >>> _______________________________________________ >>> Ferret-talk mailing list >>> Ferret-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/ferret-talk >> >> _______________________________________________ >> Ferret-talk mailing list >> Ferret-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/ferret-talk > > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk