On Wed, Apr 18, 2012 at 1:15 PM, Danny Yoo wrote:
>>
>> I think the subfield you're looking for is called "information retrieval",
>> and there are textbooks on it.
>
> Managing Gigabytes, for example:
>
> http://ww2.cs.mu.oz.au/mg/
Another book that just came out that looks good is: Introduc
On Mon, Apr 23, 2012 at 10:18 AM, Pedro wrote:
> Ok, thank you all for the input, however I'm still missing an important
> detail.
> So I build a suffix tree, but how exactly do I refer to the target documents?
> Should I tie a reference to each document in which the string occurs
> to each node
Ok, thank you all for the input, however I'm still missing an important detail.
So I build a suffix tree, but how exactly do I refer to the target documents?
Should I tie a reference to each document in which the string occurs
to each node? I can't think of other way to do it.
On Fri, Apr 20, 2
On Wed, Apr 18, 2012 at 08:54:54PM +0200, Pedro wrote:
> So to put it in a simple way, I need to tokenize all my data and
> create an index which I load into memory...?
> Is this how it is usually done? For example, does my browser (firefox)
> keep an index of all the words present in urls and page
So to put it in a simple way, I need to tokenize all my data and
create an index which I load into memory...?
Is this how it is usually done? For example, does my browser (firefox)
keep an index of all the words present in urls and page titles on
memory at any given time?
On Wed, Apr 18, 2012 at 7
Pedro wrote at 04/18/2012 02:54 PM:
So to put it in a simple way, I need to tokenize all my data and
create an index which I load into memory...?
That's a simple way that might do everything you want.
If you do this, and then find you want it to work better, then I suggest
hitting an IR t
On Wed, Apr 18, 2012 at 2:54 PM, Pedro wrote:
> So to put it in a simple way, I need to tokenize all my data and
> create an index which I load into memory...?
> Is this how it is usually done? For example, does my browser (firefox)
> keep an index of all the words present in urls and page titles
>
> I think the subfield you're looking for is called "information retrieval",
> and there are textbooks on it.
Managing Gigabytes, for example:
http://ww2.cs.mu.oz.au/mg/
Racket Users list:
http://lists.racket-lang.org/users
On Wed, Apr 18, 2012 at 10:21 AM, Neil Van Dyke wrote:
> Pedro wrote at 04/17/2012 04:21 PM:
>
>> My first question is: which kind of kind of data structure should I
>> use in order to perform such a quick search? I'm guessing I should
>> split my notes' data into words and store each single word
Pedro wrote at 04/17/2012 04:21 PM:
My first question is: which kind of kind of data structure should I
use in order to perform such a quick search? I'm guessing I should
split my notes' data into words and store each single word in some
kind of tree. But should I just jam every single word in th
10 matches
Mail list logo