Re: [racket] Need some help for my first real experiment with scheme

2012-04-24 Thread Danny Yoo
On Wed, Apr 18, 2012 at 1:15 PM, Danny Yoo wrote: >> >> I think the subfield you're looking for is called "information retrieval", >> and there are textbooks on it. > > Managing Gigabytes, for example: > >    http://ww2.cs.mu.oz.au/mg/ Another book that just came out that looks good is: Introduc

Re: [racket] Need some help for my first real experiment with scheme

2012-04-24 Thread Danny Yoo
On Mon, Apr 23, 2012 at 10:18 AM, Pedro wrote: > Ok, thank you all  for the input, however I'm still missing an important > detail. > So I build a suffix tree, but how exactly do I refer to the target documents? > Should I tie a reference to each document in which the string occurs > to each node

Re: [racket] Need some help for my first real experiment with scheme

2012-04-24 Thread Pedro
Ok, thank you all for the input, however I'm still missing an important detail. So I build a suffix tree, but how exactly do I refer to the target documents? Should I tie a reference to each document in which the string occurs to each node? I can't think of other way to do it. On Fri, Apr 20, 2

Re: [racket] Need some help for my first real experiment with scheme

2012-04-20 Thread Hendrik Boom
On Wed, Apr 18, 2012 at 08:54:54PM +0200, Pedro wrote: > So to put it in a simple way, I need to tokenize all my data and > create an index which I load into memory...? > Is this how it is usually done? For example, does my browser (firefox) > keep an index of all the words present in urls and page

Re: [racket] Need some help for my first real experiment with scheme

2012-04-19 Thread Pedro
So to put it in a simple way, I need to tokenize all my data and create an index which I load into memory...? Is this how it is usually done? For example, does my browser (firefox) keep an index of all the words present in urls and page titles on memory at any given time? On Wed, Apr 18, 2012 at 7

Re: [racket] Need some help for my first real experiment with scheme

2012-04-18 Thread Neil Van Dyke
Pedro wrote at 04/18/2012 02:54 PM: So to put it in a simple way, I need to tokenize all my data and create an index which I load into memory...? That's a simple way that might do everything you want. If you do this, and then find you want it to work better, then I suggest hitting an IR t

Re: [racket] Need some help for my first real experiment with scheme

2012-04-18 Thread Danny Yoo
On Wed, Apr 18, 2012 at 2:54 PM, Pedro wrote: > So to put it in a simple way, I need to tokenize all my data and > create an index which I load into memory...? > Is this how it is usually done? For example, does my browser (firefox) > keep an index of all the words present in urls and page titles

Re: [racket] Need some help for my first real experiment with scheme

2012-04-18 Thread Danny Yoo
> > I think the subfield you're looking for is called "information retrieval", > and there are textbooks on it. Managing Gigabytes, for example: http://ww2.cs.mu.oz.au/mg/ Racket Users list: http://lists.racket-lang.org/users

Re: [racket] Need some help for my first real experiment with scheme

2012-04-18 Thread Sam Tobin-Hochstadt
On Wed, Apr 18, 2012 at 10:21 AM, Neil Van Dyke wrote: > Pedro wrote at 04/17/2012 04:21 PM: > >> My first question is: which kind of kind of data structure should I >> use in order to perform such a quick search? I'm guessing I should >> split my notes' data into words and store each single word

Re: [racket] Need some help for my first real experiment with scheme

2012-04-18 Thread Neil Van Dyke
Pedro wrote at 04/17/2012 04:21 PM: My first question is: which kind of kind of data structure should I use in order to perform such a quick search? I'm guessing I should split my notes' data into words and store each single word in some kind of tree. But should I just jam every single word in th