for their
application.
_gk
- Original Message -
From: "Babu, KameshNarayana (GE, Research, consultant)"
<[EMAIL PROTECTED]>
To:
Sent: Wednesday, March 29, 2006 11:14 AM
Subject: RE: Hi Experts
Thanks Aditya,
Lucene is used only to search in the local machine right
Well you'll have to index the internet.
Then when you've done that then you can try going against google.
Oh, and you'll have to update that index every now and then to keep your
index of the internet updated.
Good luck.
--- I²R Disclaimer
---
PROTECTED]
Sent: Wednesday, March 29, 2006 11:34 AM
To: java-user@lucene.apache.org
Subject: RE: Hi Experts
The way lucene works is you need to have the index first.
Only then you can search it.
So if you want to search within a given URL, you need to somehow create
the index of all the webpages within
"
-Original Message-
From: Ranjan K. Baisak [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 29, 2006 12:32 PM
To: java-user@lucene.apache.org
Subject: RE: Hi Experts
you wrote
"
I am using HTMLparser to parse all html pages and to
get required information out of that.
Let me t
> "
> -Original Message-
> From: Ranjan K. Baisak
> [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, March 29, 2006 12:06 PM
> To: java-user@lucene.apache.org
> Subject: Re: Hi Experts
>
>
> For internet searching Nutch is the best tool. But
> however as
"
-Original Message-
From: Ranjan K. Baisak [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 29, 2006 12:06 PM
To: java-user@lucene.apache.org
Subject: Re: Hi Experts
For internet searching Nutch is the best tool. But
however as you dont want to use cygwin then you need
to use Luce
For internet searching Nutch is the best tool. But
however as you dont want to use cygwin then you need
to use Lucene in following way.
You need to download whole page and create an index
out of that page. Then use lucene to search offline
content than online.
I have used lucene in this way and I h
The way lucene works is you need to have the index first.
Only then you can search it.
So if you want to search within a given URL, you need to somehow create
the index of all the webpages within that URL. If the webserver linked
to that URL is also yours, then that would not be a big deal.
But i