Heikki,

Thank you very much. I tried it out and the initial results look good.

Although I get "java.lang.OutOfMemoryError: Java heap space" when I search for 
a single TextField over 70 million records. Probably my code needs tuning.

I'll research more to figure it out. But this is a great start, thanks to 
everyone who provided suggestions.

Regards,
Raghu


-----Original Message-----
From: heikki [mailto:tropic...@gmail.com] 
Sent: Monday, June 17, 2013 5:35 PM
To: java-user@lucene.apache.org
Subject: Re: New Lucene User

hi,

I think Lucene is an excellent option for you.

You don't need to export the data to a flat file first. You can just access 
your database (in whatever way you normally like, e.g. using JDBC or 
Hibernate). You can do this for example once a day, retrieving only modified 
records. For each record you retrieve, you create a so-called Lucene Document. 
You add fields to these documents as you see fit -- for example, you want to 
search in 20 of your 30 columns, so you could add fields containing the values 
from those 20 columns to the Lucene Document.
You give each Document to an IndexWriter, which will add it to the Lucene 
index. When you search, you retrieve such documents, which you can use then to 
create a UI display for search results.

Of course there's a lot more to say about this and I'd recommend you check 
online tutorials or one of the Lucene books like *Lucene In Action* to learn 
more about how to use Lucene in detail.

Kind regards
Heikki Doeleman


On Mon, Jun 17, 2013 at 11:03 PM, <raghavendra.k....@barclays.com> wrote:

> Hi,
>
> I have a requirement to perform a full-text search in a new 
> application and I came across Lucene and I want to check if it helps our 
> cause.
>
> Requirement:
>
> I have a SQL Server database table with around 70 million records in it.
> It is not a live table and the data gets appended to it on a daily basis.
>
> The table has about 30 columns. The user will provide one string, and 
> this value has to be searched against 20 columns for each record. All 
> matching records need to be displayed in the UI.
>
> My Analysis
>
> Based on what I have read until now about Lucene, I believe I need to 
> convert my database table data into a flat file, generate indexes and 
> then perform the search.
>
> Questions
>
>
> -          To begin with, is Lucene a good option for this kind of
> requirement? Note: Let us ignore daily index generation and UI display 
> for this discussion.
>
> -          Should the entire data of 70 million records exist in one flat
> file?
>
> -          How do I define what fields (20 columns) should be searched
> among the complete list (30 columns)?
>
> As I am just starting off, I may not even know about other 
> dependencies. I kindly request you to provide clarifications / 
> reference to an example that would suit my case.
>
> Please let me know if you have any questions.
>
> Thanks,
> Raghu
>
>
> _______________________________________________
>
> This message is for information purposes only, it is not a 
> recommendation, advice, offer or solicitation to buy or sell a product 
> or service nor an official confirmation of any transaction. It is 
> directed at persons who are professionals and is not intended for 
> retail customer use. Intended for recipient only. This message is subject to 
> the terms at:
> www.barclays.com/emaildisclaimer.
>
> For important disclosures, please see:
> www.barclays.com/salesandtradingdisclaimer regarding market commentary 
> from Barclays Sales and/or Trading, who are active market 
> participants; and in respect of Barclays Research, including 
> disclosures relating to specific issuers, please see 
> http://publicresearch.barclays.com.
>
> _______________________________________________
>

_______________________________________________

This message is for information purposes only, it is not a recommendation, 
advice, offer or solicitation to buy or sell a product or service nor an 
official confirmation of any transaction. It is directed at persons who are 
professionals and is not intended for retail customer use. Intended for 
recipient only. This message is subject to the terms at: 
www.barclays.com/emaildisclaimer.

For important disclosures, please see: 
www.barclays.com/salesandtradingdisclaimer regarding market commentary from 
Barclays Sales and/or Trading, who are active market participants; and in 
respect of Barclays Research, including disclosures relating to specific 
issuers, please see http://publicresearch.barclays.com.

_______________________________________________

Reply via email to