Desing Question related with Lucene

2010-05-19 Thread ilkay polat
Hello;

  I have a desing question while developing my project. If you have time,
  lease read my problem and if you have a solution please make me informed.

  Project : Our system produce a txt file for every one hour(13 pm 14 pm
e.g.
). (These files contain logs from network e.g. TCP logs). I use FreeBSD and
  cron. For every hour, after five minutes later(13:05 pm , 14:05 pm e.g.),
  there is a process which indexes this txt file with lucene indexing.  Then
I
  have an web app which search some textual search with this produced
indexed
  file(lucene produce obviously).

  Problem: Our customers wants to know which client is using internet most,
  which site is used most and like this things which are done with sql like
  that as you know

  select site, count(site) from log_table
  group by site

  Mhy solution is: A second process which insert logs to table (temporary
  table ) and aftr inserting do some quesries on this temporarty table and
get
 results to main statistics tablse which has tables as which site is most
  visited table . This temporary table updates related statistics table .


  I have need a recommendation about the problem? Is there any solution on
  lucene(get most ranked kind query searching is exist or not and if yes is
it
 good for performance)
  If there is no a solution in lucene what will you use for this situation?
: Thanks for your help.


: ilkay POLAT
: Research &Development Software Engineer TURKEY


Removing old datas from index file

2010-05-20 Thread ilkay polat
Hello;

I need to learn whether there is a way to remove some records on indexed
files. And is it rapid for removing some indexed file records (For example
clean old records whose created date's are less than a definite day) .
Thanks


Re: Desing Question related with Lucene

2010-05-20 Thread ilkay polat
Is it better to analyze logs with lucene ? Or other solutions are better for
performance

On Thu, May 20, 2010 at 9:51 AM, ilkay polat wrote:

> Hello;
>
>   I have a desing question while developing my project. If you have time,
>   lease read my problem and if you have a solution please make me informed.
>
>   Project : Our system produce a txt file for every one hour(13 pm 14 pm
> e.g.
> ). (These files contain logs from network e.g. TCP logs). I use FreeBSD and
>   cron. For every hour, after five minutes later(13:05 pm , 14:05 pm e.g.),
>   there is a process which indexes this txt file with lucene indexing.
>  Then I
>   have an web app which search some textual search with this produced
> indexed
>   file(lucene produce obviously).
>
>   Problem: Our customers wants to know which client is using internet most,
>   which site is used most and like this things which are done with sql like
>   that as you know
>
>   select site, count(site) from log_table
>   group by site
>
>   Mhy solution is: A second process which insert logs to table (temporary
>   table ) and aftr inserting do some quesries on this temporarty table and
> get
>  results to main statistics tablse which has tables as which site is most
>   visited table . This temporary table updates related statistics table .
>
>
>   I have need a recommendation about the problem? Is there any solution on
>   lucene(get most ranked kind query searching is exist or not and if yes is
> it
>  good for performance)
>   If there is no a solution in lucene what will you use for this situation?
> : Thanks for your help.
>
>
> : ilkay POLAT
>
> : Research &Development Software Engineer TURKEY
>



-- 
---
ilkay POLAT

Research &Development Software Engineer TURKEY


Out of memory problem in search

2010-07-14 Thread ilkay polat
Hello Friends;

Recently, I have problem with lucene search - memory problem on the basis that 
indexed file is so big. (I have indexed some kinds of information and this 
indexed file's size is nearly more than 40 gigabyte. )  

I search the lucene indexed file with 
org.apache.lucene.search.Searcher.search(query, null, offset + limit, new 
Sort(new SortField("time", SortField.LONG, true)));
(This provides to find (offset + limit) records to back.) 

I use searching by range. For example, in web page I firstly search records 
which are in [0, 100] range then second page [100, 200]  
I have nearly 200,000 records at all. When I go to last page which means 
records between 200,000 -100, 200,0, there is a memory problem(I have 4gb ram 
on running machine) in jvm( out of memory error).

Is there a way to overcome this memory problem? 

Thanks

--
ilkay POLAT   Software Engineer
TURKEY
 
  Gsm : (+90) 532 542 36 71
  E-mail : ilkay_po...@yahoo.com


  

RE: Out of memory problem in search

2010-07-14 Thread ilkay polat
Indeed, this is  good solution to that kind of problems. But same problem can 
be  occured in future when logs are added to index file. 
For example, here 200,000 records have problem(These logs are collected in 13 
days). 
With that reverse way, there will be maximum search range is 100,000. 
But if there is 400,000 records same problem will be occured(Max search space 
is 200,000 again). 
Is there another way which do not consume so much memory  or consume restrict 
memory and consume time instead of memory. This restriction come from our 
project hardware restrictions(Hardware memory is 8GB in maximum situation)?

--- On Wed, 7/14/10, Uwe Schindler  wrote:

From: Uwe Schindler 
Subject: RE: Out of memory problem in search
To: java-user@lucene.apache.org
Date: Wednesday, July 14, 2010, 3:25 PM

Reverse the query sorting to display the last page.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: ilkay polat [mailto:ilkay_po...@yahoo.com]
> Sent: Wednesday, July 14, 2010 12:44 PM
> To: java-user@lucene.apache.org
> Subject: Out of memory problem in search
> 
> Hello Friends;
> 
> Recently, I have problem with lucene search - memory problem on the basis
> that indexed file is so big. (I have indexed some kinds of information and
this
> indexed file's size is nearly more than 40 gigabyte. )
> 
> I search the lucene indexed file with
> org.apache.lucene.search.Searcher.search(query, null, offset + limit, new
> Sort(new SortField("time", SortField.LONG, true))); (This provides to find
> (offset + limit) records to back.)
> 
> I use searching by range. For example, in web page I firstly search
records
> which are in [0, 100] range then second page [100, 200] I have nearly
200,000
> records at all. When I go to last page which means records between 200,000
-
> 100, 200,0, there is a memory problem(I have 4gb ram on running machine)
in
> jvm( out of memory error).
> 
> Is there a way to overcome this memory problem?
> 
> Thanks
> 
> --
> ilkay POLAT   Software Engineer
> TURKEY
> 
>   Gsm : (+90) 532 542 36 71
>   E-mail : ilkay_po...@yahoo.com
> 
> 
> 


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




  

Re: Out of memory problem in search

2010-07-14 Thread ilkay polat
Hi,
We have hardware restrictions(Max RAM can be  8GB). So, unfortunately,  
increasing memory can not be option for us for today's situation. 

Yes, as you said that problem is faced when goes to last pages of search screen 
because of using search method which is find top n records. In other way, this 
is meaning "searching all the thinngs returns all". 

I am now researching whether there is a way which consumes time instead of 
memory in this search mechanism in lucene? Any other ideas? 

Thanks

--- On Wed, 7/14/10, findbestopensource  wrote:

From: findbestopensource 
Subject: Re: Out of memory problem in search
To: java-user@lucene.apache.org
Date: Wednesday, July 14, 2010, 2:59 PM

Certainly it will. Either you need to increase your memory OR refine your
query. Eventhough you display paginated result. The first couple of pages
will display fine and going towards last may face problem. This is because,
200,000 objects is created and iterated, 190,900 objects are skipped and
last100 objects are returned. The memory is consumed in creating these
objects.

Regards
Aditya
www.findbestopensource.com



On Wed, Jul 14, 2010 at 4:14 PM, ilkay polat  wrote:

> Hello Friends;
>
> Recently, I have problem with lucene search - memory problem on the basis
> that indexed file is so big. (I have indexed some kinds of information and
> this indexed file's size is nearly more than 40 gigabyte. )
>
> I search the lucene indexed file with
> org.apache.lucene.search.Searcher.search(query, null, offset + limit, new
> Sort(new SortField("time", SortField.LONG, true)));
> (This provides to find (offset + limit) records to back.)
>
> I use searching by range. For example, in web page I firstly search records
> which are in [0, 100] range then second page [100, 200]
> I have nearly 200,000 records at all. When I go to last page which means
> records between 200,000 -100, 200,0, there is a memory problem(I have 4gb
> ram on running machine) in jvm( out of memory error).
>
> Is there a way to overcome this memory problem?
>
> Thanks
>
> --
> ilkay POLAT   Software Engineer
> TURKEY
>
>  Gsm : (+90) 532 542 36 71
>  E-mail : ilkay_po...@yahoo.com
>
>
>



  

Re: Out of memory problem in search

2010-07-14 Thread ilkay polat
I have also  confused about the memory management of lucene. 

Where is this out of memory problem is mainly arised from Reason-1 or Reason-2 
reason?
 
Reason-1 : Problem is sourced from searching is done in big indexed file 
(nearly 40 GB) If there is 100(small number of records) records returned from 
search in 60 GB indexed file, problem will again arised.
OR
Reason-2 : Problem is sourced from finding so many records(nearly 200,000 
records), so in memory 200, 000 java object in heap? If file's sizeis 10 
GB(small file size ) but returned records are so many, problem will again 
arised.

Is there any document which tells the general memory management issues in 
searching in lucene? 

Thanks

 
ilkay POLAT     Software Engineer   Gsm : (+90) 532 542 36 71
  E-mail : ilkay_po...@yahoo.com

--- On Wed, 7/14/10, ilkay polat  wrote:

From: ilkay polat 
Subject: Re: Out of memory problem in search
To: java-user@lucene.apache.org
Date: Wednesday, July 14, 2010, 3:54 PM

Hi,
We have hardware restrictions(Max RAM can be  8GB). So, unfortunately,  
increasing memory can not be option for us for today's situation. 

Yes, as you said that problem is faced when goes to last pages of search screen 
because of using search method which is find top n records. In other way, this 
is meaning "searching all the thinngs returns all". 

I am now researching whether there is a way which consumes time instead of 
memory in this search mechanism in lucene? Any other ideas? 

Thanks

--- On Wed, 7/14/10, findbestopensource  wrote:

From: findbestopensource 
Subject: Re: Out of memory problem in search
To: java-user@lucene.apache.org
Date: Wednesday, July 14, 2010, 2:59 PM

Certainly it will. Either you need to increase your memory OR refine your
query. Eventhough you display paginated result. The first couple of pages
will display fine and going towards last may face problem. This is because,
200,000 objects is created and iterated, 190,900 objects are skipped and
last100 objects are returned. The memory is consumed in creating these
objects.

Regards
Aditya
www.findbestopensource.com



On Wed, Jul 14, 2010 at 4:14 PM, ilkay polat  wrote:

> Hello Friends;
>
> Recently, I have problem with lucene search - memory problem on the basis
> that indexed file is so big. (I have indexed some kinds of information and
> this indexed file's size is nearly more than 40 gigabyte. )
>
> I search the lucene indexed file with
> org.apache.lucene.search.Searcher.search(query, null, offset + limit, new
> Sort(new SortField("time", SortField.LONG, true)));
> (This provides to find (offset + limit) records to back.)
>
> I use searching by range. For example, in web page I firstly search records
> which are in [0, 100] range then second page [100, 200]
> I have nearly 200,000 records at all. When I go to last page which means
> records between 200,000 -100, 200,0, there is a memory problem(I have 4gb
> ram on running machine) in jvm( out of memory error).
>
> Is there a way to overcome this memory problem?
>
> Thanks
>
> --
> ilkay POLAT   Software Engineer
> TURKEY
>
>  Gsm : (+90) 532 542 36 71
>  E-mail : ilkay_po...@yahoo.com
>
>
>