Re: [GENERAL] multi terabyte fulltext searching

2007-03-22 Thread Arturo Perez
On Wed, 21 Mar 2007 08:57:39 -0700, Benjamin Arai wrote: > Hi Oleg, > > I am currently using GIST indexes because I receive about 10GB of new data > a week (then again I am not deleting any information). The do not expect > to be able to stop receiving text for about 5 years, so the data is not

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Benjamin Arai
Are there any examples of dblink being used in commercial environments. I am curious to understand how it deals with node failures and etc. Benjamin On Mar 21, 2007, at 9:35 AM, Oleg Bartunov wrote: On Wed, 21 Mar 2007, Benjamin Arai wrote: Can't you implement something similar to google

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Joshua D. Drake
Benjamin Arai wrote: > 24. I can think of a couple of things. 1. Increase your spindle count. 2. Push your gist indexes off to another array entirely (with separate controllers) 3. Split your actual tables between other arrays Or... by a SAN (but then again, I just replaced a million dollar SAN

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Oleg Bartunov
On Wed, 21 Mar 2007, Benjamin Arai wrote: Can't you implement something similar to google by aggregating results for TSearch2 over many machines? tsearch2 doesn't use any global statistics, so, in principle, you should be able to run fts on several machines and combine them using dblink (contr

[GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Benjamin Arai
Hi, I have been struggling with getting fulltext searching for very large databases. I can fulltext index 10s if gigs without any problem but when I start geting to hundreds of gigs it becomes slow. My current system is a quad core with 8GB of memory. I have the resource to throw more

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Oleg Bartunov
On Wed, 21 Mar 2007, Benjamin Arai wrote: What is inheritance+CE? Hmm, http://www.postgresql.org/docs/8.2/static/ddl-inherit.html http://www.postgresql.org/docs/8.2/static/ddl-partitioning.html Benjamin On Mar 21, 2007, at 9:10 AM, Oleg Bartunov wrote: inheritance+CE Regard

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Teodor Sigaev
I am currently using GIST indexes because I receive about 10GB of new data a week (then again I am not deleting any information). The do not expect to be able to stop receiving text for about 5 years, so the data is not going to become static any time soon. The reason I am concerned with perf

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Benjamin Arai
What is inheritance+CE? Benjamin On Mar 21, 2007, at 9:10 AM, Oleg Bartunov wrote: inheritance+CE

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Benjamin Arai
Can't you implement something similar to google by aggregating results for TSearch2 over many machines? Benjamin On Mar 21, 2007, at 8:59 AM, Teodor Sigaev wrote: I'm afraid that fulltext search on multiterabytes set of documents can not be implemented on any RDBMS, at least on single box.

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Benjamin Arai
24. Benjamin On Mar 21, 2007, at 9:09 AM, Joshua D. Drake wrote: Benjamin Arai wrote: True, but what happens when my database reaches 100 terabytes? Is 5 seconds ok? How about 10? My problem is that I do not believe the performance loss I am experiencing as the data becomes large is (log t

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Oleg Bartunov
On Wed, 21 Mar 2007, Benjamin Arai wrote: Hi Oleg, I am currently using GIST indexes because I receive about 10GB of new data a week (then again I am not deleting any information). The do not expect to be able to stop receiving text for about 5 years, so the data is not going to become stat

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Joshua D. Drake
Benjamin Arai wrote: > True, but what happens when my database reaches 100 terabytes? Is 5 > seconds ok? How about 10? My problem is that I do not believe the > performance loss I am experiencing as the data becomes large is (log the > # of records). This worries me because I could be doing somet

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Benjamin Arai
By the way, what is the largest TSearch2 database that you know of and how fast does it return results? Maybe my expectations are unrealistic. Benjamin On Mar 21, 2007, at 8:42 AM, Oleg Bartunov wrote: Benjamin, as one of the author of tsearch2 I'd like to know more about your setup. t

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Teodor Sigaev
I'm afraid that fulltext search on multiterabytes set of documents can not be implemented on any RDBMS, at least on single box. Specialized fulltext search engines (with exact matching and time to search about one second) has practical limit near 20 millions of docs, cluster - near 100 millions.

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Benjamin Arai
True, but what happens when my database reaches 100 terabytes? Is 5 seconds ok? How about 10? My problem is that I do not believe the performance loss I am experiencing as the data becomes large is (log the # of records). This worries me because I could be doing something wrong. Or I mig

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Benjamin Arai
Hi Oleg, I am currently using GIST indexes because I receive about 10GB of new data a week (then again I am not deleting any information). The do not expect to be able to stop receiving text for about 5 years, so the data is not going to become static any time soon. The reason I am conc

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Joshua D. Drake
Benjamin Arai wrote: > Hi, > > I have been struggling with getting fulltext searching for very large > databases. I can fulltext index 10s if gigs without any problem but > when I start geting to hundreds of gigs it becomes slow. My current > system is a quad core with 8GB of memory. I have the

Re: [GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Oleg Bartunov
Benjamin, as one of the author of tsearch2 I'd like to know more about your setup. tsearch2 in 8.2 has GIN index support, which scales much better than old GiST index. Oleg On Wed, 21 Mar 2007, Benjamin Arai wrote: Hi, I have been struggling with getting fulltext searching for very large da

[GENERAL] multi terabyte fulltext searching

2007-03-21 Thread Benjamin Arai
Hi, I have been struggling with getting fulltext searching for very large databases. I can fulltext index 10s if gigs without any problem but when I start geting to hundreds of gigs it becomes slow. My current system is a quad core with 8GB of memory. I have the resource to throw more