Fixed, thank you. Changes are commited in CVS, pls, try it (I think that index
is corrupted, so you need to recreate it)
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru
ect -
headline function is slow enough.
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 6: explain analyze is your friend
I'm using 8.1.4 at the moment but I guess I need to update. The 8.2 is
looking really promising. So with 8.2 I don't need the subselect?
IMHO, don't need
headline function is slow enough.
You think?! ;)
I known :) - computing headline is a hard task
-
performance
for queries like
select * from a,b where a.f = b.f or ( a.f is null and b.f is null)
NULL support is fast in MS SQL because MS SQL doesn't follow SQL standard: index
in MS SQL believes that (NULL = NULL) is true.
--
Teodor Sigaev E-mail: [EMAIL PROT
he files seem to be ok and are UTF-8 encoded.
Best regards
Manuel
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http://archives.postgresql.org/
--
Teodor Sigaev
--(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
oking for.
IIRC I had TSearch2 with my `oldFormat' files working on an older
8.2-dev-snapshot.
Thanks for any hint.
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
Hmm, 2.0.1. But what's the difference? I don't watch changes in OpenOffice
hardly.
Hannes Dorbath wrote:
What version of OpenOffice MySpell dictionaries is supposed to work with
TSearch in 8.2?
The format used till OpenOffice 2.0.1 or the format starting from 2.0.2?
--
Teo
Oh, I see. So, only 2.0.1 and I can't change that for 8.2 branch. :(
Hannes Dorbath wrote:
On 21.12.2006 18:32, Teodor Sigaev wrote:
Are you trying to convert openoffice (myspell) format to ispell with
help of my2ispell?
Yes:
http://groups.google.com/group/pgsql.general/browse_thread/t
e configuration is not saved correctly?
Best regards
Manuel ...
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
-
1
(1 row)
contrib_regression=# select numnode( plainto_tsquery('long table') );
numnode
-----
3
(1 row)
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
-
called on null input;
# select isvoid( plainto_tsquery('the & any') );
NOTICE: query contains only stopword(s) or doesn't contain lexeme(s), ignored
isvoid
t
(1 row)
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
waiting socket to write,
so, may be there is symmetrical problem with read? Or pgwin32_select() is used
for waiting write too?
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru
intarray. My question is whether I still should use intarray for
indexing (if yes then either I should use GIST or GIN) or maybe GIN
index is faster than GIST+intarray / GIN+intarray.
Yes, with intarray you can use GiST/GIN indexes which you wish
--
Teodor Sigaev
Use GIN index instead of GiST
I have a table of books, with 120 registers. I have created an GIST
index over the title and subtitle,
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru
tsquery. Small description of
hlparsetext is placed at
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/HOWTO-parser-tsearch2.html
near the end. Description of HLWORD struct is some out of day, sorry.
--
Teodor Sigaev E-mail: [
to_tsvector() could as well return the character number or a byte
pointer, I could see advantages for both. But the word number makes
little sense to me.
Word number is used only in ranking functions. If you don't need a ranking than
you could safely strip positional information.
--
T
ranking purpose
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 6: explain analyze is your friend
No, the first X aren't more important, but being able to determine
word proximity is very important for partial phrase matching and
ranking. The closer the words, the "better" the match, all else being
equal.
exactly
---(end of broadcast)---
TIP
;test text');
to_tsvector
---
'test':1 'text':2
(1 row)
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)
8.2 has fully rewritten text parser based on POSIX is* functions.
Thomas Pundt wrote:
On Wednesday 21 March 2007 14:25, Teodor Sigaev wrote:
| I can't reproduce your problem, but I have not Windows box, can anybody
| reproduce that?
just a guess in the wild; I once had a similar phenome
lions. Bigger collections
require engines like a google.
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP
: any data older than one month (which doesn't
change) with GIN index and new data with GiST. And one time per month moves data
from GiST to GIN.
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW:
non-breakable space (0xa0) and that commit
assumes any character with C locale and multibyte encoding and > 0x7f is alpha.
To check theory, pls, apply attached patch.
If so, I'm confused, we can not assume that 0xa0 is a space symbol in any
multibyte encoding, even
Solved, see attached patch. I had found old Celeron-300 box and install Windows
on it, and it was very slow :)
Nope, same result with this patch.
Thank you.
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW
nowball is out). It should return 'voyage' (=travel) instead of
'voyageuse' (=female traveler)
That's now what I want ; i want to use snowball to stem French words.
I'm going to make a debug build and try to debug it, but if anyone
can help
e and
Snowball doesn't use version mark or something similar. So, downloaded Snowball
core and stemmers in different time may be incompatible :(.
Our tsearch_core patch (moving tsearch into core of pgsql) solves that problem -
it contains all possible snowball stemmers.
--
Teo
Fixed. Thanks for the report.
Anyway, just to signal that tsearch2 crashes if SELECT is not
granted to pg_ts_dict (other tables give a proper error message when
not GRANTed).On
--
Teodor Sigaev E-mail: [EMAIL PROTECTED
Sorry, no - I tested on CVS HEAD, so dll isn't compatible :(
Wait a bit for 8.2.4
richardcraig wrote:
Teodor
As a non-C windows user (yes - throw stones at me :) ) Do you have a fixed
dll for this patch that I can try?
Thanks
Richard
Teodor Sigaev-2 wrote:
Solved, see attached pat
you should be able to index the way you want. In contrib there a module
"cube" which does similar to what you want to 3D, extending it to 12D
shouldn't be too hard...
contrib/cube module implements N dimensional cube representation
--
Teodor Sigaev
pgsql encodings.
by the caller?
Yes, of course.
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 9: In
Pls, check your steps or say me where I'm wrong :)
If you still have a problems, I can solve it if I'll have access to your
developer server...
% cd PGSQL_SRC
% zcat ~/tmp/tsearch_snowball_82-20070504.gz| patch -p0
% cd contrib/tsearch2
% gmake && su -c 'gmake install' && gmake installcheck
% c
ist? would openfts help (im guessing not)?
Failing that does anybody have experience of combining another text
indexing package with postgresql?
Dave
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
--
Teo
;[dist<=1]' meaning "next
token follows with a max distance of 1". I imagine that it would
only be useful on unstripped tsvectors, but if the lexem position is
already stored ...
--
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-
hould
work reasonably with such corner cases.
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 1: if p
r', bad: foo & bar).
But casting 'asasas'::tsvector and dump/reload will not work correct.
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
-
Pls, notice, the dict, aff stopword files should be in server encoding. Snowball
sources for german (and other) in UTF8 can be founded in
http://snowball.tartarus.org/dist/libstemmer_c.tgz
To all: May be, we should put all snowball's stemmers (for all available
languages and encodings) to tsearch2
it (or mail it directly to
me)?
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 9: In versions below 8.0, the
one.
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
Hmm, I have found a small bug:
When there is a compound affix with zero length of search pattern (which
should not be!), ispell dictionary ignores all other compound affixes.
Original afix file contains
flag ~\`:
E > -E,NINGS#~ avskrive > avskrivnings-
Z Y Z Y
an hyppy to get
directions - if not i'll do it the "hard way" and post my results once i
succeed;-)
best regards,
thies
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http://archives.pos
i would
be more than hyppy to get directions - if not i'll do it the "hard
way" and post my results once i succeed;-)
best regards,
thies
---(end of broadcast)---
TIP 4: Have you searched our list archives?
but without index support and it's needed to write your
operator/function.
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings
en those to rows? Any? Even
if it's a bad hack? I really need it :/
Thanks in advance
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings
$ ./config.sh
config.sh : bad sustitution
Simple workaround: take dictionary generated on Gentoo.
I'll look at the problem, but I suspect that reason is a difference with Sun and
GNU environment (echo, sed and so on).
--
Teodor Sigaev E-mail: [
arch2 improvements in 8.0.
Tsearch 2 is now working on solaris
Good
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP
est regards,
Nikolay
---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
--
Teodor Sigaev
in tsearch?
Yudie
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
don't know how to give this password
programmatically.
So I'm back at the drawing board. How can I make fast bulk inserts into
a PostgreSQL database from within a Perl script?
Thanks!
kj
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
r.
It's may be a problem with UTF-8: only CHS head tsearch2 supports UTF-8. But you
can find a patch on 8.1 at http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
not a tsearch?
Index Cond: (search_vector ~ '*.6.6.9.3.4.4.*'::lquery)
That's the problem. Queries which begin with '*' will be slow enough...
Try to reduce SIGLENINT in tsearch2/gistidx.h up to 8 (do not forget reindex !!)
and try it
--
Teodor Sigaev
ient index structure to support queries you want... May be
Oleg knows...
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)
~ 'a.b.*{1}';
?column?
--
t
(1 row)
I can determine things like this with a few experiments, but I want to
know "the right way" to work with ltrees and referential integrity. How
do people use this?
That's right way.
--
Teodor Sigaev
int
alter table foo add foreign key subpath( path, 0, -1) references foo( path )
deferrable initially deferred,;
But it's impossible...
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
e obviously put a lot of thought in
ltree. Maybe it'll be possible with a future version of PostgreSQL :)
Make a patch to allow function in FK :)
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://
mes that you don't need sophisticated
exact phrase matching.
OpenFTS may work on another box than pgsql, OpenFTS may index file directly from
file system.
5. Are there any scripts, tools, add-ons, etc. that you can recommend?
We can tweak OpenFTS/tsearch
1. If I am correct about this then what is the point of using the ISpell
dictionary in the first place?
Yes. The main goal of any dictionaries is a 'normalize' lexeme, ie to
get a infinitive. It's very important for languages with variable word's
form such as french, russian, norwegian etc. So
#x27; & 'bane') | ('fot' & 'ball' &
'bane') will match.
So, all variants to split compound words are joined with OR, words in one
variant are joined with AND.
If thats isn't desirable you can forbid word split for ispell (just comment z
fl
---
TIP 5: don't forget to increase your free space map settings
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)
your joining column's datatypes do not
match
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
---(end of broadcast)---
TIP 9: In
all - it will be
very-very slow, because there is a lot of pointers in GIN index to each table's row.
It seems to me that message makes confuse about reason of error...
Interestingly this works:
explain analyze
select *
from test.features
where NULL @@ features.vect
- so, error message should mention something like this:
GIN index doesn't support search with void argument.
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
WWW: http://www.sigaev.ru/
search v2 has stat function:
select * from stat('select TSVECTOR from TABLE') order by ndoc desc, nentry
desc, word;
Where TSVECTOR is name of column of tsvector type and TABLE is a table with
that column.
Warn: it works very slow.
--
Teodor Sigaev
third, fourth, fifth, etc.) occurrence of
any given word when its presence in the document is being scored, yet kept in
the equation for modifications to the score when proximity is being considered?
I don't see the way except modify strip or rank functions...
--
Teo
to {one,hundred}, not {"one hundred"} as is currently happening.
How do I specify the output of the lexize function so that this will
happen?
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
--
Teodor Siga
tial index
create index fti on qrydocumentos using gist (conteudo_stem_ix) where
codgrupousuario = 1;
One more. Let you use ispell dictionary ( I suppose, for Portuguese language,
http://fmg-www.cs.ucla.edu/geoff/ispell-dictionaries.html#Portuguese-dicts )
--
Teodor Sigaev
'a.u.m'
'logist' |
| A.U.M. LOGISTICS (INDIA) PVT. LTD. | 'ltd' 'pvt' 'a.u.m' 'india'
'logist' |
++----------+
(3 rows)
Time: 425.697 ms
tradein_clients=
advance
Marc
look, to_tsquery(' overbuljongterningpakkmesterassistent') returns
"over & buljong & terning & pakk & mester & assistent" query. Are your sure that
text contains all of those words exists?
--
Teodor Sigaev
index requires an immutable function.
Jochem
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
--
Teodor Sigaev E-ma
Modifiers
+-+---
id | integer | not null
value | text|
cval | text|
Indexes:
"child_pkey" PRIMARY KEY btree (id)
\d doesn show any info about inheritance
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
--
earch/V2/,
espessially http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_in_Brief
(but it on bad english :( )
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
---(end of broadcast)---
TIP 5: Have you checke
install?
what files do i need to move into $libdir directory to get tsearch2 up
and running?
Thanks!
--
Teodor Sigaev E-mail: [EMAIL PROTECTED]
---(end of broadcast)---
TIP 6: Have you searched our list
tp://www.com/
http://aew.werc.ewr/?ad=qwe&d
w 1aew.werc.ewr/?ad=qwe&dw 2aew.werc.ewr http://3aew.werc.ewr/?ad=qwe&dw http://
4aew.werc.ewr http://5aew.werc.ewr:8100/? ad=qwe&dw 6aew.w"..., limit=564)
at parser.l:303
It seems to me, bug in flex...
--
Teodor Siga
only thing using it. More importantly
I don't seem to be able to find the mailing list thread that covered pretty
much this exact unexpect exit fault. So, can anyone help with a fix,
explanation or link to the relevent thread please?
Have you a core file, if yes then send gdb ou
1') AND
content_ix @@ to_tsquery('word2') OR content_ix @@ to_tsquery('word3');
I'm having to do this on some complex querys to put LIKEs between some
ts_querys.
Does anyone has such experience?
Thanks in advance,
--
Teodor Sigaev E-mail: [EM
h IMP: http://horde.org/imp/
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
--
Teodor Sigaev
101 - 176 of 176 matches
Mail list logo