On Wed, Jul 15, 2015 at 12:31 AM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Tue, Jul 7, 2015 at 6:33 AM, Alexander Korotkov < > a.korot...@postgrespro.ru> wrote: > >> >> >> See Tom Lane's comment about downgrade scripts. I think just remove it is >> a right solution. >> > > The new patch removes the downgrade path and the ability to install the > old version. > > (If anyone wants an easy downgrade path for testing, they can keep using > the prior patch--no functional changes) > > It also added a comment before the trigramsMatchGraph call. > > I retained the palloc and the loop to promote the ternary array to a > binary array. While I also think it is tempting to get rid of that by > abusing the type system and would do it that way in my own standalone code, > it seems contrary to way the project usually does things. And I couldn't > measure a difference in performance. > Ok! > .... > > > >> Let's consider '^(?!.*def).*abc' regular expression as an example. It >> matches strings which contains 'abc' and don't contains 'def'. >> >> # SELECT 'abc' ~ '^(?!.*def).*abc'; >> ?column? >> ---------- >> t >> (1 row) >> >> # SELECT 'def abc' ~ '^(?!.*def).*abc'; >> ?column? >> ---------- >> f >> (1 row) >> >> # SELECT 'abc def' ~ '^(?!.*def).*abc'; >> ?column? >> ---------- >> f >> (1 row) >> >> Theoretically, our trigram regex processing could support negative >> matching of 'def' trigram, i.e. trigramsMatchGraph(abc = true, def = false) >> = true but trigramsMatchGraph(abc = true, def = true) = false. Actually, it >> doesn't because trigramsMatchGraph() implements a monotonic function. I >> just think it should be stated explicitly. >> > > Do you think it is likely to change to stop being monotonic and so support > the (def=GIN_TRUE) => false case? > > ^(?!.*def) seems like a profoundly niche situation. (Although one that > I might actually start using myself now that I know it isn't just a > Perl-ism). > > It doesn't make any difference to this patch, other than perhaps how to > word the comments. > Yes, it was just about the comments. I also run few tests on real-life dataset: set of dblp paper titles. As it was expected, there is huge speedup when pattern is long enough. # SELECT count(*) FROM dblp_titles WHERE s ILIKE '%Multidimensional Data%'; count ------- 318 (1 row) Time: 29,114 ms # SELECT count(*) FROM dblp_titles WHERE s ~* '.*Multidimensional Data.*'; count ------- 318 (1 row) Time: 26,273 ms # SELECT count(*) FROM dblp_titles WHERE s ILIKE '%Multidimensional Data%'; count ------- 318 (1 row) Time: 249,725 ms # SELECT count(*) FROM dblp_titles WHERE s ~* '.*Multidimensional Data.*'; count ------- 318 (1 row) Time: 218,627 ms For me, patch is in the pretty good shape. I'm going to mark it "Ready for committer". ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company