New Feature Request

2020-05-25 Thread Bert Scalzo
I am reposting this from a few months back (see below). I am not trying to
be a pest, just very motivated. I really think this feature has merit, and
if not generally worthwhile, I'd be willing to pay someone to code it for
me as I don't have strong enough C skills to modify the PostgreSQL code
myself. So anyone who might have such skills that would be interested,
please contact me: bertscal...@gmail.com.

MySQL has a really useful feature they call the query rewrite cache. The
optimizer checks incoming queries to see if a known better rewrite has been
placed within the query rewrite cache table. If one is found, the rewrite
replaces the incoming query before sending it to the execution engine. This
capability allows for one to fix poorly performing queries in 3rd party
application code that cannot be modified. For example, suppose a 3rd party
application contains the following inefficient query: SELECT COUNT(*) FROM
table WHERE SUBSTRING(column,1,3) = 'ABC'. One can place the following
rewrite in the query rewrite cache: SELECT COUNT(*) FROM table WHERE column
LIKE 'ABC%'. The original query cannot use an index while the rewrite can.
Since it's a 3rd party application there is really no other way to make
such an improvement. The existing rewrite rules in PostgreSQL are too
narrowly defined to permit such a substitution as the incoming query could
involve many tables, so what's needed is a general "if input SQL string
matches X then replace it with Y". This check could be placed at the
beginning of the parser.c code. Suggest that the matching code should first
check the string lengths and hash values before checking entire string
match for efficiency.


Re: New Feature Request

2020-05-26 Thread Bert Scalzo
I greatly appreciate all the replies. Thanks. I also fully understand and
appreciate all the points made - especially that this idea may not have
general value or acceptance as worthwhile. No argument from me. Let me
explain why I am looking to do this to see if that changes any opinions. I
have written a product called QIKR for MySQL that leverages the MySQL query
rewrite feature and places a knowledge expert of SQL rewrite rules as a
preprocessor to the MySQL optimizer. I have defined an extensive set of
rules based on my 30 years of doing code reviews for app developers who
write terrible SQL. Right now QIKR does 100% syntactic analysis (hoping to
do semantic analysis in a later version). For MySQL (which has a less
mature and less robust optimizer) the performance gains are huge - in
excess of 10X. So far QIKR shows about a 2.5X improvement over the
PostgreSQL optimizer when fed bad SQL. I am not saying the
PotsgrSQL optimizer does a poor job, but rather that QIKR was designed for
"garbage in, not garbage out" - so QIKR fixes all the stupid mistakes that
people make which can confuse or even cripple an optimizer. Hence why I am
looking for this hook - and have come to the experts for help. I have two
very large PostgreSQL partner organizations who have asked me to make
QIKR work for PostgreSQL as it does for MySQL. Again, I am willing to pay
for this hook since it's a special request for a special purpose and not
generally worthwhile in many people's opinions - which I cannot argue with.

On Tue, May 26, 2020 at 2:17 AM Konstantin Knizhnik <
k.knizh...@postgrespro.ru> wrote:

>
>
> On 26.05.2020 04:47, Tomas Vondra wrote:
> > On Mon, May 25, 2020 at 09:21:26PM -0400, Bruce Momjian wrote:
> >> On Mon, May 25, 2020 at 07:53:40PM -0500, Bert Scalzo wrote:
> >>> I am reposting this from a few months back (see below). I am not
> >>> trying to be a
> >>> pest, just very motivated. I really think this feature has merit,
> >>> and if not
> >>> generally worthwhile, I'd be willing to pay someone to code it for
> >>> me as I
> >>> don't have strong enough C skills to modify the PostgreSQL code
> >>> myself. So
> >>> anyone who might have such skills that would be interested, please
> >>> contact me:
> >>> bertscal...@gmail.com.
> >>
> >> I think your best bet is to try getting someone to write a hook
> >> that will do the replacement so that you don't need to modify too much
> >> of the Postgres core code.  You will need to have the hook updated for
> >> new versions of Postgres, which adds to the complexity.
> >>
> >
> > I don't think we have a hook to tweak the incoming SQL, though. We only
> > have post_parse_analyze_hook, i.e. post-parse, at which point we can't
> > just rewrite the SQL directly. So I guess we'd need new hook.
>
> VOPS extension performs query substitution (replace query to the
> original table with query to projection) using post_parse_analysis_hook
> and SPI. So I do not understand why  some extra hook is needed.
>
> >
> > I do however wonder if an earlier hook is a good idea at all - matching
> > the SQL directly seems like a rather naive approach that'll break easily
> > due to formatting, upper/lower-case, subqueries, and many other things.
> > From this standpoint it seems actually better to inspect and tweak the
> > parse-analyze result. Not sure how to define the rules easily, though.
> >
>
> In some cases we need to know exact parameter value (as in case
> SUBSTRING(column,1,3) = 'ABC').
> Sometime concrete value of parameter is not important...
> Also it is not clear where such pattern-matching transformation should
> be used only for the whole query or for any its subtrees?
>
> > As for the complexity, I think hooks are fairly low-maintenance in
> > practice, we tend not to modify them very often, and when we do it's
> > usually just adding an argument etc.
>
> I am not sure if the proposed approach can really be useful in many cases.
> Bad queries are used to be generated by various ORM tools.
> But them rarely generate exactly the same query. So defining matching
> rules for the whole query tree will rarely work.
>
> It seems to be more useful to have extensible SQL optimizer, which
> allows to add user defined rules (may as transformation patterns).
> This is how it is done in GCC code optimizer.
> Definitely writing such rules is very non-trivial task.
> Very few developers will be able to add their own meaningful rules.
> But in any case it significantly simplify improvement of optimizer,
> although most of problems with choosing optimal plan are
> caused by wrong statistic and rue-based optimization can not help here.
>
>
>
>


New feature request: Query Rewrite Cache

2020-03-09 Thread Bert Scalzo
MySQL has a really useful feature they call the query rewrite cache. The
optimizer checks incoming queries to see if a known better rewrite has been
placed within the query rewrite cache table. If one is found, the rewrite
replaces the incoming query before sending it to the execution engine. This
capability allows for one to fix poorly performing queries in 3rd party
application code that cannot be modified. For example, suppose a 3rd party
application contains the following inefficient query: SELECT COUNT(*) FROM
table WHERE SUBSTRING(column,1,3) = 'ABC'. One can place the following
rewrite in the query rewrite cache: SELECT COUNT(*) FROM table WHERE column
LIKE 'ABC%'. The original query cannot use an index while the rewrite can.
Since it's a 3rd party application there is really no other way to make
such an improvement. The existing rewrite rules in PostgreSQL are too
narrowly defined to permit such a substitution as the incoming query could
involve many tables, so what's needed is a general "if input SQL string
matches X then replace it with Y". This check could be placed at the
beginning of the parser.c code. Suggest that the matching code should first
check the string lengths and  hash values before checking entire string
match for efficiency.