Re: [HACKERS] improving speed of make check-world

2014-08-31 Thread Fabien COELHO



Hello Peter,

Here is a review:

The version 2 of the patch applies cleanly on current head.

The ability to generate and reuse a temporary installation for different 
tests looks quite useful, thus putting install out of pg_regress and in 
make seems reasonnable.


However I'm wondering whether there could be some use case of pg_regress 
where doing the install might be useful nevertheless, say for manually 
doing things on the command line, in which case keeping the feature, even 
if not used by default, could be nice?


About changes: I think that more comments would be welcome, eg 
with_temp_install.


There is not a single documentation change. Should there be some? Hmmm... 
I have not found much documentation about "pg_regress"...


I have tested make check, check-world, as well as make check in contrib 
sub-directories. It seems to work fine in sequential mode.


Running "make -j2 check-world" does not work because "initdb" is not found 
by "pg_regress". but "make -j1 check-world" does work fine. It seems that 
some dependencies might be missing and there is a race condition between 
temporary install and running some checks?? Maybe it is not expected to 
work anyway? See below suggestions to make it work.


However in this case the error message passed by pg_regress contains an 
error:


 pg_regress: initdb failed
 Examine /home/fabien/DEV/GIT/postgresql/contrib/btree_gin/log/initdb.log for 
the reason.
 Command was: "initdb" -D "/home/fabien/DEV/GIT/postgresql/contrib/btree_gin/./tmp_check/data" 
--noclean --nosync > "/home/fabien/DEV/GIT/postgresql/contrib/btree_gin/./tmp_check/log/initdb.log" 
2>&1
 make[2]: *** [check] Error 2

The error messages point to non existing log file (./tmp_check is 
missing), the temp_instance variable should be used in the error message 
as well as in the commands. Maybe other error messages have the same 
issues.


I do not like much the MAKELEVEL hack in a phony target. When running in 
parallel, several makes may have the same level anyway, not sure what 
would happen... Maybe it is the origin of the race condition which makes 
initdb not to be found above. I would suggest to try to rely on 
dependences, maybe something like the following could work to ensure that 
an installation is done once and only once per make invocation, whatever 
the underlying parallelism & levels, as well as a possibility to reuse the 
previous installation.


  # must be defined somewhere common to all makefiles
  ifndef MAKE_NONCE
  # define a nanosecond timestamp
  export MAKE_NONCE := $(shell date +%s.%N)
  endif

  # actual new tmp installation
  .tmp_install:
$(RM) ./.tmp_install.*
$(RM) -r ./tmp_install
# create tmp installation...
touch $@

  # tmp installation for the nonce
  .tmp_install.$(MAKE_NONCE): .tmp_install
touch $@

  # generate a new tmp installation by default
  # "make USE_TMP_INSTALL=1 ..." reuses a previous installation if available
  ifdef USE_TMP_INSTALL
  TMP_INSTALL = .tmp_install
  else
  TMP_INSTALL = .tmp_install.$(MAKE_NONCE)
  endif # USE_TMP_INSTALL

  .PHONY: main-temp-install
  main-temp-install: $(TMP_INSTALL)

  .PHONY: extra-temp-install
  extra-temp-install: main-temp-install
# do EXTRA_INSTALL


MSVC build is not yet adjusted for this.  Looking at vcregress.pl, this 
should be fairly straightforward.


No idea about that point.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_filedump for 9.4?

2014-08-31 Thread Christoph Berg
Re: Fabrízio de Royes Mello 2014-06-25 

> On Wed, Jun 25, 2014 at 3:52 PM, Tom Lane  wrote:
> > Would like that, but I'm not sure what pgindent will do with the //
> > comments.  It's been on my to-do list to switch all the comments to C89
> > style and then pgindent it, but I don't see myself getting to that in this
> > decade :-(
> >
> 
> I changed all // comments to /* */ and run pgindent.

I've pushed these patches to the git repository, thanks.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tips/advice for implementing integrated RESTful HTTP API

2014-08-31 Thread Peter Eisentraut
On 8/31/14 12:40 AM, Dobes Vandermeer wrote:
> The background workers can apparently only connect to a single database
> at a time, but I want to expose all the databases via the API. 

I think the term "background worker" should be taken as a hint that
"foreground protocol endpoint" was not one of the envisioned use cases.
 I think this sort of thing should be done as a separate process in
front of the postmaster.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY and heap_sync

2014-08-31 Thread Peter Eisentraut
On 8/30/14 2:26 AM, Jeff Janes wrote:
> But there cases were people use COPY in a loop with a small amount of
> data in each statement.

What would be the reason for doing that?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_receivexlog and replication slots

2014-08-31 Thread Magnus Hagander
On Tue, Aug 26, 2014 at 9:43 AM, Michael Paquier
 wrote:
> On Tue, Aug 19, 2014 at 2:49 PM, Michael Paquier
>  wrote:
>> On Mon, Aug 18, 2014 at 4:01 PM, Michael Paquier
>>  wrote:
>>> On Mon, Aug 18, 2014 at 3:48 PM, Fujii Masao  wrote:
 On Mon, Aug 18, 2014 at 2:38 PM, Michael Paquier
  wrote:
>>> And now looking at your patch there is additional refactoring possible
>>> with IDENTIFY_SYSTEM and pg_basebackup as well...
>> And attached is a rebased patch doing so.
> I reworked this patch a bit to take into account the fact that the
> number of columns to check after running IDENTIFY_SYSTEM is not always
> 4 as pointed out by Andres:
> - pg_basebackup => 3
> - pg_receivexlog => 3
> - pg_recvlogical => 4

As this is a number of patches rolled into one - do you happen to keep
them separate in your local repo? If so can you send them as separate
ones (refactor identify_system should be quite unrelated to supporting
replication slots, right?), for easier review? (if not, I'll just
split them apart mentally, but it's easier to review separately)

On the identify_system part - my understanding of the code is that
what you pass in as num_cols is the number of columns required for it
to work, right? We probably need to adjust the error message as well
in that case, because it's no longer what's "expected", it's what's
"required"? And we might want to include a hint about the reason
(wrong version)?

There's also a note "get LSN start position if necessary", but it
tries to do it unconditionally. What is the "if necessary" supposed to
refer to?

Actually - why do we even care about the 3 vs 4 in RunIdentifySystem,
as it never actually looks at the 4th column anyway? If we do
specifically want it to fail in the case of pg_recvlogical, we really
need to think up a better error message for it, and perhaps a
different way of specifying it?

Do we really want those Asserts? There is not a single Assert in
bin/pg_basebackup today - as is the case for most things in bin/. We
typically use regular if statements for things that "can happen", and
just ignore the others I think - since the callers are fairly simple
to trace.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_filedump for 9.4?

2014-08-31 Thread Fabrízio de Royes Mello
Em domingo, 31 de agosto de 2014, Christoph Berg  escreveu:

> Re: Fabrízio de Royes Mello 2014-06-25
>  >
> > On Wed, Jun 25, 2014 at 3:52 PM, Tom Lane  > wrote:
> > > Would like that, but I'm not sure what pgindent will do with the //
> > > comments.  It's been on my to-do list to switch all the comments to C89
> > > style and then pgindent it, but I don't see myself getting to that in
> this
> > > decade :-(
> > >
> >
> > I changed all // comments to /* */ and run pgindent.
>
> I've pushed these patches to the git repository, thanks.
>
>
Thanks!

Fabrízio de Royes Mello


-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


Re: [HACKERS] Tips/advice for implementing integrated RESTful HTTP API

2014-08-31 Thread Fabrízio de Royes Mello
Em domingo, 31 de agosto de 2014, Peter Eisentraut 
escreveu:

> On 8/31/14 12:40 AM, Dobes Vandermeer wrote:
> > The background workers can apparently only connect to a single database
> > at a time, but I want to expose all the databases via the API.
>
> I think the term "background worker" should be taken as a hint that
> "foreground protocol endpoint" was not one of the envisioned use cases.
>  I think this sort of thing should be done as a separate process in
> front of the postmaster.
>
>
+1

Do you know the PGRest project?

Look at http://pgre.st

Regards,

Fabrízio


-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347

2014-08-31 Thread Erik Rijkers
On Tue, August 26, 2014 14:24, Andrew Gierth wrote:
>> "Erik" == Erik Rijkers  writes:
>
>  >> They apply cleanly for me at 2bde297 whether with git apply or
>  >> patch, except for the contrib one (which you don't need unless you
>  >> want to run the contrib regression tests without applying the
>  >> gsp-u patch).
>
>  Erik> Ah, I had not realised that.  Excluding that contrib-patch and
>  Erik> only applying these three:
>
>  Erik> gsp1.patch
>  Erik> gsp2.patch
>  Erik> gsp-doc.patch
>
>  Erik> does indeed work (applies, compiles).
>
> I put up a rebased contrib patch anyway (linked off the CF).
>
> Did the "unrecognized node type" error go away, or do we still need to
> look into that?
>

I have found that the "unrecognized node type" error is caused by:

shared_preload_libraries = pg_stat_statements

in postgresql.conf (as my default compile script was doing).

If I disable that line the error goes away.

I don't know exactly what that means for the groping sets patches but I thought 
I'd mention it here.

Otherwise I've not run into any problems with GROUPING SETS.


Erik Rijkers




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347

2014-08-31 Thread Atri Sharma
On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers  wrote:

> On Tue, August 26, 2014 14:24, Andrew Gierth wrote:
> >> "Erik" == Erik Rijkers  writes:
> >
> >  >> They apply cleanly for me at 2bde297 whether with git apply or
> >  >> patch, except for the contrib one (which you don't need unless you
> >  >> want to run the contrib regression tests without applying the
> >  >> gsp-u patch).
> >
> >  Erik> Ah, I had not realised that.  Excluding that contrib-patch and
> >  Erik> only applying these three:
> >
> >  Erik> gsp1.patch
> >  Erik> gsp2.patch
> >  Erik> gsp-doc.patch
> >
> >  Erik> does indeed work (applies, compiles).
> >
> > I put up a rebased contrib patch anyway (linked off the CF).
> >
> > Did the "unrecognized node type" error go away, or do we still need to
> > look into that?
> >
>
> I have found that the "unrecognized node type" error is caused by:
>
> shared_preload_libraries = pg_stat_statements
>
> in postgresql.conf (as my default compile script was doing).
>
> If I disable that line the error goes away.
>
>
I  think thats more of a library linking problem rather than a problem with
the patch. I couldnt reproduce it,though.

Regards,

Atri

-- 
Regards,

Atri
*l'apprenant*


Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347

2014-08-31 Thread Andres Freund
On 2014-08-31 21:09:59 +0530, Atri Sharma wrote:
> On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers  wrote:
> > I have found that the "unrecognized node type" error is caused by:

It's a warning, not an error, right?

> > shared_preload_libraries = pg_stat_statements
> >
> > in postgresql.conf (as my default compile script was doing).
> >
> > If I disable that line the error goes away.
> >
> >
> I  think thats more of a library linking problem rather than a problem with
> the patch. I couldnt reproduce it,though.

I think it's vastly more likely that the patch simply didn't add the new
expression types to pg_stat_statements.c:JumbleExpr().

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] improving speed of make check-world

2014-08-31 Thread Fabien COELHO



 # actual new tmp installation
 .tmp_install:
$(RM) ./.tmp_install.*
$(RM) -r ./tmp_install
# create tmp installation...
touch $@

 # tmp installation for the nonce
 .tmp_install.$(MAKE_NONCE): .tmp_install
touch $@


Oops, I got it wrong, the install would not be reexecuted the second time.

Maybe someting more like:

  ifdef USE_ONE_INSTALL
  TMP_INSTALL = .tmp_install.once
  else
  TMP_INSTALL = .tmp_install.$(MAKE_NONCE)
  endif

  $(TMP_INSTALL):
$(RM) -r ./tmp_install .tmp_install.*
# do installation...
touch $@

So that the file target is different each time it is run. Hopefully.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347

2014-08-31 Thread Atri Sharma
On Sunday, August 31, 2014, Andres Freund  wrote:

> On 2014-08-31 21:09:59 +0530, Atri Sharma wrote:
> > On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers  > wrote:
> > > I have found that the "unrecognized node type" error is caused by:
>
> It's a warning, not an error, right?
>
> > > shared_preload_libraries = pg_stat_statements
> > >
> > > in postgresql.conf (as my default compile script was doing).
> > >
> > > If I disable that line the error goes away.
> > >
> > >
> > I  think thats more of a library linking problem rather than a problem
> with
> > the patch. I couldnt reproduce it,though.
>
> I think it's vastly more likely that the patch simply didn't add the new
> expression types to pg_stat_statements.c:JumbleExpr().
>
>
>
Must have run the above diagnosis in a wrong manner then, I will
check.Thanks for the heads up!

Regards,

Atri


-- 
Regards,

Atri
*l'apprenant*


Re: [HACKERS] Re: [BUGS] Re: BUG #9555: pg_dump for tables with inheritance recreates the table with the wrong order of columns

2014-08-31 Thread Tom Lane
Bruce Momjian  writes:
> I have developed the attached patch to warn about column reordering in
> this odd case.  The patch mentions the reordering of c:

>   NOTICE:  merging column "a" with inherited definition
>   NOTICE:  merging column "c" with inherited definition;  column moved 
> earlier to match inherited column location

This does not comport with our error message style guidelines.
You could put the additional information as an errdetail sentence,
perhaps.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-08-31 Thread Simon Riggs
On 30 August 2014 18:24, Tom Lane  wrote:

>> 3. I am thinking about name - I don't think so varwidth_bucket is correct
>> -- in relation to name "width_bucket" ... what about "range_bucket" or
>> "scope_bucket" ?
>
> Neither of those seem like improvements from here.  I agree with the
> objections to bin() as well.  bucket() might be all right but it still
> seems a bit too generic.  width_bucket(), which at least shows there's
> a relationship to the standard functions, still seems like the best
> of the suggestions so far.

I'd been mulling that over and agree with objections to bin()

Suggest discretize() as a much more informative name. The other names
will be overlooked by anybody that doesn't already know what to look
for.

Thanks for the polymorphic patch, that sounds good. Sorry, not
reviewed more deeply (still travelling).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] Re: BUG #9555: pg_dump for tables with inheritance recreates the table with the wrong order of columns

2014-08-31 Thread David G Johnston
Tom Lane-2 wrote
> Bruce Momjian <

> bruce@

> > writes:
>> I have developed the attached patch to warn about column reordering in
>> this odd case.  The patch mentions the reordering of c:
> 
>>  NOTICE:  merging column "a" with inherited definition
>>  NOTICE:  merging column "c" with inherited definition;  column moved
>> earlier to match inherited column location
> 
> This does not comport with our error message style guidelines.
> You could put the additional information as an errdetail sentence,
> perhaps.

Would it be proper to issue an additional top-level warning with the column
moved notification?  Thus there would be NOTICE, NOTICE, WARNING in the
above example?  Or, more generically, "columns reordered to match inherited
column order" to avoid multiple warnings if more than one column is moved.

David J.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Re-BUGS-Re-BUG-9555-pg-dump-for-tables-with-inheritance-recreates-the-table-with-the-wrong-order-of-s-tp5816566p5817073.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Selectivity estimation for inet operators

2014-08-31 Thread Emre Hasegeli
> * Isn't "X >> Y" equivalent to "network_scan_first(X) < Y AND 
> network_scan_last(X) > Y"? Or at least close enough for selectivity 
> estimation purposes? Pardon my ignorance - I'm not too familiar with the 
> inet datatype - but how about just calling scalarineqsel for both bounds?

Actually, "X >> Y" is equivalent to

network_scan_first(X) <= network_host(Y) AND
network_scan_last(X) >= network_host(Y) AND
network_masklen(X) < network_masklen(X)

but we do not have statistics for neither network_scan_last(X)
nor network_masklen(X).  I tried to find a solution based on
the implementation of the operators.

> * inet_mcv_join_selec() is O(n^2) where n is the number of entries in the 
> MCV lists. With the max statistics target of 1, a worst case query on 
> my laptop took about 15 seconds to plan. Maybe that's acceptable, but you 
> went through some trouble to make planning of MCV vs histogram faster, by 
> the log2 method to compare only some values, so I wonder why you didn't do 
> the same for the MCV vs MCV case?

It was like that in the previous versions.  It was causing worse
estimation, but I was trying to reduce both sides of the lists.  It
works slightly better when only the left hand side of the list is
reduced.  Attached version works like that.

> * A few typos: lenght -> length.

Fixed.

Thank you for looking at it.
diff --git a/src/backend/utils/adt/network_selfuncs.c 
b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..a00706c 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -1,32 +1,671 @@
 /*-
  *
  * network_selfuncs.c
  *   Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * Estimates are based on null fraction, distinct value count, most common
+ * values, and histogram of inet/cidr datatypes.
  *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
  * IDENTIFICATION
  *   src/backend/utils/adt/network_selfuncs.c
  *
  *-
  */
 #include "postgres.h"
 
+#include 
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
+#include "utils/lsyscache.h"
 #include "utils/inet.h"
+#include "utils/selfuncs.h"
 
 
+/* Default selectivity constant for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity constant for the other operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for given operator */
+#define DEFAULT_SEL(operator) \
+   ((operator) == OID_INET_OVERLAP_OP ? \
+   DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
+
+static Selectivity networkjoinsel_inner(Oid operator,
+VariableStatData *vardata1, 
VariableStatData *vardata2);
+extern double eqjoinsel_semi(Oid operator, VariableStatData *vardata1,
+  VariableStatData *vardata2, RelOptInfo *inner_rel);
+extern RelOptInfo *find_join_input_rel(PlannerInfo *root, Relids relids);
+static short int inet_opr_order(Oid operator);
+static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues,
+Datum *constvalue, short int 
opr_order);
+static Selectivity inet_mcv_join_selec(Datum *values1, float4 *numbers1,
+   int nvalues1, Datum *values2, float4 
*numbers2,
+   int nvalues2, int red_nvalues, Oid 
operator);
+static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers,
+  int mcv_nvalues, Datum *his_values, int 
his_nvalues,
+  int red_nvalues, short int opr_order,
+  Selectivity *max_selec_pointer);
+static Selectivity inet_his_inclusion_join_selec(Datum *his1_values,
+ int his1_nvalues, Datum *his2_values, 
int his2_nvalues,
+ int red_nvalues, 
short int opr_order);
+static short int inet_inclusion_cmp(inet *left, inet *right,
+  short int opr_order);
+static short int inet_masklen_inclusion_cmp(inet *left, inet *right,
+  short int opr_order);
+static short int inet_his_match_divider(inet *boundary, inet *query,
+  short int opr_order);
+
+/*
+ * Selectivity estimation for the subnet inclusion operators
+ */
 Datum
 networksel(PG_FUNCTION_ARGS)
 {
-   PG_RETURN_FLOAT8(0.001);
+   PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+

Re: [HACKERS] Selectivity estimation for inet operators

2014-08-31 Thread Emre Hasegeli
> Heikki Linnakangas  writes:
> > * inet_mcv_join_selec() is O(n^2) where n is the number of entries in 
> > the MCV lists. With the max statistics target of 1, a worst case 
> > query on my laptop took about 15 seconds to plan. Maybe that's 
> > acceptable, but you went through some trouble to make planning of MCV vs 
> > histogram faster, by the log2 method to compare only some values, so I 
> > wonder why you didn't do the same for the MCV vs MCV case?
> 
> Actually, what I think needs to be asked is the opposite question: why is
> the other code ignoring some of the statistical data?  If the user asked
> us to collect a lot of stats detail it seems reasonable that he's
> expecting us to use it to get more accurate estimates.  It's for sure
> not obvious why these estimators should take shortcuts that are not being
> taken in the much-longer-established code for scalar comparison estimates.

It will still use more statistical data, when statistics_target is
higher.  It was not sure that the user wants to spent O(n^2) amount
of time based on statistics_target.  Attached version is without
this optimization.  Estimates are better without it, but planning
takes more time.

> I'm not exactly convinced that the math adds up in this logic, either.
> The way in which it combines results from looking at the MCV lists and
> at the histograms seems pretty arbitrary.

I taught the product of the join will be

(left_mcv + left_histogram) * (right_mcv + right_histogram) * 
selectivity

and tried to calculate it as in the following:

(left_mcv * right_mcv * selectivity) +
(right_mcv * left_histogram * selectivity) +
(left_mcv * right_histogram * selectivity) +
(left_histogram * right_histogram * selectivity)

where left_histogram is

1.0 - left_nullfrac - left_mcv

I fixed calculation for the MCV vs histogram part.  The estimates of
inner join are very close to the actual rows with statistics_target = 1000.
I think the calculation should be right.
diff --git a/src/backend/utils/adt/network_selfuncs.c 
b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..4367f0e 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -1,32 +1,655 @@
 /*-
  *
  * network_selfuncs.c
  *   Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * Estimates are based on null fraction, distinct value count, most common
+ * values, and histogram of inet/cidr datatypes.
  *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
  * IDENTIFICATION
  *   src/backend/utils/adt/network_selfuncs.c
  *
  *-
  */
 #include "postgres.h"
 
+#include 
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
+#include "utils/lsyscache.h"
 #include "utils/inet.h"
+#include "utils/selfuncs.h"
 
 
+/* Default selectivity constant for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity constant for the other operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for given operator */
+#define DEFAULT_SEL(operator) \
+   ((operator) == OID_INET_OVERLAP_OP ? \
+   DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
+
+static Selectivity networkjoinsel_inner(Oid operator,
+VariableStatData *vardata1, 
VariableStatData *vardata2);
+extern double eqjoinsel_semi(Oid operator, VariableStatData *vardata1,
+  VariableStatData *vardata2, RelOptInfo *inner_rel);
+extern RelOptInfo *find_join_input_rel(PlannerInfo *root, Relids relids);
+static short int inet_opr_order(Oid operator);
+static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues,
+Datum *constvalue, short int 
opr_order);
+static Selectivity inet_mcv_join_selec(Datum *values1, float4 *numbers1,
+   int nvalues1, Datum *values2, float4 
*numbers2,
+   int nvalues2, Oid operator, Selectivity 
*max_selec_p);
+static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers,
+  int mcv_nvalues, Datum *his_values, int 
his_nvalues,
+  short int opr_order, Selectivity 
*max_selec_p);
+static Selectivity inet_his_inclusion_join_selec(Datum *his1_values,
+ int his1_nvalues, Datum *his2_values, 
int his2_nvalues,
+ short int opr_order);
+static short int inet_inclusion_cmp(inet *left, in

Re: [HACKERS] Built-in binning functions

2014-08-31 Thread Tom Lane
Simon Riggs  writes:
> Suggest discretize() as a much more informative name. The other names
> will be overlooked by anybody that doesn't already know what to look
> for.

I did not like that idea to begin with, but it's growing more attractive.
In particular, I think it would be unique enough that we could safely
mark the polymorphic function as variadic (a move that would cause
ambiguity issues for pretty nearly any user-defined function of the
same name).

OTOH, I'm not as convinced as you that this name would mean anything
to "anybody that doesn't already know what to look for".  It still
seems like rather arcane terminology.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Selectivity estimation for inet operators

2014-08-31 Thread Emre Hasegeli
> What you did there is utterly unacceptable from a modularity standpoint;
> and considering that the values will be nowhere near right, the argument
> that "it's better than returning a constant" seems pretty weak.  I think
> you should just take that out again.

I will try to come up with a better, data type specific implementation
in a week.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] Re: BUG #9555: pg_dump for tables with inheritance recreates the table with the wrong order of columns

2014-08-31 Thread Tom Lane
David G Johnston  writes:
> Would it be proper to issue an additional top-level warning with the column
> moved notification?  Thus there would be NOTICE, NOTICE, WARNING in the
> above example?  Or, more generically, "columns reordered to match inherited
> column order" to avoid multiple warnings if more than one column is moved.

That's a good point: if this message fires at all, it will probably fire
more than once; do we want that?  If we do it as you suggest here, we'll
lose the information as to exactly which columns got relocated, which
perhaps is bad, or maybe it doesn't matter.  Also, I don't remember the
exact code structure in that area, but it might be a bit painful to
arrange that we get only one such warning even when inheriting from
multiple parents.

If we do want the specific moved columns to be identified, I'd still go
with errdetail on the NOTICE rather than two separate messages.  I think
calling it a WARNING is a bit extreme anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_filedump for 9.4?

2014-08-31 Thread Stepan Rutz
Hi community,
while I am currently investigating why a certain table with highly redundant 
and utterly verbose xml becomes worse storage wise when making the xml more 
compact. Since i am quite new to this, I believe its the lz compression in the 
text database. But thats irrelevant now, just mentioning because tools like 
pg_filedump allow people like me to help themselves and a basic understanding 
of things.

During checking I noticed pg_filedump (current from git.postgresql.org incl. 
the below mentioned commit) does not compile on Mac-OSX. Afaik it will not 
compile as soon as post.h comes into play and USE_REPL_SNPRINTF is defined. 
Then printf and sprintf (ouch particular but code path seems tolerable) in the 
source of pg_filedump become pg_printf and so on. These replacements are part 
of postgres and can’t be linked into the standalone pg_filedump. At least that 
is certainly not the intention.

Putting 

#undef sprintf
#undef print

after the includes in pg_filedump fixes the mac compile and imho all builds 
where the USE_REPL_SNPRINTF is defined as a side effect of include postgres.h 
effectively taking printf from me.

Not sure how to deal with this issue correctly so this is just for your 
consideration since the issue is a bit broader imho.

Regards,
Stepan

  
Am 31.08.2014 um 17:25 schrieb Fabrízio de Royes Mello 
:

> 
> 
> Em domingo, 31 de agosto de 2014, Christoph Berg  escreveu:
> Re: Fabrízio de Royes Mello 2014-06-25 
> 
> > On Wed, Jun 25, 2014 at 3:52 PM, Tom Lane  wrote:
> > > Would like that, but I'm not sure what pgindent will do with the //
> > > comments.  It's been on my to-do list to switch all the comments to C89
> > > style and then pgindent it, but I don't see myself getting to that in this
> > > decade :-(
> > >
> >
> > I changed all // comments to /* */ and run pgindent.
> 
> I've pushed these patches to the git repository, thanks.
> 
> 
> Thanks!
> 
> Fabrízio de Royes Mello
> 
> 
> -- 
> Fabrízio de Royes Mello
> Consultoria/Coaching PostgreSQL
> >> Timbira: http://www.timbira.com.br
> >> Blog: http://fabriziomello.github.io
> >> Linkedin: http://br.linkedin.com/in/fabriziomello
> >> Twitter: http://twitter.com/fabriziomello
> >> Github: http://github.com/fabriziomello
> 



smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] Built-in binning functions

2014-08-31 Thread Gavin Flower

On 01/09/14 06:00, Tom Lane wrote:

Simon Riggs  writes:

Suggest discretize() as a much more informative name. The other names
will be overlooked by anybody that doesn't already know what to look
for.

I did not like that idea to begin with, but it's growing more attractive.
In particular, I think it would be unique enough that we could safely
mark the polymorphic function as variadic (a move that would cause
ambiguity issues for pretty nearly any user-defined function of the
same name).

OTOH, I'm not as convinced as you that this name would mean anything
to "anybody that doesn't already know what to look for".  It still
seems like rather arcane terminology.

regards, tom lane


If I was new to PostgreSQL, I'd probably read this as disc*(), a 
function to do something with discs.


I have done finite maths and statistics at university, plus I have been 
vaguely following this thread - but, the meaning of discretize() is not 
immediately apparent to me (if I reread a previous email or 2 in this 
thread, then I'm sure it would then be obvious!).  I might guess that it 
is to make something discrete, but why would I want to do that?  So if I 
came across this function in a year or 2, I'd probably go right past it.


I have been programming for over 40 years, and I still find choosing 
appropriate names sometimes very difficult, so I know it is not easy to 
come up with meaningful names - especially ones that mean something 
relevant to other people!



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-08-31 Thread Petr Jelinek

On 30/08/14 19:24, Tom Lane wrote:

Pavel Stehule  writes:

1. I am thinking so reduction to only numeric types is not necessary -
although we can live without it - but there are lot of non numeric
categories: chars, date, ...


I wasn't terribly happy about that either.  I still think we should
reduce this to a single polymorphic function, as in the attached.



I did try to write generic function very similar to what you wrote but 
discarded it because of the performance reasons. If we really want to 
support any data type I am all for having generic function but I still 
would like to have one optimized for float8 because that seems to be the 
most used use-case (at least that one is the reason why I even wrote 
this) for performance reasons.


Here are some numbers:
First float8:
CREATE TABLE wbtest AS SELECT (random() * 100)::float8 a FROM 
generate_series(1,100) ORDER BY 1;


SELECT
width_bucket(a, ARRAY[10, 20, 30, 40, 50, 60, 70, 80, 90]::float8[]),
width_bucket(a, ARRAY[10, 20, 30, 40, 50, 60, 70, 80, 91]::float8[]),
width_bucket(a, ARRAY[10, 20, 30, 40, 50, 60, 70, 80, 92]::float8[])
FROM wbtest;

Optimized float8: ~250ms
Tom's generic: ~400ms


Numeric:
CREATE TABLE wbtestn AS SELECT (random() * 100)::numeric a FROM 
generate_series(1,100) ORDER BY 1;


SELECT
width_bucket(a, ARRAY[10, 20, 30, 40, 50, 60, 70, 80, 90]::numeric[]),
width_bucket(a, ARRAY[10, 20, 30, 40, 50, 60, 70, 80, 91]::numeric[]),
width_bucket(a, ARRAY[10, 20, 30, 40, 50, 60, 70, 80, 92]::numeric[])
FROM wbtestn;

Optimized numeric: ~600ms
My generic: ~780ms
Tom's generic: ~900ms

The difference between my generic and Tom's generic is because Tom's is 
slowed down by the deconstruct_array.




2. Still I strongly afraid about used searching method - there is not
possible to check a  validity of input. Did you check how much linear
searching is slower - you spoke, so you have a real data and real use case?


Well, if we check the input then we'll be doing O(N) comparisons instead
of O(log N).  That could be a serious cost penalty for large arrays and
expensive comparison functions (such as strcoll()).  I think it's probably
sufficient to have a clear warning in the docs.



With resort the performance is worse than bunch of CASE WHEN :(



Also, a documentation quibble: in Petr's patch the documentation and
comments refer to the thresholds as "right bounds", which I didn't care
for and replaced with "upper bounds".  However, looking at it again,
I wonder if we would not be better off describing them as "lower bounds".
They are exclusive bounds if seen as upper bounds, and inclusive if seen
as lower bounds, and on the whole I think the latter viewpoint might be
less confusing.  Thoughts?


Upper bounds is probably better name than right bounds, I agree with 
that. In any case it's upper bound for a bucket (that's why there is one 
more bucket to which everything bigger than max threshold goes into).



--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On partitioning

2014-08-31 Thread Tom Lane
Another thought about this general topic:

Alvaro Herrera  writes:
> ...
> Allowed actions on a RELKIND_PARTITION:
> * CREATE INDEX .. ON PARTITION  ON TABLE 
> ...
> Still To Be Designed
> 
> * Are indexes/constraints inherited from the parent rel?

I think one of the key design decisions we have to make is whether
partitions are all constrained to have exactly the same set of indexes.
If we don't insist on that it will greatly complicate planning compared
to what we'll get if we do insist on it, because then the planner will
need to generate a separate customized plan subtree for each partition.
Aside from costing planning time, most likely that would forever prevent
us from pushing some types of intelligence about partitioning into the
executor.

Now, in the current model, it's up to the user what indexes to create
on each partition, and sometimes one might feel that maintaining a
particular index is unnecessary in some partitions.  But the flip side
of that is it's awfully easy to screw yourself by forgetting to add
some index when you add a new partition.  So I'm not real sure which
approach is superior from a purely user-oriented perspective.

I'm not trying to push one or the other answer right now, just noting
that this is a critical decision.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-08-31 Thread Tom Lane
Petr Jelinek  writes:
> On 30/08/14 19:24, Tom Lane wrote:
>> I wasn't terribly happy about that either.  I still think we should
>> reduce this to a single polymorphic function, as in the attached.

> I did try to write generic function very similar to what you wrote but 
> discarded it because of the performance reasons. If we really want to 
> support any data type I am all for having generic function but I still 
> would like to have one optimized for float8 because that seems to be the 
> most used use-case (at least that one is the reason why I even wrote 
> this) for performance reasons.

Well, part of the reason why your v3 float8 function looks so fast is that
it's cheating: it will not produce the right answers for comparisons
involving NaNs.  I'm not sure how good it would look once you'd added
some isnan() tests to make the behavior actually equivalent to
float8_cmp_internal().

However, assuming it still seems worthwhile once that's accounted for,
I don't have a strong objection to having an additional code path for
float8.  There are two ways we could do that:

1. Add a runtime test in the polymorphic function, eg

if (element_type == FLOAT8OID)
result = width_bucket_float8(...);
else if (typentry->typlen > 0)
result = width_bucket_fixed(...);
else
result = width_bucket_variable(...);

2. Define a SQL function width_bucket(float8, float8[]) alongside
the polymorphic one.

While #2 might be marginally faster at runtime, the main difference
between these is what the parser would choose to do with ambiguous cases.

I experimented a bit and it seemed that the parser would prefer the
float8 function in any situation where the inputs were of numeric types,
probably because float8 is the preferred type in the numeric category.
I'm not real sure that would be a good thing: in particular it would
cast integer and numeric inputs to float8, which risks precision loss,
and there doesn't seem to be any way to get the parser to make the
other choice if the inputs are of numeric-category types.

As against that objection, it would make cross-type numeric cases easier
to use, for example "width_bucket(1, array[2.4])" would be accepted.
If we just offer the anyelement/anyarray version then the parser would
make you cast "1" to numeric before it would consider the function to
match.

On balance the runtime-test approach looks like a better idea, in that
it doesn't risk any unexpected semantic behaviors.

> The difference between my generic and Tom's generic is because Tom's is 
> slowed down by the deconstruct_array.

Meh.  It looked to me like your version would have O(N^2) performance
problems from computing array offsets repeatedly, depending on exactly
which array element it ended up on.  It would avoid repeat calculations
only if it always moved right.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On partitioning

2014-08-31 Thread Hannu Krosing
On 08/31/2014 10:03 PM, Tom Lane wrote:
> Another thought about this general topic:
>
> Alvaro Herrera  writes:
>> ...
>> Allowed actions on a RELKIND_PARTITION:
>> * CREATE INDEX .. ON PARTITION  ON TABLE 
>> ...
>> Still To Be Designed
>> 
>> * Are indexes/constraints inherited from the parent rel?
> I think one of the key design decisions we have to make is whether
> partitions are all constrained to have exactly the same set of indexes.
> If we don't insist on that it will greatly complicate planning compared
> to what we'll get if we do insist on it, because then the planner will
> need to generate a separate customized plan subtree for each partition.
> Aside from costing planning time, most likely that would forever prevent
> us from pushing some types of intelligence about partitioning into the
> executor.
>
> Now, in the current model, it's up to the user what indexes to create
> on each partition, and sometimes one might feel that maintaining a
> particular index is unnecessary in some partitions.  But the flip side
> of that is it's awfully easy to screw yourself by forgetting to add
> some index when you add a new partition.  
The "forgetting" part is easy to solve by inheriting all indexes from
parent (or template) partition unless explicitly told not to.

One other thing that has been bothering me about this proposal
is the ability to take partitions offline for maintenance or to load
them offline ant then switch in.

In current scheme we do this using ALTER TABLE ... [NO] INHERIT ...

If we also want to have this with the not-directly-accessible partitions
then perhaps it could be done by having a possibility to move
a partition between two tables with exactly the same structure ?

> So I'm not real sure which
> approach is superior from a purely user-oriented perspective.
What we currently have is a very flexible scheme which has a few
drawbacks

1) unnecessarily complex for simple case
2) easy to shoot yourself in the foot by forgetting something
3) can be hard on planner, especially with huge number of partitions

An alternative way of solving these problems is adding some
(meta-)constraints to current way of doing things and some more
automation

CREATE TABLE FOR PARTITIONMASTER
WITH (ALL_INDEXES_SAME=ON,
  SAME_STRUCTURE_ALWAYS=ON,
  SINGLE_INHERITANCE_ONLY=ON,
  NESTED_INHERITS=OFF,
  PARTITION_FUNCTION=default_range_partitioning(int)
);

and then force these when adding inherited tables (in this case
partition tables)
either via CREATE TABLE or ALTER TABLE

Best Regards

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On partitioning

2014-08-31 Thread Martijn van Oosterhout
On Fri, Aug 29, 2014 at 12:35:50PM -0400, Tom Lane wrote:
> > Each partition is assigned an Expression that receives a tuple and
> > returns boolean.  This expression returns true if a given tuple belongs
> > into it, false otherwise.
> 
> -1, in fact minus a lot.  One of the core problems of the current approach
> is that the system, particularly the planner, hasn't got a lot of insight
> into exactly what the partitioning scheme is in a partitioned table built
> on inheritance.  If you allow the partitioning rule to be a black box then
> that doesn't get any better.  I want to see a design wherein the system
> understands *exactly* what the partitioning behavior is.  I'd start with
> supporting range-based partitioning explicitly, and maybe we could add
> other behaviors such as hashing later.
> 
> In particular, there should never be any question at all that there is
> exactly one partition that a given row belongs to, not more, not less.
> You can't achieve that with a set of independent filter expressions;
> a meta-rule that says "exactly one of them should return true" is an
> untrustworthy band-aid.
> 
> (This does not preclude us from mapping the tuple through the partitioning
> rule and finding that the corresponding partition doesn't currently exist.
> I think we could view the partitioning rule as a function from tuples to
> partition numbers, and then we look in pg_class to see if such a partition
> exists.)

There is one situation where you need to be more flexible, and that is
if you ever want to support online repartitioning. To do that you have
to distinguish between "I want to insert tuple X, which partition
should it go into" and "I want to know which partitions I need to look
for partition_key=Y".

For the latter you really have need an expression per partition, or
something equivalent.  If performance is an issue I suppose you could
live with having an "old" and an "new" partition scheme, so you
couldn't have two "live repartitionings" happening simultaneously.

Now, if you want to close the door on online repartitioning forever
then that fine. But being in the position of having to say "yes our
partitioning scheme sucks, but we would have to take the database down
for a week to fix it" is no fun.

Unless logical replication provides a way out maybe??

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Built-in binning functions

2014-08-31 Thread Simon Riggs
On 31 August 2014 20:44, Gavin Flower  wrote:
> On 01/09/14 06:00, Tom Lane wrote:
>>
>> Simon Riggs  writes:
>>>
>>> Suggest discretize() as a much more informative name. The other names
>>> will be overlooked by anybody that doesn't already know what to look
>>> for.
>>
>> I did not like that idea to begin with, but it's growing more attractive.
>> In particular, I think it would be unique enough that we could safely
>> mark the polymorphic function as variadic (a move that would cause
>> ambiguity issues for pretty nearly any user-defined function of the
>> same name).
>>
>> OTOH, I'm not as convinced as you that this name would mean anything
>> to "anybody that doesn't already know what to look for".  It still
>> seems like rather arcane terminology.
>>
>>
> If I was new to PostgreSQL, I'd probably read this as disc*(), a function to
> do something with discs.
>
> I have done finite maths and statistics at university, plus I have been
> vaguely following this thread - but, the meaning of discretize() is not
> immediately apparent to me (if I reread a previous email or 2 in this
> thread, then I'm sure it would then be obvious!).  I might guess that it is
> to make something discrete, but why would I want to do that?  So if I came
> across this function in a year or 2, I'd probably go right past it.
>
> I have been programming for over 40 years, and I still find choosing
> appropriate names sometimes very difficult, so I know it is not easy to come
> up with meaningful names - especially ones that mean something relevant to
> other people!

It's important that we *don't* come up with a new name.

This function does something useful and the words people already use
to describe that are binning and discretization. By this I mean names
already used by very widely used software. Search them and see, or
suggest more widely used alternatives, please.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-08-31 Thread Petr Jelinek

On 31/08/14 22:33, Tom Lane wrote:

Petr Jelinek  writes:

On 30/08/14 19:24, Tom Lane wrote:

I wasn't terribly happy about that either.  I still think we should
reduce this to a single polymorphic function, as in the attached.



I did try to write generic function very similar to what you wrote but
discarded it because of the performance reasons. If we really want to
support any data type I am all for having generic function but I still
would like to have one optimized for float8 because that seems to be the
most used use-case (at least that one is the reason why I even wrote
this) for performance reasons.


Well, part of the reason why your v3 float8 function looks so fast is that
it's cheating: it will not produce the right answers for comparisons
involving NaNs.  I'm not sure how good it would look once you'd added
some isnan() tests to make the behavior actually equivalent to
float8_cmp_internal().



True, it increases the time of the test to around 285ms, still almost 
30% speed difference.



However, assuming it still seems worthwhile once that's accounted for,
I don't have a strong objection to having an additional code path for
float8.  There are two ways we could do that:

1. Add a runtime test in the polymorphic function, eg

if (element_type == FLOAT8OID)
result = width_bucket_float8(...);
else if (typentry->typlen > 0)
result = width_bucket_fixed(...);
else
result = width_bucket_variable(...);



Yeah I think I prefer this one too, will see how much performance it eats.




The difference between my generic and Tom's generic is because Tom's is
slowed down by the deconstruct_array.


Meh.  It looked to me like your version would have O(N^2) performance
problems from computing array offsets repeatedly, depending on exactly
which array element it ended up on.  It would avoid repeat calculations
only if it always moved right.



I actually think that worst case (when you go always left) for my 
version is O(N) since you only need to seek for the half of previous 
interval (it's doing binary search after all) and you do O(N) in the 
deconstruct_array. It would be very different if we could cache the 
array somehow (ie, if this was an aggregate) then it would obviously 
make a lot of sense to use deconstruct_array and in that case it would 
make even sense to resort probably, but sadly we can't cache afaik.


Also, I made more tests with various array sizes (3-1) and 
distributions and mine version was never slower.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-08-31 Thread David G Johnston
Simon Riggs wrote
> width_bucket() seems to refer to an equal-width binning process. The
> function being discussed here is a generic mechanism, the boundaries
> of which could have been decided using equal-frequency or other
> mechanisms. Using the word "width" in those contexts could be
> confusing.
> 
> Given width_bucket() is already in use for SQL Standard function, I
> agree it would be confusing to have same name for different parameter
> profiles.

literal_bucket(float8, float8[])

Since "bucket" is the 'verb' here (in this specific case meaning "lookup the
supplied value in the supplied bucket definition") and "width" is a modifier
(the bucket specification describes an equal-width structure) I suggest
"literal_bucket(val, array[])" such that the bucket is still the verb but
now the modifier describes a structure that is literally provided.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Built-in-binning-functions-tp5807266p5817097.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-08-31 Thread Tom Lane
Petr Jelinek  writes:
> On 31/08/14 22:33, Tom Lane wrote:
>> Petr Jelinek  writes:
>>> The difference between my generic and Tom's generic is because Tom's is
>>> slowed down by the deconstruct_array.

>> Meh.  It looked to me like your version would have O(N^2) performance
>> problems from computing array offsets repeatedly, depending on exactly
>> which array element it ended up on.  It would avoid repeat calculations
>> only if it always moved right.

> I actually think that worst case (when you go always left) for my 
> version is O(N) since you only need to seek for the half of previous 
> interval (it's doing binary search after all) and you do O(N) in the 
> deconstruct_array.

[ thinks about that for awhile... ]  Hm, I think you are right.

If there are N elements in the array then we will scan over N/2 of them
to locate the midpoint element for the first comparison.  After that,
we will go either left or right, and in either case we will need to scan
over N/4 elements to find the second-level midpoint.  The third-level
midpoint will be found after scanning N/8 elements, and so on.  So the
number of elements scanned over to compute their lengths is N/2 + N/4 +
N/8 ..., or asymptotically N, the same as for the deconstruct_array
approach.  deconstruct_array might have some small advantage from tighter
inner loops, but this is probably more than blown by the need for a
palloc and pfree.

So I was wrong and your way is better for navigating the array in the
variable-width case, assuming that we're not concerned about spending
some more code space to make this function a shade faster.

BTW, was there a reason for not noticing the case of exact match in
the search loop, and falling out early?  As it stands the code will
reliably choose the leftmost match if there are multiple equal items
in the search array, but do we care about such cases?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-08-31 Thread Tom Lane
David G Johnston  writes:
> Since "bucket" is the 'verb' here (in this specific case meaning "lookup the
> supplied value in the supplied bucket definition") and "width" is a modifier
> (the bucket specification describes an equal-width structure) I suggest
> "literal_bucket(val, array[])" such that the bucket is still the verb but
> now the modifier describes a structure that is literally provided.

It's a very considerable stretch to see "bucket" as a verb here :-).
Maybe that's why the SQL committee's choice of function name seems
so unnatural (to me anyway).

I was wondering about bucket_index(), ie "get the index of the bucket
this value falls into".  Or get_bucket(), or get_bucket_index() if you
like verbosity.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-08-31 Thread David Johnston
On Sun, Aug 31, 2014 at 7:48 PM, Tom Lane  wrote:

> David G Johnston  writes:
> > Since "bucket" is the 'verb' here (in this specific case meaning "lookup
> the
> > supplied value in the supplied bucket definition") and "width" is a
> modifier
> > (the bucket specification describes an equal-width structure) I suggest
> > "literal_bucket(val, array[])" such that the bucket is still the verb but
> > now the modifier describes a structure that is literally provided.
>
> It's a very considerable stretch to see "bucket" as a verb here :-).
> Maybe that's why the SQL committee's choice of function name seems
> so unnatural (to me anyway).
>
> I was wondering about bucket_index(), ie "get the index of the bucket
> this value falls into".  Or get_bucket(), or get_bucket_index() if you
> like verbosity.
>
> regards, tom lane
>

​I got stuck on the thought that a function name should ideally be/include
a verb...​

​Even if you read it as a noun (and thus the verb is an implied "get") the
naming logic still holds.

I pondered a "get_" version though the argument for avoiding conflicting
user-code decreases its appeal.

The good part about SQL standard naming is that the typical programmer
isn't likely to pick a conflicting name.

"bucket_index" is appealing by itself though user-code probable...as bad as
I think "width_bucket" is for a name the fact is that it currently exists
and, even forced, consistency has merit.

David J.


Re: [HACKERS] Tips/advice for implementing integrated RESTful HTTP API

2014-08-31 Thread Craig Ringer
On 08/31/2014 12:40 PM, Dobes Vandermeer wrote:
> 1. Connecting to multiple databases
> 
> The background workers can apparently only connect to a single database
> at a time, but I want to expose all the databases via the API. 

bgworkers are assigned a database at launch time (if SPI is enabled),
and this database may not change during the worker's lifetime, same as a
normal backend.

Sometimes frustrating, but that's how it is.

> I think I could use libpq to connect to PostgreSQL on localhost but this
> might have weird side-effects in terms of authentication, pid use, stuff
> like that.

If you're going to do that, why use a bgworker at all?

In general, what do you gain from trying to do this within the database
server its self, not as an app in front of the DB?

> I could probably manage a pool of dynamic workers (as of 9.4), one per
> user/database combination or something along those lines.  Even one per
> request?  Is there some kind of IPC system in place to help shuttle the
> requests and responses between dynamic workers?  Or do I need to come up
> with my own?

The dynamic shmem code apparently has some queuing functionality. I
haven't used it yet.

> It seems like PostgreSQL itself has a way to shuttle requests out to
> workers, is it possible to tap into that system instead?  Basically some
> way to send the requests to a PostgreSQL backend from the background worker?

It does?

It's not the SPI, that executes work directly within the bgworker,
making it behave like a normal backend for the purpose of query execution.

> Or perhaps I shouldn't do this as a worker but rather modify PostgreSQL
> itself and do it in a more integrated/destructive manner?

Or just write a front-end.

The problem you'd have attempting to modify PostgreSQL its self for this
is that connection dispatch occurs via the postmaster, which is a
single-threaded process that already needs to do a bit of work to keep
an eye on how things are running. You don't want it constantly busy
processing and dispatching millions of tiny HTTP requests. It can't just
hand a connection off to a back-end immediately after accepting it,
either; it'd have to read the HTTP headers to determine what database to
connect to. Then launch a new backend for the connection, which is
horribly inefficient when doing tiny short-lived connections. The
postmaster has no concept of a pool of backends (unfortunately, IMO) to
re-use.

I imagine (it's not something I've investigated, really) that you'd want
a connection accepter process that watched the listening http request
socket. It'd hand connections off to dispatcher processes that read the
message content to get the target DB and dispatch the request to a
worker backend for the appropriate user/db combo, then collect the
results and return them on the connection. Hopefully at this point
you're thinking "that sounds a lot like a connection pool"... because it
is. An awfully complicated one, probably, as you'd have to manage
everything using shared memory segments and latches.

In my view it's unwise to try to do this in the DB with PostgreSQL's
architecture. Hack PgBouncer or PgPool to do what you want. Or write a
server with Tomcat/Jetty using JAX-RS and PgJDBC and the built in
connection pool facilities - you won't *believe* how easy it is.

> 3. Parallelism
> 
> The regular PostgreSQL server can run many queries in parallel

Well, one PostgreSQL instance (postmaster) may have many backends, each
of which may run queries in series but not in parallel. Any given
process may only run one query at once.

> but it
> seems like if I am using SPI I could only run one query at a time - it's
> not an asynchronous API.

Correct.

> Any help, sage advice, tips, and suggestions how to move forward in
> these areas would be muchly appreciated!

Don't do it with bgworkers.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY and heap_sync

2014-08-31 Thread Fabrízio de Royes Mello
On Sun, Aug 31, 2014 at 10:10 AM, Peter Eisentraut  wrote:
>
> On 8/30/14 2:26 AM, Jeff Janes wrote:
> > But there cases were people use COPY in a loop with a small amount of
> > data in each statement.
>
> What would be the reason for doing that?
>

I used that to the same thing many times. In a company that I was employed
we developed scripts to migrate data from one database do another.

The first version we used INSERT statements and was very very slow. Then we
wrote a second version changing the INSERT by COPY statements. The
performance was very better, but  we believe that could be better, so in
the third version we created some kind of "cache" (using arrays) to
accumulate the records in memory then after N rows we build the COPY
statement with the cache contents and run it. This was a really good
performance improvement.

It's my use case to we have a feature to postpone the heap_sync in COPY
statements. I don't know if it's a feature that a lot of people wants, but
IMHO it could be nice to improve the bulk load operations.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


Re: [HACKERS] Why data of timestamptz does not store value of timezone passed to it?

2014-08-31 Thread rohtodeveloper
> On 08/29/2014 04:59 AM, Kevin Grittner wrote:
>> I just took a quick look at the spec to refresh my memory, and it
>> seems to require that the WITH TIME ZONE types store UTC (I suppose
>> for fast comparisons), it requires the time zone in the form of a
>> hour:minute offset to be stored with it, so you can determine the
>> local time from which it was derived. I concede that this is not
>> usually useful, and am glad we have a type that behaves as
>> timestamptz does; but occasionally a type that behaves in
>> conformance with the spec would be useful, and it would certainly
>> be less confusing for people who are used to the standard behavior.
> 
> FWIW, MS SQL's DateTimeOffset data type:
> 
> http://msdn.microsoft.com/en-AU/library/bb630289.aspx
> 
> is much more like what I, when I was getting started, expected TIMESTAMP
> WITH TIME ZONE to be. We don't really have anything equivalent in
> PostgreSQL.
> 

That's also what i expect,a timestamptz = timestampt + offset . Just like the 
current implementation  of TIME WITH TIME ZONE.

typedef struct
{
 TimeADT  time;   /* all time units other than months and years */
 int32  zone;   /* numeric time zone, in seconds */
} TimeTzADT;

And, it's inconvenient for client(jdbc,npgsql...) to understand a strict 
'timezone' (such as 'America/New_York') which comes from PostgreSQL and 
transform it to theirown data type(Such as DateTimeOffset in .NET). But a 
*offset* is easy to parse and process.


Beast Regards
rohto 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS Design

2014-08-31 Thread Stephen Frost
Adam, all,

* Brightwell, Adam (adam.brightw...@crunchydatasolutions.com) wrote:
> Attached is a patch for RLS that was create against master at
> 01363beae52700c7425cb2d2452177133dad3e93 and is ready for review.

Many thanks for posting this.  As others may realize already, I've
reviewed and modified this patch already, working with Adam to get it
ready.  I'm continuing to review and test it, but in general I'm quite
happy with how it's shaping up- additional review, testing, and comments
are always appreciated though.

> Alter a RLS policy named  on .  Specifying a command is
> optional, if provided then the policy's command is changed otherwise it is
> left as-is.  Specifying a role is optional, if provided then the policy's
> role is changed otherwise it is left as-is.  The  must always be
> provided and is therefore always replaced.

I'm pretty sure the  is also optional in this patch (that was
a late change that I made), but the documentation needs to be updated.

> * Plancache Invalidation:  If a relation has a row-security policy and
> row-security is enabled then the invalidation will occur when either the
> row_security GUC is changed OR when a the current user changes.  This
> invalidation ONLY takes place for cached plans where the target relation
> has a row security policy.

I know there was a lot of discussion about this previously, but I'm fine
with the initial version simply invalidating plans which involve
RLS-enabled relations and role changes.  This patch doesn't create any
regressions for individuals who are not using RLS.  We can certainly
look into improving this in the future to have per-role plan caches but
it's a fair bit of additional non-trivial code that can be added
independently.

> * Security Qual Expression:  All row-security policies are OR'ed together.

This was also a point of much discussion, but I continue to feel this is
the right approach for the initial version.  We can add flexability here
later, if necessary, but OR'ing these together is in-line with how role
membership works today (you have right for all roles you are a member
of, persuant to inherit/noinherit status, of course).

> * row_security GUC - enable/disable row level security.

Note that, as discussed, pg_dump will set row_security off, unless
specifically asked to enable it.  row_security will also be set to off
when the user logging in is a superuser or does a 'set role' to a
superuser.  Currently, if a user logging in is *not* a superuser, or a
'set role' is done to a non-superuser, row_security gets re-set to
enabled.  This is one aspect of the patch that I think we should change
(which is a matter of removing just a few lines of code and then
updating the regression tests to do 'set row_security = on;' before
running), because if you log in as a superuser and then 'set role' to a
non-superuser, it occurs to me now (it didn't really when I wrote this
originally) as a bit surprising that row_security gets set to 'on' when
doing a 'set role'.

One thing that I really like about this approach is that a superuser can
explicitly set 'row_security' to on and be able to see what happens.
Clearly, in an environment of untrusted users, that could be dangerous,
but it can also be an extremely useful way of testing things,
particularly in development environments where everyone is a superuser. 

This deserves a bit more documentation also.

> * BYPASSRLS and NOBYPASSRLS role attribute - allows user to bypass RLS if
> row_security GUC is set to OFF.  If a user sets row_security to OFF and
> does not have this attribute, then an error is raised when attempting to
> query a relation with a RLS policy.

(note that the superuser is always considered to have the bypassrls
attribute)

> * psql \d  support: psql describe support for listing policy
> information per table.

This works pretty well for me, but we may want to add some indication
that RLS is on a table in the \dp listing.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Immediate standby promotion

2014-08-31 Thread Amit Kapila
On Thu, Aug 14, 2014 at 1:49 PM, Fujii Masao  wrote:
>
> Hi,
>
> I'd like to propose to add new option "--immediate" to pg_ctl promote.
> When this option is set, recovery ignores any WAL data which have not
> been replayed yet and exits immediately. Patch attached.
>
> This promotion is faster than normal one but can cause data loss.

I think there is one downside as well for this proposal that
apart from data loss, it can lead to uncommitted data occupying
space in database which needs to be later cleaned by vacuum.
This can happen with non-immediate promote as well, but the
chances with immediate are more.  So the gain we got by doing
immediate promotion can lead to slow down of operations in some
cases.  It might be useful if we mention this in docs.

Few comments about patch:

1.
On standby we will see below message:

LOG:  received promote request

User will always see above message irrespective of whether it
is immediate promote or any other mode of promote. I think it will
be better to distinguish between different modes and display the
appropriate message.

2.
StartupXLOG()
{
..
+ if (immediate_promote)
+ break;
..
}

Why are you doing this check after pause
(recoveryApplyDelay/recoveryPausesHere) for recovery?

Why can't we do it after ReadRecord()?


3.
!  * of "promote" and "immediate_promote"
shouldn't in above sentence 'or' is more appropriate?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com