Re: [HACKERS] Declarative partitioning - another take

Venkata B Nagothi Thu, 08 Dec 2016 22:17:15 -0800

Hi,

I am testing the partitioning feature from the latest master and got the
following error while loading the data -


db01=# create table orders_y1993 PARTITION OF orders FOR VALUES FROM
('1993-01-01') TO ('1993-12-31');
CREATE TABLE

db01=# copy orders from '/data/orders-1993.csv' delimiter '|';
*ERROR:  could not read block 6060 in file "base/16384/16412": read only 0
of 8192 bytes*
*CONTEXT:  COPY orders, line 376589:
"9876391|374509|O|54847|1997-07-16|3-MEDIUM       |Clerk#000001993|0|ithely
regular pack"*

Am i doing something wrong ?

Regards,

Venkata B N
Database Consultant


On Fri, Dec 9, 2016 at 3:58 PM, Amit Langote <[email protected]>
wrote:

> On 2016/12/09 0:25, Robert Haas wrote:
> > On Wed, Dec 7, 2016 at 11:42 PM, Michael Paquier
> > <[email protected]> wrote:
> >>> Congrats to everyone working on this! This is a large step forward.
> >>
> >> Congratulations to all! It was a long way to this result.
> >
> > Yes.  The last effort in this area which I can remember was by Itagaki
> > Takahiro in 2010, so we've been waiting for this for more than 6
> > years.  It's really good that Amit was able to put in the effort to
> > produce a committable patch, and I think he deserves all of our thanks
> > for getting that done - and NTT deserves our thanks for paying him to
> > do it.
> >
> > Even though I know he put in a lot more work than I did, let me just
> > say: phew, even reviewing that was a ton of work.
>
> Absolutely!  Your review comments and design suggestions have been
> instrumental in improving (and cutting down on the size of) the patches.
>
> > Of course, this is the beginning, not the end.
>
> +1000!
>
> > I've been thinking
> > about next steps -- here's an expanded list:
> >
> > - more efficient plan-time partition pruning (constraint exclusion is
> too slow)
> > - run-time partition pruning
> > - partition-wise join (Ashutosh Bapat is already working on this)
> > - try to reduce lock levels
> > - hash partitioning
> > - the ability to create an index on the parent and have all of the
> > children inherit it; this should work something like constraint
> > inheritance.  you could argue that this doesn't add any real new
> > capability but it's a huge usability feature.
> > - teaching autovacuum enough about inheritance hierarchies for it to
> > update the parent statistics when they get stale despite the lack of
> > any actual inserts/updates/deletes to the parent.  this has been
> > pending for a long time, but it's only going to get more important
> > - row movement (aka avoiding the need for an ON UPDATE trigger on each
> > partition)
> > - insert (and eventually update) tuple routing for foreign partitions
> > - not scanning the parent
> > - fixing the insert routing so that we can skip tuple conversion where
> possible
> > - fleshing out the documentation
>
> I would definitely want to contribute to some of these items.  It's great
> that many others plan to contribute toward this as well.
>
> > One thing I'm wondering is whether we can optimize away some of the
> > heavyweight locks.  For example, if somebody does SELECT * FROM ptab
> > WHERE id = 1, they really shouldn't need to lock the entire
> > partitioning hierarchy, but right now they do.  If the root knows
> > based on its own partitioning key that only one child is relevant, it
> > would be good to lock *only that child*.  For this feature to be
> > competitive, it needs to scale to at least a few thousand partitions,
> > and locking thousands of objects instead of one or two is bound to be
> > slow.  Similarly, you can imagine teaching COPY to lock partitions
> > only on demand; if no tuples are routed to a particular partition, we
> > don't need to lock it.  There's a manageability component here, too:
> > not locking partitions unnecessarily makes ti easier to get DDL on
> > other partitions through.  Alternatively, maybe we could rewrite the
> > lock manager to be hierarchical, so that you can take a single lock
> > that represents an AccessShareLock on all partitions and only need to
> > make one entry in the lock table to do it.  That means that attempts
> > to lock individual partitions need to check not only for a lock on
> > that partition but also on anything further up in the hierarchy, but
> > that might be a good trade if it gives us O(1) locking on the parent.
> > And maybe we could also have a level of the hierarchy that represents
> > every-table-in-the-database, for the benefit of pg_dump.  Of course,
> > rewriting the lock manager is a big project not for the faint of
> > heart, but I think if we don't it's going to be a scaling bottleneck.
>
> Hierarchical lock manager stuff is interesting.  Are you perhaps alluding
> to a new *intention* lock mode as described in the literature on multiple
> granularity locking [1]?
>
> > We also need to consider other parts of the system that may not scale,
> > like pg_dump.  For a long time, we've been sorta-kinda willing to fix
> > the worst of the scalability problems with pg_dump, but that's really
> > no longer an adequate response.  People want 1000 partitions.  Heck,
> > people want 1,000,000 partitions, but getting to where 1000 partitions
> > works well would help PostgreSQL a lot.  Our oft-repeated line that
> > inheritance isn't designed for large numbers of inheritance children
> > is basically just telling people who have the use case where they need
> > that to go use some other product.  Partitioning, like replication, is
> > not an optional feature for a world-class database.  And, from a
> > technical point of view, I think we've now got an infrastructure that
> > really should be able to be scaled up considerably higher than what
> > we've been able to do in the past.  When we were stuck with
> > inheritance + constraint exclusion, we could say "well, there's not
> > really any point because you'll hit these other limits anyway".  But I
> > think now that's not really true.  This patch eliminates one of the
> > core scalability problems in this area, and provides infrastructure
> > for attacking some of the others.  I hope that people will step up and
> > do that.  There's a huge opportunity here for PostgreSQL to become
> > relevant in use cases where it currently falters badly, and we should
> > try to take advantage of it.  This patch is a big step by itself, but
> > if we ignore the potential to do more with this as the base we will be
> > leaving a lot of "win" on the table.
>
> Agreed on all counts.
>
> Thanks,
> Amit
>
> [1] https://en.wikipedia.org/wiki/Multiple_granularity_locking
>
>
>
>
> --
> Sent via pgsql-hackers mailing list ([email protected])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Declarative partitioning - another take

Reply via email to