Hi, I am testing the partitioning feature from the latest master and got the following error while loading the data -
db01=# create table orders_y1993 PARTITION OF orders FOR VALUES FROM ('1993-01-01') TO ('1993-12-31'); CREATE TABLE db01=# copy orders from '/data/orders-1993.csv' delimiter '|'; *ERROR: could not read block 6060 in file "base/16384/16412": read only 0 of 8192 bytes* *CONTEXT: COPY orders, line 376589: "9876391|374509|O|54847|1997-07-16|3-MEDIUM |Clerk#000001993|0|ithely regular pack"* Am i doing something wrong ? Regards, Venkata B N Database Consultant On Fri, Dec 9, 2016 at 3:58 PM, Amit Langote <langote_amit...@lab.ntt.co.jp> wrote: > On 2016/12/09 0:25, Robert Haas wrote: > > On Wed, Dec 7, 2016 at 11:42 PM, Michael Paquier > > <michael.paqu...@gmail.com> wrote: > >>> Congrats to everyone working on this! This is a large step forward. > >> > >> Congratulations to all! It was a long way to this result. > > > > Yes. The last effort in this area which I can remember was by Itagaki > > Takahiro in 2010, so we've been waiting for this for more than 6 > > years. It's really good that Amit was able to put in the effort to > > produce a committable patch, and I think he deserves all of our thanks > > for getting that done - and NTT deserves our thanks for paying him to > > do it. > > > > Even though I know he put in a lot more work than I did, let me just > > say: phew, even reviewing that was a ton of work. > > Absolutely! Your review comments and design suggestions have been > instrumental in improving (and cutting down on the size of) the patches. > > > Of course, this is the beginning, not the end. > > +1000! > > > I've been thinking > > about next steps -- here's an expanded list: > > > > - more efficient plan-time partition pruning (constraint exclusion is > too slow) > > - run-time partition pruning > > - partition-wise join (Ashutosh Bapat is already working on this) > > - try to reduce lock levels > > - hash partitioning > > - the ability to create an index on the parent and have all of the > > children inherit it; this should work something like constraint > > inheritance. you could argue that this doesn't add any real new > > capability but it's a huge usability feature. > > - teaching autovacuum enough about inheritance hierarchies for it to > > update the parent statistics when they get stale despite the lack of > > any actual inserts/updates/deletes to the parent. this has been > > pending for a long time, but it's only going to get more important > > - row movement (aka avoiding the need for an ON UPDATE trigger on each > > partition) > > - insert (and eventually update) tuple routing for foreign partitions > > - not scanning the parent > > - fixing the insert routing so that we can skip tuple conversion where > possible > > - fleshing out the documentation > > I would definitely want to contribute to some of these items. It's great > that many others plan to contribute toward this as well. > > > One thing I'm wondering is whether we can optimize away some of the > > heavyweight locks. For example, if somebody does SELECT * FROM ptab > > WHERE id = 1, they really shouldn't need to lock the entire > > partitioning hierarchy, but right now they do. If the root knows > > based on its own partitioning key that only one child is relevant, it > > would be good to lock *only that child*. For this feature to be > > competitive, it needs to scale to at least a few thousand partitions, > > and locking thousands of objects instead of one or two is bound to be > > slow. Similarly, you can imagine teaching COPY to lock partitions > > only on demand; if no tuples are routed to a particular partition, we > > don't need to lock it. There's a manageability component here, too: > > not locking partitions unnecessarily makes ti easier to get DDL on > > other partitions through. Alternatively, maybe we could rewrite the > > lock manager to be hierarchical, so that you can take a single lock > > that represents an AccessShareLock on all partitions and only need to > > make one entry in the lock table to do it. That means that attempts > > to lock individual partitions need to check not only for a lock on > > that partition but also on anything further up in the hierarchy, but > > that might be a good trade if it gives us O(1) locking on the parent. > > And maybe we could also have a level of the hierarchy that represents > > every-table-in-the-database, for the benefit of pg_dump. Of course, > > rewriting the lock manager is a big project not for the faint of > > heart, but I think if we don't it's going to be a scaling bottleneck. > > Hierarchical lock manager stuff is interesting. Are you perhaps alluding > to a new *intention* lock mode as described in the literature on multiple > granularity locking [1]? > > > We also need to consider other parts of the system that may not scale, > > like pg_dump. For a long time, we've been sorta-kinda willing to fix > > the worst of the scalability problems with pg_dump, but that's really > > no longer an adequate response. People want 1000 partitions. Heck, > > people want 1,000,000 partitions, but getting to where 1000 partitions > > works well would help PostgreSQL a lot. Our oft-repeated line that > > inheritance isn't designed for large numbers of inheritance children > > is basically just telling people who have the use case where they need > > that to go use some other product. Partitioning, like replication, is > > not an optional feature for a world-class database. And, from a > > technical point of view, I think we've now got an infrastructure that > > really should be able to be scaled up considerably higher than what > > we've been able to do in the past. When we were stuck with > > inheritance + constraint exclusion, we could say "well, there's not > > really any point because you'll hit these other limits anyway". But I > > think now that's not really true. This patch eliminates one of the > > core scalability problems in this area, and provides infrastructure > > for attacking some of the others. I hope that people will step up and > > do that. There's a huge opportunity here for PostgreSQL to become > > relevant in use cases where it currently falters badly, and we should > > try to take advantage of it. This patch is a big step by itself, but > > if we ignore the potential to do more with this as the base we will be > > leaving a lot of "win" on the table. > > Agreed on all counts. > > Thanks, > Amit > > [1] https://en.wikipedia.org/wiki/Multiple_granularity_locking > > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >