Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 8:32 PM, Andres Freund wrote: > LGTM, pushed. Closing CF entry. Yay! Only 110 to go. Thanks Andres! -- Peter Geoghegan

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Andres Freund
On 2018-03-31 20:25:24 -0700, Peter Geoghegan wrote: > On Sat, Mar 31, 2018 at 8:09 PM, Peter Geoghegan wrote: > > I was thinking of using rint(), which is what you get if you call > > round(float8) from SQL. > > Attached patch does it that way. Note that there are float/int cast > regression tes

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 8:09 PM, Peter Geoghegan wrote: > I was thinking of using rint(), which is what you get if you call > round(float8) from SQL. Attached patch does it that way. Note that there are float/int cast regression tests that ensure that rint() behaves consistently on supported plat

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 8:08 PM, Andres Freund wrote: >> round() is from C99, apparently. I'll investigate a fix. > > Just replacing it with a floor(val + 0.5) ought to do the trick, right? I was thinking of using rint(), which is what you get if you call round(float8) from SQL. -- Peter Geoghe

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Andres Freund
On 2018-03-31 19:43:45 -0700, Peter Geoghegan wrote: > On Sat, Mar 31, 2018 at 3:15 PM, Peter Geoghegan wrote: > > On Sat, Mar 31, 2018 at 2:59 PM, Peter Geoghegan wrote: > >> WFM. I have all the information I need to produce the next revision now. > > > > I might as well post this one first. I'l

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 3:15 PM, Peter Geoghegan wrote: > On Sat, Mar 31, 2018 at 2:59 PM, Peter Geoghegan wrote: >> WFM. I have all the information I need to produce the next revision now. > > I might as well post this one first. I'll have 0002 for you in a short while. Looks like thrips doesn'

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 3:15 PM, Peter Geoghegan wrote: >> WFM. I have all the information I need to produce the next revision now. > > I might as well post this one first. I'll have 0002 for you in a short while. Attached is 0002 -- the amcheck enhancement itself. As requested by Andres, this ad

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 2:59 PM, Peter Geoghegan wrote: > WFM. I have all the information I need to produce the next revision now. I might as well post this one first. I'll have 0002 for you in a short while. -- Peter Geoghegan v8-0001-Add-Bloom-filter-data-structure-implementation.patch Desc

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 2:56 PM, Andres Freund wrote: >> So you're asking for something like bt_index_check_heap() + >> bt_index_parent_check_heap()? Or, are you talking about function >> overloading? > > The latter. That addresses my concerns about dropping the function and > causing issues due t

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Andres Freund
On 2018-03-31 11:27:14 -0700, Peter Geoghegan wrote: > On Fri, Mar 30, 2018 at 7:04 PM, Andres Freund wrote: > > I'm just saying that there should be two functions here, rather than > > dropping the old definition, and creating s new one with a default argument. > > So you're asking for somethin

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Fri, Mar 30, 2018 at 7:04 PM, Andres Freund wrote: > I'm just saying that there should be two functions here, rather than dropping > the old definition, and creating s new one with a default argument. So you're asking for something like bt_index_check_heap() + bt_index_parent_check_heap()? Or

Re: [HACKERS] A design for amcheck heapam verification

2018-03-30 Thread Peter Geoghegan
On Fri, Mar 30, 2018 at 6:55 PM, Peter Geoghegan wrote: > On Fri, Mar 30, 2018 at 6:20 PM, Andres Freund wrote: >>> What would the upcasting you have in mind look like? >> >> Just use UINT64CONST()? Let's try not to introduce further code that'll >> need to get painfully fixed. > > What I have r

Re: [HACKERS] A design for amcheck heapam verification

2018-03-30 Thread Andres Freund
On March 30, 2018 6:55:50 PM PDT, Peter Geoghegan wrote: >On Fri, Mar 30, 2018 at 6:20 PM, Andres Freund >wrote: >>> What would the upcasting you have in mind look like? >> >> Just use UINT64CONST()? Let's try not to introduce further code >that'll >> need to get painfully fixed. > >What I hav

Re: [HACKERS] A design for amcheck heapam verification

2018-03-30 Thread Peter Geoghegan
On Fri, Mar 30, 2018 at 6:20 PM, Andres Freund wrote: >> What would the upcasting you have in mind look like? > > Just use UINT64CONST()? Let's try not to introduce further code that'll > need to get painfully fixed. What I have right now is based on imitating the style that Tom uses. I'm pretty

Re: [HACKERS] A design for amcheck heapam verification

2018-03-30 Thread Andres Freund
On 2018-03-29 19:42:42 -0700, Peter Geoghegan wrote: > >> + /* > >> + * Aim for two bytes per element; this is sufficient to get a false > >> + * positive rate below 1%, independent of the size of the bitset or > >> total > >> + * number of elements. Also, if rounding down the

Re: [HACKERS] A design for amcheck heapam verification

2018-03-30 Thread Peter Geoghegan
On Thu, Mar 29, 2018 at 7:42 PM, Peter Geoghegan wrote: >> We should be able to get this into v11... > > That's a relief. Thanks. I have a new revision lined up. I won't send anything to the list until you clear up what you meant in those few cases where it seemed unclear. I also acted on some o

Re: [HACKERS] A design for amcheck heapam verification

2018-03-29 Thread Peter Geoghegan
On Thu, Mar 29, 2018 at 6:18 PM, Andres Freund wrote: >> This commit adds a test harness extension module, test_bloomfilter. It >> can be used to get a sense of how the Bloom filter implementation >> performs under varying conditions. > > Maybe add a short paragraph explaining what this'll be use

Re: [HACKERS] A design for amcheck heapam verification

2018-03-29 Thread Andres Freund
Hi, On 2018-03-26 20:20:57 -0700, Peter Geoghegan wrote: > From ede1ba731dc818172a94adbb6331323c1f2b1170 Mon Sep 17 00:00:00 2001 > From: Peter Geoghegan > Date: Thu, 24 Aug 2017 20:58:21 -0700 > Subject: [PATCH v7 1/2] Add Bloom filter data structure implementation. > > A Bloom filter is a spac

Re: [HACKERS] A design for amcheck heapam verification

2018-03-29 Thread Peter Geoghegan
On Thu, Mar 29, 2018 at 2:28 AM, Simon Riggs wrote: > I understand we are adding a check to verify heap against index and > this will take longer than before. When it runs does it report > progress of the run via pg_stat_activity, so we can monitor how long > it will take? My point to Pavan was t

Re: [HACKERS] A design for amcheck heapam verification

2018-03-29 Thread Simon Riggs
On 29 March 2018 at 01:49, Peter Geoghegan wrote: >>> IndexBuildHeapRangeScan() doesn't mention anything about CIC's heap >>> ShareUpdateExclusiveLock (it just says SharedLock), because that lock >>> strength doesn't have anything to do with IndexBuildHeapRangeScan() >>> when it operates with an

Re: [HACKERS] A design for amcheck heapam verification

2018-03-28 Thread Peter Geoghegan
On Wed, Mar 28, 2018 at 5:47 AM, Pavan Deolasee wrote: > Mostly a nitpick, but I guess we should leave a comment after > IndexBuildHeapScan() saying heap_endscan() is not necessary since > IndexBuildHeapScan() does that internally. I stumbled upon that while > looking for any potential leaks. I kn

Re: [HACKERS] A design for amcheck heapam verification

2018-03-28 Thread Peter Geoghegan
On Wed, Mar 28, 2018 at 5:04 AM, Pavan Deolasee wrote: > Hmm Ok. It still sounds backwards to me, but then English is not my first or > even second language. "A heap scan later verifies the presence in the heap > of all index tuples fingerprinted" sounds as if we expect to find all > fingerprinted

Re: [HACKERS] A design for amcheck heapam verification

2018-03-28 Thread Pavan Deolasee
On Wed, Mar 28, 2018 at 2:48 AM, Peter Geoghegan wrote: > > I don't think so. The transaction involved is still an ordinary user > transaction. > > Mostly a nitpick, but I guess we should leave a comment after IndexBuildHeapScan() saying heap_endscan() is not necessary since IndexBuildHeapScan()

Re: [HACKERS] A design for amcheck heapam verification

2018-03-28 Thread Pavan Deolasee
On Wed, Mar 28, 2018 at 2:48 AM, Peter Geoghegan wrote: > On Tue, Mar 27, 2018 at 6:48 AM, Pavan Deolasee > wrote: > > + * When index-to-heap verification is requested, a Bloom filter is used > to > > + * fingerprint all tuples in the target index, as the index is > traversed to > > + * verify i

Re: [HACKERS] A design for amcheck heapam verification

2018-03-27 Thread Peter Geoghegan
On Tue, Mar 27, 2018 at 6:48 AM, Pavan Deolasee wrote: > + * When index-to-heap verification is requested, a Bloom filter is used to > + * fingerprint all tuples in the target index, as the index is traversed to > + * verify its structure. A heap scan later verifies the presence in the > heap > +

Re: [HACKERS] A design for amcheck heapam verification

2018-03-27 Thread Pavan Deolasee
On Tue, Mar 27, 2018 at 8:50 AM, Peter Geoghegan wrote: > On Fri, Mar 23, 2018 at 7:13 AM, Andrey Borodin > wrote: > > I've just flipped patch to WoA. But if above issues will be fixed I > think that patch is ready for committer. > > Attached is v7, which has the small tweaks that you suggested.

Re: [HACKERS] A design for amcheck heapam verification

2018-03-26 Thread Peter Geoghegan
On Fri, Mar 23, 2018 at 7:13 AM, Andrey Borodin wrote: > I've just flipped patch to WoA. But if above issues will be fixed I think > that patch is ready for committer. Attached is v7, which has the small tweaks that you suggested. Thank you for the review. I hope that this can be committed shor

Re: [HACKERS] A design for amcheck heapam verification

2018-03-23 Thread Andrey Borodin
Hi! > 8 февр. 2018 г., в 22:45, Peter Geoghegan написал(а): > > On Thu, Feb 8, 2018 at 6:05 AM, Andrey Borodin wrote: >> I do not see a reason behind hashing the seed. > > It made some sense when I was XOR'ing it to mix. A uniform > distribution of bits seemed desirable then, since random() wo

Re: [HACKERS] A design for amcheck heapam verification

2018-02-08 Thread Peter Geoghegan
On Thu, Feb 8, 2018 at 6:05 AM, Andrey Borodin wrote: > I do not see a reason behind hashing the seed. It made some sense when I was XOR'ing it to mix. A uniform distribution of bits seemed desirable then, since random() won't use the most significant bit -- it generates random numbers in the ran

Re: [HACKERS] A design for amcheck heapam verification

2018-02-08 Thread Andrey Borodin
Hi, Peter! > 8 февр. 2018 г., в 4:56, Peter Geoghegan написал(а): > > * Faster modulo operations. > > * Removed sdbmhash(). Thanks! I definitely like how Bloom filter is implemented now. I could not understand meaning of this, but apparently this will not harm + /* +* Caller

Re: [HACKERS] A design for amcheck heapam verification

2018-02-07 Thread Peter Geoghegan
On Mon, Feb 5, 2018 at 12:55 PM, Peter Geoghegan wrote: > Anyway, parallel CREATE INDEX added a new "scan" argument to > IndexBuildHeapScan(), which caused this patch to bitrot. At a minimum, > an additional NULL argument should be passed by amcheck. However, I > have a better idea. > > ISTM that

Re: [HACKERS] A design for amcheck heapam verification

2018-02-05 Thread Peter Geoghegan
On Mon, Jan 22, 2018 at 6:07 PM, Michael Paquier wrote: > Yep. I have provided the feedback I wanted for 0001 (no API change in > the bloom facility by the way :( ), but I still wanted to look at 0002 > in depths. I don't see a point in adding complexity that no caller will actually use. It *migh

Re: [HACKERS] A design for amcheck heapam verification

2018-01-22 Thread Michael Paquier
On Mon, Jan 22, 2018 at 03:22:05PM -0800, Peter Geoghegan wrote: > Michael said he'd do more review. I generally feel this is close, though. Yep. I have provided the feedback I wanted for 0001 (no API change in the bloom facility by the way :( ), but I still wanted to look at 0002 in depths. -- Mi

Re: [HACKERS] A design for amcheck heapam verification

2018-01-22 Thread Peter Geoghegan
Hi, On Fri, Jan 12, 2018 at 1:41 AM, Andrey Borodin wrote: > I've looked into the code and here's my review. > > The new functionality works and is useful right now. I believe it should be > shipped in the Postgres box (currently, I install it with packet managers). > The code is nice and well d

Re: [HACKERS] A design for amcheck heapam verification

2018-01-22 Thread Peter Geoghegan
On Thu, Jan 11, 2018 at 2:14 AM, Andrey Borodin wrote: > I like heapam verification functionality and use it right now. So, I'm > planning to provide review for this patch, probably, this week. Great! > Seems like new check is working 4 orders of magnitudes faster then > bt_index_parent_check(

Re: [HACKERS] A design for amcheck heapam verification

2018-01-12 Thread Andrey Borodin
Hi, Peter! > 11 янв. 2018 г., в 15:14, Andrey Borodin написал(а): > I like heapam verification functionality and use it right now. So, I'm > planning to provide review for this patch, probably, this week. I've looked into the code and here's my review. The new functionality works and is useful

Re: [HACKERS] A design for amcheck heapam verification

2018-01-11 Thread Andrey Borodin
Hello! I like heapam verification functionality and use it right now. So, I'm planning to provide review for this patch, probably, this week. From my current use I have some thoughts on interface. Here's what I get. # select bt_index_check('messagefiltervalue_group_id_59490523e6ee451f',true);

Re: [HACKERS] A design for amcheck heapam verification

2017-12-13 Thread Peter Geoghegan
On Mon, Dec 11, 2017 at 9:35 PM, Michael Paquier wrote: > + /* > +* Generate random seed, or use caller's. Seed should always be a > positive > +* value less than or equal to PG_INT32_MAX, to ensure that any random > seed > +* can be recreated through callerseed if the need arises

Re: [HACKERS] A design for amcheck heapam verification

2017-12-11 Thread Michael Paquier
On Fri, Dec 8, 2017 at 4:37 AM, Peter Geoghegan wrote: > Here, we repeat the same test 3 times, varying only the seed value > used for each run. Thanks for the updated version! Looking at 0001, I have run a coverage test and can see all code paths covered, which is nice. It is also easier to chec

Re: [HACKERS] A design for amcheck heapam verification

2017-12-07 Thread Peter Geoghegan
On Tue, Nov 28, 2017 at 9:54 PM, Peter Geoghegan wrote: > On Tue, Nov 28, 2017 at 9:50 PM, Michael Paquier > wrote: >>> Would that address your concern? There would be an SQL interface, but >>> it would be trivial. >> >> That's exactly what I think you should do, and mentioned so upthread. >> A S

Re: [HACKERS] A design for amcheck heapam verification

2017-11-28 Thread Michael Paquier
On Wed, Nov 29, 2017 at 2:54 PM, Peter Geoghegan wrote: > My understanding of your earlier remarks, rightly or wrongly, was that > you wanted me to adopt the Bloom filter to actually be usable from SQL > in some kind of general way. As opposed to what I just said -- adding > a stub SQL interface t

Re: [HACKERS] A design for amcheck heapam verification

2017-11-28 Thread Peter Geoghegan
On Tue, Nov 28, 2017 at 9:50 PM, Michael Paquier wrote: >> Would that address your concern? There would be an SQL interface, but >> it would be trivial. > > That's exactly what I think you should do, and mentioned so upthread. > A SQL interface can also show a good example of how developers can us

Re: [HACKERS] A design for amcheck heapam verification

2017-11-28 Thread Michael Paquier
On Wed, Nov 29, 2017 at 2:48 PM, Peter Geoghegan wrote: > I still don't think that regression tests as such make sense. However, > it seems like it might be a good idea to add a test harness for the > Bloom filter code. I actually wrote code like this for myself during > development, that could be

Re: [HACKERS] A design for amcheck heapam verification

2017-11-28 Thread Peter Geoghegan
On Tue, Nov 28, 2017 at 9:38 PM, Michael Paquier wrote: > My apologies for slacking here. I would still welcome some regression > tests to stress the bloom API you are proposing! For now I am moving > this patch to next CF. I still don't think that regression tests as such make sense. However, it

Re: [HACKERS] A design for amcheck heapam verification

2017-11-28 Thread Michael Paquier
On Sat, Oct 21, 2017 at 9:34 AM, Peter Geoghegan wrote: > I should point out that I shipped virtually the same code yesterday, > as v1.1 of the Github version of amcheck (also known as amcheck_next). > Early adopters will be able to use this new "heapallindexed" > functionality in the next few day