Re: [PERFORM] [SQL] querying with index on jsonb slower than standard column. Why?
I wrote: > Adrian Klaver writes: >> Seems work_mem is the key: > Fascinating. So there's some bad behavior in the lossy-bitmap stuff > that's exposed by one case but not the other. Meh. I was overthinking it. A bit of investigation with oprofile exposed the true cause of the problem: whenever the bitmap goes lossy, we have to execute the "recheck" condition for each tuple in the page(s) that the bitmap has a lossy reference to. So in the fast case we are talking about Recheck Cond: ((assay1_ic50 > 90::double precision) AND (assay2_ic50 < 10::double precision)) which involves little except pulling the float8 values out of the tuple and executing float8gt and float8lt. In the slow case we have got Recheck Cond: data ->> 'assay1_ic50'::text))::double precision > 90::double precision) AND (((data ->> 'assay2_ic50'::text))::double precision < 10::double precision)) which means we have to pull the JSONB value out of the tuple, search it to find the 'assay1_ic50' key, convert the associated value to text (which is not exactly cheap because *the value is stored as a numeric*), then reparse that text string into a float8, after which we can use float8gt. And then probably do an equivalent amount of work on the way to making the other comparison. So this says nothing much about the lossy-bitmap code, and a lot about how the JSONB code isn't very well optimized yet. In particular, the decision not to provide an operator that could extract a numeric field without conversion to text is looking pretty bad here. For reference, the oprofile results down to the 1% level for the jsonb query: samples %symbol name 7646 8.1187 get_str_from_var 7055 7.4911 AllocSetAlloc 4447 4.7219 AllocSetCheck 4000 4.2473 BitmapHeapNext 3945 4.1889 lengthCompareJsonbStringValue 3713 3.9425 findJsonbValueFromContainer 3637 3.8618 ExecMakeFunctionResultNoSets 3624 3.8480 hash_search_with_hash_value 3452 3.6654 cstring_to_text 2993 3.1780 slot_deform_tuple 2566 2.7246 jsonb_object_field_text 2225 2.3625 palloc 2176 2.3105 heap_tuple_untoast_attr 1993 2.1162 AllocSetReset 1926 2.0451 findJsonbValueFromContainerLen 1846 1.9601 GetPrivateRefCountEntry 1563 1.6596 float8gt 1486 1.5779 float8in 1477 1.5683 InputFunctionCall 1365 1.4494 getJsonbOffset 1137 1.2073 slot_getattr 1083 1.1500 init_var_from_num 1058 1.1234 ExecEvalConst 1056 1.1213 float8_cmp_internal 1053 1.1181 cstring_to_text_with_len 1032 1.0958 text_to_cstring 988 1.0491 ExecClearTuple 969 1.0289 ResourceOwnerForgetBuffer and for the other: samples %symbol name 1401012.1898 BitmapHeapNext 1347911.7278 hash_search_with_hash_value 8201 7.1355 GetPrivateRefCountEntry 7524 6.5465 slot_deform_tuple 6091 5.2997 ExecMakeFunctionResultNoSets 4459 3.8797 ExecClearTuple 4456 3.8771 slot_getattr 3876 3.3724 ExecStoreTuple 3112 2.7077 ReleaseBuffer 3086 2.6851 float8_cmp_internal 2890 2.5145 ExecQual 2794 2.4310 HeapTupleSatisfiesMVCC 2737 2.3814 float8gt 2130 1.8533 ExecEvalScalarVarFast 2102 1.8289 IncrBufferRefCount 2100 1.8272 ResourceOwnerForgetBuffer 1896 1.6497 hash_any 1752 1.5244 ResourceOwnerRememberBuffer 1567 1.3634 DatumGetFloat8 1543 1.3425 ExecEvalConst 1486 1.2929 LWLockAcquire 1454 1.2651 _bt_checkkeys 1424 1.2390 check_stack_depth 1374 1.1955 ResourceOwnerEnlargeBuffers 1354 1.1781 pgstat_end_function_usage 1164 1.0128 tbm_iterate 1158 1.0076 CheckForSerializableConflictOut Just to add insult to injury, this is only counting cycles in postgres proper; it appears that in the jsonb case 30% of the overall runtime is spent in strtod() :-( regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Re: [SQL] querying with index on jsonb slower than standard column. Why?
On 12/08/2014 12:53 PM, Tom Lane wrote: > I wrote: >> Adrian Klaver writes: >>> Seems work_mem is the key: > >> Fascinating. So there's some bad behavior in the lossy-bitmap stuff >> that's exposed by one case but not the other. > > Meh. I was overthinking it. A bit of investigation with oprofile exposed > the true cause of the problem: whenever the bitmap goes lossy, we have to > execute the "recheck" condition for each tuple in the page(s) that the > bitmap has a lossy reference to. So in the fast case we are talking about > > Recheck Cond: ((assay1_ic50 > 90::double precision) AND (assay2_ic50 < > 10::double precision)) > > which involves little except pulling the float8 values out of the tuple > and executing float8gt and float8lt. In the slow case we have got > > Recheck Cond: data ->> 'assay1_ic50'::text))::double precision > > 90::double precision) AND (((data ->> 'assay2_ic50'::text))::double precision > < 10::double precision)) > > which means we have to pull the JSONB value out of the tuple, search > it to find the 'assay1_ic50' key, convert the associated value to text > (which is not exactly cheap because *the value is stored as a numeric*), > then reparse that text string into a float8, after which we can use > float8gt. And then probably do an equivalent amount of work on the way > to making the other comparison. > > So this says nothing much about the lossy-bitmap code, and a lot about > how the JSONB code isn't very well optimized yet. In particular, the > decision not to provide an operator that could extract a numeric field > without conversion to text is looking pretty bad here. > I think I understand the above. I redid the test on my 32-bit machine, setting work_mem=16MB, and I got comparable results to what I saw on the 64-bit machine. So, what I am still am puzzled by is why work_mem seems to make the two paths equivalent in time?: Fast case, assay1_ic50 > 90 and assay2_ic50 < 10: 1183.997 ms Slow case, (data->>'assay1_ic50')::float > 90 and (data->>'assay2_ic50')::float < 10;: 1190.187 ms > > regards, tom lane > > -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [SQL] querying with index on jsonb slower than standard column. Why?
Adrian Klaver writes: > I redid the test on my 32-bit machine, setting work_mem=16MB, and I got > comparable results to what I saw on the 64-bit machine. So, what I am > still am puzzled by is why work_mem seems to make the two paths > equivalent in time?: If work_mem is large enough that we never have to go through tbm_lossify(), then the recheck condition will never be executed, so its speed doesn't matter. (So the near-term workaround for Tim is to raise work_mem when working with tables of this size.) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Re: [SQL] querying with index on jsonb slower than standard column. Why?
On 12/08/2014 01:22 PM, Tom Lane wrote: Adrian Klaver writes: I redid the test on my 32-bit machine, setting work_mem=16MB, and I got comparable results to what I saw on the 64-bit machine. So, what I am still am puzzled by is why work_mem seems to make the two paths equivalent in time?: If work_mem is large enough that we never have to go through tbm_lossify(), then the recheck condition will never be executed, so its speed doesn't matter. Aah, peeking into tidbitmap.c is enlightening. Thanks. (So the near-term workaround for Tim is to raise work_mem when working with tables of this size.) regards, tom lane -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Re: [SQL] querying with index on jsonb slower than standard column. Why?
On 08/12/2014 18:14, Adrian Klaver wrote: Recheck Cond: data ->> 'assay1_ic50'::text))::double precision > 90::double precision) AND (((data ->> 'assay2_ic50'::text))::double precision < 10::double precision)) > >which means we have to pull the JSONB value out of the tuple, search >it to find the 'assay1_ic50' key, convert the associated value to text >(which is not exactly cheap because*the value is stored as a numeric*), >then reparse that text string into a float8, after which we can use >float8gt. And then probably do an equivalent amount of work on the way >to making the other comparison. > >So this says nothing much about the lossy-bitmap code, and a lot about >how the JSONB code isn't very well optimized yet. In particular, the >decision not to provide an operator that could extract a numeric field >without conversion to text is looking pretty bad here. Yes, that bit seemed strange to me. As I understand the value is stored internally as numeric, but the only way to access it is as text and then cast back to numeric. I *think* this is the only way to do it presently? Tim
Re: [PERFORM] intel s3500 -- hot stuff
On Sat, Dec 6, 2014 at 7:08 AM, Bruce Momjian wrote: > On Wed, Nov 5, 2014 at 12:09:16PM -0600, Merlin Moncure wrote: >> effective_io_concurrency 1: 46.3 sec, ~ 170 mb/sec peak via iostat >> effective_io_concurrency 2: 49.3 sec, ~ 158 mb/sec peak via iostat >> effective_io_concurrency 4: 29.1 sec, ~ 291 mb/sec peak via iostat >> effective_io_concurrency 8: 23.2 sec, ~ 385 mb/sec peak via iostat >> effective_io_concurrency 16: 22.1 sec, ~ 409 mb/sec peak via iostat >> effective_io_concurrency 32: 20.7 sec, ~ 447 mb/sec peak via iostat >> effective_io_concurrency 64: 20.0 sec, ~ 468 mb/sec peak via iostat >> effective_io_concurrency 128: 19.3 sec, ~ 488 mb/sec peak via iostat >> effective_io_concurrency 256: 19.2 sec, ~ 494 mb/sec peak via iostat >> >> Did not see consistent measurable gains > 256 >> effective_io_concurrency. Interesting that at setting of '2' (the >> lowest possible setting with the feature actually working) is >> pessimal. > > Very interesting. When we added a per-tablespace random_page_cost, > there was a suggestion that we might want to add per-tablespace > effective_io_concurrency someday: What I'd really like to see is to have effective_io_concurrency work on other types of scans. It's clearly a barn burner on fast storage and perhaps the default should be something other than '1'. Spinning storage is clearly dead and ssd seem to really benefit from the posix readhead api. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance