On Wed, Dec 04, 2024 at 08:29:57PM +0100, Peter Eisentraut wrote:
> On 29.11.24 21:39, Noah Misch wrote:
> > > I remembered where that's documented:
> > >
> > > https://wiki.postgresql.org/wiki/Committing_checklist#Maintaining_ABI_compatibility_while_backpatching
> > I'd say the source tree's ,
>
Peter Eisentraut writes:
> On 29.11.24 21:39, Noah Misch wrote:
>> I'd say the source tree's ,
>> which
>> David Wheeler mentioned, is authoritative. It currently matches
>> postgr.es/c/e54a42a. Let's update both or change that wiki heading to point
>> to the authoritative doc section.
> I don
On 29.11.24 21:39, Noah Misch wrote:
I remembered where that's documented:
https://wiki.postgresql.org/wiki/
Committing_checklist#Maintaining_ABI_compatibility_while_backpatching
I'd say the source tree's , which
David Wheeler mentioned, is authoritative. It currently matches
postgr.es/c/e54a
On Nov 29, 2024, at 15:39, Noah Misch wrote:
> Then packagers who are taking the risk of not rebuilding every time will have
> 3 months to prepare, not the 3 days we're currently giving. The point about
> "well-known extensions" is based on my practice of grepping PGXN. That would
> not have fo
On Mon, Nov 25, 2024 at 08:56:50PM -0500, Tom Lane wrote:
> [ getting back to the document-ABI-breakage-rules-better topic ... ]
>
> I wrote:
> > That text says exactly nothing about what specific code changes to
> > make or not make. I'm not sure offhand where (or if) we have this
> > documented
Matthias van de Meent writes:
> I would also add something like the following, to protect against
> corruption and deserialization errors of the catalog pg_node_tree
> fields, because right now the generated readfuncs.c code ignores field
> names, assumes ABI field order, and is generally quite se
Bertrand Drouvot writes:
> On Mon, Nov 25, 2024 at 08:56:50PM -0500, Tom Lane wrote:
>> ... (But consider both 32-bit and 64-bit cases when
>> deciding what is "padding".)
> What about providing a decision table to help considering for 32-bit,
> something
> like (proposed in [1])?
> 64-bit hol
On Nov 25, 2024, at 20:56, Tom Lane wrote:
> None of this is a substitute for installing some kind of ABI-checking
> infrastructure; but that project doesn't seem to be moving fast.
These sound great, very useful. I wonder if the guideline docs Peter added
should link to these items (and back).
On Tue, 26 Nov 2024 at 02:57, Tom Lane wrote:
>
> [ getting back to the document-ABI-breakage-rules-better topic ... ]
>
> I wrote:
> > That text says exactly nothing about what specific code changes to
> > make or not make. I'm not sure offhand where (or if) we have this
> > documented, but ther
Hi,
On Mon, Nov 25, 2024 at 08:56:50PM -0500, Tom Lane wrote:
> [ getting back to the document-ABI-breakage-rules-better topic ... ]
>
> I wrote:
> > That text says exactly nothing about what specific code changes to
> > make or not make. I'm not sure offhand where (or if) we have this
> > docum
[ getting back to the document-ABI-breakage-rules-better topic ... ]
I wrote:
> That text says exactly nothing about what specific code changes to
> make or not make. I'm not sure offhand where (or if) we have this
> documented, but there's an idea that adding fields at the end of
> a struct is s
Christoph Berg writes:
> Re: Noah Misch
>> I'm no longer arguing against the undo, but I'm trying to explain the
>> consequences rigorously.
> I'll just have to explain what happened here to the Debian security
> team who just released the 16.5 update. We can do 16.6 on top of that.
Fixes are pu
On Nov 15, 2024, at 19:30, Tom Lane wrote:
> That text says exactly nothing about what specific code changes to
> make or not make. I'm not sure offhand where (or if) we have this
> documented, but there's an idea that adding fields at the end of
> a struct is safer ABI-wise than putting them in
Re: Noah Misch
> I'm no longer arguing against the undo, but I'm trying to explain the
> consequences rigorously.
I'll just have to explain what happened here to the Debian security
team who just released the 16.5 update. We can do 16.6 on top of that.
Christoph
Noah Misch writes:
> Currently, we have Christoph Berg writing "I'd say the ship has sailed, a new
> release would now break things the other way round." and you writing in favor
> of undoing. It think it boils down to whether you want N people to recompile
> twice or M>N people to recompile once
On Nov 15, 2024, at 16:13, Tom Lane wrote:
> In other words, our current guidelines
> for preserving ABI compatibility actually *created* this disaster,
> because the HEAD change was fine from an ABI standpoint but what
> was done in back branches was not. So we do need to rethink how
> that's w
"David E. Wheeler" writes:
> On Nov 15, 2024, at 16:13, Tom Lane wrote:
>> In other words, our current guidelines
>> for preserving ABI compatibility actually *created* this disaster,
>> because the HEAD change was fine from an ABI standpoint but what
>> was done in back branches was not. So we
On Fri, Nov 15, 2024 at 05:37:01PM -0500, Tom Lane wrote:
> Noah Misch writes:
> > Currently, we have Christoph Berg writing "I'd say the ship has sailed, a
> > new
> > release would now break things the other way round." and you writing in
> > favor
> > of undoing. It think it boils down to wh
On Fri, Nov 15, 2024 at 03:29:21PM -0500, Tom Lane wrote:
> Noah Misch writes:
> > On Fri, Nov 15, 2024 at 10:09:54AM -0500, Tom Lane wrote:
> >> I'm starting to lean to the opinion that we need a re-wrap.
>
> > Perhaps. Even if we do rewrap for some reason, it's not a given that
> > restoring t
After some quick checking with "git diff", I can confirm that
there is no ABI break for ResultRelInfo in v12 or v13. To fix
it in v14-v17, we could do this:
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 418c81f4be..17b0ec5138 100644
--- a/src/include/nodes/execn
Noah Misch writes:
> On Fri, Nov 15, 2024 at 10:09:54AM -0500, Tom Lane wrote:
>> I'm starting to lean to the opinion that we need a re-wrap.
> Perhaps. Even if we do rewrap for some reason, it's not a given that
> restoring the old struct size is net beneficial. If we restore the old struct
>
On Sat, Nov 16, 2024 at 12:10:06AM +0530, Pavan Deolasee wrote:
> On Fri, Nov 15, 2024 at 11:39 PM Pavan Deolasee
> wrote:
> > Looking more carefully at the usage of `ResultRelInfo` in the PGD code, I
> > think we might also be impacted by it. At one place, we loop through the
> > `es_result_rela
On Fri, Nov 15, 2024 at 11:39 PM Pavan Deolasee
wrote:
>
> Looking more carefully at the usage of `ResultRelInfo` in the PGD code, I
> think we might also be impacted by it. At one place, we loop through the
> `es_result_relations` array and a size mismatch there will cause problems.
> Interestin
On Fri, Nov 15, 2024 at 11:39:24PM +0530, Pavan Deolasee wrote:
> On Fri, Nov 15, 2024 at 11:22 PM Noah Misch wrote:
> > > I'm starting to lean to the opinion that we need a re-wrap.
> >
> > Perhaps. Even if we do rewrap for some reason, it's not a given that
> > restoring the old struct size is
On Nov 15, 2024, at 12:52, Noah Misch wrote:
> Either way, users of timescaledb should rebuild timescaledb for every future
> PostgreSQL minor release.
Ugh, was really hoping to get to a place where we could avoid requiring
rebuilds of any extension for every minor release. :-( We even added AB
On Fri, Nov 15, 2024 at 11:22 PM Noah Misch wrote:
>
>
> > I'm starting to lean to the opinion that we need a re-wrap.
>
> Perhaps. Even if we do rewrap for some reason, it's not a given that
> restoring the old struct size is net beneficial. If we restore the old
> struct
> size in v16.6, thos
On Fri, Nov 15, 2024 at 10:09:54AM -0500, Tom Lane wrote:
> Aleksander Alekseev writes:
> > Hi Macro,
> >> The problem here is that because TimescaleDB compiled against 17.0
> >> assumes a struct size of 376 (on my laptop) while PostgreSQL allocated
> >> the array with a struct size of 384, so the
Aleksander Alekseev writes:
> Hi Macro,
>> The problem here is that because TimescaleDB compiled against 17.0
>> assumes a struct size of 376 (on my laptop) while PostgreSQL allocated
>> the array with a struct size of 384, so the pointer math no longer
>> holds and the whichrel value becomes nons
On Fri, Nov 15, 2024 at 02:45:42PM +0100, Christoph Berg wrote:
> Re: Noah Misch
> > I'm hearing the only confirmed impact on non-assert builds is the need to
> > recompile timescaledb. (It's unknown whether recompiling will suffice for
> > timescaledb. For assert builds, six PGXN extensions need
Re: Noah Misch
> I'm hearing the only confirmed impact on non-assert builds is the need to
> recompile timescaledb. (It's unknown whether recompiling will suffice for
> timescaledb. For assert builds, six PGXN extensions need recompilation.)
Which 6 extensions are these?
I re-ran the tests on a
On Fri, Nov 15, 2024 at 12:51 PM Marco Slot wrote:
> On Fri, Nov 15, 2024 at 9:57 AM Aleksander Alekseev
> wrote:
> > I'm working with the TimescaleDB dev team to fix these issues on the
> > TimescaleDB side.
>
> I looked a bit at this out of interest. I see an assert failure in the
> lines belo
On Thu, Nov 14, 2024 at 11:41 PM Noah Misch wrote:
> On Thu, Nov 14, 2024 at 09:33:32PM +0100, Alvaro Herrera wrote:
> > On 2024-Nov-14, Tom Lane wrote:
> > > Alvaro Herrera writes:
> > > > But timescale did crash:
> > >
> > > So what is timescale doing differently?
>
> > src/ts_catalog/catalog.
On Thu, Nov 14, 2024 at 9:33 PM Alvaro Herrera
wrote:
> On 2024-Nov-14, Tom Lane wrote:
>
> > Alvaro Herrera writes:
> > > On 2024-Nov-14, Christoph Berg wrote:
> > >> It didn't actually crash, true.
> >
> > > But timescale did crash:
> >
> > So what is timescale doing differently?
>
> No idea.
On Thu, Nov 14, 2024 at 5:13 PM Peter Eisentraut
wrote:
> On 14.11.24 15:35, Noah Misch wrote:
> > The postgr.es/c/e54a42a standard would have us stop here. But I'm open
> to
> > treating the standard as mistaken and changing things.
>
> That text explicitly calls out that adding struct members
Hi Macro,
> I looked a bit at this out of interest. I see an assert failure in the
> lines below when running tests with TimescaleDB compiled against 17.0
> with 17.1 installed. Without the assertion it would anyway segfault
> below.
>
> /*
> * Usually, mt_lastResultIndex matches the targ
On Fri, Nov 15, 2024 at 9:57 AM Aleksander Alekseev
wrote:
> I'm working with the TimescaleDB dev team to fix these issues on the
> TimescaleDB side.
I looked a bit at this out of interest. I see an assert failure in the
lines below when running tests with TimescaleDB compiled against 17.0
with 1
On 2024-Nov-14, Noah Misch wrote:
> I'm hearing the only confirmed impact on non-assert builds is the need to
> recompile timescaledb. (It's unknown whether recompiling will suffice for
> timescaledb. For assert builds, six PGXN extensions need recompilation.) I
> don't see us issuing another s
Hi,
On Thu, Nov 14, 2024 at 11:35:25AM -0500, Peter Geoghegan wrote:
> On Thu, Nov 14, 2024 at 11:29 AM Alvaro Herrera
> wrote:
> > ISTM that we have spare bytes where we could place that boolean without
> > breaking ABI. In x86_64 there's a couple of 4-byte holes, but those
> > won't be there
Hi,
> > src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo *
> > src/ts_catalog/catalog.c-ts_catalog_open_indexes(Relation heapRel)
> > src/ts_catalog/catalog.c-{
> > src/ts_catalog/catalog.c- ResultRelInfo *resultRelInfo;
> > src/ts_catalog/catalog.c-
> > src/ts_catalog/catalog.c: resu
On Fri, Nov 15, 2024 at 1:00 AM Tom Lane wrote:
> Noah Misch writes:
> > It's not immediately to clear to me why this would crash in a non-asserts
> > build. palloc issues a 512-byte chunk for sizeof(ResultRelInfo)==368 on
> v16,
> > so I expect no actual writing past the end of the chunk.
>
>
On Thu, Nov 14, 2024 at 5:41 PM Noah Misch wrote:
> I'm hearing the only confirmed impact on non-assert builds is the need to
> recompile timescaledb. (It's unknown whether recompiling will suffice for
> timescaledb. For assert builds, six PGXN extensions need recompilation.)
>
That matches wh
Hi,
> The allocation should be big enough. The other
> hazard would be failing to initialize the field, but if the extension
> uses InitResultRelInfo then that's taken care of.
> So what is timescale doing differently?
I see 3 usages of makeNode(ResultRelInfo) in Timescale:
- src/nodes/chunk_d
On Thu, Nov 14, 2024 at 09:33:32PM +0100, Alvaro Herrera wrote:
> On 2024-Nov-14, Tom Lane wrote:
> > Alvaro Herrera writes:
> > > But timescale did crash:
> >
> > So what is timescale doing differently?
> src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo *
> src/ts_catalog/catalog.c-ts_
Aleksander Alekseev writes:
> The third case is the following:
> ```
> extern TSDLLEXPORT ResultRelInfo *
> ts_catalog_open_indexes(Relation heapRel)
> {
> ResultRelInfo *resultRelInfo;
> resultRelInfo = makeNode(ResultRelInfo);
> resultRelInfo->ri_RangeTableIndex = 0; /* dummy */
>
Hi Alvaro,
> There are three makeNode(ResultRelInfo), but they don't look anything
> out of the ordinary:
> [...]
> src/ts_catalog/catalog.c-extern TSDLLEXPORT ResultRelInfo *
> src/ts_catalog/catalog.c-ts_catalog_open_indexes(Relation heapRel)
> src/ts_catalog/catalog.c-{
> src/ts_catalog/catalog
On 2024-Nov-14, Tom Lane wrote:
> Alvaro Herrera writes:
> > On 2024-Nov-14, Christoph Berg wrote:
> >> It didn't actually crash, true.
>
> > But timescale did crash:
>
> So what is timescale doing differently?
No idea. Maybe a stacktrace from a core would tell us more; but the
jenkins task d
Alvaro Herrera writes:
> On 2024-Nov-14, Christoph Berg wrote:
>> It didn't actually crash, true.
> But timescale did crash:
So what is timescale doing differently?
regards, tom lane
On 2024-Nov-14, Christoph Berg wrote:
> It didn't actually crash, true.
But timescale did crash:
https://pgdgbuild.dus.dg-i.net/job/timescaledb-autopkgtest/9/architecture=amd64,distribution=sid/console
11:59:51 @@ -34,6 +40,7 @@
11:59:51 CREATE INDEX ON _hyper_1_3_chunk(temp2);
11:59:51 -- IN
Christoph Berg writes:
> Re: Noah Misch
>> The non-asserts build passed. The asserts build failed with "+WARNING:
>> problem in alloc set ExecutorState: detected write past chunk" throughout the
>> diffs, but there were no crashes. (Note that AGE "make installcheck" creates
>> a temporary instal
Re: Noah Misch
> The non-asserts build passed. The asserts build failed with "+WARNING:
> problem in alloc set ExecutorState: detected write past chunk" throughout the
> diffs, but there were no crashes. (Note that AGE "make installcheck" creates
> a temporary installation, unlike PostgreSQL "mak
Noah Misch writes:
> It's not immediately to clear to me why this would crash in a non-asserts
> build. palloc issues a 512-byte chunk for sizeof(ResultRelInfo)==368 on v16,
> so I expect no actual writing past the end of the chunk.
I'm confused too. The allocation should be big enough. The ot
On Thu, Nov 14, 2024 at 10:26:58AM -0800, Noah Misch wrote:
> On Thu, Nov 14, 2024 at 05:29:18PM +0100, Christoph Berg wrote:
> > Re: Noah Misch
> > > Based on a grep of PGXN code, here are some or all of the modules that
> > > react
> > > to sizeof(ResultRelInfo):
> > >
> > > $ grepx -r 'lloc.*R
On Thu, Nov 14, 2024 at 05:29:18PM +0100, Christoph Berg wrote:
> Re: Noah Misch
> > Based on a grep of PGXN code, here are some or all of the modules that react
> > to sizeof(ResultRelInfo):
> >
> > $ grepx -r 'lloc.*ResultRelInfo' | tee /tmp/1 | sed 's/-[^:]*/:/'|sort -u
> > apacheage::resul
On Thu, Nov 14, 2024 at 9:43 PM Peter Eisentraut
wrote:
> On 14.11.24 15:35, Noah Misch wrote:
> > The postgr.es/c/e54a42a standard would have us stop here. But I'm open
> to
> > treating the standard as mistaken and changing things.
>
> That text explicitly calls out that adding struct members
On Thu, Nov 14, 2024 at 11:29 AM Alvaro Herrera wrote:
> ISTM that we have spare bytes where we could place that boolean without
> breaking ABI. In x86_64 there's a couple of 4-byte holes, but those
> won't be there on x86, so not great candidates. Fortunately there are
> 3-byte and 7-byte holes
Hi,
> Right. makeNode(), palloc(sizeof), and stack allocation have that problem.
> Allocation wrappers like CreateExecutorState() avoid the problem. More
> generally, structs allocated in non-extension code are fine.
Perhaps we should consider adding an API like makeResultRelInfo() and
others,
On Thu, Nov 14, 2024 at 05:13:35PM +0100, Peter Eisentraut wrote:
> On 14.11.24 15:35, Noah Misch wrote:
> > The postgr.es/c/e54a42a standard would have us stop here. But I'm open to
> > treating the standard as mistaken and changing things.
>
> That text explicitly calls out that adding struct m
Timescale at least has many users, I don't know about the others. But
they won't be happy if we let things be. IMO we should rewrap ...
especially 12, since there won't be a next one.
ISTM that we have spare bytes where we could place that boolean without
breaking ABI. In x86_64 there's a coupl
Re: Noah Misch
> Based on a grep of PGXN code, here are some or all of the modules that react
> to sizeof(ResultRelInfo):
>
> $ grepx -r 'lloc.*ResultRelInfo' | tee /tmp/1 | sed 's/-[^:]*/:/'|sort -u
> apacheage::resultRelInfo = palloc(sizeof(ResultRelInfo));
Confirmed, crashing: AGE for 14..
On 14.11.24 15:35, Noah Misch wrote:
The postgr.es/c/e54a42a standard would have us stop here. But I'm open to
treating the standard as mistaken and changing things.
That text explicitly calls out that adding struct members at the end of
a struct is considered okay. But thinking about it now
Hi,
> To add to this list, Christoph Berg confirmed that timescaledb test suite
> crashes. [1]
Yes changing ResultRelInfo most definetely breaks TimescaleDB. The
extension uses makeNode(ResultRelInfo) and this can't end-up well:
```
static inline Node *
newNode(size_t size, NodeTag tag)
{
N
On Thu, 14 Nov 2024 at 16:35, Noah Misch wrote:
> Based on a grep of PGXN code, here are some or all of the modules that
> react
> to sizeof(ResultRelInfo):
>
To add to this list, Christoph Berg confirmed that timescaledb test suite
crashes. [1]
Regards,
Ants Aasma
[1]
https://pgdgbuild.dus.dg
On Thu, Nov 14, 2024 at 04:18:02PM +0530, Pavan Deolasee wrote:
> Commit 51ff46de29f67d73549b2858f57e77ada8513369 (backported all the way
> back to v12) added a new member to `ResultRelInfo` struct. This can
> potentially cause ABI breakage for the extensions that allocate the struct
> and pass it
63 matches
Mail list logo