Hi I just checked the dmesg.
The segfault I wrote about is the only one I see, dated Nov 24 last year.
Since then no other segfaults happened although dsa_allocated failures
happen daily.
I'll report if anything occurs.
I have the core dumping setup in place.
--
regards,
pozdrawiam,
Jakub Glapa
Hi,
On Mon, Nov 26, 2018 at 09:52:07AM -0600, Justin Pryzby wrote:
> Hi, thanks for following through.
>
> On Mon, Nov 26, 2018 at 04:38:35PM +0100, Jakub Glapa wrote:
> > I had a look at dmesg and indeed I see something like:
> >
> > postgres[30667]: segfault at 0 ip 557834264b16 sp 7ff
On Mon, Feb 04, 2019 at 08:31:47PM +, Arne Roland wrote:
> I could take a backup and restore the relevant tables on a throwaway system.
> You are just suggesting to replace line 728
> elog(FATAL,
> "dsa_allocate could not find %zu free
> pages", npages);
> by
It's definitely a quite a relatively complex pattern. The query I set you last
time was minimal with respect to predicates (so removing any single one of the
predicates converted that one into a working query).
> Huh. Ok well that's a lot more frequent that I thought. Is it always the
> same q
On Mon, Feb 4, 2019 at 6:52 PM Jakub Glapa wrote:
> I see the error showing up every night on 2 different servers. But it's a bit
> of a heisenbug because If I go there now it won't be reproducible.
Huh. Ok well that's a lot more frequent that I thought. Is it always
the same query? Any chanc
Hi Thomas,
I was one of the reporter in the early Dec last year.
I somehow dropped the ball and forgot about the issue.
Anyhow I upgraded the clusters to pg11.1 and nothing changed. I also have a
rule to coredump but a segfault does not happen while this is occurring.
I see the error showing up eve
On Thu, Jan 31, 2019 at 06:19:54PM +, Arne Roland wrote:
> this is reproducible, while it's highly sensitive to the change of plans
> (i.e. the precise querys that do break change with every new analyze).
> Disabling parallel query seems to solve the problem (as expected).
> At some point eve
Hi Thomas,
it is a Production system and we don’t have permanent access to it.
Also to have an auto_explain feature always on, is not an option in production.
I will ask the customer to give us notice asap the error present itself to
connect immediately and try to get a query plan.
Regards
Fabio
On Tue, Jan 29, 2019 at 10:32 PM Fabio Isabettini
wrote:
> we are facing a similar issue on a Production system using a Postgresql 10.6:
>
> org.postgresql.util.PSQLException: ERROR: EXCEPTION on getstatistics ; ID:
> EXCEPTION on getstatistics_media ; ID: uidatareader.
> run_query_media(2): [a1
Hello,
we are facing a similar issue on a Production system using a Postgresql 10.6:
org.postgresql.util.PSQLException: ERROR: EXCEPTION on getstatistics ; ID:
EXCEPTION on getstatistics_media ; ID: uidatareader.
run_query_media(2): [a1] REMOTE FATAL: dsa_allocate could not find 7 free pages
T
On Tue, Jan 29, 2019 at 2:50 AM Arne Roland wrote:
> does anybody have any idea what goes wrong here? Is there some additional
> information that could be helpful?
Hi Arne,
This seems to be a bug; that error should not be reached. I wonder if
it is a different manifestation of the bug reported
Hello,
does anybody have any idea what goes wrong here? Is there some additional
information that could be helpful?
All the best
Arne Roland
On 2018-Nov-26, Jakub Glapa wrote:
> Justin thanks for the information!
> I'm running Ubuntu 16.04.
> I'll try to prepare for the next crash.
> Couldn't find anything this time.
As I recall, the appport stuff in Ubuntu is terrible ... I've seen it
take 40 minutes to write the crash dump to disk,
Justin thanks for the information!
I'm running Ubuntu 16.04.
I'll try to prepare for the next crash.
Couldn't find anything this time.
--
regards,
Jakub Glapa
On Mon, Nov 26, 2018 at 4:52 PM Justin Pryzby wrote:
> Hi, thanks for following through.
>
> On Mon, Nov 26, 2018 at 04:38:35PM +0100,
Hi, thanks for following through.
On Mon, Nov 26, 2018 at 04:38:35PM +0100, Jakub Glapa wrote:
> I had a look at dmesg and indeed I see something like:
>
> postgres[30667]: segfault at 0 ip 557834264b16 sp 7ffc2ce1e030
> error 4 in postgres[557833db7000+6d5000]
That's useful, I think "at
sorry, the message was sent out to early.
So, the issue occurs only on production db an right now I cannot reproduce
it.
I had a look at dmesg and indeed I see something like:
postgres[30667]: segfault at 0 ip 557834264b16 sp 7ffc2ce1e030
error 4 in postgres[557833db7000+6d5000]
and AFAI
So, the issue occurs only on production db an right now I cannot reproduce
it.
I had a look at dmesg and indeed I see something like:
--
regards,
pozdrawiam,
Jakub Glapa
On Fri, Nov 23, 2018 at 5:10 PM Justin Pryzby wrote:
> On Fri, Nov 23, 2018 at 03:31:41PM +0100, Jakub Glapa wrote:
> > Hi
On Fri, Nov 23, 2018 at 03:31:41PM +0100, Jakub Glapa wrote:
> Hi Justin, I've upgrade to 10.6 but the error still shows up:
>
> If I set it to max_parallel_workers=0 I also get and my connection is being
> closed (but the server is alive):
>
> psql db@host as user => set max_parallel_workers=0;
Hi Justin, I've upgrade to 10.6 but the error still shows up:
psql db@host as user => select version();
version
──
On Wed, Nov 21, 2018 at 03:26:42PM +0100, Jakub Glapa wrote:
> Looks like my email didn't match the right thread:
> https://www.postgresql.org/message-id/flat/CAMAYy4%2Bw3NTBM5JLWFi8twhWK4%3Dk_5L4nV5%2BbYDSPu8r4b97Zg%40mail.gmail.com
> Any chance to get some feedback on this?
In the related thread
Looks like my email didn't match the right thread:
https://www.postgresql.org/message-id/flat/CAMAYy4%2Bw3NTBM5JLWFi8twhWK4%3Dk_5L4nV5%2BbYDSPu8r4b97Zg%40mail.gmail.com
Any chance to get some feedback on this?
--
regards,
Jakub Glapa
On Tue, Nov 13, 2018 at 2:08 PM Jakub Glapa wrote:
> Hi, I'm
Hi, I'm also experiencing the problem: dsa_allocate could not find 7 free
pages CONTEXT: parallel worker
I'm running: PostgreSQL 10.5 (Ubuntu 10.5-1.pgdg16.04+1) on
x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0
20160609, 64-bit
query plan: (select statement over pare
On Wed, Aug 29, 2018 at 5:48 PM Sand Stone wrote:
> I attached a query (and its query plan) that caused the crash: "dsa_allocate
> could not find 13 free pages" on one of the worker nodes. I anonymised the
> query text a bit. Interestingly, this time only one (same one) of the nodes
> is crash
I attached a query (and its query plan) that caused the crash:
"dsa_allocate could not find 13 free pages" on one of the worker nodes. I
anonymised the query text a bit. Interestingly, this time only one (same
one) of the nodes is crashing. Since this is a production environment, I
cannot get the
>Can you still see the problem with Citus 7.4?
Hi, Thomas. I actually went back to the cluster with Citus7.4 and
PG10.4. And modified the parallel param. So far, I haven't seen any
server crash.
The main difference between crashes observed and no crash, is the set
of Linux TCP time out parameters
On Thu, Aug 16, 2018 at 8:32 AM, Sand Stone wrote:
> Just as a follow up. I tried the parallel execution again (in a stress
> test environment). Now the crash seems gone. I will keep an eye on
> this for the next few weeks.
Thanks for the report. That's great news, but it'd be good to
understand
Just as a follow up. I tried the parallel execution again (in a stress
test environment). Now the crash seems gone. I will keep an eye on
this for the next few weeks.
My theory is that the Citus cluster created and shut down a lot of TCP
connections between coordinator and workers. If running on u
>> At which commit ID?
83fcc615020647268bb129cbf86f7661feee6412 (5/6)
>>do you mean that these were separate PostgreSQL clusters, and they were all
>>running the same query and they all crashed like this?
A few worker nodes, a table is hash partitioned by "aTable.did" by
Citus, and further partit
On Wed, May 23, 2018 at 4:10 PM, Sand Stone wrote:
>>>dsa_allocate could not find 7 free pages
> I just this error message again on all of my worker nodes (I am using
> Citus 7.4 rel). The PG core is my own build of release_10_stable
> (10.4) out of GitHub on Ubuntu.
At which commit ID?
All of y
>>dsa_allocate could not find 7 free pages
I just this error message again on all of my worker nodes (I am using
Citus 7.4 rel). The PG core is my own build of release_10_stable
(10.4) out of GitHub on Ubuntu.
What's the best way to debug this? I am running pre-production tests
for the next few da
If I do a "set max_parallel_workers_per_gather=0;" before I run the query
in that session, it runs just fine.
If I set it to 2, the query dies with the dsa_allocate error.
I'll use that as a work around until 10.2 comes out. Thanks! I have
something that will help.
On Mon, Jan 29, 2018 at 3:52
On Tue, Jan 30, 2018 at 5:37 AM, Tom Lane wrote:
> Rick Otten writes:
>> I'm wondering if there is anything I can tune in my PG 10.1 database to
>> avoid these errors:
>
>> $ psql -f failing_query.sql
>> psql:failing_query.sql:46: ERROR: dsa_allocate could not find 7 free pages
>> CONTEXT: par
Rick Otten writes:
> I'm wondering if there is anything I can tune in my PG 10.1 database to
> avoid these errors:
> $ psql -f failing_query.sql
> psql:failing_query.sql:46: ERROR: dsa_allocate could not find 7 free pages
> CONTEXT: parallel worker
Hmm. There's only one place in the source c
33 matches
Mail list logo