n Grittner [mailto:kevin.gritt...@wicourts.gov]
Enviado el: lunes, 14 de noviembre de 2011 02:27 p.m.
Para: 'Richard Huxton'; Anibal David Acosta; 'Sergey Konoplev'
CC: pgsql-performance@postgresql.org; 'Stephen Frost'
Asunto: Re: [PERFORM] unlogged tables
"Anibal D
Hello, I have a postgres 9.0.2 installation.
Every works fine, but in some hours of day I got several timeout in my
application (my application wait X seconds before throw a timeout).
Normally hours are not of intensive use, so I think that the autovacuum
could be the problem.
Is threre any l
I have a couple of tables with about 400millions of records increasing about
5 millions per day.
I think that disabling autovac over those tables, and enabling daily manual
vacuum (in some idle hour) will be better.
I am right?
Is possible to exclude autovacuum over some tables?
Tha
...@ringerc.id.au]
Enviado el: lunes, 12 de diciembre de 2011 11:45 a.m.
Para: Anibal David Acosta
CC: pgsql-performance@postgresql.org
Asunto: Re: [PERFORM] autovacuum, exclude table
Top-posting because this is context free:
You need to provide more info for anybody to help you. Are the tables
Hi,
yesterday I delete about 200 million rows of a table (about 150GB of data),
after delete completes the autovacuum process start.
The autovacuum is running for about 11 hours but no space is released
Autovacuum parameters are with default values in postgresql.conf
The postgres version is
In my postgres log I saw a lot of warning like this.
WARNING: pgstat wait timeout
Every 10 seconds aprox since yesterday and after one year working without
any warning
I have postgres 9.0.3 on a Windows Server 2008 R2.
I have only one big table with aprox. 1,300,000,000 (yes 1,300
More information.
After many "WARNING: pgstat wait timeout" in the log also appear "ERROR:
canceling autovacuum task "
De: Anibal David Acosta [mailto:a...@devshock.com]
Enviado el: viernes, 27 de julio de 2012 06:04 p.m.
Para: pgsql-performance@postgresq
Hi,
if I have a table that daily at night is deleted about 8 millions of rows
(table maybe has 9 millions) is recommended to do a vacuum analyze after
delete completes or can I leave this job to autovacuum?
This table is very active during the day but less active during night
I think that
on for the server
condition
Thanks!
-Mensaje original-
De: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov]
Enviado el: jueves, 16 de agosto de 2012 04:52 p.m.
Para: Anibal David Acosta; pgsql-performance@postgresql.org
Asunto: Re: [PERFORM] best practice to avoid table bloat?
"Anib
Using explain analyze I saw that many of my queries run really fast, less
than 1 milliseconds, for example the analyze output of a simple query over a
table with 5millions of records return "Total runtime: 0.078 ms"
But the real time is a lot more, about 15 ms, in fact the pgadmin show this
v
Hi,
I have a table with about 10 millions of records, this table is update and
inserted very often during the day (approx. 200 per second) , in the night
the activity is a lot less, so in the first seconds of a day (00:00:01) a
batch process update some columns (used like counters) of this table
original-
De: Claudio Freire [mailto:klaussfre...@gmail.com]
Enviado el: viernes, 05 de octubre de 2012 10:27 a.m.
Para: Jeff Janes
CC: Anibal David Acosta; pgsql-performance@postgresql.org
Asunto: Re: [PERFORM] how to avoid deadlock on masive update with multiples
delete
On Thu, Oct 4, 2012 at 1
I have a table with a column of type timestamp with time zone, this column
has an index
If I do a select like this
select * from mytable where cast(my_date as timestamp without time zone) >
'2012-10-12 20:00:00'
this query will use the index over the my_date column?
Thanks
.pgh.pa.us]
Enviado el: viernes, 12 de octubre de 2012 05:39 p.m.
Para: Anibal David Acosta
CC: pgsql-performance@postgresql.org
Asunto: Re: [PERFORM] Do cast affects index usage?
"Anibal David Acosta" writes:
> I have a table with a column of type timestamp with time zone, this
> c
On 12/05/2012 10:34 AM, Andrea Suisani wrote:
> [sorry for resuming an old thread]
>
> [cut]
>
Question is... will that remove the performance penalty of
HyperThreading?
>>>
>>> So I've added to my todo list to perform a test to verify this claim :)
>>
>> done.
>
> on this box:
>
>> in a
On 03/03/2013 03:16 PM, Josh Berkus wrote:
> Steven,
>
>> We saw the same performance problems when this new hardware was running
>> cent 6.3 with a 2.6.32-279.19.1.el6.x86_64 kernel and when it was matched
>> to the OS/kernel of the old hardware which was cent 5.8 with
>> a 2.6.18-308.11.1.el5 ke
ut whether or not that is
significant enough to discard the advantages of triggers is something only
you can decide - ideally after testing.
David J.
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Best-practice-question-tp5801010p5801011.html
Sent from the PostgreS
How does something like:
WITH unreads AS (
SELECT messageid FROM message
EXCEPT
SELECT messageid FROM message_property WHERE personid=1 AND has_read
)
SELECT ...
FROM unreads
JOIN messages USING (messageid)
;
perform?
David J.
--
View this message in context:
http://postgresql.1045698.n5
= ANY(reader_ids)) ; UPDATE message SET reader_ids = reader_ids || 1 WHERE
messageid = ..." I'm not that familiar with how well indexes over arrays
work or which kind is needed (i.e. gin/gist).
HTH
David J.
--
View this message in context:
http://postgresql.1045698.n5.nabble.c
when
"is_read" is true? Since you already allow for the possibility of a
missing record (giving it the meaning of "not read") these other
properties cannot currently exist in that situation.
David J.
--
View this message in context:
http://postgresql.1045698.n5.nabble
multiple-column index
time should probably the first listed field.
The planner figures being more selective and filtering is going to be faster
than scanning the much larger section of index covered by the stock_id(s)
and then going and fetching those pages and then checking them for
visibility.
effect given that the
symptoms are sporadic and we are only talking about a select statement that
returns a single row; and an update that does not hit any indexed column and
therefore benefits from "HOT" optimization.
HTH
David J.
--
View this message in context:
http://postg
atly if
you can afford it in your production environment - it would make looking for
internal concurrency much easier.
David J.
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Re-recently-and-selectively-slow-but-very-simple-update-query-tp5802553p5802579.html
S
常超 wrote
> Hi,all
> I have a table to save received measure data.
>
>
> CREATE TABLE measure_data
> (
> id serial NOT NULL,
> telegram_id integer NOT NULL,
> measure_time timestamp without time zone NOT NULL,
> item_id integer NOT NULL,
> val double precision,
> CONSTRAINT measure_dat
ineffective as the write workload
> increased, because of internal lock contention.
Though based upon your question regarding parallel replication I am thinking
that maybe your concept of "group commit" and the one that was implemented
are quite different...
David J.
--
View this m
then discard
the prepared statement.
I do not know enough about the underlying data to draw a conclusion but
typically the higher the bind/prepare ratio the more efficient your use of
database resources. Same goes for the prepare ratio. The clients you use
and the general usage of the database he
unction.html
Note the "ROWS" property.
Functions are black-boxes to the planner so it has no means of estimating a
row count. So a set returning function uses 1,000 and all others use 1.
Determining "COST" is similarly problematic.
David J.
--
View th
t a basic level it is unable to push down LIMIT into a WHERE clause
and it cannot add additional sub-queries that do not exist in the original
plan - which includes adding a UNION node.
David J.
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Slow-query-with-indexed-ORD
johno wrote
> Thanks for the quick reply David!
>
> However I am still unsure how these two queries are not relationally
> equivalent. I am struggling to find a counterexample where the first and
> third query (in email, not in gist) would yield different results. Any
> ideas?
> BTW this is to my understanding a very similar scenario to how partitioned
> tables work and push down limit and where conditions. Why is this not
> possible in this case?
>
> Jano
>
>
> On Mon, Jul 21, 2014 at 11:54 PM, David G Johnston <
> david.g.johnston@
>
le both are quite small (0.002/0.004) the
300,000+ loops do add up. The same likely applies to the other planning
nodes but I didn't dig that deep.
David J.
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Query-performing-very-bad-and-sometimes-good-tp5813831p5813847.h
n by the planner - and can provide data that the
developers can use to replicate the experiment - then improvements can be
made. At worse you will come to understand why the planner is right and can
then explore alternative models.
David J.
--
View this message in context:
http://postgre
only available in supported versions
that is not an option for you. Still, it is the most likely explanation for
what you are seeing.
There is time involved to process the partition constraint exclusion but I'm
doubting it accounts for a full 3 seconds...
David J.
--
View this me
ter advice depends on context and
hardware.
You should also consider upgrading to a newer, supported, version of
PostgreSQL.
David J.
--
View this message in context:
http://postgresql.1045698.n5.nabble.com/autocommit-true-false-for-more-than-1-million-records-tp5815943p5815946.html
Sent from the
, and another imported was to be attempted,
ideally the allocated space could be reused.
I'm not sure what a reasonable formula would be, especially at the TB
scale, but roughly 2x the size of the imported (uncompressed) file would be
a good starting point (table + WAL). You likely would want m
each twice - and note that (each(...).*) does not work to avoid the
double-call - you have to use a subquery / a CTE one to ensure that it is
not collapsed (offset 0 should work too but I find the CTE one a little
cleaner personally).
David J.
--
View this message in context:
http://postgresq
Huang, Suya wrote
> Can someone figure out why the first query runs so slow comparing to the
> second one? They generate the same result...
Try: EXPLAIN (ANALYZE, BUFFERS)
I believe you are only seeing caching effects.
David J.
--
View this message in context:
http://postgresql.1045
will perform better since fewer rows
must be evaluated in the less efficient count(DISTINCT) expression - the
time saved there more than offset by the fact that you are effectively
passing over that subset of the data a second time.
HashAggregate(1M rows) + Aggregate(200k rows) < Aggregate(1M rows)
> GroupAggregate (cost=0.42..228012.62 rows=208120 width=15) (actual
> time=0.042..2089.367 rows=20 loops=1)
>Buffers: shared hit=53146
>-> Index Scan using t1_name on t1 (cos
o) will generate sub-optimal plans that can be
rewritten using relational algebra and better optimized for having done so.
But such work takes resources that would be expended for every single query
while manually rewriting the sub-optimal query solves the problem
once-and-for-all.
David J.
--
View
David G Johnston wrote
>
> Laurent Martelli wrote
>> Le 20/10/2014 15:58, Tom Lane a écrit :
>>> Laurent Martelli <
>> laurent.martelli@
>> > writes:
>>>> Do we agree that both queries are identical ?
>>> No, they *aren't* id
s query as well as pagination-oriented queries are two that
come to mind. I think the material would fit well in the tutorial section
but having some kind of quick synopsis and cross reference in the
performance chapter would aid someone whose looking to solve a problem and
not in general edu
personally choose only between having different databases for each
client or using a "client_id" column in conjunction with a multi-tenant
database. Those are the two logical models; everything else (e.g.
partitioning) are physical implementation details.
David J.
--
View
RAL. More detailed reports may at least bring
exposure to what is being used in the wild and garner interest from other
parties in improving things. Unfortunately this report is too limited to
really make a dent; lacking even the name of the ORM that is being used and
the entire queries that are bein
ndition probably using only 1 or 2 columns
instead of all five.
I'm not familiar with the caching constraint or the data so its hard to
make more specific suggestions.
David J.
Instead, not knowing whether there were changes since
the last checkpoint, the system truncated the relation.
What use case is there for a behavior that the last checkpoint data is left
on the relation upon restarting - not knowing whether it was possible the
other data could have been written subsequent?
David J.
On Mon, Apr 13, 2015 at 4:49 PM, Jeff Janes wrote:
> On Mon, Apr 13, 2015 at 1:49 PM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Monday, April 13, 2015, Matheus de Oliveira
>> wrote:
>>
>>>
>>> On Mon, Apr 13, 2015 at 4:31
On Mon, Apr 13, 2015 at 7:45 PM, Jim Nasby wrote:
> On 4/13/15 7:32 PM, David G. Johnston wrote:
>
> That particular use-case would probably best be served with a separate
>> replication channel which pushes data files from the primary to the
>> slaves and allows for t
not affected by month boundaries but do start on Monday.
David J.
On Thursday, May 21, 2015, Bosco Rama wrote:
> On 05/20/15 20:22, David G. Johnston wrote:
> > On Monday, May 18, 2015, er.tejaspate...@gmail.com <
> > er.tejaspate...@gmail.com > wrote:
> >
> >> If I have to find upcoming birthdays in current week a
ou insert 273 rows at once, you are doing it
as 273 transactions instead of one?
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key:166D840A 0C610C8B Registered Machine 1935521.
/( )\ Shrewsbury, New Jerseyhttp://linuxcounter.net
^^-^^ 09:00:01 up 3 days, 9:57
You should repost this directly and not through Nabble. It has wrapped
your code in raw tags which the PostgreSQL mailing list software strips.
On Wednesday, June 3, 2015, ben.play wrote:
> Hi all,
>
> We have a big database (more than 300 Gb) and we run a lot of queries each
> minute.
>
> Howe
changes are worse would take
effect after a reboot - though most are used on the very next query that
runs.
The vacuum would indeed likely account for the gains - there being
significantly fewer dead/invisible rows to have to scan over and discard
while retrieving the live rows that fulfill your query.
David J.
On Wed, Jul 15, 2015 at 12:16 PM, Robert DiFalco
wrote:
> First off I apologize if this is question has been beaten to death. I've
> looked around for a simple answer and could not find one.
>
> Given a database that will not have it's PKEY or indices modified, is it
> generally faster to INSERT
On Wednesday, July 15, 2015, Robert DiFalco
wrote:
> First off I apologize if this is question has been beaten to death. I've
> looked around for a simple answer and could not find one.
>
> Given a database that will not have it's PKEY or indices modified, is it
> generally faster to INSERT or UP
On Wed, Jul 15, 2015 at 1:56 PM, Robert DiFalco
wrote:
>
>
> On Wed, Jul 15, 2015 at 10:33 AM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Wednesday, July 15, 2015, Robert DiFalco
>> wrote:
>>
>>> First off I apologize if
us the update would likely end up writing an
entirely new record upon each event category recording.
David J.
On Wed, Jul 15, 2015 at 4:53 PM, Michael Nolan wrote:
> On Wed, Jul 15, 2015 at 3:16 PM, Robert DiFalco
> wrote:
>
>>
>> Thanks David, my example was a big simplification, but I appreciate your
>> guidance. The different event types have differing amounts of related d
t;UNION ALL" query you proposed.
David J.
On Fri, Aug 21, 2015 at 8:07 AM, Stephane Bailliez
wrote:
>
> On Thu, Aug 20, 2015 at 8:19 PM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>>
>> SELECT [...]
>> FROM (SELECT reference_id, [...] FROM table_where_referenced_id_is_a_pk
ure if the planner could be smarter because you are asking a
question it is not particularly suited to estimating - namely cross-table
correlations. Rethinking the model is likely to give you a better outcome
long-term though it does seem like there should be room for improvement
within the stated query and model.
As Tomas said you likely will benefit from increased working memory in
order to make materializing and hashing/bitmapping favorable compared to a
nested loop.
David J.
ified. I'm not sure why the nested loop executor is
not intelligent enough to do this...
The important number in these plans is "loops", not "rows"
David J.
gt; slow to use? Is it because it involves the *date()* function call that it
> makes it difficult for the planner to guess the data distribution in the
> DOCUMENT table?
>
What happens if you pre-compute the date condition and hard code it?
David J.
rtitioning data has to be injected into the
query explicitly so that it is already in place before the planner receives
the query. Anything within the query requiring "execution" is handled by
the executor and at that point the chance to exclude partitions has come
and gone.
David J.
stem is designed to return data from the heap, not an index.
While it possibly can in some instances if you need to return data you
should store it directly in the table.
David J.
25..0.26 rows=1
> width=116) (actual time=0.401..0.402 rows=1 loops=1)"
> "Planning time: 0.058 ms"
> "Execution time: 0.423 ms"
>
>
I'm doubting the query inside of the function is the problem here...it is
the function usage itself. Calling a function has overhead in that the
body of function needs to be processed. This only has to happen once per
session. The first call of the function incurs this overhead while
subsequent calls do not.
Pending others correcting me...I fairly certain regarding my conclusions
though somewhat inexperienced in doing this kind of diagnostics.
David J.
query the planner thinks it needs 1.5 million of the rows and
will have to check each of them for visibility. It decided that scanning
the entire table was more efficient.
The LIMIT 1 in both queries should not be necessary. The planner is smart
enough to stop once it finds what it is looking for. In fact the LIMIT's
presence may be a contributing factor...but I cannot say for sure.
A better query seems like it would be:
WITH active_sites AS (
SELECT DISTINCT site_id FROM datavalues;
)
SELECT *
FROM sites
JOIN active_sites USING (site_id);
David J.
633764 11994442 1849232 2014935 4563638 132955919 7
>
>
Ok...again its beyond my present experience but its what the planner
thinks about the distribution, and not what actually is present, that
matters.
David J.
index even if it compiles (not tested):
CREATE FUNCTION similarity_80(col, val)
RETURNS boolean
SET similarity_threshold = 0.80
LANGUAGE sql
AS $$
SELECT col % val;
$$;
David J.
> SQL ERROR[54000]
> ERROR: array size exceeds the maximum allowed (1073741823)
>
> https://www.postgresql.org/about/
Maximum Field Size: 1 GB
It doesn't matter that the data never actually is placed into a physical
table.
David J.
On Tue, Jun 7, 2016 at 8:36 AM, Nicolas Paris wrote:
> 2016-06-07 14:31 GMT+02:00 David G. Johnston :
>
>> On Tue, Jun 7, 2016 at 7:44 AM, Nicolas Paris
>> wrote:
>>
>>> Hello,
>>>
>>> I run a query transforming huge tables to a json document
On Tue, Jun 7, 2016 at 8:42 AM, Nicolas Paris wrote:
>
>
> 2016-06-07 14:39 GMT+02:00 David G. Johnston :
>
>> On Tue, Jun 7, 2016 at 8:36 AM, Nicolas Paris
>> wrote:
>>
>>> 2016-06-07 14:31 GMT+02:00 David G. Johnston >> >:
>>>
>&
re col like '' and col not like ''
Or a CTE (with)
With likeqry as ( select where like )
Select from likeqry where not like
(sorry for brevity but not at a pc)
David J.
ding to the docs)
You need to get to a point where you are seeing feedback from the
pg_restore process. Once you get it telling you what it is doing (or
trying to do) then diagnosing can begin.
David J.
an additional index on "guid::text".
>
Or, better, persuade the app to label the value "
public.push_guid
" since that is the column's type...a type you haven't defined for us.
If you get to add explicit casts this should be easy...but I'm not familiar
with the framework you are using.
David J.
seconds, my natural gas fueled backup
generator picks up the load very quickly.
Am I overlooking something?
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key:166D840A 0C610C8B Registered Machine 1935521.
/( )\ Shrewsbury, New Jerseyhttp://linuxcounter.net
On 07/08/2016 07:44 AM, vincent wrote:
>
>
> Op 7/8/2016 om 12:23 PM schreef Jean-David Beyer:
>> Why all this concern about how long a disk (or SSD) drive can stay up
>> after a power failure?
>>
>> It seems to me that anyone interested in maintaining an
On Thu, Jul 21, 2016 at 2:24 PM, Claudio Freire
wrote:
> That cross join doesn't look right. It has no join condition.
That is that the definition of a "CROSS JOIN"...
David J.
temp table in function
>
>
I have no difficulty using arrays in functions.
As for "other methods" - you can use CTE (WITH) to create a truly local
table - updating the catalogs by using a temp table is indeed quite
expensive.
WITH vals AS ( VALUES (1, 'lw'), (2, 'lw2') )
SELECT * FROM vals;
David J.
ed view data gets saved to the physical table thus making
the table clustered on whatever order by is specified.
David J.
en obtaining rows from the master table. If this
is the case then you've gotten away from the expected usage of partitions
and so need to do things that aren't in the manual to make them work.
David J.
David J.
On Tue, Dec 27, 2016 at 10:38 AM, Valerii Valeev
wrote:
> Thank you David,
>
> I used same rationale to convince my colleague — it didn’t work :)
> Sort of “pragmatic” person who does what seems working no matter what
> happens tomorrow.
> So I’m seeking for better under
r -
and probably examples of both as well since its not clear when it can occur.
Some TLC to the docs here would be welcomed.
David J.
On Wed, Jan 18, 2017 at 4:23 PM, Tom Lane wrote:
> "David G. Johnston" writes:
> > I'm feeling a bit dense here but even after having read a number of
> these
> > kinds of interchanges I still can't get it to stick. I think part of the
> > probl
;
IIRC the only reason the first query cares to use the index is because it
can perform an Index Only Scan and thus avoid touching the heap at all. If
it cannot avoid touching the heap the planner is going to just use a
sequential scan to retrieve the records directly from the heap and save the
index lookup step.
David J.
on the visibility map or the heap).
https://www.postgresql.org/docs/9.6/static/indexes-index-only-scans.html
David J.
On Wed, Mar 1, 2017 at 3:00 PM, Stefan Andreatta
wrote:
> plain analyze
> select tmp_san_1.id
> from tmp_san_1
>left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text
> where tmp_san_2.id is null;
>
> Does it help if you check for "tmp_san_2.text is null"?
David J.
On Wed, Mar 1, 2017 at 5:24 PM, Jeff Janes wrote:
> On Wed, Mar 1, 2017 at 2:12 PM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Wed, Mar 1, 2017 at 3:00 PM, Stefan Andreatta > > wrote:
>>
>>> plain analyze
>>> select t
.
>
> xxx 172.23.110.175
> yyy 172.23.110.178
> zzz 172.23.110.177
> aaa 172.23.110.176
> bbb 172.23.111.180
> ccc 172.23.115.26
>
SELECT ... WHERE substring(ip_addr::text, 1, 10) = '172.23.110'
David J.
www.postgresql.org/docs/9.6/static/sql-grant.html
David J.
possibly partial)
plans and picking the best one - flagging those plan steps that can
leverage parallelism for possible execution.
David J.
the somewhat
specialized nature of the problem, a response should be forthcoming even
though its taking a bit longer than usual.
David J.
hat
particular unlogged table since the data files are known to be accurate.
David J.
us into a single column. "CASE ... WHEN
... THEN ... ELSE ... END" is quite helpful for doing stuff like that. For
now I'll just leave them as two columns.
SELECT status, payment_status, count(*)
FROM ud_document
WHERE uniqueid <> '201708141701018'
GROUP BY 1, 2;
David J.
901 - 994 of 994 matches
Mail list logo