On Wed, Jul 15, 2020 at 02:49:16PM -0400, Henrique Montenegro wrote:
> Any idea why the actual time is in the 3ms range? If I query that partition
> directly, like this:
>
> As you can see, the `actual_time` when querying the partition table
> directly goes to 0.002 which is almost 2000x faster.
On Wed, Jul 15, 2020 at 8:24 AM Henrique Montenegro
wrote:
>
>
> On Wed, Jul 15, 2020 at 4:03 AM Sebastian Dressler
> wrote:
>
>> Hi Henrique,
>>
>> On 15. Jul 2020, at 03:13, Henrique Montenegro wrote:
>> [...]
>>
>> ```
>> ssl = off
>> shared_buffers = 160GB # min 128kB
>> wo
On Wed, Jul 15, 2020 at 4:03 AM Sebastian Dressler
wrote:
> Hi Henrique,
>
> On 15. Jul 2020, at 03:13, Henrique Montenegro wrote:
> [...]
>
> ```
> ssl = off
> shared_buffers = 160GB # min 128kB
> work_mem = 96GB # min 64kB
> maintenance_work_mem = 12GB
Hi Henrique,
On 15. Jul 2020, at 03:13, Henrique Montenegro
mailto:typ...@gmail.com>> wrote:
[...]
```
ssl = off
shared_buffers = 160GB # min 128kB
work_mem = 96GB # min 64kB
maintenance_work_mem = 12GB # min 1MB
max_stack_depth = 4MB
Alright, so it seems like partitioning and changing the shared_buffers as
well as adding the order by helped to a certain extent, but the writes are
still slow. Inserting a 1 million records file is taking almost 3 minutes
(way better than the 20+ minutes, but still pretty slow compared to the 20
s
On Mon, Jul 13, 2020 at 8:05 PM Jeff Janes wrote:
> On Mon, Jul 13, 2020 at 10:23 AM Henrique Montenegro
> wrote:
>
> insert into users_no_dups (
>> created_ts,
>> user_id,
>> name,
>> url
>> ) (
>> select
>> created_ts,
>> user_id,
>> name,
>>
On Mon, Jul 13, 2020 at 10:23 AM Henrique Montenegro
wrote:
insert into users_no_dups (
> created_ts,
> user_id,
> name,
> url
> ) (
> select
> created_ts,
> user_id,
> name,
> url
> from
> users
> ) on conflict do nothing
>
Once th
On Mon, Jul 13, 2020 at 12:50 PM Sebastian Dressler
wrote:
> Hi Henrique,
>
> On 13. Jul 2020, at 18:42, Henrique Montenegro wrote:
>
> On Mon, Jul 13, 2020 at 11:20 AM Sebastian Dressler
> wrote:
>
>
>> Running the above loop worked fine for about 12 hours. Each file was
>> taking
>> about 30
Is this an insert only table and perhaps not being picked up by autovacuum?
If so, try a manual "vacuum analyze" before/after each batch run perhaps.
You don't mention updates, but also have been adjusting fillfactor so I am
not not sure.
On Mon, Jul 13, 2020 at 11:20 AM Sebastian Dressler
wrote:
> Hi Henrique,
>
> On 13. Jul 2020, at 16:23, Henrique Montenegro wrote:
>
> [...]
>
> * Insert the data from the `users` table into the `users_no_dups` table
>
> ```
> insert into users_no_dups (
> created_ts,
> user_id,
> n
Hi Henrique,
On 13. Jul 2020, at 18:42, Henrique Montenegro
mailto:typ...@gmail.com>> wrote:
On Mon, Jul 13, 2020 at 11:20 AM Sebastian Dressler
mailto:sebast...@swarm64.com>> wrote:
Running the above loop worked fine for about 12 hours. Each file was taking
about 30 seconds to be processed.
On Mon, Jul 13, 2020 at 12:28 PM Michael Lewis wrote:
> Is this an insert only table and perhaps not being picked up by
> autovacuum? If so, try a manual "vacuum analyze" before/after each batch
> run perhaps. You don't mention updates, but also have been adjusting
> fillfactor so I am not not su
Hi Henrique,
On 13. Jul 2020, at 16:23, Henrique Montenegro
mailto:typ...@gmail.com>> wrote:
[...]
* Insert the data from the `users` table into the `users_no_dups` table
```
insert into users_no_dups (
created_ts,
user_id,
name,
url
) (
select
created_ts,
u
Hello list,
I am having issues with performance inserting data in Postgres and would
like
to ask for help figuring out the problem as I ran out of ideas.
I have a process that generates a CSV file with 1 million records in it
every
5 minutes and each file is about 240MB. I need this data to be in
14 matches
Mail list logo