Have you tried to set the instance running on GCP to have similar
shared_buffers as the AWS database ?

What you described has a much lower cache hit rate on GCS and 2X the
shared buffers on AWS which could well explain much of the difference
in execution times.

DETAILS:
Query explain for Postgres on GCP VM:
    Buffers: shared hit=423 read=4821

Query explain for Postgres on AWS RDS:
    Buffers: shared hit=3290 read=1948

and the configuration :

Instance on VM on GCP (2 vCPUs, 2 GB memory, 800 GB disk):
• shared_buffers: 510920kB (close to 499MB)

Instance managed by RDS (2 vCPUs, 2 GiB RAM, 250GB disk, 750 de IOPS):
• shared_buffers: 943896kB (close to 922MB)


Cheers
Hannu

On Fri, Feb 26, 2021 at 9:16 AM Justin Pitts <justinpi...@gmail.com> wrote:
>
> Since this is a comparison to RDS, and the goal presumably is to make the 
> test as even as possible, you will want to pay attention to the network IO 
> capacity for the client and the server in both tests.
>
> For RDS, you will be unable to run the client software locally on the server 
> hardware, so you should plan to do the same for the GCP comparison.
>
> What is the machine size you are using for your RDS instance? Each machine 
> size will specify CPU and RAM along with disk and network IO capacity.
>
> Is your GCP VM where you are running PG ( a GCP VM is the equivalent of an 
> EC2 instance, by the way ) roughly equivalent to that RDS instance?
>
> Finally, is the network topology roughly equivalent? Are you performing these 
> tests with the same region and/or availability zone?
>
>
>
> On Thu, Feb 25, 2021 at 3:32 PM Philip Semanchuk 
> <phi...@americanefficient.com> wrote:
>>
>>
>>
>> > On Feb 25, 2021, at 4:04 PM, Igor Gois <i...@bixtecnologia.com.br> wrote:
>> >
>> > Philip,
>> >
>> > The results in first email in this thread were using explain analyze.
>> >
>> > I thought that you asked to run using only 'explain'. My bad.
>> >
>> > The point is, the execution time with explain analyze is less the 1 
>> > second. But the actual execution time (calculated from the python client) 
>> > is 24 seconds (aws) and 300+ seconds in gcp
>>
>> Oh OK, sorry, I wasn’t following. Yes, network speed sounds like the source 
>> of the problem.
>>
>> Under AWS sometimes we log into an EC2 instance if we have to run a query 
>> that generates a lot of data so that both server and client are inside AWS. 
>> If GCP has something similar to EC2, it might be an interesting experiment 
>> to run your query from there and see how much, if any, that changes the time 
>> it takes to get results.
>>
>> Hope this helps
>> Philip
>>
>>
>>
>> >
>> > Em qui., 25 de fev. de 2021 às 17:53, Philip Semanchuk 
>> > <phi...@americanefficient.com> escreveu:
>> >
>> >
>> > > On Feb 25, 2021, at 3:46 PM, Igor Gois <i...@bixtecnologia.com.br> wrote:
>> > >
>> > > Hi, Philip
>> > >
>> > > We ran: EXPLAIN (FORMAT JSON) SELECT "Id", "DateTime", 
>> > > "SignalRegisterId", "Raw" FROM "SignalRecordsBlobs" WHERE 
>> > > "SignalSettingId" = 103 AND "DateTime" BETWEEN 
>> > > '2019-11-28T14:00:12.540200000' AND '2020-07-23T21:12:32.249000000';
>> > >
>> > > but it was really fast. I think the results were discarded.
>> >
>> > EXPLAIN and EXPLAIN ANALYZE are different in an important way. EXPLAIN 
>> > merely plans the query, EXPLAIN ANALYZE plans *and executes* the query. 
>> > From the doc —
>> >
>> > "The ANALYZE option causes the statement to be actually executed, not only 
>> > planned....Keep in mind that the statement is actually executed when the 
>> > ANALYZE option is used. Although EXPLAIN will discard any output that a 
>> > SELECT would return, other side effects of the statement will happen as 
>> > usual. “
>> >
>> > https://www.postgresql.org/docs/12/sql-explain.html
>> >
>> >
>> > >
>> > > AWS Execution time select without explain: 24.96505s (calculated in 
>> > > python client)
>> > > AWS Execution time select with explain but without analyze: 0.03876s 
>> > > (calculated in python client)
>> > >
>> > > https://explain.depesz.com/s/5HRO
>> > >
>> > > Thanks in advance
>> > >
>> > >
>> > > Em qui., 25 de fev. de 2021 às 15:13, Philip Semanchuk 
>> > > <phi...@americanefficient.com> escreveu:
>> > >
>> > >
>> > > > On Feb 24, 2021, at 10:11 AM, Igor Gois <i...@bixtecnologia.com.br> 
>> > > > wrote:
>> > > >
>> > > > Hi, Julien
>> > > >
>> > > > Your hypothesis about network transfer makes sense. The query returns 
>> > > > a big size byte array blobs.
>> > > >
>> > > > Is there a way to test the network speed against the instances? I have 
>> > > > access to the network speed in gcp (5 Mb/s), but don't have access in 
>> > > > aws rds.
>> > >
>> > > Perhaps what you should run is EXPLAIN ANALYZE SELECT...? My 
>> > > understanding is that EXPLAIN ANALYZE executes the query but discards 
>> > > the results. That doesn’t tell you the network speed of  your AWS 
>> > > instance, but it does isolate the query execution speed (which is what I 
>> > > think you’re trying to measure) from the network speed.
>> > >
>> > > Hope this is useful.
>> > >
>> > > Cheers
>> > > Philip
>> > >
>> > > >
>> > > >
>> > > > Em qua., 24 de fev. de 2021 às 10:35, Julien Rouhaud 
>> > > > <rjuju...@gmail.com> escreveu:
>> > > > Hi,
>> > > >
>> > > > On Wed, Feb 24, 2021 at 6:14 AM Maurici Meneghetti
>> > > > <maurici.meneghe...@bixtecnologia.com.br> wrote:
>> > > > >
>> > > > > I have 2 postgres instances created from the same dump (backup), one 
>> > > > > on a GCP VM and the other on AWS RDS. The first instance takes 18 
>> > > > > minutes and the second one takes less than 20s to run this simples 
>> > > > > query:
>> > > > > SELECT "Id", "DateTime", "SignalRegisterId", "Raw" FROM 
>> > > > > "SignalRecordsBlobs" WHERE "SignalSettingId" = 103 AND "DateTime" 
>> > > > > BETWEEN '2019-11-28T14:00:12.540200000' AND 
>> > > > > '2020-07-23T21:12:32.249000000';
>> > > > > I’ve run this query a few times to make sure both should be reading 
>> > > > > data from cache.
>> > > > > I expect my postgres on GPC to be at least similar to the one 
>> > > > > managed by AWS RDS so that I can work on improvements parallelly and 
>> > > > > compare.
>> > > > >
>> > > > > DETAILS:
>> > > > > [...]
>> > > > > Planning time: 456.315 ms
>> > > > > Execution time: 776.976 ms
>> > > > >
>> > > > > Query explain for Postgres on AWS RDS:
>> > > > > [...]
>> > > > > Planning time: 0.407 ms
>> > > > > Execution time: 14.87 ms
>> > > >
>> > > > Those queries were executed in respectively ~1s and ~15ms (one thing
>> > > > to note is that the slower one had less data in cache, which may or
>> > > > may note account for the difference).  Does those plans reflect the
>> > > > reality of your slow executions?  If yes it's likely due to quite slow
>> > > > network transfer.  Otherwise we would need an explain plan from the
>> > > > slow execution, for which auto_explain can help you.  See
>> > > > https://www.postgresql.org/docs/11/auto-explain.html for more details.
>> > > >
>> > > >
>> > > > --
>> > > > Att,
>> > > >
>> > > > Igor Gois | Sócio Consultor
>> > > > (48) 99169-9889 | Skype: igor_msg
>> > > > Site | Blog | LinkedIn | Facebook | Instagram
>> > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Att,
>> > >
>> > > Igor Gois | Sócio Consultor
>> > > (48) 99169-9889 | Skype: igor_msg
>> > > Site | Blog | LinkedIn | Facebook | Instagram
>> > >
>> > >
>> >
>> >
>> >
>> > --
>> > Att,
>> >
>> > Igor Gois | Sócio Consultor
>> > (48) 99169-9889 | Skype: igor_msg
>> > Site | Blog | LinkedIn | Facebook | Instagram
>> >
>> >
>>
>>
>>


Reply via email to