...Final note: performance when executing queries "limit A, B" and "limit
C, D" in sequence may be completely different than when executing them in
parallel. In particular, if they are being run in parallel, most likely a
lot fewer caching will happen. Make sure your benchmarks account for this
too
Most likely the identical performance you observed for "limit" clause is
because you are not sorting the rows. Without sorting, a "limit" query is
meaningless: the database is technically allowed return exactly the same
result for "limit 0, 10" and "limit 10, 20", because both of these queries
are
Thanks Madhusudan. Please note that in your case, likely, the time was
dominated by shipping the rows over the network, rather than executing the
query. Please make sure to include benchmarks where the query itself is
expensive to evaluate (e.g. "select count(*) from query" takes time
comparable to
Hi,
Appreciate your questions.
One thing I believe, AWS Aurora even though it is based on MySQL, it is no
MySQL. The reason being, AWS has developed this database service RDS ground
up and has improved or completely changed its implementation. That being
said some of things that one may have experi
+1 for S3 being more of a FS
@Madhusudan can you point to some documentation on how to do row-range
queries in Aurora as from a quick scan it follows the MySql 5.6 syntax so
you will still need an order by for the IO to do exactly once reads. So
wanted to learn more about how the questions raised
Hi,
I think it's a mix of filesystem and IO. For S3, I see more a Beam filesystem
than a pure IO.
WDYT ?
Regards
JB
On 06/13/2017 02:43 AM, tarush grover wrote:
Hi All,
I think this can be added under java --> io --> aws-cloud-platform with
more io connectors can be added into it eg. S3 al
Hi All,
I think this can be added under java --> io --> aws-cloud-platform with
more io connectors can be added into it eg. S3 also.
Regards,
Tarush
On Mon, Jun 12, 2017 at 4:03 AM, Madhusudan Borkar
wrote:
> Yes, I believe so. Thanks for the Jira.
>
> Madhu Borkar
>
> On Sat, Jun 10, 2017 at
Yes, I believe so. Thanks for the Jira.
Madhu Borkar
On Sat, Jun 10, 2017 at 10:36 PM, Jean-Baptiste Onofré
wrote:
> Hi,
>
> I created a Jira to add custom splitting to JdbcIO (but it's not so
> trivial depending of the backends.
>
> Regarding your proposal it sounds interesting, but do you thi
To elaborate a bit on what JB said:
Suppose the table has 1,000,000 rows, and suppose you split it into 1000
bundles, 1000 rows per bundle.
Does Aurora provide an API that allows to efficiently read the bundle
containing rows 999,000-1,000,000, that does not involve reading and
throwing away the
Hi,
I created a Jira to add custom splitting to JdbcIO (but it's not so trivial
depending of the backends.
Regarding your proposal it sounds interesting, but do you think we will have
really "parallel" read of the split ? I think splitting makes sense if we can do
parallel read: if we split
Hi,
We are proposing to develop connector for AWS Aurora. Aurora being cluster
for relational database (MySQL) has no Java api for reading/writing other
than jdbc client. Although there is a JdbcIO available, it looks like it
doesn't work in parallel. The proposal is to provide split functionality
11 matches
Mail list logo