This is a generic question with regard to an optimum design.

Many Cloud Data Warehouses like Google BigQuery (BQ)
<https://cloud.google.com/bigquery-ml/docs/introduction> or Oracle
Autonomous Data Warehouse (ADW),
<https://www.oracle.com/autonomous-database/autonomous-data-warehouse/>
nowadays offer ML capabilities based on models built within the storage
itself. This is great as it allows those with SQL knowledge but not
necessarily data scientists to build and run models. These data
warehouses are built in Cloud.

However, I see some limitations if the data warehouse itself is used for
both storage and model building capabilities. The fundamental issue arises
when you want to scale this up with multiple sources (on prem or already in
a data warehouse), concurrent users, ability to enrich data and the ability
to store what is needed in the data warehouse itself (data or results of
models).

This where Spark comes into play. It can connect multiple sources with JDBC
connections, can combine data from these sources within Spark itself and
provide in-memory enrichment and computation at the compute layer.
additionally and perhaps more importantly you can scale up and down compute
layers (some of them dedicated) to your needs without adversely impacting
the storage and model building layer.

In summary, I cannot see how one can rely on storage layer alone to


   1. read data from multiple sources
   2. combine storage and computing with scale
   3. avoid concurrency bottlenecks in a meaningful way

I would be interested to hear other views on this.

Thanks

Mich


LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*





*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

Reply via email to