Hi Stephane,

If you are currently using on-premisses then you should also consider
Google Cloud platform (GCP). As a practitioner I see a number of customers
migrating from others to GCP.


Databricks on GCP will be available (if I am correct) in April this year. GCP
already offers Google Compute Engines as IaaS which support Spark with
Yarn. In addition, you have other cost saving  'preemptible instances' that
can run Spark on affordable tin boxes so to speak. GCP also offers BigQuery
as a Data Warehouse (DW) with ML models built in. So there is a fair bit of
'either or choice' here. There is also the question of the migration path
from GCP artifacts to Databricks. Will Databricks provide all these as a
service? For example, BigQuery is a fully managed serverless warehouse.
Will Lakehouse provide the same in GCP etc? BigQuery besides ML provides
Oracle's PL/SQL type functions and procedures so some are migrating from
Oracle classic on premises to BigQuery


However, neither BigQuery nor compute engines are cheap. Personally I
believe the landscape on Cloud is getting congested and unless there is a
clear motivation to move from one to another, many will choose to stay
where they are. if you are already using Spark on a private Cloud, then the
journey to GCP should be pretty smooth. As ever, your mileage will vary.
You may also decide to go for a multi-cloud mixture with the best of breed.


HTH,


Mich

LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*





*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Wed, 24 Feb 2021 at 15:25, Stephane Verlet <for...@verlet.name> wrote:

> Hello,
>
> We have been using Spark on a on-premise cluster for several years and
> looking at moving to a cloud deployment.
>
> I was wondering what is your current favorite cloud setup.  Just simple
> AWR / Azure, or something on top like Databricks ?
>
> This would support a on demand report application so usage would be
> sporadic with spikes during the day. Current deployment is Spark with
> Hive data.
>
> Thanks for sharing
>
> Stephane
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to