Re: 4th generation Data Warehousing and Spark

Mich Talebzadeh Tue, 20 Apr 2021 10:24:40 -0700

Having done some digging on this 4th generation Data Warehouse topic, I
believe it is just a misdemeanor of some sort. However, I came across this
note in Linkedin which although had some element of truth about market
differentiation etc,  it showed plaine ignorance of why on premise
Hadoop data lake and its newer cloud Data Lakehouse were created in the
first place. The author takes a simplistic and rejectionist view of
these concepts. It pointedly states and I quote:

"... Like as if data warehousing can’t hack data related to brands? And
you’d need Spark, data streaming from social media and the Hadoop ecosphere
to make that magic sauce work..."

And I would say yes we do need it.

 Anyway read for yourself as I found it interesting in some aspects. It is
light reading I guess and the reader's discretion is needed and
admittedly not for everyone's taste.

Bullshit at the Data Lakehouse | GOOD STRATEGY
<https://goodstrat.com/2020/04/15/bullshit-at-the-data-lakehouse/>

HTH

   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

On Sun, 18 Apr 2021 at 21:17, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

>
> You must forgive me for this seemingly pseudo  technical question. Last
> week I came across a client manager who mentioned developing 4th generation
> data warehousing with Spark. And I was wondering whether the individual
> pointedly made a reference to the new data lakehouse concept and how it was
> different with the current concept of Real time data pipeline, Batch data
> pipeline, Lambda Architecture or just plain Data enrichment. Spark can be
> used for all these. Can anyone throw some light on the notion of 4th
> Generation Data Warehousing with Spark? The D in Spark RDD for Dataset can
> handle structured, semi-structured and equally unstructured data so what is
> new?
>
> Thanks
>
> Mich
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>

Re: 4th generation Data Warehousing and Spark

Reply via email to