Data Lakes using Spark

Boris Litvak Tue, 06 Apr 2021 23:33:16 -0700

Hi Friends,

I’d like to publish a document to Medium about data lakes using Spark.
Its latter parts include info that is not widely known, unless you have 
experience with data lakes.


https://github.com/borislitvak/datalake-article/blob/initial_comments/Building%20a%20Real%20Life%20Data%20Lake%20in%C2%A0AWS.md
I hope it’s OK if I ask you to review its draft.

You can respond here or contact me directly.
If there are some topics I should add (like, compaction effect on downstream 
reads using structured streaming), or there are errors, please point them out 
before it gets out.
Also, if some points are unclear or misleading, please state so.

Thanks,

Boris Litvak

Data Lakes using Spark

Reply via email to