Hi,

Can you explain about you particular stack.

Example what is the source of streaming data and the role that Spark plays.

Are you dealing with Real Time and Batch and why Parquet and not something
like Hbase to ingest data real time.

HTH



Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 28 August 2016 at 15:43, Kevin Tran <kevin...@gmail.com> wrote:

> Hi,
> Does anyone know what is the best practises to store data to parquet file?
> Does parquet file has limit in size ( 1TB ) ?
> Should we use SaveMode.APPEND for long running streaming app ?
> How should we store in HDFS (directory structure, ... )?
>
> Thanks,
> Kevin.
>

Reply via email to