Hi, I already store them in MongoDB in parralel for operational access and don't want to add an other database in the loop Is it the only solution ?
Tks Nicolas ----- Mail original ----- De: "Ted Yu" <yuzhih...@gmail.com> À: nib...@free.fr Cc: "user" <user@spark.apache.org> Envoyé: Mercredi 2 Septembre 2015 18:34:17 Objet: Re: Small File to HDFS Instead of storing those messages in HDFS, have you considered storing them in key-value store (e.g. hbase) ? Cheers On Wed, Sep 2, 2015 at 9:07 AM, < nib...@free.fr > wrote: Hello, I'am currently using Spark Streaming to collect small messages (events) , size being <50 KB , volume is high (several millions per day) and I have to store those messages in HDFS. I understood that storing small files can be problematic in HDFS , how can I manage it ? Tks Nicolas --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org