Maybe you can tell us more about your use case, I have somehow the feeling
that we are missing sth here
Le jeu. 3 sept. 2015 à 15:54, Jörn Franke a écrit :
>
> Store them as hadoop archive (har)
>
> Le mer. 2 sept. 2015 à 18:07, a écrit :
>
>> Hello,
>> I'am currently using Spark Streaming to
;>>>> and sometimes I will need to replace the content of a file by a new
>>>>> content
>>>>> (remove/replace)
>>>>>
>>>>>
>>>>> Tks a lot
>>>>> Nicolas
>>>>>
>>&g
R ?
>>>> Basically the name of my small files will be the keys of my records ,
>>>> and sometimes I will need to replace the content of a file by a new content
>>>> (remove/replace)
>>>>
>>>>
>>>> Tks a lot
>>>&g
gt;>>
>>> Tks a lot
>>> Nicolas
>>>
>>> - Mail original -
>>> De: "Jörn Franke"
>>> À: nib...@free.fr
>>> Cc: user@spark.apache.org
>>> Envoyé: Jeudi 3 Septembre 2015 19:29:42
>>> Objet: Re:
>> sometimes I will need to replace the content of a file by a new content
>> (remove/replace)
>>
>>
>> Tks a lot
>> Nicolas
>>
>> - Mail original -
>> De: "Jörn Franke"
>> À: nib...@free.fr
>> Cc: user@spark.apache
Jörn Franke"
> À: nib...@free.fr
> Cc: user@spark.apache.org
> Envoyé: Jeudi 3 Septembre 2015 19:29:42
> Objet: Re: Small File to HDFS
>
>
>
> Har is transparent and hardly any performance overhead. You may decide not
> to compress or use a fast compression algorithm, such as snapp
new content
(remove/replace)
Tks a lot
Nicolas
- Mail original -
De: "Jörn Franke"
À: nib...@free.fr
Cc: user@spark.apache.org
Envoyé: Jeudi 3 Septembre 2015 19:29:42
Objet: Re: Small File to HDFS
Har is transparent and hardly any performance overhead. You may decide not to
about performances ?
>
> - Mail original -
> De: "Jörn Franke"
> À: nib...@free.fr, user@spark.apache.org
> Envoyé: Jeudi 3 Septembre 2015 15:54:42
> Objet: Re: Small File to HDFS
>
>
>
>
> Store them as hadoop archive (har)
>
>
> Le mer. 2 se
: Jeudi 3 Septembre 2015 15:54:42
> Objet: Re: Small File to HDFS
>
>
>
>
> Store them as hadoop archive (har)
>
>
> Le mer. 2 sept. 2015 à 18:07, < nib...@free.fr > a écrit :
>
>
> Hello,
> I'am currently using Spark Streaming to collect small messag
R usage is , is it possible to use Pig on it
> and what about performances ?
>
> - Mail original -
> De: "Jörn Franke"
> À: nib...@free.fr, user@spark.apache.org
> Envoyé: Jeudi 3 Septembre 2015 15:54:42
> Objet: Re: Small File to HDFS
>
>
>
>
> Store
My main question in case of HAR usage is , is it possible to use Pig on it and
what about performances ?
- Mail original -
De: "Jörn Franke"
À: nib...@free.fr, user@spark.apache.org
Envoyé: Jeudi 3 Septembre 2015 15:54:42
Objet: Re: Small File to HDFS
Store them as hado
Store them as hadoop archive (har)
Le mer. 2 sept. 2015 à 18:07, a écrit :
> Hello,
> I'am currently using Spark Streaming to collect small messages (events) ,
> size being <50 KB , volume is high (several millions per day) and I have to
> store those messages in HDFS.
> I understood that stori
Pig on it
>> directly ?
>>
>> Tks
>> Nicolas
>>
>> - Mail original -
>> De: "Tao Lu"
>> À: nib...@free.fr
>> Cc: "Ted Yu" , "user"
>> Envoyé: Mercredi 2 Septembre 2015 19:09:23
>> Objet:
n the case of a big zip file, is it possible to easily process Pig on it
> directly ?
>
> Tks
> Nicolas
>
> - Mail original -
> De: "Tao Lu"
> À: nib...@free.fr
> Cc: "Ted Yu" , "user"
> Envoyé: Mercredi 2 Septembre 2015 19:0
c: "Ted Yu" , "user"
Envoyé: Mercredi 2 Septembre 2015 19:09:23
Objet: Re: Small File to HDFS
You may consider storing it in one big HDFS file, and to keep appending new
messages to it.
For instance,
one message -> zip it -> append it to the HDFS as one line
On
and
> don't want to add an other database in the loop
> Is it the only solution ?
>
> Tks
> Nicolas
>
> - Mail original -
> De: "Ted Yu"
> À: nib...@free.fr
> Cc: "user"
> Envoyé: Mercredi 2 Septembre 2015 18:34:17
> Objet: Re:
Hi,
I already store them in MongoDB in parralel for operational access and don't
want to add an other database in the loop
Is it the only solution ?
Tks
Nicolas
- Mail original -
De: "Ted Yu"
À: nib...@free.fr
Cc: "user"
Envoyé: Mercredi 2 Septembre 2015 18:34
Instead of storing those messages in HDFS, have you considered storing them
in key-value store (e.g. hbase) ?
Cheers
On Wed, Sep 2, 2015 at 9:07 AM, wrote:
> Hello,
> I'am currently using Spark Streaming to collect small messages (events) ,
> size being <50 KB , volume is high (several millions
Hello,
I'am currently using Spark Streaming to collect small messages (events) , size
being <50 KB , volume is high (several millions per day) and I have to store
those messages in HDFS.
I understood that storing small files can be problematic in HDFS , how can I
manage it ?
Tks
Nicolas
--
19 matches
Mail list logo