Re: Reading and processing binary format files using spark

2014-05-03 Thread Mayur Rustagi
Hadoop Input & Output format would be the best way. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi On Sat, May 3, 2014 at 10:12 AM, Chengi Liu wrote: > Hi, >Lets say I have millions of binary format files... Lets say

Reading and processing binary format files using spark

2014-05-02 Thread Chengi Liu
Hi, Lets say I have millions of binary format files... Lets say I have this java (or python) library which reads and parses these binary formatted files.. Say import foo f = foo.open(filename) header = f.get_header() and some other methods.. What I was thinking was to write hadoop input format