Looks like what I was suggesting doesn't work. :/

On Wed, Nov 11, 2015 at 4:49 PM, Jeff Zhang <zjf...@gmail.com> wrote:

> Yes, that's what I suggest. TextInputFormat support multiple inputs. So in
> spark side, we just need to provide API to for that.
>
> On Thu, Nov 12, 2015 at 8:45 AM, Pradeep Gollakota <pradeep...@gmail.com>
> wrote:
>
>> IIRC, TextInputFormat supports an input path that is a comma separated
>> list. I haven't tried this, but I think you should just be able to do
>> sc.textFile("file1,file2,...")
>>
>> On Wed, Nov 11, 2015 at 4:30 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>>
>>> I know these workaround, but wouldn't it be more convenient and
>>> straightforward to use SparkContext#textFiles ?
>>>
>>> On Thu, Nov 12, 2015 at 2:27 AM, Mark Hamstra <m...@clearstorydata.com>
>>> wrote:
>>>
>>>> For more than a small number of files, you'd be better off using
>>>> SparkContext#union instead of RDD#union.  That will avoid building up a
>>>> lengthy lineage.
>>>>
>>>> On Wed, Nov 11, 2015 at 10:21 AM, Jakob Odersky <joder...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hey Jeff,
>>>>> Do you mean reading from multiple text files? In that case, as a
>>>>> workaround, you can use the RDD#union() (or ++) method to concatenate
>>>>> multiple rdds. For example:
>>>>>
>>>>> val lines1 = sc.textFile("file1")
>>>>> val lines2 = sc.textFile("file2")
>>>>>
>>>>> val rdd = lines1 union lines2
>>>>>
>>>>> regards,
>>>>> --Jakob
>>>>>
>>>>> On 11 November 2015 at 01:20, Jeff Zhang <zjf...@gmail.com> wrote:
>>>>>
>>>>>> Although user can use the hdfs glob syntax to support multiple
>>>>>> inputs. But sometimes, it is not convenient to do that. Not sure why
>>>>>> there's no api of SparkContext#textFiles. It should be easy to implement
>>>>>> that. I'd love to create a ticket and contribute for that if there's no
>>>>>> other consideration that I don't know.
>>>>>>
>>>>>> --
>>>>>> Best Regards
>>>>>>
>>>>>> Jeff Zhang
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Reply via email to