def readGenericRecords(sc: SparkContext, inputDir: String, startDate:
Date, endDate: Date) = {

    val path = getInputPaths(inputDir, startDate, endDate)

   sc.newAPIHadoopFile[AvroKey[GenericRecord], NullWritable,
AvroKeyInputFormat[GenericRecord]]("/A/B/C/D/D/2015/05/22/out-r-*.avro")

  }


This is my method, can you show me where should i modify to use
FileInputFormat ? If you add the path there what should you give while
invoking newAPIHadoopFile

On Wed, May 27, 2015 at 2:20 PM, Eugen Cepoi <cepoi.eu...@gmail.com> wrote:

> You can do that using FileInputFormat.addInputPath
>
> 2015-05-27 10:41 GMT+02:00 ayan guha <guha.a...@gmail.com>:
>
>> What about /blah/*/blah/out*.avro?
>> On 27 May 2015 18:08, "ÐΞ€ρ@Ҝ (๏̯͡๏)" <deepuj...@gmail.com> wrote:
>>
>>> I am doing that now.
>>> Is there no other way ?
>>>
>>> On Wed, May 27, 2015 at 12:40 PM, Akhil Das <ak...@sigmoidanalytics.com>
>>> wrote:
>>>
>>>> How about creating two and union [ sc.union(first, second) ] them?
>>>>
>>>> Thanks
>>>> Best Regards
>>>>
>>>> On Wed, May 27, 2015 at 11:51 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>>> wrote:
>>>>
>>>>> I have this piece
>>>>>
>>>>> sc.newAPIHadoopFile[AvroKey[GenericRecord], NullWritable,
>>>>> AvroKeyInputFormat[GenericRecord]](
>>>>> "/a/b/c/d/exptsession/2015/05/22/out-r-*.avro")
>>>>>
>>>>> that takes ("/a/b/c/d/exptsession/2015/05/22/out-r-*.avro") this as
>>>>> input.
>>>>>
>>>>> I want to give a second directory as input but this is a invalid syntax
>>>>>
>>>>>
>>>>> that takes ("/a/b/c/d/exptsession/2015/05/*22*/out-r-*.avro",
>>>>> "/a/b/c/d/exptsession/2015/05/*21*/out-r-*.avro")
>>>>>
>>>>> OR
>>>>>
>>>>> ("/a/b/c/d/exptsession/2015/05/*22*/out-r-*.avro,
>>>>> /a/b/c/d/exptsession/2015/05/*21*/out-r-*.avro")
>>>>>
>>>>>
>>>>> Please suggest.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Deepak
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Deepak
>>>
>>>
>


-- 
Deepak

Reply via email to