def readGenericRecords(sc: SparkContext, inputDir: String, startDate: Date, endDate: Date) = {
val path = getInputPaths(inputDir, startDate, endDate) sc.newAPIHadoopFile[AvroKey[GenericRecord], NullWritable, AvroKeyInputFormat[GenericRecord]]("/A/B/C/D/D/2015/05/22/out-r-*.avro") } This is my method, can you show me where should i modify to use FileInputFormat ? If you add the path there what should you give while invoking newAPIHadoopFile On Wed, May 27, 2015 at 2:20 PM, Eugen Cepoi <cepoi.eu...@gmail.com> wrote: > You can do that using FileInputFormat.addInputPath > > 2015-05-27 10:41 GMT+02:00 ayan guha <guha.a...@gmail.com>: > >> What about /blah/*/blah/out*.avro? >> On 27 May 2015 18:08, "ÐΞ€ρ@Ҝ (๏̯͡๏)" <deepuj...@gmail.com> wrote: >> >>> I am doing that now. >>> Is there no other way ? >>> >>> On Wed, May 27, 2015 at 12:40 PM, Akhil Das <ak...@sigmoidanalytics.com> >>> wrote: >>> >>>> How about creating two and union [ sc.union(first, second) ] them? >>>> >>>> Thanks >>>> Best Regards >>>> >>>> On Wed, May 27, 2015 at 11:51 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> >>>> wrote: >>>> >>>>> I have this piece >>>>> >>>>> sc.newAPIHadoopFile[AvroKey[GenericRecord], NullWritable, >>>>> AvroKeyInputFormat[GenericRecord]]( >>>>> "/a/b/c/d/exptsession/2015/05/22/out-r-*.avro") >>>>> >>>>> that takes ("/a/b/c/d/exptsession/2015/05/22/out-r-*.avro") this as >>>>> input. >>>>> >>>>> I want to give a second directory as input but this is a invalid syntax >>>>> >>>>> >>>>> that takes ("/a/b/c/d/exptsession/2015/05/*22*/out-r-*.avro", >>>>> "/a/b/c/d/exptsession/2015/05/*21*/out-r-*.avro") >>>>> >>>>> OR >>>>> >>>>> ("/a/b/c/d/exptsession/2015/05/*22*/out-r-*.avro, >>>>> /a/b/c/d/exptsession/2015/05/*21*/out-r-*.avro") >>>>> >>>>> >>>>> Please suggest. >>>>> >>>>> >>>>> >>>>> -- >>>>> Deepak >>>>> >>>>> >>>> >>> >>> >>> -- >>> Deepak >>> >>> > -- Deepak