Glad it worked.

At the same time, I think adding filters to the FileSystem Source is a
fairly common scenario, but the current documentation lacks relevant
examples.
We could improve this section with examples and also add relevant methods
to the Builder class to facilitate passing the necessary filters during use.


Best,
Feng



On Wed, Aug 21, 2024 at 1:36 AM amogh joshi <amoghjo...@gmail.com> wrote:

> Thanks Feng.
>
> Above code worked well and served its purpose.
>
> I just modified it to use *NonSplittingRecursiveAllDirEnumerator* instead
> of *NonSplittingRecursiveEnumerato**r* as the regex filter was to be
> applied on the individual filenames, excluding the parent directory
> specified as the path.
>
> FileSource.forRecordStreamFormat(new TextLineInputFormat(),new 
> Path("file:///tmp/test/"))
>     .setFileEnumerator(() -> new 
> NonSplittingRecursiveAllDirEnumerator(".+\\.csv"))
>     .monitorContinuously(Duration.ofSeconds(30))
>     .build();
>
> Regards,
> Amogh.
>
> On Sun, Aug 18, 2024 at 2:18 PM Feng Jin <jinfeng1...@gmail.com> wrote:
>
>> Hi Amogh
>>
>> You can test the code below:
>>
>> ```java
>> FileSource.forRecordStreamFormat(csvFormat, new Path("file:///tmp/test"))
>> .setFileEnumerator(() -> new NonSplittingRecursiveEnumerator(path ->
>> path.getName().endsWith(".csv")))
>> .build();
>> ```
>>
>> Best,
>> Feng
>>
>>
>>
>> On Sat, Aug 17, 2024 at 4:41 AM amogh joshi <amoghjo...@gmail.com> wrote:
>>
>>> Hi Users,
>>>
>>> Any clues on configurable regex path for FilesSource/Filesystem
>>> connector for stream APIs is appreciated.
>>>
>>> Regards,
>>> Amogh.
>>>
>>>
>>> On Thu, 15 Aug, 2024, 11:18 amogh joshi, <amoghjo...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am building a pretty straightforward processing pipeline as described
>>>> below, using *DataStream* *APIs* and *FileSystem connector*.
>>>>
>>>> *filesystem-source -> transforms -> database-sink*
>>>>
>>>> Everything worked well till the filesystem (source) had just a single
>>>> type (JSON) of files. Recently the filesystem got modified to store other
>>>> types of files too (like parquet, text, etc.).
>>>>
>>>> It appeared just a matter of specifying some regex to the FileSource or
>>>> SplitEnumerator and creating different sources for different types of
>>>> files.
>>>>
>>>> However, the two public methods *FileSource::forRecordStreamFormat *and
>>>> *FileSource::forBulkFileFormat* do not allow configuring the regex
>>>> based file paths.
>>>>
>>>> What could be the recommended way to configure a custom regex for a
>>>> FileSource?
>>>>
>>>> I am using version 1.19.1.
>>>>
>>>> Regards,
>>>> Amogh.
>>>>
>>>

Reply via email to