Re: [DISCUSS] "latestFirst" option and metadata growing issue in File stream source

2020-07-30 Thread vikram agrawal
If we compare file-stream source with other streaming sources such as Kafka, the current behavior is indeed incomplete. Starting the streaming from a custom offset/particular point of time is something that is missing. Typically filestream sources don't have auto-deletion of the older data/files.

Re: My curation of pending structured streaming PRs to review

2019-08-13 Thread vikram agrawal
Thanks, Jungtaek for curating this list. It covers a lot of important fixes and performance improvements in structured streaming. Hi Devs What is missing from process perspective from getting these PRs merged? Apart from this list, is there any other forum where we can request attention to such i

Structured Streaming Support for Amazon Kinesis in Spark 2.2.x

2018-03-13 Thread vikram agrawal
Hi All, I have implemented Kinesis Connector for Structured Streaming. The code is available is at https://github.com/qubole/kinesis-sql. Open Source Jira for the same is SPARK-18165 Design details are mentioned here - https://docs.google.com/p