It's a fairly small static file that may update once in a blue moon lol But I'm hopping to use existing functions. Why can't I just use CSV to table source?
Why should I have to now either write my own CSV parser or look for 3rd party, then what put in a Java Map and lookup that map? I'm finding Flink to be a bit of death by 1000 paper cuts lol if i put the CSV in a table I can then use it to join across it with the event no? On Fri, 27 Sep 2019 at 16:25, Sameer W <sam...@axiomine.com> wrote: > Connected Streams is one option. But may be an overkill in your scenario > if your CSV does not refresh. If your CSV is small enough (number of > records wise), you could parse it and load it into an object (serializable) > and pass it to the constructor of the operator where you will be streaming > the data. > > If the CSV can be made available via a shared network folder (or S3 in > case of AWS) you could also read it in the open function (if you use Rich > versions of the operator). > > The real problem I guess is how frequently does the CSV update. If you > want the updates to propagate in near real time (or on schedule) the option > 1 ( parse in driver and send it via constructor does not work). Also in > the second option you need to be responsible for refreshing the file read > from the shared folder. > > In that case use Connected Streams where the stream reading in the file > (the other stream reads the events) periodically re-reads the file and > sends it down the stream. The refresh interval is your tolerance of stale > data in the CSV. > > On Fri, Sep 27, 2019 at 3:49 PM John Smith <java.dev....@gmail.com> wrote: > >> I don't think I need state for this... >> >> I need to load a CSV. I'm guessing as a table and then filter my events >> parse the number, transform the event into geolocation data and sink that >> downstream data source. >> >> So I'm guessing i need a CSV source and my Kafka source and somehow join >> those transform the event... >> >> On Fri, 27 Sep 2019 at 14:43, Oytun Tez <oy...@motaword.com> wrote: >> >>> Hi, >>> >>> You should look broadcast state pattern in Flink docs. >>> >>> --- >>> Oytun Tez >>> >>> *M O T A W O R D* >>> The World's Fastest Human Translation Platform. >>> oy...@motaword.com — www.motaword.com >>> >>> >>> On Fri, Sep 27, 2019 at 2:42 PM John Smith <java.dev....@gmail.com> >>> wrote: >>> >>>> Using 1.8 >>>> >>>> I have a list of phone area codes, cities and their geo location in CSV >>>> file. And my events from Kafka contain phone numbers. >>>> >>>> I want to parse the phone number get it's area code and then associate >>>> the phone number to a city, geo location and as well count how many numbers >>>> are in that city/geo location. >>>> >>>