Hi Andrew, That will be great if you can come up with something to show the idea. There are lots of wiki pages on the github you can refer to(including the server side architecture and client side architecture). The unique feature of the OpenRefine is its ability to have the user to interact with the system and do the point-and-click to wangle the data set. But this is also I think hardest part to migrate/refactoring to another system like flink and spark. Or you can say OpenRefine deals with finite data set but Flink/Spark deal with infinite data set with high velocity and variation. They have some intersection in between of course. While we adopt a new framework like Flink, we may have to consider what to give up and what should be kept.
Jacky On Wed, Jun 14, 2017 at 2:25 PM, Andrew Psaltis <psaltis.and...@gmail.com> wrote: > Thad, > Based on your description that OpenRefine uses similar techniques as > Zeeplin then I *think* the reading and writing will work. > > The Undo/Redo I am fuzzy on as. > > I will try over the next couple of days and see if I can make something > like this work (at lest a trivial use case). Personally I think it would be > cool to allow business users to wrangle data with OpenRefine with the power > of Flink behind it. > > > > On Wed, Jun 14, 2017 at 7:53 PM, Thad Guidry <thadgui...@gmail.com> wrote: > >> Andrew, >> >> So you idea is that Flink could be used as a storage abstraction layer for >> OpenRefine ? Where OpenRefine would use TableSources for reading and >> TableSinks for writing ? >> And would that still work with our concept of Undo/Redo in OpenRefine to >> use Flink's Savepoints in concert with TableSources and TableSinks ? That >> last part is where I am reading Flink docs now and still seeing a lot of >> fuzzyness, which worries me. >> >> -Thad >> +ThadGuidry <https://www.google.com/+ThadGuidry> >> >> > >> > >> > > > > -- > Thanks, > Andrew > > Subscribe to my book: Streaming Data <http://manning.com/psaltis> > <https://www.linkedin.com/pub/andrew-psaltis/1/17b/306> > twiiter: @itmdata <http://twitter.com/intent/user?screen_name=itmdata> >