Hi Xingcan,
thank you for your input!
On 27 February 2017 at 14:03, Xingcan Cui wrote:
> Hi Vasia and Greg,
>
> thanks for the discussion. I'd like to share my thoughts.
>
> 1) I don't think it's necessary to extend the algorithm list intentionally.
> It's just like a textbook that can not cove
Hi,Fabian,
Thanks for your attention to this discussion. Let me share some ideas
about this. :)
1. Yes, the solution I have proposed can indeed be extended to support
multi-watermarks. A single watermark is a special case of multiple
watermarks (n = 1). I agree that for the realization of the si
Hi,
So how can I read the available records of my datasource. I saw in some
examples that print() method will print the available data of that
datasource. ( like files )
Thanks,
Pawan
On Wed, Mar 1, 2017 at 11:30 AM, Xingcan Cui wrote:
> Hi Pawan,
>
> in Flink, most of the methods for DataSet
Hi Pawan,
in Flink, most of the methods for DataSet (including print()) will just add
operators to the plan but not really run it. If the DASInputFormat has no
error, you can run the plan by calling environment.execute().
Best,
Xingcan
On Wed, Mar 1, 2017 at 12:17 PM, Pawan Manishka Gunarathna <
Hi,
I have implemented a Flink InputFormat interface related to my datasource.
It have our own data type as *Record*. So my class seems as follows,
public class DASInputFormat implements InputFormat {
}
So when I executed the print() method, my console shows the Flink execution,
but nothing will
Hi all,
I have a question about the designate time for `rowtime`. The current
design do this during the DataStream to Table conversion. Does this mean
that `rowtime` is only valid for the source streams and can not be
designated after a subquery? (That's why I considered using alias to
dynamically
Hi Jincheng Sun,
registering watermark functions for different attributes to allow each of
them to be used in a window is an interesting idea.
However, watermarks only work well if the streaming data is (almost) in
timestamp order. Since it is not possible to sort a stream, all attributes
that wo
Hi Chen and Aljoscha,
thanks for the great proposal and work.
I prefer the WindowedOperator.getLateStream() variant without explicit tags.
I think it is fine to start adding side output to ProcessFunction (keyed
and non-keyed) and window operators and see how it is picked up by users.
Best, Fabi
Hi Philipp,
It's great to hear you are interested in Flink ML!
Based on your description, your prototype seems like an interesting
approach for combining online+offline learning. If you're interested, we
might find a way to integrate your work, or at least your ideas, into
Flink ML if we deci
Hi all,
I have completed a first implementation that works for the SQL query
SELECT a, SUM(b) OVER (PARTITION BY c ORDER BY a RANGE BETWEEN 2 PRECEDING) AS
sumB FROM MyTable
I have SUM, MAX, MIN, AVG, COUNT implemented but I could test it just on simple
queries such as the one above. Is there a
Tzu-Li (Gordon) Tai created FLINK-5939:
--
Summary: Wrong version in README.md for several Apache Flink
extensions
Key: FLINK-5939
URL: https://issues.apache.org/jira/browse/FLINK-5939
Project: Fli
Till Rohrmann created FLINK-5938:
Summary: Replace ExecutionContext by Executor in Scheduler
Key: FLINK-5938
URL: https://issues.apache.org/jira/browse/FLINK-5938
Project: Flink
Issue Type: I
Thinking about this a bit more...
I think it may be interesting to enable two modes for event-time
advancement in Flink
1) The current mode which I'll call partition-based, pessimistic,
event-time advancement
2) Key-based, eager, event-time advancement
In this key-based eager mode it's actually
Kostas Kloudas created FLINK-5937:
-
Summary: Add documentation about the task lifecycle.
Key: FLINK-5937
URL: https://issues.apache.org/jira/browse/FLINK-5937
Project: Flink
Issue Type: Bug
Quick update: I created a branch where I make the result type of
WindowedStream operations more specific:
https://github.com/aljoscha/flink/blob/windowed-stream-result-specific/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/WindowedStream.java
We would need this for t
@Tzu-Li Yes, the deluxe stream would not allow another keyBy(). Or we could
allow it but then we would exit the world of the deluxe stream and per-key
watermarks and go back to the realm of normal streams and keyed streams.
On Tue, 28 Feb 2017 at 10:08 Tzu-Li (Gordon) Tai
wrote:
> Throwing in so
By the way, I also don't see the benefit of doing the transition piece by
piece.
On Mon, 27 Feb 2017 at 22:21 Dawid Wysakowicz
wrote:
> I agree with adopting a custom codestyle/checkstyle for flink, but as I
> understood correctly most people agree there is no point of providing an
> unenforced
Alex DeCastro created FLINK-5936:
Summary: Can't pass keyed vectors to KNN join algorithm
Key: FLINK-5936
URL: https://issues.apache.org/jira/browse/FLINK-5936
Project: Flink
Issue Type: Im
David Anderson created FLINK-5935:
-
Summary: confusing/misleading error message when failing to
restore savepoint
Key: FLINK-5935
URL: https://issues.apache.org/jira/browse/FLINK-5935
Project: Flink
Till Rohrmann created FLINK-5934:
Summary: Scheduler in ExecutionGraph null if failure happens in
ExecutionGraph.restoreLatestCheckpointedState
Key: FLINK-5934
URL: https://issues.apache.org/jira/browse/FLINK-5934
Hi everyone, thanks for sharing your thoughts. I really like Timo’s
proposal, and I have a few thoughts want to share.
We want to keep the query same for batch and streaming. IMO. “process time”
is something special to dataStream while it is not a well defined term for
batch query. So it is kind o
On Tue, Feb 28, 2017 at 11:38 AM, Aljoscha Krettek wrote:
> I see the ProcessFunction as a bit of the generalised future of FlatMap, so
> to me it makes sense to only allow side outputs on the ProcessFunction but
> I'm open for anything. If we decide for this I'm happy with an additional
> method
About 1: We can definitely go with Jamie's proposal for the late data side
output, for me this is just a name and anything that has "late" in it is
perfect!
Regarding 2: I agree, and I though about implementing split/select on top
of side outputs and it should be easily doable. I think side output
Aljoscha Krettek created FLINK-5933:
---
Summary: Allow Evictor for merging windows
Key: FLINK-5933
URL: https://issues.apache.org/jira/browse/FLINK-5933
Project: Flink
Issue Type: Bug
Kostas Kloudas created FLINK-5932:
-
Summary: Order of legacy vs new state initialization in the
AbstractStreamOperator.
Key: FLINK-5932
URL: https://issues.apache.org/jira/browse/FLINK-5932
Project: F
1. I like the variant without the explicit OutputTag for the WindowOperator:
WindowedOperator windowedResult = input
.keyBy(...)
.window(...)
.apply(...)
DataStream lateData = windowedResult.getLateDataSideOutput();
I like Jamie's proposal getLateStream() a little better though. On the
oth
Throwing in some thoughts:
When a source determines that no more data will come for a key (which
in itself is a bit of a tricky problem) then it should signal to downstream
operations to take the key out of watermark calculations, that is that we
can release some space.
I don’t think this is p
27 matches
Mail list logo