GaoLun created FLINK-3783:
-
Summary: Support weighted random sampling with reservoir
Key: FLINK-3783
URL: https://issues.apache.org/jira/browse/FLINK-3783
Project: Flink
Issue Type: Improvement
Hi Stephan and Ufuk,
Thank you for your reply.
I have assigned uid to the "assignTimestampsAndWatermarks", "addSource",
"apply" operators. However, I couldn't assign uid to the time window. Therefore
the time window doesn't hold any state regarding timestamp.
For example, I implemented a cust
Chenguang He created FLINK-3782:
---
Summary: ByteArrayOutputStream and ObjectOutputStream should close
Key: FLINK-3782
URL: https://issues.apache.org/jira/browse/FLINK-3782
Project: Flink
Issue T
I was trying out the new scala-shell with streaming support...
The following code executes correctly the first time I run it:
val survival = benv.readCsvFile[(String, String, String,
String)]("file:///home/trevor/gits/datasets/haberman/haberman.data")
survival.count()
However, if I call survival
If you work on plain Integer (or other non-POJO types) you need to
provide a KeySelector to make it work.
For you case something like this:
.keyBy(new KeySelector() {
@Override
public Integer getKey(Integer value) throws Exception {
return value;
}
})
As S
It throws by
if (!type.isTupleType()) {
throw new InvalidProgramException("Specifying
keys via field positions is only valid " +
"for tuple data types. Type: "
+ type);
}
So, like Saikat say
Not entirely related, but for the special case of writing a parallelized source
that emits records in event time order, I found the MergeIterator to be most
useful. Here's an
example:https://github.com/nupic-community/flink-htm/blob/eb29f97f08f3482b32228db7284f669aad8dce2e/flink-htm-streaming-s
Unfortunately making code generation a separate module would introduce
cyclic dependency.
Code generation requires the TypeInfo which is available in flink-core and
flink-core requires
the generated serializers from the code generation module. Do you have a
solution for this?
I think if we can com
Hello Jitendra,
I am new to Flink community but may have seen this issue earlier. Can you
try to use DataStream> instead of KeyedStream integers4
Regards Saikat
On Mon, Apr 18, 2016 at 7:37 PM, Jitendra Agrawal <
jitendra.agra...@northgateps.com> wrote:
> Hi Team,
>
> Problem Description : Wh
Ted Yu created FLINK-3781:
-
Summary: BlobClient may be left unclosed in
BlobCache#deleteGlobal()
Key: FLINK-3781
URL: https://issues.apache.org/jira/browse/FLINK-3781
Project: Flink
Issue Type: Bug
I think then you have to either reconfigure your cluster environment or
wait until we bump the Akka version to 2.4.x which supports having an
internal and external IP address.
Cheers,
Till
On Fri, Apr 15, 2016 at 6:36 PM, star jlong
wrote:
> Hi Till/Ned,
>
> Soory I thought this was my post.
>
Hi Team,
Problem Description : When I was calling *reduce()* method on keyedStream
object then getting Ecxeption as
"* org.apache.flink.api.common.InvalidProgramException: Specifying keys via
field positions is only valid for tuple data types. Type: Integer*".
StreamExecutionEnvironment env
Greg Hogan created FLINK-3780:
-
Summary: Jaccard Similarity
Key: FLINK-3780
URL: https://issues.apache.org/jira/browse/FLINK-3780
Project: Flink
Issue Type: New Feature
Components: Gell
Ufuk Celebi created FLINK-3779:
--
Summary: Add support for queryable state
Key: FLINK-3779
URL: https://issues.apache.org/jira/browse/FLINK-3779
Project: Flink
Issue Type: Improvement
C
Hi!
Yes, window contents is part of savepoints. If you change the topology, it
is crucial that the new topology matches the old window contents to the new
operator.
If you change the structure of the program, you probably need to assign
persistent names to the operators. See
https://ci.apache.org
Till Rohrmann created FLINK-3778:
Summary: ScalaShellRemoteStreamEnvironment cannot be forwarded a
user configuration
Key: FLINK-3778
URL: https://issues.apache.org/jira/browse/FLINK-3778
Project: Fli
As Flavio underlined, it is not about selecting a certain number of rows,
but executing queries with sequence of joins on a very large database. We
played around to find the best throughput. Honestly, I prefer to have many
smaller range-queries with more parallel threads than fewer expensive
querie
Can you please share the program before and after the savepoint?
– Ufuk
On Mon, Apr 18, 2016 at 3:11 PM, Ozan DENİZ wrote:
> Hi everyone,
>
> I am trying to implement savepoint mechanism for my Flink project.
>
> Here is the scenario:
>
> I got the snapshot of Flink application by using "flink s
Dear Flink community,
Please vote on releasing the following candidate as Apache Flink version 1.0.2.
The commit to be voted on:
d39af152a166ddafaa2466cdae82695880893f3e
Branch:
release-1.0.2-rc3 (see
https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-1.
Hi everyone,
I am trying to implement savepoint mechanism for my Flink project.
Here is the scenario:
I got the snapshot of Flink application by using "flink savepoint "
command while the application is running.
After saving snapshot of application, I canceled the job from web ui than I
cha
+1 for not mixing Java and Scala in flink-core.
Maybe it makes sense to implement the code generated serializers /
comparators as a separate module which can be plugged-in. This could be
pure Scala.
In general, I think it would be good to have some kind of "version
management" for serializers in p
OK, thanks for the responses. I'll go ahead and include it for RC3.
On Mon, Apr 18, 2016 at 12:20 PM, Aljoscha Krettek wrote:
> +1, since others are blocked on this
>
> On Mon, 18 Apr 2016 at 12:10 Ufuk Celebi wrote:
>
>> I am fine with it since it only touches a utility class. So +1 to
>> inclu
+1, since others are blocked on this
On Mon, 18 Apr 2016 at 12:10 Ufuk Celebi wrote:
> I am fine with it since it only touches a utility class. So +1 to
> include it as well.
>
> – Ufuk
>
> On Mon, Apr 18, 2016 at 12:07 PM, Fabian Hueske wrote:
> > There is also FLINK-3657 which was recently me
+1 from my side as well
On Mon, Apr 18, 2016 at 12:10 PM, Ufuk Celebi wrote:
> I am fine with it since it only touches a utility class. So +1 to
> include it as well.
>
> – Ufuk
>
> On Mon, Apr 18, 2016 at 12:07 PM, Fabian Hueske wrote:
> > There is also FLINK-3657 which was recently merged to
I am fine with it since it only touches a utility class. So +1 to
include it as well.
– Ufuk
On Mon, Apr 18, 2016 at 12:07 PM, Fabian Hueske wrote:
> There is also FLINK-3657 which was recently merged to master.
> This feature was requested by the Mahout community and the commit changes
> the vi
Hi Fabian, I've just created a JIRA for that (FLINK-3777).
As you said input split should be not too fine-grained but we have a table
with 11 billions of rows that can't be queried with ranges greated than
100K of rows because it has a lot of JOIN and increasing thhis threashold
implies incredibly
There is also FLINK-3657 which was recently merged to master.
This feature was requested by the Mahout community and the commit changes
the visibility of a method in DataSetUtils.
So strictly speaking, this is not a bug fix but a new feature. On the other
hand, it is very lightweight change and doe
Flavio Pompermaier created FLINK-3777:
-
Summary: Add open and close methods to manage IF lifecycle
Key: FLINK-3777
URL: https://issues.apache.org/jira/browse/FLINK-3777
Project: Flink
Iss
I agree, a method to close an input format is missing.
InputFormat is an API stable interface, so it is not possible to extend it
(until Flink 2.0). RichInputFormat is API stable as well, but an abstract
class. So it should be possible to add an empty default implementation of a
closeInputFormat()
This vote is cancelled in favour of RC3, because of the problem raised
by Fabian.
On Fri, Apr 15, 2016 at 1:02 PM, Fabian Hueske wrote:
> There is a request from the Mahout community to include a fix for
> FLINK-3762 in 1.0.2.
> Stephan gave a +1 [1] and I would also like to include it.
>
> I'm r
Till Rohrmann created FLINK-3776:
Summary: Flink Scala shell does not allow to set configuration for
local execution
Key: FLINK-3776
URL: https://issues.apache.org/jira/browse/FLINK-3776
Project: Flin
Till Rohrmann created FLINK-3775:
Summary: Flink Scala shell does not respect Flink configuration
Key: FLINK-3775
URL: https://issues.apache.org/jira/browse/FLINK-3775
Project: Flink
Issue Ty
Till Rohrmann created FLINK-3774:
Summary: Flink configuration is not correctly forwarded to
PlanExecutor in ScalaShellRemoteEnvironment
Key: FLINK-3774
URL: https://issues.apache.org/jira/browse/FLINK-3774
Hi,
yes, I'm afraid you need a custom operator for that. (We are working on
providing built-in support for this, though)
I sketched an Operator that does the sorting and also wrote a quick example
that uses it:
SortedWindowOperator:
https://gist.github.com/aljoscha/6600bc1121b7f8a0f68b89988dd341bd
Yes, I know Janino is a pure Java project. I meant if we add Scala code to
flink-core, we should add Scala dependency to flink-core and it could be
confusing.
Regards,
Chiwan Park
> On Apr 18, 2016, at 2:49 PM, Márton Balassi wrote:
>
> Chiwan, just to clarify Janino is a Java project. [1]
>
Yes, I forgot to mention that I could instantiate the connection in the
configure() but then I can't close it (as you confirmed) :(
On Mon, Apr 18, 2016 at 9:46 AM, Aljoscha Krettek
wrote:
> There is also InputFormat.configure() which is called before any split
> processing happens. But I see yo
There is also InputFormat.configure() which is called before any split
processing happens. But I see your point about a missing close() method
that is called after all input splits have been processed.
On Mon, 18 Apr 2016 at 09:44 Stefano Bortoli wrote:
> Of course there is one already. We'll lo
Of course there is one already. We'll look into the runtime context.
saluti,
Stefano
2016-04-18 9:41 GMT+02:00 Stefano Bortoli :
> Being a generic JDBC input format, I would prefer to stay with Row,
> letting the developer manage the cast according to the driver
> functionalities.
>
> As for the
Being a generic JDBC input format, I would prefer to stay with Row, letting
the developer manage the cast according to the driver functionalities.
As for the open() and close() issue, I agree with Flavio that we'd need a
better management of the inputformat lifecycle. Perhaps a new interface
exten
Talking with Stefano this morning and looking at the DataSourceTask code we
discovered that the open() and close() methods are both called for every
split and not once per inputFormat instance (maybe open and close should be
renamed as openSplit and closeSplit to avoid confusion...).
I think that i
40 matches
Mail list logo