Alexander Alexandrov created FLINK-20043:
Summary: Add flink-sql-connector-kinesis package
Key: FLINK-20043
URL: https://issues.apache.org/jira/browse/FLINK-20043
Project: Flink
Alexander Alexandrov created FLINK-20042:
Summary: Add end-to-end tests for Kinesis Table sources and sinks
Key: FLINK-20042
URL: https://issues.apache.org/jira/browse/FLINK-20042
Project
FYI, I recently revisited state-of-the-art CSV parsing libraries for Emma.
I think this blog post might be useful
https://github.com/uniVocity/csv-parsers-comparison
The uniVocity parsers library seems to be dominating the benchmarks and is
feature complete.
As far as I can tell at the moment u
Just to clarify - by "losing the commit history" you actually mean "losing
the ability to annotate each line in a file with its last commit", right?
Or is there some other sense in which something is lost after applying bulk
re-format?
Cheers,
A.
On Sat, Feb 25, 2017 at 7:10 AM Henry Saputra
wr
/org/apache/flink/api/common/typeinfo/TypeInfoFactory.java
On Sat, Oct 8, 2016 at 4:00 PM Alexander Alexandrov <
alexander.s.alexand...@gmail.com> wrote:
I wanted to open this directly as a JIRA to follow-up on FLINK-3042,
however my account (aalexandrov) does not seem to have the nec
I wanted to open this directly as a JIRA to follow-up on FLINK-3042,
however my account (aalexandrov) does not seem to have the necessary
privileges, so I will post this to the dev list instead.
The current approach for registration of custom `TypeInformation`
implementations which relies exclusiv
> As far as I know, the reason why the broadcast variables are implemented
that way is that the senders would have to know which sub-tasks are
deployed to which TMs.
As the broadcast variables are realized as additionally attached "broadcast
channels", I am assuming that the same behavior will app
graph get re-executed.
>
> (c) You have two operators with the same name that become tasks with the
> same name.
>
> Do any of those explanations make sense in your setting?
>
> Stephan
>
>
> On Tue, May 31, 2016 at 12:48 PM, Alexander Alexandrov <
> alexander.s.a
Hello,
I am analyzing the logs from a Flink batch job and am seeing the following
two lines:
2016-05-30 15:32:31,701 INFO ...- DataSource (at ${path}) (4/4)
(7efe8fcfe9c7c7e6cd4683e1b5c06a3a) switched from SCHEDULED to DEPLOYING
2016-05-30 15:32:31,701 INFO ...- DataSource (at $
Hi Greg,
I just pushed v1.0.0-rc2 for Peel to Sonatype.
As Till said, we are using the framework extensively at the TU for
benchmarking and comparing different systems (mostly Flink and Spark).
We recently used Peel to conduct some experiments for FLINK-2237. If you
want to learn more about the
Is it possible to link to important JIRA-s in the list of new features as
you did in the 0.8.0 release notes?
For example, I was wondering whether I can find more information about the
"Off-heap Managed Memory" model.
Regards,
Alexander
2015-11-14 20:53 GMT+01:00 Ron Crocker :
> Hi Fabian -
>
>
I wouldn't stop with GitHub - the main benefit for spaces is that the code
looks the same on all viewers because it does not depend on a user-specific
parameter (the size of the tab).
2015-11-09 14:02 GMT+01:00 Ufuk Celebi :
> Minor thing in favour of spaces: Reviewability on GitHub is improved (
My two cents - there are already Maven artifacts deployed for 2.11 in the
SNAPSHOT repository. I think it might be confusing if they suddenly
disappear for the stable release.
2015-10-29 11:58 GMT+01:00 Maximilian Michels :
> Seems like we agree that we need artifacts for different versions of S
just cherry-picked (and if needed amended) by a committer without
too much unnecessary discussion and excluded from the "shepherding process".
2015-10-17 12:32 GMT+02:00 Alexander Alexandrov <
alexander.s.alexand...@gmail.com>:
> One suggestion from me: in GitHub you can ma
One suggestion from me: in GitHub you can make clear who the current
sheppard is through the "Assignee" field in the PR (which can and IMHO
should be different from the user who actually opened the request).
Regards,
A.
2015-10-16 15:58 GMT+02:00 Fabian Hueske :
> Hi folks,
>
> I think we can sp
gt; >
> > @Chiwan:
> >
> > There are a few mentionings of the Scala version in the docs as well. For
> > example in "docs/index.md" and on the website under "downloads".
> >
> > We should make sure we explain on these pages that the
t; Regards,
> > Chiwan Park
> >
> >
> >> On Jul 2, 2015, at 2:57 PM, Alexander Alexandrov <
> alexander.s.alexand...@gmail.com> wrote:
> >>
> >> @Chiwan: let me know if you need hands-on support. I'll be more then
> happy to help (as my do
Alexander Alexandrov created FLINK-2311:
---
Summary: Set flink-* dependencies in flink-contrib as "provided"
Key: FLINK-2311
URL: https://issues.apache.org/jira/browse/FLINK-2311
Proj
What about adding some state state to the DataBag internals that tracks the
following conditions
1. whether the last job execution was triggered by an "enforcer" API method
like print() / collect();
2. whether a DataSource / lazy operator was created after that;
If 1 is true and 2 is false, a WAR
I added a comment with suggestions how to proceed in the JIRA issue.
2015-06-17 22:41 GMT+02:00 :
>
> Hello dear Developer,
> Currently aggregation functions are implemented based on sorting. We would
> like to add hash based aggregation to Flink. We would be thankful if you
> could tell as how t
During an offline chat some time ago Stephan Ewen mentioned that there is
an ongoing effort for a dynamic memory allocation in some feature branch
lying around. Can you point me to that, as I would like to look at the
code? Thanks.
I've seen some work on adaptive learning rates in the past days.
Maybe we can think about extending the base algorithm and comparing the use
case setting for the IMPRO-3 project.
@Felix you can discuss this with the others on Wednesday, Manu will be also
there and can give some feedback, I'll try
Hi there,
I was trying to find away to get the metainformation about the TM executing
an operator from via the ExecutionContext object, but without success.
Is this possible at the moment? If not, any objections to add it (I would
prepare an issue with a patch).
Regards,
Alex
I think that these two should be renamed to flink-optimizer, no?
./flink-staging/flink-language-binding/flink-python/pom.xml:
flink-compiler
./flink-staging/flink-language-binding/flink-language-binding-generic/pom.xml:
flink-compiler
2015-05-19 21:07 GMT+02:00 Alexander Alexandrov
I had a different issue related to the fact that
flink-language-binding-generic was not able to find (a potentially
outdated) flink-compiler dependency. I had to wipe out the local flink
artifacts from my .m2/repository to make this work.
2015-05-19 18:06 GMT+02:00 Robert Metzger :
> We could act
PS. Is there a particular reason why the APIs are stacked above each other
in the picture (ML on top of Gelly on top of the Table API)? I was actually
picturing the three next to each other...
2015-05-12 12:08 GMT+02:00 Alexander Alexandrov <
alexander.s.alexand...@gmail.com>:
> I s
I suggest to change the layout of the bottom half in the following way
(will solve the alignment issue):
- 2 column layout in 1:1 ratio for *Getting Started*, 1st column with the
text and the download button, second column with the maven code snippets
- 2 column layout in 1:1 ratio for the *Recen
ot; to
> correctly multiply, ideally as a group for streamlined multiplication.
>
> Johannes
>
> -Ursprüngliche Nachricht-
> Von: Alexander Alexandrov [mailto:alexander.s.alexand...@gmail.com]
> Gesendet: Sonntag, 26. April 2015 23:22
> An: dev@flink.apache.org
> B
I thought about your problem over the weekend. Unfortunately the algorithm
that you describe does not fit "regular" equi-join semantics, but I think
it could be "fitted" with a more complex dataflow.
To achieve that, I would partition the (active) domain of the two datasets
on fine-granular interv
Hey there,
Please use the user mailing list for user-related questions (this list is
for Flink internals only).
At the moment outer joins are not directly supported in Flink, but there
are good indications that this will change in the next 4-8 weeks. For the
time being, you can use a CoGroup with
Hello,
Can you please re-post this on the user list and make sure you have
formatted the example code.
At the moment it is kind of hard to read.
2015-04-09 15:35 GMT+02:00 hager sallah :
> I want write program flink on any databaseuser input filed and type of
> filed and when read database want
Hi Martin,
The answer of your question really depends on the DOP in which you will be
running the job and the expected selectivity (the fraction of lines with
that certain ID) in case this does not depend on "the other side" and can
be pre-filtered prior to broadcasting.
However, since Flink's op
> Should "print()" be also an "eager" statement?
I would expect this to be the case as I can only imagine an implementation
of print() via collect().
2015-04-06 14:37 GMT+02:00 Stephan Ewen :
> count() and collect() need to immediately trigger an execution, because the
> driver program cannot pr
Alexander Alexandrov created FLINK-1829:
---
Summary: Conflicting Jackson version in the Flink POMs
Key: FLINK-1829
URL: https://issues.apache.org/jira/browse/FLINK-1829
Project: Flink
I have a similar issue here:
I would like to run a dataflow up to a particular point and materialize (in
memory) the intermediate result. Is this possible at the moment?
Regards,
Alex
2015-04-02 17:33 GMT+02:00 Felix Neutatz :
> Hi,
>
> I have run the following program:
>
> final ExecutionEnvir
+Table
2015-03-26 10:28 GMT+01:00 Robert Metzger :
> +Table
>
>
> On Thu, Mar 26, 2015 at 10:13 AM, Aljoscha Krettek
> wrote:
>
> > Thanks Henry. :D
> >
> > +Relation
> >
> > On Thu, Mar 26, 2015 at 9:36 AM, Till Rohrmann
> > wrote:
> > > +Table
> > >
> > > On Thu, Mar 26, 2015 at 9:32 AM, Márt
Will take a look at that. The easiest solution would be to fix the version
to 2.10 for now. Is this OK?
2015-03-25 1:48 GMT+01:00 Stephan Ewen (JIRA) :
> Stephan Ewen created FLINK-1781:
> ---
>
> Summary: Quickstarts broken due to Scala Version Variab
+1 for DataTable as core abstraction name and "flink-table" or something
similar as the package name.
2015-03-25 11:54 GMT+01:00 Aljoscha Krettek :
> I also prefer Relation. So what should we do? Doesn't really look like
> consensus.
>
> On Sat, Mar 21, 2015 at 6:02 PM, Paris Carbone wrote:
> >
n Gábor <
> reckone...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > +1 for the stricter Java code styles.
> > >> > >
> > >> > > We should not forget about providing code formatter settings for
> > >
+1 for not limiting the line length.
2015-03-16 14:39 GMT+01:00 Stephan Ewen :
> +1 for not limiting the line length. Everyone should have a good sense to
> break lines. When in exceptional cases people violate this, it is usually
> for a good reason.
>
> On Mon, Mar 16, 2015 at 2:18 PM, Maximili
suffixed packages.
> So far, the demand for Scala 2.11 has been low on the mailing list, and for
> those who want to use it, we have a good way to enable it manually.
>
> On Mon, Mar 16, 2015 at 12:35 PM, Alexander Alexandrov <
> alexander.s.alexand...@gmail.com> wrote:
>
eed to set the properties correctly when building their flink
> projects.
>
>
>
>
> On Wed, Mar 11, 2015 at 12:41 AM, Alexander Alexandrov <
> alexander.s.alexand...@gmail.com> wrote:
>
> > The PR is here: https://github.com/apache/flink/pull/477
> >
>
+1
2015-03-12 9:23 GMT+01:00 Ufuk Celebi :
> +1 I think it's a good idea to remove it and finish the deprecation. ;)
>
> Thanks for looking into it Fabian.
>
> – Ufuk
>
> On 10 Mar 2015, at 20:42, Henry Saputra wrote:
>
> > Thanks guys,
> >
> > I have filed FLINK-1681 [1] to track this issue.
>
+1
2015-03-11 9:41 GMT+01:00 Till Rohrmann :
> If Spargel's functionality is a subset of Gelly, I'm also in favor of a
> deprecation. This will direct new users directly to Gelly and gives old
> ones time to adapt their code.
>
> On Wed, Mar 11, 2015 at 1:56 AM, Henry Saputra
> wrote:
>
> > Than
The PR is here: https://github.com/apache/flink/pull/477
Cheers!
2015-03-10 18:07 GMT+01:00 Alexander Alexandrov <
alexander.s.alexand...@gmail.com>:
> Yes, will do.
>
> 2015-03-10 16:39 GMT+01:00 Robert Metzger :
>
>> Very nice work.
>> The changes are probably
for scala 2.11 ?
>
> On Tue, Mar 10, 2015 at 2:50 PM, Alexander Alexandrov <
> alexander.s.alexand...@gmail.com> wrote:
>
> > We have is almost ready here:
> >
> > https://github.com/stratosphere/flink/commits/scala_2.11_rebased
> >
> > I wanted to op
eal to offer scala_version x hadoop_version
> builds
> > > for newer releases.
> > > You only need to add more builds here:
> > >
> >
> https://github.com/apache/flink/blob/master/tools/create_release_files.sh#L131
> > >
> > >
> > >
>
+1 for Scala
2015-03-09 15:34 GMT+01:00 Márton Balassi :
> Then if no objections in 24 hours I'd open a JIRA issue for this.
>
> On Mon, Mar 9, 2015 at 3:23 PM, Till Rohrmann
> wrote:
>
> > +1 for Scala :-)
> >
> > On Sat, Mar 7, 2015 at 1:56 PM, Márton Balassi >
> > wrote:
> >
> > > I'm strong
or does a maven property work?
> (Profile
> > may be needed for quasiquotes dependency?)
> >
> > On Mon, Mar 2, 2015 at 4:36 PM, Alexander Alexandrov <
> > alexander.s.alexand...@gmail.com> wrote:
> >
> >> Hi there,
> >>
> >> since I'm
> On Mon, Mar 2, 2015 at 4:36 PM, Alexander Alexandrov <
> alexander.s.alexand...@gmail.com> wrote:
>
> > Hi there,
> >
> > since I'm relying on Scala 2.11.4 on a project I've been working on, I
> > created a branch which updates the Scala
Hi there,
since I'm relying on Scala 2.11.4 on a project I've been working on, I
created a branch which updates the Scala version used by Flink from 2.10.4
to 2.11.4:
https://github.com/stratosphere/flink/commits/scala_2.11
Everything seems to work fine and the PR contains minor changes compared
Alexander Alexandrov created FLINK-1613:
---
Summary: Cannost submit to remote ExecutionEnvironment from IDE
Key: FLINK-1613
URL: https://issues.apache.org/jira/browse/FLINK-1613
Project: Flink
Apache's commons-math implementation offers various strategies for handling
this scenarios:
http://commons.apache.org/proper/commons-math/jacoco/org.apache.commons.math3.stat.clustering/KMeansPlusPlusClusterer.java.html
(take a look at the EmptyClusterStrategy enum options)
2015-02-24 23:28 GMT+
Hi Vasia,
I am trying to look at the problem in more detail. Which version of the MST
are you talking about?
Right now in the Gelly repository I can only find the SSSP example
(parallel Bellman-Ford) from Section 4.2 in [1].
However, it seems that the issues encountered by Andra are related to t
I guess the intended behavior here is to just throw a nicer error, as you
cannot really join two data streams.
2015-02-20 16:41 GMT+01:00 Daniel Bali (JIRA) :
> Daniel Bali created FLINK-1594:
> --
>
> Summary: DataStreams don't support self-join
>
Alexander Alexandrov created FLINK-1464:
---
Summary: Added ResultTypeQueryable interface to
TypeSerializerInputFormat.
Key: FLINK-1464
URL: https://issues.apache.org/jira/browse/FLINK-1464
Forget what I just said, didn't realize that it's Scala :)
2015-01-29 16:24 GMT+01:00 Alexander Alexandrov <
alexander.s.alexand...@gmail.com>:
> have you tried declaring your UDF classes (e.g. TotalRankDistribution) as
> static?
>
> 2015-01-29 16:14 GMT+01:00 A
have you tried declaring your UDF classes (e.g. TotalRankDistribution) as
static?
2015-01-29 16:14 GMT+01:00 Arvid Heise :
> Hi Flinker,
>
> I'm currently desparetely trying to get a workflow to run remotely on a
> server. The workflow works fine in the local execution environment (both
> with Ex
InputFormat,TypeInformation) instead of env.readFile()
> then you can pass TypeInformation manually without implementing
> ResultTypeQueryable.
>
> Regards,
> Timo
>
>
>
>
> On 29.01.2015 14:54, Alexander Alexandrov wrote:
>
>> The problem seems to b
Alexander Alexandrov <
alexander.s.alexand...@gmail.com>:
> The problem seems to be that the reflection analysis cannot determine the
> type of the TypeSerializerInputFormat.
>
> One possible solution is to add the ResultTypeQueryable interface and
> force clients
inference, but at the
moment I cannot find any other usages of the TypeSerializerInputFormat
except from the unit test.
-- Forwarded message --
From: Alexander Alexandrov
Date: 2015-01-29 12:04 GMT+01:00
Subject: TypeSerializerInputFormat cannot determine its type automatically
To: u
There is already an ongoing discussion and an issue open about that:
http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Gather-a-distributed-dataset-td3216.html
I am sadly currently time-pressed with other things, but if nobody else
handles this, I expect to be able to work
I don't get the difference between Private and LimitedPrivate, but
otherwise seems like quite a nice idea.
It will be also good if we can agree upon what these tags actually mean and
add this meaning to the documentation.
2015-01-27 15:46 GMT+01:00 Robert Metzger :
> Hi,
>
> Hadoop has annotatio
d fine and executed the job.
>
> I tracked it down to the following commit using `git bisect`:
>
> {noformat}
> 93eadca782ee8c77f89609f6d924d73021dcdda9 is the first bad commit
> commit 93eadca782ee8c77f89609f6d924d73021dcdda9
> Author: Alexander Alexandrov
> Date: Wed Dec 24 13:49:56 2014 +0200
>
&
Alexander Alexandrov created FLINK-1422:
---
Summary: Missing usage example for "withParameters"
Key: FLINK-1422
URL: https://issues.apache.org/jira/browse/FLINK-1422
Project: Flink
Hi there,
I have to implement some generic fallback strategy on top of a more
abstract DSL in order to keep datasets in a temp space (e.g. Tachyon). My
implementation is based on the 0.8 release. At the moment I am undecided
between three options:
- BinaryInputFormat / BinaryOutputFormat
-
Just to clarify in order to spare us some time in the discussion. I
*deliberately* want to use Flink Java API from Scala with Scala core types.
2015-01-20 18:53 GMT+01:00 Alexander Alexandrov <
alexander.s.alexand...@gmail.com>:
> Hi there,
>
> I cannot figure out how the Scala
Hi there,
I cannot figure out how the Scala base types (e.g. scala.Int, scala.Double,
etc.) are mapped to the Flink runtime.
It seems that there are not treated the same as their Java counterparts
(e.g. java.lang.Integer, java.lang.Double). For example, if I write the
following code:
val inputFo
I/O type classes are known, so we don't need static code analysis
> for that. For types inside UDFs I can add that requirement to FLINK-1319.
>
>
>
> On 20.01.2015 11:51, Alexander Alexandrov wrote:
>
>> +1 for program analysis from me too...
>>
>> Should
+1 for program analysis from me too...
Should be doable also on a lower level (e.g. analysis of compiled *.class
files) with some off-the-shelf libraries, right?
2015-01-20 11:39 GMT+01:00 Till Rohrmann :
> I like the idea to automatically figure out which types are used by a
> program and to re
Hi Daniel,
I think at least regarding 3 threre is a quick fix in the pom.xml - we need
to exclude the hadoop-* artifacts from the shader plugin. I think Robert
can confirm whether this is the case.
Regards,
Alexander
2015-01-18 18:28 GMT+01:00 Daniel Warneke :
> Hi,
>
> I just pushed my first
Thanks, I will have a look at your comments tomorrow and create a PR which
should superseed 210. BTW, is there already a test case where I can see the
suggested way to do staged execution in with the new ExecutionEnvironment
API?
I thought about your second remark as well. The following lines pitc
on about this recently:
>
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Changing-Scala-Version-to-2-11-x-td1473.html
>
> On Thu, Jan 15, 2015 at 5:51 PM, Alexander Alexandrov <
> alexander.s.alexand...@gmail.com> wrote:
>
> > Currently, Fli
Currently, Flink uses Scala 2.10.4 and relies on the macro paradise
compiler plugin to get the quasi-quotes functionality.
This makes the code incompatible with third-party add-ons that use macros
written against a newer version of Scala.
Scala 2.11 has been around for almost a year already. It
+1 for "git rebase"
2015-01-15 17:39 GMT+01:00 Aljoscha Krettek :
> No, I always do a manual "git rebase". Makes for a cleaner history. And I
> have more control over how things are merged and squashed.
> On Jan 15, 2015 5:27 PM, "Henry Saputra" wrote:
>
> > Oh, so you guys do not use the tools/
, we need another mechanism than
> the accumulators. Let's create a design doc or thread an get working on
> that. Probably involves adding another set of akka messages from TM -> JM
> -> Client. Or something like an extension to the BLOB manager for streams?
>
> G
15, at 11:42, Alexander Alexandrov <
> alexander.s.alexand...@gmail.com> wrote:
>
> > Hi there,
> >
> > I wished for intermediate datasets, and Santa Ufuk made my wishes come
> true
> > (thank you, Santa)!
> >
> > Now that FLINK-986 is in the mainlin
Hi there,
I wished for intermediate datasets, and Santa Ufuk made my wishes come true
(thank you, Santa)!
Now that FLINK-986 is in the mainline, I want to ask some practical
questions.
In Spark, there is a way to put a value from the local driver to the
distributed runtime via
val x = env.paral
78 matches
Mail list logo