Hi Guys,
We are in the process of creating Proof of concept.,
I am looking for the sample project - Flink scala or java which can load
data from database to database or
CSV to relational database(any).
CSV --> S3 --> AWS Redshift
could you please some one advise me on that..
C
Hi Steffen,
Turns out that FLINK-4514 just missed Flink 1.1.2 and wasn’t included in the
release (I’ll update the resolve version in JIRA to 1.1.3, thanks for noticing
this!).
The Flink community is going to release 1.1.3 asap, which will include the fix.
If you don’t want to wait for the releas
Thank you for the response.
Regarding adding to the page, I will check with our PR department.
Regards,
Hironori
2016/10/04 21:12、Fabian Hueske のメッセージ:
> Thanks Hironori for sharing these excellent news!
> Do you think it would be possible to add your use case to Flink's Powered-By
> wiki page
I think Josh found a "WIP" bug. The code is very much in flux because of
the new feature that allows to change the parallelism with which savepoints
are resumed.
The "user code class loader" is not yet properly used in the operator state
backend when reloading snapshot state. This will be integrat
Thanks for your prompt response Stephan.
I'd wait for Flink 1.1.3 !!!
Best Regards
Varaga
On Tue, Oct 4, 2016 at 5:36 PM, Stephan Ewen wrote:
> The plan to release 1.1.3 is asap ;-)
>
> Waiting for last backported patched to get in, then release testing and
> release.
>
> If you want to te
It would be great to know if this only occurs in setups where Netty in
involved (more than one TaskManager and, and at least one
shuffle/rebalance) or also in one-taskmanager setups (which have local
channels only).
Stephan
On Tue, Oct 4, 2016 at 11:49 AM, Till Rohrmann wrote:
> Hi Tarandeep,
>
Yes Kostas, thank you for the explanation , I will take a look
Regards,
Vinay Patil
On Tue, Oct 4, 2016 at 11:23 AM, Kostas Kloudas [via Apache Flink User
Mailing List archive.] wrote:
> Hi Vinay,
>
> These methods are useful when using your trigger with SessionWindows. When
> using session win
The plan to release 1.1.3 is asap ;-)
Waiting for last backported patched to get in, then release testing and
release.
If you want to test it today, you would need to manually build the
release-1.1 branch.
Best,
Stephan
On Tue, Oct 4, 2016 at 5:46 PM, Chakravarthy varaga <
chakravarth...@gmail
How about just overriding the "readLine()" method to call
"super.readLine()" and catching EOF exceptions?
On Tue, Oct 4, 2016 at 5:56 PM, Fabian Hueske wrote:
> Hi Yassine,
>
> AFAIK, there is no built-in way to ignore corrupted compressed files.
> You could try to implement a FileInputFormat th
Hi Vinay,
These methods are useful when using your trigger with SessionWindows. When
using session windows,
the state of a window and that of the corresponding trigger has to be merged
with that of other windows.
These methods do exactly that: the canMerge() says if the trigger can be used
w
Hi Kostas,
Yes you are right , I am always doing FIRE_AND_PURGE , if we don't do this
and only use FIRE , the window function will get the elements
in incremental fashion (1, 2,3..so on)
I had observed this while testing.
Can you please explain me the importance of canMerge and onMerge functions
Hi Yassine,
AFAIK, there is no built-in way to ignore corrupted compressed files.
You could try to implement a FileInputFormat that wraps the CsvInputFormat
and forwards all calls to the wrapped CsvIF.
The wrapper would also catch and ignore the EOFException.
If you do that, you would not be able
Hi Vinay,
By setting the allowed_lateness to LongMax you are ok.
Sorry I forgot that this was the default value.
Just a note (although you have it right in your code), in this case
you should always FIRE_AND_PURGE and not just FIRE. In other
case your state will keep growing as it is never ga
Hi Gordon,
Do I need to clone and build release-1.1 branch to test this?
I currently use flinlk 1.1.2 runtime. When is the plan to release it
in 1.1.3?
Best Regards
Varaga
On Tue, Oct 4, 2016 at 9:25 AM, Tzu-Li (Gordon) Tai
wrote:
> Hi,
>
> Helping out here: this is the PR for async
Hi all,
I am reading a large number of GZip compressed csv files, nested in a HDFS
directory:
Configuration parameters = new Configuration();
parameters.setBoolean("recursive.file.enumeration", true);
DataSet> hist = env.readCsvFile("hdfs:///shared/logs/")
.ignoreFirstLine()
Hi Kostas,
The late elements are immediately getting triggered with the code I have
sent,
I have tested it with a test case as follows : (I am doing the outer-join
operation by doing the union of stream1 and stream2)
1. Push 5 records to Kafka Topic 1 -> sourceStream1
2. Wait for few minutes -
Hi Vinay,
From what I understand from your code, the only difference of your trigger
compared to the
one shipping with Flink is that for the late elements, instead of firing and
keeping the element,
you fire and purge, i.e. clean the window state.
This does not solve the problem of dropping t
Hi Ufuk,
any ideas? Any configuration that could be wrong?
Cheers,
Konstantin
On 30.09.2016 13:13, Konstantin Knauf wrote:
> Hi Ufuk,
>
> thanks for your quick answer.
>
> Setup: 2 Servers, each running a JM as well as TM
>
> 1) Removing all existing blobstores locally (/tmp) as well as on H
Hi Kostas,
Thank you for your reply, yes that will be a good functionality to have,
but for now the Custom Trigger as close to 1.0.3 works for me.
public TriggerResult onElement(Object element, long timestamp, TimeWindow
window, TriggerContext ctx) throws Exception { if(window.maxTimestamp() <=
ct
Hi Dominik,
To only fire when new elements have arrived, you should modify your
EventTimeTriggerWithEarlyAndLateFiring to detect that
more elements have arrived since the last firing.
To do so, you should add some extra of state, e.g.
ValueStateDescriptor, that you set to true in the onElement(
Hi Dominik,
you could extend the EventTimeTriggerWithEarlyAndLateFiring trigger to
store for each key whether you’ve seen a new element since the last firing
or not. When firing you can set the state back to alreadyFired. For that
you can use the TriggerContext.getPartitionedState.
The community
Hi,
I'm heavily relying on TimeWindows for my real time processing. Roughly
my job consumes from an AMQP queue, computes some time buckets and saves
the time-buckets to Cassandra.
I found the EventTimeTriggerWithEarlyAndLateFiring [1] class which
already helped me a lot: Even with long time-w
I think you can start from this (using flink table-api), I hope it could be
helpful:
PS:maybe someone could write a blog post on how to do this with Scala since
it's a frequent question on the mailing list... :)
public static void main(String[] args) throws Exception {
String path
Really great to hear this!
Cheers,
Gordon
On October 4, 2016 at 8:13:27 PM, Till Rohrmann (trohrm...@apache.org) wrote:
It's always great to hear Flink success stories :-) Thanks for sharing it with
the community.
I hope Flink helps you to solve even more problems. And don't hesitate to reac
Hi Guys,
We are in the process of creating POC,
I am looking for the sample project - Flink scala or java which can load
data from database to database or
CSV to relational database(any).
CSV --> SQLSERVER --> AWS Redshift
could you please some one help me on that..
Cheers
Ram
It's always great to hear Flink success stories :-) Thanks for sharing it
with the community.
I hope Flink helps you to solve even more problems. And don't hesitate to
reach out to the community whenever you stumble across some Flink problems.
Cheers,
Till
On Tue, Oct 4, 2016 at 2:04 PM, Hironor
Thanks Hironori for sharing these excellent news!
Do you think it would be possible to add your use case to Flink's
Powered-By wiki page [1] ?
Thanks, Fabian
[1] https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink
2016-10-04 14:04 GMT+02:00 Hironori Ogibayashi :
> Hello,
>
> Just
Hello,
Just for information.
Last week, I have presented our Flink use case in my company's conference.
(http://developers.linecorp.com/blog/?p=3992)
Here is the slide.
http://www.slideshare.net/linecorp/b-6-new-stream-processing-platformwith-apache-flink
I think the video with English subtitle
Awesome, thanks Fabian !
I will give this a try.
Fabian Hueske-2 wrote
> Hi Philipp,
>
> If I got your requirements right you would like to:
> 1) load an initial hashmap via JDBC
> 2) update the hashmap from a stream
> 3) use the hashmap to enrich another stream.
>
> You can use a CoFlatMap to
Hi Tarandeep,
it would be great if you could compile a small example data set with which
you're able to reproduce your problem. We could then try to debug it. It
would also be interesting to know whether Flavio's bug solves your problem
or not.
Cheers,
Till
On Mon, Oct 3, 2016 at 10:26 PM, Flavi
Hi Sameer,
the semantics of side inputs are not fully fledged out yet, as far as I
know.
The design doc says that one first waits for some data on the side input
before starting processing the main input. Therefore, I would assume that
you have some temporal ordering of the side input and main in
Hi Josh,
the internal state representation of Kafka sources has been changed
recently so that it is now possible to rescale the Kafka sources. That is
the reason why the old savepoint which contains the Kafka state in the old
representation is not able to be read by the updated Kafka sources.
The
Hi Ken,
you can let a class implement both the SourceFunction and the SinkFunction.
However when running a job, the source and the sink will be distinct
instances. Thus, there is no way that they share instance variables.
What you could do is to write the updated and newly discovered URLs to a
me
Hi Philipp,
If I got your requirements right you would like to:
1) load an initial hashmap via JDBC
2) update the hashmap from a stream
3) use the hashmap to enrich another stream.
You can use a CoFlatMap to do this:
stream1.connect(stream2).flatMap(new YourCoFlatMapFunction).
YourCoFlatMapFunc
Hello LF and Vinay,
With the introduction of “allowed lateness” elements and windows are kept
around until the watermark
passes the window.maxTimestamp + allowed_lateness and then they are cleaned up
(garbage collected)
Every element that comes in and belongs to a window that is garbage collec
Hi,
Helping out here: this is the PR for async Kafka offset committing -
https://github.com/apache/flink/pull/2574.
It has already been merged into the master and release-1.1 branches, so you can
try out the changes now if you’d like.
The change should also be included in the 1.1.3 release, whic
Hi Govindarajan,
Regarding the stagnant Kakfa offsets, it’ll be helpful if you can supply more
information for the following to help us identify the cause:
1. What is your checkpointing interval set to?
2. Did you happen to have set the “max.partition.fetch.bytes” property in the
properties give
Hi Govindarajan,
you can broadcast the stream with debug logger information by calling
`stream.broadcast`. Then every stream record should be send to all
sub-tasks of the downstream operator.
Cheers,
Till
On Mon, Oct 3, 2016 at 5:13 PM, Govindarajan Srinivasaraghavan <
govindragh...@gmail.com> w
FYI: FLINK-4497 [1] requests Scala tuple and case class support for the
Cassandra sink and was opened about a month ago.
[1] https://issues.apache.org/jira/browse/FLINK-4497
2016-09-30 23:14 GMT+02:00 Stephan Ewen :
> How hard would it be to add case class support?
>
> Internally, tuples and cas
Hi
I am spark developer with good exposure to spark streaming/core/sql.
I have just started on flink and keen to take any part time role as
developer , in case anyone needs it.
Please do write me back , if anyone has got any such opportunity.
--
Thanks
Deepak
On Fri, Sep 30, 2016 at 4:19 PM, Astrac wrote:
> What I mean by "manually reading the savepoint" is that rather than
> providing the savepoint path via "the --fromSavepoint
> hdfs://some-path/to/savepoint" option I'd like to provide it in the code
> that initialises the StreamExecutionEnvironment.
41 matches
Mail list logo