ink job needs to consume multiple sources. I
> have a topic in Kafka where the order of consuming is important. Because
> the cost of S3 is much less than storage on Kafka, we have a job that sinks
> to S3. The topic in Kafka can then retain just 3 days worth of data. My
> job needs t
I have a case where my Flink job needs to consume multiple sources. I have a
topic in Kafka where the order of consuming is important. Because the cost of
S3 is much less than storage on Kafka, we have a job that sinks to S3. The
topic in Kafka can then retain just 3 days worth of data. My
As identified with the community, it's bug and more information in issue
https://issues.apache.org/jira/browse/FLINK-22113
On Sat, Apr 3, 2021 at 8:43 PM Kai Fu wrote:
> Hi team,
>
> We have a use case to join multiple data sources to generate a
> continuous updated view. We defined primary key
Hi team,
We have a use case to join multiple data sources to generate a
continuous updated view. We defined primary key constraint on all the input
sources and all the keys are the subsets in the join condition. All joins
are left join.
In our case, the first two inputs can produce *JoinKeyContai
Hi Aissa,
Flink supports to read from multiple sources in one job. You have to call
multiple times `StreamExecutionEnvironment.addSource()` with the respective
`SourceFunction`. Flink does not come with a ready-made MongoDB connector.
However, there is a project which tried to implement a MongoDB
Hello everyone,
I hope you all doing well.I am reading from a Kafka topic some real-time
messages produced by some sensors, and in order to do some aggregations, I
need to enrich the stream with other data that are stocked in a mongoDB.
So, I want to know if it is possible to work with two sources
Hi,
If you are expressing a job that contains three pairs of source->sinks that
are isolated from each other, then Flink supports this form of Job.
It is not much different from a single source->sink, just changed from a
DataStream to three DataStreams.
For example,
*DataStream ds1 = xxx*
*ds1.a
How can I configure 1 Flink Job (stream execution environment, parallelism set
to 10) to have multiple kafka sources where each has its' own sink to s3.
For example, let's say the sources are:
- Kafka Topic A - Consumer (10 partitions)
- Kafka Topic B - Consumer (10 partitions)
- Kafka Topic C -
> Please advise if I am missing something here.
Quoted from:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Multiple-Sources-and-Isolated-Analytics-tp11059.html
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Multiple-
of them is probably not the most efficient way to go.
Instead, you could maybe write a custom multiple shortest paths program,
where each node calculates distances for multiple sources in each
iteration. In this case, the vertex value could be a vector of size
equal to the number of input sources
();
}
However, if you have a large amount of source nodes, executing one SSSP for
each of them is probably not the most efficient way to go.
Instead, you could maybe write a custom multiple shortest paths program,
where each node calculates distances for multiple sources in each
iteration. In this
example,
for(Node node: nodes){
DataSet> singleSourceShortestPaths =
graph.run(new SingleSourceShortestPaths(node,
maxIterations)).getVertices();
}
--
View this message in context:
http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Multiple-sourc
12 matches
Mail list logo