Hi, regarding your first question. I think it is in general not safe to call ctx.collect() from Threads other than the Thread that is invoking the run() method of your SourceFunction. What I would suggest is to have a queue that your reader threads put data into and then read from that queue in the main thread to call ctx.collect().
For your second question: it depends on the Source. Sources that implement the SourceFunction interface are not parallelized. One instance of the source will send data (round-robin) to downstream parallel mappers (or other operations). If you implement the ParallelSourceFunction or RichParallelSourceFunction then your source will be parallelized. Cheers, Aljoscha On Tue, 24 May 2016 at 22:15 omaralvarez <xxosurfer...@gmail.com> wrote: > I have been trying to send data from several processes to my Flink > application. I want to use a single port that will receive data from > multiple clients. I have implemented my own SourceFunction, but I have two > doubts. > > My TCP server has multiple threads receiving data from multiple clients, is > calling ctx.collect() without synchronization safe? > > Second, when I have a source function, how is the data processing of the > arrival messages handled? Is a source function parallelized? For instance > with the simple SocketTextStream, are messages received in parallel or > received sequentially and then processed in parallel by mappers and so on? > > Sorry if my questions seem too obvious, but I'm kind of new to the > Streaming > API, and thank you in advance! > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-ingestion-using-a-Flink-TCP-Server-tp7134.html > Sent from the Apache Flink User Mailing List archive. mailing list archive > at Nabble.com. >