Hi, I am trying to integrate Drools rules API with Spark so that the solution could solve few CEP centric use cases.
When I read data from a local file (simple FileWriter -> readLine()), I see that all my rules are reliably fired and everytime I get the results as expected. I have tested with file record sizes from 5K to 5M; results are as expected, every time. However when I try to receive same data through a stream (I created a simple ServerSocket, and am reading the file and writing to the socket line by line) using a custom socket receiver; even though I see that data is received at the custom receiver's end; perhaps store() API has an issue, and data is not reliably persisted, (I am using StorageLevel.MEMORY_AND_DISK_SER_2() as recommended). Result is that my rules don't get fired reliably, and everytime I get a different result. It also could be internal data loss within Spark engine. I am using a a single Windows based server, and latest 1.4.0. I have attached code for custom receiver, and my socket server which streams file data as text. Can someone pls shed more light on this issue? I have read in the documentation that a reliable receiver needs to implement *store(multi-records)*, but couldn't find any example. Many thanks in advance for any inputs or suggestions for trying out. Regards, Ajit
StreamingDataGeneratorFromFile.java
Description: Binary data
SparkCustomSocketReceiver.java
Description: Binary data
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
