Hi
I would want to know more about Spark CEP (Complex Event Processing). Are there
exist some simple (but also complex) examples with input data (log files ?).
Whether Spark CEP is based on Siddhi ? If yes, it is better to use Siddhi
directly ?
I know CEP engines are intended to stream data, b
Hi
I don't know whether this question is suitable for this forum, but I take the
risk and ask :)
In my understanding the execution model in Spark is very data (flow) stream
oriented and specific. Is it difficult to build a control flow logic (like
state-machine) outside of the stream specific
Hello
That is good to hear, but are there exist some good practical (Python or Scala)
examples ? This would help a lot.
I tried to do that by Apache Flink (and its CEP) and it was not so piece cake.
Best, Esa
From: Matteo Cossu
Sent: Friday, May 18, 2018 10:51 AM
To: Esa Heikkinen
Cc: user
Hello
I am trying to use CEP of Spark for log files (as batch job), but not for
streams (as realtime).
Is that possible ? If yes, do you know examples Scala codes about that ?
Or should I convert the log files (with time stamps) into streams ?
But how to handle time stamps in Spark ?
If I can n
Hi
It is very often difficult to get answers of question about Spark in many
forums.. Maybe they are inactive or my questions are too bad. I don't know, but
does anyone know good active groups, forums or contacts other like this ?
Esa Heikkinen
). But this
can be impossible by Spark ?
Esa Heikkinen
Hi
I would want to build pyspark-application, which searches sequential items or
events of time series from csv-files.
What are the best data structures for this purpose ? Dataframe of pyspark or
pandas, or RDD or SQL or something else ?
---
Esa
Hi
What would be the best way to use Spark and neutral networks (especially RNN
LSTM) ?
I think it would be possible by "tool"-combination:
Pyspark + anaconda + pandas + numpy + keras + tensorflow + scikit
But what about scalability and usability by Spark (pyspark) ?
How compatible are da
Hi
Does anyone have any hints or example (code) how to get combination:
Windows10 + pyspark + ipython notebook + csv file loading with
timestamps (timeseries data) to dataframe or RDD to work ?
I have already installed windows10 + pyspark + ipython notebook and they
seem to work, but my pyth
4
Vastaanottaja: Esa Heikkinen
Kopio: Mahesh Sawaiker; user@spark.apache.org
Aihe: Re: VS: Using Spark as a simulator
Spark dropped Akka some time ago...
I think the main issue he will face is a library for simulating the state
machines (randomly), storing a huge amount of files (HDFS is probably
The spark was originally built on it (Akka).
Esa
Lähettäjä: Mahesh Sawaiker
Lähetetty: 21. kesäkuuta 2017 14:45
Vastaanottaja: Esa Heikkinen; Jörn Franke
Kopio: user@spark.apache.org
Aihe: RE: Using Spark as a simulator
Spark can help you to create one large file
also be split into smaller ones.
What transformation or action functions can i use in Spark for that purpose ?
Or are there exist some code sample (Python or Scala) about that ?
Regards
Esa Heikkinen
Lähettäjä: Jörn Franke
Lähetetty: 20. kesäkuuta 2017 17:12
Hi
Spark is a data analyzer, but would it be possible to use Spark as a data
generator or simulator ?
My simulation can be very huge and i think a parallelized simulation using by
Spark (cloud) could work.
Is that good or bad idea ?
Regards
Esa Heikkinen
Hi
Does anyone know of good books about event processing in distributed
event systems (like IoT-systems) ?
I have already read book: "Power of Events" (Luckham 2002), but are
there exist newer ones ?
Best Regards
Esa
f log
and its data
3) And so on..
Regards
Esa Heikkinen
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
-Mike
On Apr 29, 2016, at 3:54 AM, Esa Heikkinen
mailto:esa.heikki...@student.tut.fi>>
wrote:
Hi
I try to explain my case ..
Situation is not so simple in my logs and solution. There also many
types of logs and there are from many sources.
They are as csv-format and header line inc
g and very big data). This solution is suitable for very
complex (targeted) analyzing. It can be too slow and memory-consuming,
but well done pre-processing of log data can help a lot.
---
Esa Heikkinen
28.4.2016, 14:44, Michael Segel kirjoitti:
I don’t.
I believe that there have been a coup
Esa Heikkinen
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
Do you know any good examples how to use Spark streaming in tracking
public transportation systems ?
Or Storm or some other tool example ?
Regards
Esa Heikkinen
28.4.2016, 3:16, Michael Segel kirjoitti:
Uhm…
I think you need to clarify a couple of things…
First there is this thing called
time-consuming
analyzings? There are no a need for real time.
And it is possible to do "CEP with delays" reasonably some existing
analyzer (for example Spark) ?
Regards
PhD student at Tampere University of Technology, Finland, www.tut.fi
<http://www.tut.fi/>
Esa Heikkinen
ou know any good (scientific) papers i should read about CEP ?
Regards
PhD student at Tampere University of Technology, Finland, www.tut.fi
Esa Heikkinen
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
Hi
I am newbie with apache spark and i would want to know or find good
example python codes how to implement (finite) statemachine
functionality in spark. I try to read many different log files to find
certain events by specific order. Is this possible or even impossible ?
Or is that only "p
22 matches
Mail list logo