Hi Harut,
Jeff's right that Kibana + Elasticsearch can take you quite far out of the
box. Depending on your volume of data, you may only be able to keep recent
data around though.
Another option that is custom-built for handling many dimensions at query
time (not as separate metrics) is Druid (h
014 at 8:42 AM, Roger Hoover
wrote:
> Thanks, Andrew. I'll give it a try.
>
>
> On Mon, May 26, 2014 at 2:22 PM, Andrew Or wrote:
>
>> Hi Roger,
>>
>> This was due to a bug in the Spark shell code, and is fixed in the latest
>> master (and RC11). Her
Hi Aaron,
When you say that sorting is being worked on, can you elaborate a little
more please?
If particular, I want to sort the items within each partition (not
globally) without necessarily bringing them all into memory at once.
Thanks,
Roger
On Sat, May 31, 2014 at 11:10 PM, Aaron Davidso
I think it would very handy to be able to specify that you want sorting
during a partitioning stage.
On Thu, Jun 5, 2014 at 4:42 PM, Roger Hoover wrote:
> Hi Aaron,
>
> When you say that sorting is being worked on, can you elaborate a little
> more please?
>
> If particular,
>
> As far as the work that Aaron mentioned is happening, I think he might be
> referring to the discussion and code surrounding
> https://issues.apache.org/jira/browse/SPARK-983
>
> Cheers!
> Andrew
>
>
> On Thu, Jun 5, 2014 at 5:16 PM, Roger Hoover
> wrote:
>
&g
I have this same question. Isn't there somewhere that the Kafka range
metadata can be saved? From my naive perspective, it seems like it should
be very similar to HDFS lineage. The original HDFS blocks are kept
somewhere (in the driver?) so that if an RDD partition is lost, it can be
recomputed.
"I’ve also considered to use Kafka to message between Web UI and the pipes,
I think it will fit. Chaining the pipes together as a workflow and
implementing, managing and monitoring these long running user tasks with
locality as I need them is still causing me headache."
You can look at Apache Sam
Can anyone comment on their experience running Spark Streaming in
production?
On Thu, Apr 10, 2014 at 10:33 AM, Dmitriy Lyubimov wrote:
>
>
>
> On Thu, Apr 10, 2014 at 9:24 AM, Andrew Ash wrote:
>
>> The biggest issue I've come across is that the cluster is somewhat
>> unstable when under memor
Hi,
I'm trying to figure out how to join two RDDs with different key types and
appreciate any suggestions.
Say I have two RDDS:
ipToUrl of type (IP, String)
ipRangeToZip of type (IPRange, String)
How can I join/cogroup these two RDDs together to produce a new RDD of type
(IP, (String, St
t; of CIDR notations and do the join then, but you're starting to have the
> cartesian product work against you on scale at that point.
>
> Andrew
>
>
> On Tue, Apr 15, 2014 at 1:07 AM, Roger Hoover wrote:
>
>> Hi,
>>
>> I'm trying to figure out how to
I'm thinking of creating a union type for the key so that IPRange and IP
types can be joined.
On Tue, Apr 15, 2014 at 10:44 AM, Roger Hoover wrote:
> Andrew,
>
> Thank you very much for your feedback. Unfortunately, the ranges are not
> of predictable size but you gave me
Ah, in case this helps others, looks like RDD.zipPartitions will accomplish
step 4.
On Tue, Apr 15, 2014 at 10:44 AM, Roger Hoover wrote:
> Andrew,
>
> Thank you very much for your feedback. Unfortunately, the ranges are not
> of predictable size but you gave me an idea of how
d help with?
>
>
> On Wed, Apr 16, 2014 at 7:11 PM, Roger Hoover wrote:
>
>> Ah, in case this helps others, looks like RDD.zipPartitions will
>> accomplish step 4.
>>
>>
>> On Tue, Apr 15, 2014 at 10:44 AM, Roger Hoover wrote:
>>
>>> Andrew,
>&
Hi,
>From the meetup talk about the 1.0 release, I saw that spark-submit will be
the preferred way to launch apps going forward.
How do you recommend launching such jobs in a development cycle? For
example, how can I load an app that's expecting to a given to spark-submit
into spark-shell?
Also
method and just
> call that method from the SBT shell, that should work.
>
> Matei
>
> On Apr 27, 2014, at 3:14 PM, Roger Hoover wrote:
>
> > Hi,
> >
> > From the meetup talk about the 1.0 release, I saw that spark-submit will
> be the preferred way
ll fails. When I do that in the scala repl, it works.
BTW, I'm using the latest code from the master branch
(8421034e793c0960373a0a1d694ce334ad36e747)
On Mon, Apr 28, 2014 at 3:40 PM, Roger Hoover wrote:
> Matei, thank you. That seemed to work but I'm not able to import a class
>
; I think either this or the --jars flag should work, but it's possible
> there is a bug with the --jars flag when calling the Repl.
>
>
> On Mon, Apr 28, 2014 at 4:30 PM, Roger Hoover wrote:
>
>> A couple of issues:
>> 1) the jar doesn't show up on the classpa
The return type should be RDD[(Int, Int, Int)] because sc.textFile()
returns an RDD. Try adding an import for the RDD type to get rid of the
compile error.
import org.apache.spark.rdd.RDD
On Mon, Apr 28, 2014 at 6:22 PM, SK wrote:
> Hi,
>
> I am a new user of Spark. I have a class that define
/apache/spark/commit/8edbee7d1b4afc192d97ba192a5526affc464205.
> Try it now and it should work. :)
>
> Andrew
>
>
> 2014-05-26 10:35 GMT+02:00 Perttu Ranta-aho :
>
> Hi Roger,
>>
>> Were you able to solve this?
>>
>> -Perttu
>>
>>
>> On Tue, Apr 29, 2014
19 matches
Mail list logo