yes, that's exactly what I was looking for, thanks for the pointer ;-)
On Thu, Jul 28, 2016 at 1:07 AM, Takeshi Yamamuro wrote:
> Hi,
>
> Have you seen this ticket?
> https://issues.apache.org/jira/browse/SPARK-12449
>
> // maropu
>
> On Thu, Jul 28, 2016 at 2:1
cache it, of multiple queries on the
> same inner queries are requested.
>
>
> Il mercoledì 27 luglio 2016, Timothy Potter ha
> scritto:
>>
>> Take this simple join:
>>
>> SELECT m.title as title, solr.aggCount as aggCount FROM movies m INNER
>> JOI
Take this simple join:
SELECT m.title as title, solr.aggCount as aggCount FROM movies m INNER
JOIN (SELECT movie_id, COUNT(*) as aggCount FROM ratings WHERE rating
>= 4 GROUP BY movie_id ORDER BY aggCount desc LIMIT 10) as solr ON
solr.movie_id = m.movie_id ORDER BY aggCount DESC
I would like the
I'm using the Spark Thrift server to execute SQL queries over JDBC.
I'm wondering if it's possible to plugin a class to do some
pre-processing on the SQL statement before it gets passed to the
SQLContext for actual execution? I scanned over the code and it
doesn't look like this is supported but I
FWIW - I synchronized access to the transformer and the problem went
away so this looks like some type of concurrent access issue when
dealing with UDFs
On Tue, Mar 29, 2016 at 9:19 AM, Timothy Potter wrote:
> It's a local spark master, no cluster. I'm not sure what you mean
> a
s://twitter.com/jaceklaskowski
>
>
> On Mon, Mar 28, 2016 at 7:11 PM, Timothy Potter wrote:
>> I'm seeing the following error when trying to generate a prediction
>> from a very simple ML pipeline based model. I've verified that the raw
>> data sent to the token
I'm seeing the following error when trying to generate a prediction
from a very simple ML pipeline based model. I've verified that the raw
data sent to the tokenizer is valid (not null). It seems like this is
some sort of weird classpath or class loading type issue. Any help you
can provide in tryi
I'm using Spark 1.4.1 and am doing the following with spark-shell:
solr = sqlContext.read.format("solr").option("zkhost",
"localhost:2181").option("collection","spark").load()
solr.select("id").count()
The Solr DataSource implements PrunedFilteredScan so I expected the
buildScan method to get ca
Forgot to mention that I've tested that SerIntWritable and
PipelineDocumentWritable are serializable by serializing /
deserializing to/from a byte array in memory.
On Wed, Oct 1, 2014 at 1:43 PM, Timothy Potter wrote:
> I'm running into the following deserialization issue when tryi
I'm running into the following deserialization issue when trying to
run a very simple Java-based application using a local Master (see
stack trace below).
My code basically queries Solr using a custom Hadoop InputFormat. I've
hacked my code to make sure the objects involved
(PipelineDocumentWritab
10 matches
Mail list logo