its
> distributed execution environment. SQL-only analysts would struggle to be
> effective with SQL-only access to Spark.
>
> On Fri, Aug 31, 2018 at 5:05 AM Hemant Bhanawat
> wrote:
>
>> We allow our users to interact with spark cluster using SQL queries only.
>> Tha
BTW, I can contribute if there is already an effort going on somewhere.
On Fri, Aug 31, 2018 at 3:35 PM Hemant Bhanawat
wrote:
> We allow our users to interact with spark cluster using SQL queries only.
> That's easy for them. MLLib does not have SQL extensions and we cannot
> ex
ng, this is certainly the best place to start.
>
> See here: https://spark.apache.org/docs/latest/ml-guide.html
>
>
> best,
> wb
>
>
>
> On Thu, Aug 30, 2018 at 1:45 AM Hemant Bhanawat
> wrote:
>
>> Is there a plan to support SQL extensions for mllib?
Is there a plan to support SQL extensions for mllib? Or is there an effort
already underway?
Any information is appreciated.
Thanks in advance.
Hemant
!
>
> On Fri, Apr 27, 2018 at 3:59 AM Hemant Bhanawat
> wrote:
>
>> I see.
>>
>> monotonically_increasing_id on streaming dataFrames will be really
>> helpful to me and I believe to many more users. Adding this functionality
>> in Spark would b
; spec. However, from my experience with Spark, there are many good reasons
>> why this requirement is not supported ;)
>>
>> Best,
>>
>> Chayapan (A)
>>
>>
>> On Apr 24, 2018, at 2:18 PM, Hemant Bhanawat
>> wrote:
>>
>> Thanks Chris. The
gt; all. For example, if you are using kafka, a proper partitioning scheme and
> message offsets may be “good enough”.
> ------
> *From:* Hemant Bhanawat
> *Sent:* Thursday, April 12, 2018 11:42:59 PM
> *To:* Reynold Xin
> *Cc:* dev
> *Subject:* Re: Sorti
the
dataframe so that the records always get the same snapshot id.
On Fri, Apr 13, 2018 at 11:43 AM, Reynold Xin wrote:
> Can you describe your use case more?
>
> On Thu, Apr 12, 2018 at 11:12 PM Hemant Bhanawat
> wrote:
>
>> Hi Guys,
>>
>> Why is sorting on s
Hi Guys,
Why is sorting on streaming dataframes not supported(unless it is complete
mode)? My downstream needs me to sort the streaming dataframe.
Hemant
BTW, aggregate push-down support is desirable and should be considered as
an enhancement going forward.
Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
www.snappydata.io
On Sun, Sep 10, 2017 at 8:45 PM, vaquar khan wrote:
> +1
>
> Regards,
> Vaquar khan
>
API
documentation.
https://spark.apache.org/docs/1.6.1/api/scala/index.html#org.apache.spark.rdd.RDD
Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
www.snappydata.io
On Thu, Sep 15, 2016 at 4:28 AM, Akshay Sachdeva
wrote:
> Environment:
> Apache Spark 1.6.2
&
processing a specific data size of let's say parquet data? Also, has
someone investigated memory usage for the individual SQL operators like
Filter, group by, order by, Exchange etc.?
Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
www.snappydata.io
exit thread will wait for a certain period of time
before the executor jvm exits to allow proper cleanups of the tasks.
Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
www.snappydata.io
On Thu, Apr 7, 2016 at 6:08 AM, Reynold Xin wrote:
>
> On Wed, Apr 6, 201
correcting email id for Nezih
Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
www.snappydata.io
On Sun, Apr 3, 2016 at 11:09 AM, Hemant Bhanawat
wrote:
> Hi Nezih,
>
> Can you share JIRA and PR numbers?
>
> This partial de-coupling of data partitioni
Hi Nezih,
Can you share JIRA and PR numbers?
This partial de-coupling of data partitioning strategy and spark
parallelism would be a useful feature for any data store.
Hemant
Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
www.snappydata.io
On Fri, Apr 1, 2016 at
I think rdd.toLocalIterator is what you want. But it will keep one
partition's data in-memory.
On Wed, Sep 2, 2015 at 10:05 AM, Niranda Perera
wrote:
> Hi all,
>
> I have a large set of data which would not fit into the memory. So, I wan
> to take n number of data from the RDD given a particular
o remember about Spark Streaming.
>
>
> On Wed, May 20, 2015 at 3:40 AM, Hemant Bhanawat
> wrote:
>
>> Hi,
>>
>> I have compiled a list (from online sources) of knobs/design
>> considerations that need to be taken care of by applications running on
>> spark
Hi,
I have compiled a list (from online sources) of knobs/design considerations
that need to be taken care of by applications running on spark streaming.
Is my understanding correct? Any other important design consideration that
I should take care of?
- A DStream is associated with a single
18 matches
Mail list logo