You can also look at informatica data quality that runs on spark. Of course it’s not free but you can sign up for a 30 day free trial. They have both profiling and prebuilt data quality rules and accelerators. Sent from my iPhoneOn Dec 28, 2022, at 10:02 PM, vaquar khan wrote:@ Gourav Sengupta wh
What is the expected ballpark release date of spark 3.2 ?
Thanks and Regards,
Ajay.
Great job everyone !! Do we have any tentative GA dates yet?
Thanks and Regards,
Ajay.
On Tue, Dec 24, 2019 at 5:11 PM Star wrote:
> Awesome work. Thanks and happy holidays~!
>
>
> On 2019-12-25 04:52, Yuming Wang wrote:
> > Hi all,
> >
> > To enable wide-scale community testing of the upcoming
gt; data store without reading it in first.
>
> Jerry
>
> On Thu, Jul 11, 2019 at 1:27 PM infa elance wrote:
>
>> Sorry, i guess i hit the send button too soon
>>
>> This question is regarding a spark stand-alone cluster. My understanding
>> is spark is a
(df/rdd) as hive or deltalake table?
Spark version with hadoop : spark-2.0.2-bin-hadoop2.7
Thanks and appreciate your help!!
Ajay.
On Thu, Jul 11, 2019 at 12:19 PM infa elance wrote:
> This is stand-alone spark cluster. My understanding is spark is an
> execution engine and not a storage
This is stand-alone spark cluster. My understanding is spark is an
execution engine and not a storage layer.
Spark processes data in memory but when someone refers to a spark table
created through sparksql(df/rdd) what exactly are they referring to?
Could it be a Hive table? If yes, is it the same
Hi All,
I trying to understand how row_number is applied In the below code, does spark
store data in a dataframe and then perform row_number function or does it apply
while reading from hive ?
from pyspark.sql import HiveContext
hiveContext = HiveContext(sc)
hiveContext.sql("
( SELECT colunm1 ,c
Hi All,
I trying to understand how row_number is applied In the below code, does
spark store data in a dataframe and then perform row_number function or
does it apply while reading from hive ?
from pyspark.sql import HiveContext
hiveContext = HiveContext(sc)
hiveContext.sql("
( SELECT colunm1 ,col
Hi all,
When using spark-shell my understanding is spark connects to hive through
metastore.
The question i have is does spark connect to metastore , is it JDBC?
Thanks and Regards,
Ajay.