Hi Joseph,
Thank you for the link! Two follow up questions
1)Suppose I have the original DataFrame in Tungsen, i.e. catalyst types and
cached in off-heap store. It might be quite useful for iterative workloads due
to lower GC overhead. Then I convert it to RDD and then backto DF. Will the
resul
+1 to the general structure of Reynold's proposal. I've found what we do
currently a little confusing. In particular, it doesn't make much sense
that @DeveloperApi things are always labeled as possibly changing. For
example the Data Source API should arguably be one of the most stable
interfaces
Here's a JIRA for it: https://issues.apache.org/jira/browse/SPARK-13346
I don't have a great method currently, but hacks can get around it: convert
the DataFrame to an RDD and back to truncate the query plan lineage.
Joseph
On Wed, May 11, 2016 at 12:46 PM, Ulanov, Alexander <
alexander.ula...@h
On Fri, May 13, 2016 at 10:18 AM, Sean Busbey wrote:
> I think LimitedPrivate gets a bad rap due to the way it is misused in
> Hadoop. The use case here -- "we offer this to developers of
> intermediate layers; those willing to update their software as we
> update ours"
I think "LimitedPrivate" i
On Fri, May 13, 2016 at 6:37 AM, Tom Graves
wrote:
> So we definitely need to be careful here. I know you didn't mention it but
> it mentioned by others so I would not recommend using LimitedPrivate. I had
> started a discussion on Hadoop about some of this due to the way Spark
> needed to use s
So we definitely need to be careful here. I know you didn't mention it but it
mentioned by others so I would not recommend using LimitedPrivate. I had
started a discussion on Hadoop about some of this due to the way Spark needed
to use some of the Api's.https://issues.apache.org/jira/browse/HA
> On 12 May 2016, at 22:29, Reynold Xin wrote:
>
> We currently have three levels of interface annotation:
>
> - unannotated: stable public API
> - DeveloperApi: A lower-level, unstable API intended for developers.
> - Experimental: An experimental user-facing API.
>
>
> After using this anno
Hi all,
I notice that HiveContext used to have a refreshTable() method, but it doesn’t
in branch-2.0.
Do we drop that intentionally? If yes, how do we achieve similar functionality?
Thanks.
Yang