Hi,

@Li, same as Jieun , I'd like to start with a single machine but can
imagine that there are use cases for a distributed approach.
@Wes, thanks, I'll look into it,

Richard

On Wed, 25 Jul 2018 at 03:59, Wes McKinney <wesmck...@gmail.com> wrote:

> hi Richard,
>
> I might start here in the Spark codebase to see how Spark SQL tables
> are converted to Arrow record batches:
>
>
> https://github.com/apache/spark/blob/d8aaa771e249b3f54b57ce24763e53fd65a0dbf7/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala
>
> The code has been developed to send payloads over a socket to PySpark,
> but it could be adapted for your needs perhaps without too much
> effort. Li and Bryan and others have worked on this so should be able
> to answer your questions about it.
>
> - Wes
>
> On Tue, Jul 24, 2018 at 8:21 AM, Li Jin <ice.xell...@gmail.com> wrote:
> > Hi,
> >
> > Do you want to collect a Spark DataFrame into Arrow format on a single
> > machine or do you still want to keep the data distributed?
>

Reply via email to