Yep, exactly what I suggested below.

In terms of format, Feather (suggested by Robin below) should be favoured
over .csv given it persists schema as well.


XD

On Tue, Dec 24, 2019 at 17:44 Tomasz Urbaszek <tomasz.urbas...@polidea.com>
wrote:

> Personally I would use a .csv format and store the file on a S3/GCS bucket.
> Xcom is meant to store small amount of data.
>
> T.
>
> On Tue, Dec 24, 2019 at 10:33 AM Robin Edwards <r...@bidnamic.com> wrote:
>
> > Feather is probably a good option for data frames:
> >
> >
> >
> https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_feather.html
> >
> > R
> >
> > On Tue, 24 Dec 2019 at 07:52, Deng Xiaodong <xd.den...@gmail.com> wrote:
> > >
> > > Hi David.
> > >
> > > The only “out of box” way to share data/information between tasks is
> > XCom (
> > >
> >
> https://airflow.apache.org/docs/stable/concepts.html?highlight=xcom#xcoms
> > ).
> > >
> > > For you case, the quick suggestion I can share is
> > >
> > > - either merging your tasks
> > > - or persisting your Pandas Dataframes somewhere then load it in your
> 2nd
> > > task (e.g. using pickle)
> > >
> > >
> > > XD
> > >
> > > On Tue, Dec 24, 2019 at 15:00 David Muñoz <david.munoz4...@gmail.com>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > Excuse me, I am new to this and maybe this topic has already been
> > treated.
> > > >
> > > > I would like to know if there is a way to "share/pass" pandas
> > dataframes
> > > > between tasks in airflow.
> > > >
> > > > Any help would be appreciated.
> > > >
> > > > Thank you!!!
> > > >
> > > > David.
> > > >
> >
>
>
> --
>
> Tomasz Urbaszek
> Polidea <https://www.polidea.com/> | Software Engineer
>
> M: +48 505 628 493 <+48505628493>
> E: tomasz.urbas...@polidea.com <tomasz.urbasz...@polidea.com>
>
> Unique Tech
> Check out our projects! <https://www.polidea.com/our-work>
>

Reply via email to