Hi Mike,
1. I think yes, though we'd need to turn off the automatic LRU eviction
that happens when the store fills up.
3. I think there are some edge cases and it depends what is in your
DataFrame, but at least if it consists of numerical data then the two
representations should use the same unde
I am interested to implement an arrow based persisted cache store and I
have a few related questions:
1.
Is it possible just to use Plasma for this goal?
(My understanding is that it is not persistable)
Else, what is the recommended way to do so?
2.
Is feather the better file f
Robert Nishihara created ARROW-2011:
---
Summary: Allow setting the pickler to use in pyarrow serialization.
Key: ARROW-2011
URL: https://issues.apache.org/jira/browse/ARROW-2011
Project: Apache Arrow
Wes McKinney created ARROW-2010:
---
Summary: [C++] Compiler warnings with CHECKIN warning level in ORC
adapter
Key: ARROW-2010
URL: https://issues.apache.org/jira/browse/ARROW-2010
Project: Apache Arrow
Great, thank you for the explanation - it makes so much sense. I have a use
case where once I've converted an Arrow table back to pandas I then convert
it into a dictionary (with to_dict()). This dictionary then gets JSON
serialised and sent over the wire for display on the client side. I
encounter
Upon converting to Arrow, the information about whether the original
input was a list or ndarray was lost. So any kind of sequence ends up
as an Arrow List type.
When converting back to pandas, we could return either a list or an
ndarray. Returning ndarray is faster and much more memory efficient;
Hi Wes,
Great! Thanks for the pointer. From what I gather this is a fundamental and
deliberate design decision. Would I be correct in saying the memory
footprint and access speed of a NumPy array compared to that of a Python
list is the reason why the conversion is done?
Kind Regards
Simba
On Th
hi Simba,
Yes -- Arrow list types are converted to NumPy arrays when converting
back to pandas with to_pandas(...). This conversion happens in C++ code in
https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/arrow_to_pandas.cc#L541
- Wes
On Thu, Jan 18, 2018 at 1:26 PM, simba nyatsan
Good day everyone,
I noticed what looks like type inference happening after persisting a
pandas DataFrame where one of the column values is a list. When I load up
the DataFrame again and do df.to_dict(), the value is no longer a list but
a numpy array. I dug through functions in the pandas_compat.