Joris Van den Bossche created ARROW-9017:
--------------------------------------------

             Summary: [Python] Refactor the Scalar classes
                 Key: ARROW-9017
                 URL: https://issues.apache.org/jira/browse/ARROW-9017
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
            Reporter: Joris Van den Bossche


The situation regarding scalars in Python is currently not optimal.

We have two different "types" of scalars:

- {{ArrayValue(Scalar)}} (and subclasses of that for all types):  this is used 
when you access a single element of an array (eg {{arr[0]}})
- {{ScalarValue(Scalar)}} (and subclasses of that for _some_ types): this is 
used when wrapping a C++ scalar into a python scalar, eg when you get back a 
scalar from a reduction like {{arr.sum()}}.

And while we have two versions of scalars, neither of them can actually easily 
be used as scalar as they both can't be constructed from a python scalar (there 
is no {{scalar(1)}} function to use when calling a kernel, for example).

I think we should try to unify those scalar classes? (which probably means 
getting rid of the ArrayValue scalar)

In addition, there is an issue of trying to re-use python scalar <-> arrow 
conversion code, as this is also logic for this in the {{python_to_arrow.cc}} 
code. But this is probably a bigger change. cc [~kszucs] 





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to