> On Dec 14, 2018, at 3:22 AM, Suvayu Ali <fatkasuvayu+li...@gmail.com> wrote: > > Hi everyone, > > Maybe I'm missing something obvious, but for the life of me, I can't > figure out how I can access the elements of an array after a Gandiva > filter operation. > > I have linked a minimal example at the end which I compile like this: > > $ /usr/lib64/ccache/g++ -g -Wall -m64 -std=c++17 -pthread -fPIC \ > -I/opt/data-an/include mwe.cc -o mwe \ > -L/opt/data-an/lib64 -lgandiva -larrow > > and I then run the binary like this: > > $ LD_LIBRARY_PATH=/opt/data-an/lib64 ./mwe > > Broadly this is what I was attempting: > > 1. create a 5-element vector: 1, 3, 2, 4, 5 > > int num_records = 5; > arrow::Int64Builder i64builder; > ArrayPtr array0; > > EXPECT_OK(i64builder.AppendValues({1, 3, 2, 4, 5})); > EXPECT_OK(i64builder.Finish(&array0)); > > 2. use Gandiva to get even elements; here, indices: 2, 3 > > // schema for input fields > auto field0 = field("f0", arrow::int64()); > auto schema = arrow::schema({field0}); > > // even: f0 % 2 == 0 > auto field0_node = TreeExprBuilder::MakeField(field0); > auto lit_2 = TreeExprBuilder::MakeLiteral(int64_t(2)); > auto remainder = TreeExprBuilder::MakeFunction("mod", {field0_node, lit_2}, > int64()); > auto lit_0 = TreeExprBuilder::MakeLiteral(int64_t(0)); > auto even = TreeExprBuilder::MakeFunction("equal", {remainder, lit_0}, > boolean()); > auto condition = TreeExprBuilder::MakeCondition(even); > > // input record batch > auto in_batch = arrow::RecordBatch::Make(schema, num_records, {array0}); > > // filter > std::shared_ptr<Filter> filter; > EXPECT_OK(Filter::Make(schema, condition, &filter)); > > std::shared_ptr<SelectionVector> selected; > EXPECT_OK(SelectionVector::MakeInt16(num_records, pool_, &selected)); > EXPECT_OK(filter->Evaluate(*in_batch, selected));
> > 3. try accessing elements from the original array by index, which works > after downcasting. > > // std::cout << "array0[0]: " << array0->Value(0); // doesn't compile > // error: ‘using element_type = class arrow::Array’ {aka ‘class > // arrow::Array’} has no member named ‘Value’ > > // downcast it to the correct derived class, this works > auto array0_cast = > std::dynamic_pointer_cast<NumericArray<Int64Type>>(array0); > std::cout << "array0[0]: " << array0_cast->Value(0) << std::endl; > > 4. Then try to access the "selected" elements (even elements) in the original > array by using the selection vector from the Gandiva filter as an index > array > > auto idx_arr_cast = > std::dynamic_pointer_cast<NumericArray<Int16Type>>(idx_arr); > if (idx_arr_cast) { > std::cout << "idx_arr[0]: " << idx_arr_cast->Value(0) << std::endl; > } else { > std::cerr << "idx_arr_cast is a nullptr!" << std::endl; > } > > But I can't access the elements of the selection vector! Since it is > declared > as std::shared_ptr<arrow::Array>, the Value(..) method isn't found. I had > filled it with SelectionVector::MakeInt16(..), so I tried downcasting to > arrow::NumericArray<Int16Type>, but that fails! This should work: auto array = std::dynamic_pointer_cast<arrow::NumericArray<arrow::UInt16Type>>(selected->ToArray()); printf("%d %d\n", array->Value(0), array->Value(1)); > > I'm not sure where I'm going wrong. > > I also have a related, but more general question. Given an array, I can't find > a way to access the elements (or iterate over them) if I don't know the exact > type. If I know the type, I can downcast, and use the likes of Value(..), > GetValue(..), GetString(..), etc. Is that right? Or am I missing something? > > I looked at the pretty printer implementation, if I understood it correctly, > it specializes the WriteDataValue(..) method for every kind of array. Do I > need > something similar for generalised index access? > > Thanks for any help. > > Cheers, > > PS: The complete MWE, along with a Makefile, can be cloned from this gist: > https://gist.github.com/suvayu/aa2d38cee82b97be76186ec00073fe10 > > -- > Suvayu > > Open source is the future. It sets us free. > > * Footnotes >