Thanks a lot. I see that there's a PR that's been opened to resolve the encoding issue - https://github.com/apache/arrow/pull/1476
Do you think this PR (if merged ) will also roll out as part of version 0.9.0, or I'll be able to pip install with the merge commit as soon as it's merged? Kind Regards On Sun, 14 Jan 2018 at 15:50 Uwe L. Korn <uw...@xhochy.com> wrote: > Nice to hear that it worked. > > Updating the docs should not be necessary, we should rather see that we > soon get a 0.9.0 release out (but that will also take some more weeks) > > Uwe > > On Sun, Jan 14, 2018, at 2:42 PM, simba nyatsanga wrote: > > Amazing, thanks Uwe! > > > > I was able to build pyarrow successfully for python 2.7 using your > > workaround. I appreciate that you've got a possible solution for the too. > > > > Besides the PR getting reviewed by more experienced maintainers, I'm > > thinking to pull your branch and try the building process from scratch. > > Otherwise I was wondering if it's valuable, in the meantime, to update > the > > docs with your work around? > > > > Kind Regards > > Simba > > > > On Sun, 14 Jan 2018 at 15:17 Uwe L. Korn <uw...@xhochy.com> wrote: > > > > > Hello Simba, > > > > > > it looks like you are running to > > > https://issues.apache.org/jira/browse/ARROW-1856. > > > > > > To work around this issue, please "unset PARQUET_HOME" before you call > the > > > setup.py. Also set PKG_CONFIG_PATH, in your case this should be "export > > > PKG_CONFIG_PATH=/Users/simba/anaconda/envs/pyarrow-dev/lib/pkgconfig". > By > > > doing this, you do the package discovery using pkg-config instead of > the > > > *_HOME variables. Currently this is the only path on which we can > > > auto-detect the extension of the parquet shared library. > > > > > > Nevertheless, I will take a shot at fixing the issues as it seems that > > > multiple users run into it. > > > > > > Uwe > > > > > > On Thu, Jan 11, 2018, at 11:42 PM, simba nyatsanga wrote: > > > > Hi Wes, > > > > > > > > Apologies for the ambiguity there. To clarify, I used the conda > > > > instructions only to create a conda environment. So I did this > > > > > > > > conda create -y -q -n pyarrow-dev \ > > > > python=2.7 numpy six setuptools cython pandas pytest \ > > > > cmake flatbuffers rapidjson boost-cpp thrift-cpp snappy zlib \ > > > > gflags brotli jemalloc lz4-c zstd -c conda-forge > > > > > > > > > > > > I followed the instructions closely and I've stumbled upon a > different > > > > error from the one I initially had encountered. Now the issue seems > to be > > > > that when I'm building the Arrow C++ i.e running the following steps: > > > > > > > > mkdir parquet-cpp/build > > > > pushd parquet-cpp/build > > > > > > > > cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \ > > > > -DCMAKE_INSTALL_PREFIX=$PARQUET_HOME \ > > > > -DPARQUET_BUILD_BENCHMARKS=off \ > > > > -DPARQUET_BUILD_EXECUTABLES=off \ > > > > -DPARQUET_BUILD_TESTS=off \ > > > > .. > > > > > > > > make -j4 > > > > make install > > > > popd > > > > > > > > > > > > The make install step generates *libparquet.1.3.2.dylib* as one of > the > > > > artefacts, as illustrated below: > > > > > > > > -- Install configuration: "RELEASE"-- Installing: > > > > > /Users/simba/anaconda/envs/pyarrow-dev/share/parquet-cpp/cmake/parquet- > > > > cppConfig.cmake-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/share/parquet-cpp/ > > > > cmake/parquet-cppConfigVersion.cmake-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/lib/libparquet. > > > > 1.3.2.dylib-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/lib/libparquet. > > > > 1.dylib-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/lib/ > > > > libparquet.dylib-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/lib/libparquet.a-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > column_reader.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > column_page.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > column_scanner.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > column_writer.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > encoding.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > exception.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > file_reader.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > file_writer.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > metadata.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > printer.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > properties.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > schema.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > statistics.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > types.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > parquet_version.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/lib/pkgconfig/ > > > > parquet.pc-- > > > > Installing: > /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/api/ > > > > io.h-- > > > > Installing: > /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/api/ > > > > reader.h-- > > > > Installing: > /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/api/ > > > > writer.h-- > > > > Installing: > /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/api/ > > > > schema.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > arrow/reader.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > arrow/schema.h-- > > > > Installing: /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/ > > > > arrow/writer.h-- > > > > Installing: > /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/util/ > > > > buffer-builder.h-- > > > > Installing: > /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/util/ > > > > comparison.h-- > > > > Installing: > /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/util/ > > > > logging.h-- > > > > Installing: > /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/util/ > > > > macros.h-- > > > > Installing: > /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/util/ > > > > memory.h-- > > > > Installing: > /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/util/ > > > > stopwatch.h-- > > > > Installing: > /Users/simba/anaconda/envs/pyarrow-dev/include/parquet/util/ > > > > visibility.h > > > > > > > > > > > > Subsequently when I want to build a standalone pyarrow wheel by > running > > > > this step: > > > > > > > > python setup.py build_ext --build-type=$ARROW_BUILD_TYPE \ > > > > --with-parquet --with-plasma --bundle-arrow-cpp bdist_wheel > > > > > > > > > > > > > > > > Then I get an error where one of the build steps in the > *CMakelists.txt* > > > > expects to find *libparquet.1.0.0.dylib*. The error is illustrated > > > below: > > > > > > > > running build_ext-- Runnning cmake for pyarrow > > > > cmake > > > -DPYTHON_EXECUTABLE=/Users/simba/anaconda/envs/pyarrow-dev/bin/python > > > > -DPYARROW_BUILD_PARQUET=on -DPYARROW_BUILD_PLASMA=on > > > > -DPYARROW_BUNDLE_ARROW_CPP=ON -DCMAKE_BUILD_TYPE=release > > > > /Users/simbarashenyatsanga/Projects/personal/oss/arrow/python > > > > INFOCompiler command: /Library/Developer/CommandLineTools/usr/bin/c++ > > > > INFOCompiler version: Apple LLVM version 8.0.0 > > > > (clang-800.0.42.1)Target: x86_64-apple-darwin15.6.0 > > > > Thread model: posixInstalledDir: > > > /Library/Developer/CommandLineTools/usr/bin > > > > > > > > INFOCompiler id: Clang > > > > Selected compiler clang 3.8.0svn > > > > Configured for RELEASE build (set with cmake > > > > -DCMAKE_BUILD_TYPE={release,debug,...})-- Build Type: RELEASE-- Build > > > > output directory: > > > > > /Users/simba/Projects/personal/oss/arrow/python/build/temp.macosx-10.9- > > > > x86_64-2.7/release/-- > > > > Checking for module 'arrow'-- Found arrow, version 0.9.0-SNAPSHOT-- > > > > Arrow ABI version: 0.0.0-- Arrow SO version: 0-- Found the Arrow core > > > > library: /Users/simba/anaconda/envs/pyarrow-dev/lib/libarrow.dylib-- > > > > Found the Arrow Python library: > > > > /Users/simba/anaconda/envs/pyarrow-dev/lib/libarrow_python.dylib > > > > Added shared library dependency arrow: > > > > /Users/simba/anaconda/envs/pyarrow-dev/lib/libarrow.dylib > > > > Added shared library dependency arrow_python: > > > > /Users/simba/anaconda/envs/pyarrow-dev/lib/libarrow_python.dylib-- > > > > Found the Parquet library: > > > > /Users/simba/anaconda/envs/pyarrow-dev/lib/libparquet.dylib > > > > CMake Error: File > > > > /Users/simba/anaconda/envs/pyarrow-dev/lib/libparquet.1.0.0.dylib > does > > > > not exist. > > > > CMake Error at CMakeLists.txt:213 (configure_file): > > > > configure_file Problem configuring file > > > > Call Stack (most recent call first): > > > > CMakeLists.txt:296 (bundle_arrow_lib) > > > > > > > > > > > > Added shared library dependency parquet: > > > > /Users/simba/anaconda/envs/pyarrow-dev/lib/libparquet.dylib-- > Checking > > > > for module 'plasma'-- Found plasma, version-- Plasma ABI version: > > > > 0.0.0-- Plasma SO version: 0-- Found the Plasma core library: > > > > /Users/simba/anaconda/envs/pyarrow-dev/lib/libplasma.dylib-- Found > > > > Plasma executable: > > > > /Users/simba/anaconda/envs/pyarrow-dev/bin/plasma_store > > > > Added shared library dependency libplasma: > > > > /Users/simba/anaconda/envs/pyarrow-dev/lib/libplasma.dylib-- > > > > Configuring incomplete, errors occurred! > > > > See also "/Users/simba/Projects/personal/oss/arrow/python/build/ > > > > temp.macosx-10.9-x86_64-2.7/CMakeFiles/CMakeOutput.log". > > > > See also "/Users/simba/Projects/personal/oss/arrow/python/build/ > > > > temp.macosx-10.9-x86_64-2.7/CMakeFiles/CMakeError.log".error: > > > > command 'cmake' failed with exit status 1 > > > > > > > > > > > > Also (might be) worth noting from above is that I'm picking up *arrow > > > > 0.9.0-SNAPSHOT.* > > > > > > > > From what I can see in the > */Users/simba/anaconda/envs/pyarrow-dev/lib* > > > > folder the sym link is infact pointing to *libparquet.1.3.2.dylib > > > *instead > > > > of the expected *libparquet.1.0.0.dylib*: > > > > > > > > > pwd/Users/simba/anaconda/envs/pyarrow-dev/lib> ll | grep > > > "libparquet"-rwxr-xr-x 1 simba staff 1.6M Jan 11 18:45 > > > libparquet.1.3.2.dylib > > > > lrwxr-xr-x 1 simba staff 22B Jan 11 18:45 libparquet.1.dylib > -> > > > > libparquet.1.3.2.dylib-rw-r--r-- 1 simba staff 3.0M Jan 11 > 18:45 > > > > libparquet.a > > > > lrwxr-xr-x 1 simba staff 18B Jan 11 18:45 libparquet.dylib -> > > > > libparquet.1.dylib > > > > > > > > > > > > > > > > Just to clarify also, I'm attempting to build the wheel from within > > > > *arrow/python* folder where the *setup.py* file is. > > > > > > > > Thanks again for the help. > > > > > > > > Simba > > > > > > > > > > > > > > > > On Thu, 11 Jan 2018 at 09:09 simba nyatsanga <simnyatsa...@gmail.com > > > > > wrote: > > > > > > > > > Hi Wes, > > > > > > > > > > Thanks for the response. I was following the development > instructions > > > on > > > > > Github here: > > > > > > > > > https://github.com/apache/arrow/blob/master/python/doc/source/development.rst > > > > > > > > > > I took MacOS option and installed my virtual env via conda. I > must've > > > > > missed an instruction when trying the 2.7 install, because I was > able > > > to > > > > > successfully install for 3.6. > > > > > > > > > > Although it looks like the instructions on Github are similar to > the > > > ones > > > > > you linked, I will give it another go with the later. > > > > > > > > > > Kind Regards > > > > > Simba > > > > > > > > > > On Thu, 11 Jan 2018 at 00:51 Wes McKinney <wesmck...@gmail.com> > wrote: > > > > > > > > > >> hi Simba, > > > > >> > > > > >> Are you following development instructions in > > > > >> > > > > >> > > > > http://arrow.apache.org/docs/python/development.html#developing-on-linux-and-macos > > > > >> or something else? > > > > >> > > > > >> - Wes > > > > >> > > > > >> On Wed, Jan 10, 2018 at 11:20 AM, simba nyatsanga > > > > >> <simnyatsa...@gmail.com> wrote: > > > > >> > Hi, > > > > >> > > > > > >> > I've created a python 2.7 virtualenv in my attempt to build the > > > pyarrow > > > > >> > project. But I'm having trouble running one of commands as > > > specified in > > > > >> the > > > > >> > development docs on Github, specifically this command: > > > > >> > > > > > >> > cd arrow/python > > > > >> > python setup.py build_ext --build-type=$ARROW_BUILD_TYPE \ > > > > >> > --with-parquet --with-plasma --inplace > > > > >> > > > > > >> > The error output looks like this: > > > > >> > > > > > >> > running build_ext-- Runnning cmake for pyarrow > > > > >> > cmake > > > > >> > > > > -DPYTHON_EXECUTABLE=/Users/simba/anaconda/envs/pyarrow-dev-py2.7/bin/python > > > > >> > -DPYARROW_BUILD_PARQUET=on -DPYARROW_BUILD_PLASMA=on > > > > >> > -DCMAKE_BUILD_TYPE= > /Users/simba/Projects/personal/oss/arrow/python > > > > >> > INFOCompiler command: > > > /Library/Developer/CommandLineTools/usr/bin/c++ > > > > >> > INFOCompiler version: Apple LLVM version 8.0.0 > > > > >> > (clang-800.0.42.1)Target: x86_64-apple-darwin15.6.0 > > > > >> > Thread model: posixInstalledDir: > > > > >> /Library/Developer/CommandLineTools/usr/bin > > > > >> > > > > > >> > INFOCompiler id: Clang > > > > >> > Selected compiler clang 3.8.0svn > > > > >> > Configured for DEBUG build (set with cmake > > > > >> > -DCMAKE_BUILD_TYPE={release,debug,...})-- Build Type: DEBUG-- > Build > > > > >> > output directory: > > > > >> > /Users/simba/Projects/personal/oss/arrow/python/build/debug/-- > > > > >> > Checking for module 'arrow'-- No package 'arrow' found-- > Found the > > > > >> > Arrow core library: > > > > >> > > /Users/simba/anaconda/envs/pyarrow-dev-py2.7/lib/libarrow.dylib-- > > > > >> > Found the Arrow Python library: > > > > >> > > > > /Users/simba/anaconda/envs/pyarrow-dev-py2.7/lib/libarrow_python.dylib > > > > >> > Added shared library dependency arrow: > > > > >> > /Users/simba/anaconda/envs/pyarrow-dev-py2.7/lib/libarrow.dylib > > > > >> > Added shared library dependency arrow_python: > > > > >> > > > > > /Users/simba/anaconda/envs/pyarrow-dev-py2.7/lib/libarrow_python.dylib-- > > > > >> > Checking for module 'parquet'-- No package 'parquet' found-- > Found > > > > >> > the Parquet library: > > > > >> > > /Users/simba/anaconda/envs/pyarrow-dev-py2.7/lib/libparquet.dylib > > > > >> > Added shared library dependency parquet: > > > > >> > > /Users/simba/anaconda/envs/pyarrow-dev-py2.7/lib/libparquet.dylib-- > > > > >> > Checking for module 'plasma'-- No package 'plasma' found-- > Found > > > the > > > > >> > Plasma core library: > > > > >> > > /Users/simba/anaconda/envs/pyarrow-dev-py2.7/lib/libplasma.dylib-- > > > > >> > Found Plasma executable: > > > > >> > Added shared library dependency libplasma: > > > > >> > > /Users/simba/anaconda/envs/pyarrow-dev-py2.7/lib/libplasma.dylib-- > > > > >> > Configuring done-- Generating done-- Build files have been > written > > > to: > > > > >> > /Users/simba/Projects/personal/oss/arrow/python-- Finished > cmake for > > > > >> > pyarrow-- Running cmake --build for pyarrow > > > > >> > makemake: *** No targets specified and no makefile found. > > > Stop.error: > > > > >> > command 'make' failed with exit status 2 > > > > >> > > > > > >> > > > > > >> > It looks like there's a change dir happening at this line in the > > > > >> setup.py: > > > > >> > > https://github.com/apache/arrow/blob/master/python/setup.py#L136 > > > > >> > Which, in my case, is switching to the temp build which doesn't > > > have the > > > > >> > required Makefile to run the make command. > > > > >> > > > > > >> > I could be missing something because I was able to build the > project > > > > >> > successfully for python3. But I'd like to build it in python2.7 > to > > > > >> attempt > > > > >> > a bug fix for this issue: > > > > >> https://issues.apache.org/jira/browse/ARROW-1976 > > > > >> > > > > > >> > Thanks for help. > > > > >> > > > > > >> > Kind Regards > > > > >> > Simba > > > > >> > > > > > > > > >