[ 
https://issues.apache.org/jira/browse/ARROW-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17661435#comment-17661435
 ] 

Rok Mihevc commented on ARROW-4413:
-----------------------------------

This issue has been migrated to [issue 
#20975|https://github.com/apache/arrow/issues/20975] on GitHub. Please see the 
[migration documentation|https://github.com/apache/arrow/issues/14542] for 
further details.

> [Python] pyarrow.hdfs.connect() failing
> ---------------------------------------
>
>                 Key: ARROW-4413
>                 URL: https://issues.apache.org/jira/browse/ARROW-4413
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.12.0
>         Environment: Python 2.7
> Hadoop distribution: Amazon 2.7.3
> Hive 2.1.1 
> Spark 2.1.1
> Tez 0.8.4
> Linux 4.4.35-33.55.amzn1.x86_64
>            Reporter: Bradley Grantham
>            Assignee: Antoine Pitrou
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.13.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Trying to connect to hdfs using the below snippet. Using {{hadoop-libhdfs}}.
> This error appears in {{v0.12.0}}. It doesn't appear in {{v0.11.1}}. (I used 
> the same environment when testing that it still worked on {{v0.11.1}})
>  
> {code:java}
> In [1]: import pyarrow as pa
> In [2]: fs = pa.hdfs.connect()
> ---------------------------------------------------------------------------
> TypeError                                 Traceback (most recent call last)
> <ipython-input-2-e0007ad7fa95> in <module>()
> ----> 1 fs = pa.hdfs.connect()
> /usr/local/lib64/python2.7/site-packages/pyarrow/hdfs.pyc in connect(host, 
> port, user, kerb_ticket, driver, extra_conf)
>     205     fs = HadoopFileSystem(host=host, port=port, user=user,
>     206                           kerb_ticket=kerb_ticket, driver=driver,
> --> 207                           extra_conf=extra_conf)
>     208     return fs
> /usr/local/lib64/python2.7/site-packages/pyarrow/hdfs.pyc in __init__(self, 
> host, port, user, kerb_ticket, driver, extra_conf)
>      36             _maybe_set_hadoop_classpath()
>      37 
> ---> 38         self._connect(host, port, user, kerb_ticket, driver, 
> extra_conf)
>      39 
>      40     def __reduce__(self):
> /usr/local/lib64/python2.7/site-packages/pyarrow/io-hdfs.pxi in 
> pyarrow.lib.HadoopFileSystem._connect()
>      72         if host is not None:
>      73             conf.host = tobytes(host)
> ---> 74         self.host = host
>      75 
>      76         conf.port = port
> TypeError: Expected unicode, got str
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to