Maria Rebelka created AIRFLOW-3419:
--------------------------------------

             Summary: S3_hook.select_key is broken
                 Key: AIRFLOW-3419
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3419
             Project: Apache Airflow
          Issue Type: Bug
          Components: boto3, hooks
    Affects Versions: 1.10.1
            Reporter: Maria Rebelka


Hello,

Using select_key throws an error:

{quote}text = S3Hook('aws_conn').select_key(key='my_key',
                                     bucket_name='my_bucket',
                                     expression='SELECT * FROM S3Object s',
                                     expression_type='SQL',
                                     input_serialization={'JSON': \{'Type': 
'DOCUMENT'}},
                                     output_serialization={'JSON': {}}){quote}

Traceback (most recent call last):

{quote}   File "db.py", line 31, in <module>
    output_serialization={'JSON': {}})
  File "/usr/local/lib/python3.4/site-packages/airflow/hooks/S3_hook.py", line 
262, in select_key
    for event in response['Payload']
TypeError: sequence item 0: expected str instance, bytes found{quote}


Seems that the problem is in this line:

S3_hook.py, line 262:  return ''.join(event['Records']['Payload']

which probably should be return 
''.join(event['Records']['Payload'].decode('utf-8')

>From example in Amazon blog:
https://aws.amazon.com/blogs/aws/s3-glacier-select/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to