Re: external table with parquet files: problem querying in sparksql since data is stored as integer while hive schema expects a timestamp

Gourav Sengupta Sun, 24 Jul 2022 02:54:28 -0700

Hi,

please try to query the table directly by loading the hive metastore (we
can do that quite easily in AWS EMR, but we can do things quite easily with
everything in AWS), rather than querying the s3 location directly.



Regards,
Gourav

On Wed, Jul 20, 2022 at 9:51 PM Joris Billen <joris.bil...@bigindustries.be>
wrote:

> Hi,
> below sounds like something that someone will have experienced...
> I have external tables of parquet files with a hive table defined on top
> of the data. I dont manage/know the details of how the data lands.
> For some tables no issues when querying through spark.
> But for others there is an issue: looks like the datatype for hive is
> timestamp, but the parquet file contains an integer number =microseconds:
> if I access the table in spark sql I get:
>
> *Unable to create Parquet converter for data type “timestamp” whose
> Parquet type is optional int64 airac_wef (TIMESTAMP(MICROS,false))*
>
> OR
>
> *Parquet column cannot be converted in file
> abfs://somelocation/table/partition=484/000000_0. Column: [atimefield],
> Expected: timestamp, Found: INT64*
>
>
>
> Anyone encountered this? I tried several sorts of CAST but no success yet.
> I see similar problems on forums (like this:
> https://stackoverflow.com/questions/59096125/spark-2-4-parquet-column-cannot-be-converted-in-file-column-impressions-exp
> ) but no solution.
>
>
> Thanks for input!
>

Re: external table with parquet files: problem querying in sparksql since data is stored as integer while hive schema expects a timestamp

Reply via email to