Hi lei, I think the reason is that our `HiveMapredSplitReader` not supports name mapping reading for parquet format. Can you create a JIRA for tracking this?
Best, Jingsong Lee On Fri, Apr 10, 2020 at 9:42 AM wangl...@geekplus.com.cn < wangl...@geekplus.com.cn> wrote: > > I am using Hive 3.1.1 > The table has many fields, each field is corresponded to a feild in the > RobotUploadData0101 > class. > > CREATE TABLE `robotparquet`(`robotid` int, `framecount` int, > `robottime` bigint, `robotpathmode` int, `movingmode` int, > `submovingmode` int, `xlocation` int, `ylocation` int, > `robotradangle` int, `velocity` int, `acceleration` int, > `angularvelocity` int, `angularacceleration` int, `literangle` int, > `shelfangle` int, `onloadshelfid` int, `rcvdinstr` int, `sensordist` > int, `pathstate` int, `powerpresent` int, `neednewpath` int, > `pathelenum` int, `taskstate` int, `receivedtaskid` int, > `receivedcommcount` int, `receiveddispatchinstr` int, > `receiveddispatchcount` int, `subtaskmode` int, `versiontype` int, > `version` int, `liftheight` int, `codecheckstatus` int, > `cameraworkmode` int, `backrimstate` int, `frontrimstate` int, > `pathselectstate` int, `codemisscount` int, `groundcameraresult` int, > `shelfcameraresult` int, `softwarerespondframe` int, `paramstate` int, > `pilotlamp` int, `codecount` int, `dist2waitpoint` int, > `targetdistance` int, `obstaclecount` int, `obstacleframe` int, > `cellcodex` int, `cellcodey` int, `cellangle` int, `shelfqrcode` int, > `shelfqrangle` int, `shelfqrx` int, `shelfqry` int, > `trackthetaerror` int, `tracksideerror` int, `trackfuseerror` int, > `lifterangleerror` int, `lifterheighterror` int, `linearcmdspeed` int, > `angluarcmdspeed` int, `liftercmdspeed` int, `rotatorcmdspeed` int) > PARTITIONED BY (`hour` string) STORED AS parquet; > > > Thanks, > Lei > ------------------------------ > wangl...@geekplus.com.cn > > > *From:* Jingsong Li <jingsongl...@gmail.com> > *Date:* 2020-04-09 21:45 > *To:* wangl...@geekplus.com.cn > *CC:* Jark Wu <imj...@gmail.com>; lirui <li...@apache.org>; user > <user@flink.apache.org> > *Subject:* Re: Re: fink sql client not able to read parquet format table > Hi lei, > > Which hive version did you use? > Can you share the complete hive DDL? > > Best, > Jingsong Lee > > On Thu, Apr 9, 2020 at 7:15 PM wangl...@geekplus.com.cn < > wangl...@geekplus.com.cn> wrote: > >> >> I am using the newest 1.10 blink planner. >> >> Perhaps it is because of the method i used to write the parquet file. >> >> Receive kafka message, transform each message to a Java class Object, >> write the Object to HDFS using StreamingFileSink, add the HDFS path as a >> partition of the hive table >> >> No matter what the order of the field description in hive ddl statement, >> the hive client will work, as long as the field name is the same with Java >> Object field name. >> But flink sql client will not work. >> >> DataStream<RobotUploadData0101> sourceRobot = source.map( x->transform(x)); >> final StreamingFileSink<RobotUploadData0101> sink; >> sink = StreamingFileSink >> .forBulkFormat(new >> Path("hdfs://172.19.78.38:8020/user/root/wanglei/robotdata/parquet"), >> ParquetAvroWriters.forReflectRecord(RobotUploadData0101.class)) >> >> For example >> RobotUploadData0101 has two fields: robotId int, robotTime long >> >> CREATE TABLE `robotparquet`( `robotid` int, `robottime` bigint ) and >> CREATE TABLE `robotparquet`( `robottime` bigint, `robotid` int) >> is the same for hive client, but is different for flink-sql client >> >> It is an expected behavior? >> >> Thanks, >> Lei >> >> ------------------------------ >> wangl...@geekplus.com.cn >> >> >> *From:* Jark Wu <imj...@gmail.com> >> *Date:* 2020-04-09 14:48 >> *To:* wangl...@geekplus.com.cn; Jingsong Li <jingsongl...@gmail.com>; >> lirui <li...@apache.org> >> *CC:* user <user@flink.apache.org> >> *Subject:* Re: fink sql client not able to read parquet format table >> Hi Lei, >> >> Are you using the newest 1.10 blink planner? >> >> I'm not familiar with Hive and parquet, but I know @Jingsong Li >> <jingsongl...@gmail.com> and @li...@apache.org <li...@apache.org> are >> experts on this. Maybe they can help on this question. >> >> Best, >> Jark >> >> On Tue, 7 Apr 2020 at 16:17, wangl...@geekplus.com.cn < >> wangl...@geekplus.com.cn> wrote: >> >>> >>> Hive table stored as parquet. >>> >>> Under hive client: >>> hive> select robotid from robotparquet limit 2; >>> OK >>> 1291097 >>> 1291044 >>> >>> >>> But under flink sql-client the result is 0 >>> Flink SQL> select robotid from robotparquet limit 2; >>> robotid >>> 0 >>> 0 >>> >>> Any insight on this? >>> >>> Thanks, >>> Lei >>> >>> >>> >>> ------------------------------ >>> wangl...@geekplus.com.cn >>> >>> > > -- > Best, Jingsong Lee > > -- Best, Jingsong Lee