Yes, I will check. I will update you by EOD today. Thanks for support. Thanks & Regards Joice
On Sun, Oct 8, 2023, 7:48 AM kaka chen <kaka11.c...@gmail.com> wrote: > Hi Joice, > > BTW,could you pls try to query it by Doris after putting the > "_orc_acid_version" > file to the hdfs directory of hive manually? Thanks. > > Best regards, > Kaka > > kaka chen <kaka11.c...@gmail.com> 于2023年10月8日周日 10:10写道: > >> Hi Joice, >> >> Got it, thanks very much. We will setup a cdp env and do some >> compatibility work. >> >> Best regards, >> Kaka >> >> Joice Jacob <joicejacob1...@gmail.com> 于2023年10月7日周六 14:58写道: >> >>> Hi,Dear Doris Community, >>> I wanted to provide an update on my recent experience with integrating >>> Doris with both Apache Hive and CDP Hive, as well as seek insights into a >>> particular observation. >>> >>> In my integration with Apache Hive, I noticed that transaction tables >>> retain the "_orc_acid_version" file, which aligns with the expected >>> behavior. However, when I integrated with CDP Hive, I observed that this >>> "_orc_acid_version" file was conspicuously absent. >>> >>> This discrepancy in behavior between the two environments has left me >>> somewhat perplexed, and I'm keen to understand the underlying reasons >>> behind it. It appears that in CDP Hive, *managed tables are set to be >>> transactional by default*. Interestingly, when we create managed tables >>> with the TBLPROPERTIES("transactional"="false") option, they are seemingly >>> *treated >>> as external tables*. >>> >>> I've attached screenshots to illustrate this scenario for further >>> clarity. >>> CREATE TABLE BABY2( >>> id int, >>> FNAME VARCHAR(50), >>> GENDER VARCHAR(2), >>> TOTCOUNT INT >>> ) >>> PARTITIONED BY(DATA_ID INT) >>> TBLPROPERTIES ( >>> 'transactional'='false', >>> 'orc.compress'='snappy'); >>> >>> if we execute above statement table create like this. >>> [image: image.png] >>> >>> I would greatly appreciate any insights, suggestions, or explanations >>> that the Doris community may have regarding this behavior. Your expertise >>> will be invaluable in helping me navigate this integration and address any >>> associated challenges. >>> >>> Thank you in advance for your time and support. I look forward to >>> hearing from you and to collaborating with the community to better >>> understand and resolve this matter. >>> >>> Best regards, >>> joice >>> >>> On Sat, Oct 7, 2023 at 11:46 AM kaka chen <kaka11.c...@gmail.com> wrote: >>> >>>> Hi Joice: >>>> Thanks your reporting. >>>> >>>> It seems the root cause of this issue is missing "_orc_acid_version" >>>> file. >>>> From Hive version >= 3.0, delta/base files will always have file >>>> '_orc_acid_version' with value >= '2'. >>>> Maybe the hive3 of HDP has similar issue? >>>> https://issues.apache.org/jira/browse/HIVE-16964 >>>> >>>> A workaround is try to create table without transactional props. >>>> TBLPROPERTIES("transactional"="true") >>>> pls try it, thanks. >>>> >>>> Best regards, >>>> Kaka >>>> >>>> >>>> Joice Jacob <joicejacob1...@gmail.com> 于2023年10月6日周五 22:58写道: >>>> >>>> > Hi, >>>> > I've checked the Hive data directory, and I couldn't find the >>>> > "_orc_acid_version" file. I have attached screenshots for your >>>> reference. >>>> > >>>> > Are there any recommended workarounds or alternative approaches that >>>> I can >>>> > consider to resolve this issue? I'm open to exploring different >>>> solutions >>>> > to ensure the successful integration of Doris with Hive in CDP. Any >>>> > guidance or suggestions would be greatly appreciated. >>>> > >>>> > Screenshots attached for your reference. >>>> > >>>> > Best regards, >>>> > Joice >>>> > >>>> > [image: image.png] >>>> > >>>> > On Fri, Oct 6, 2023 at 6:28 PM Mingyu Chen <morning...@163.com> >>>> wrote: >>>> > >>>> >> Could you list your hive data dir, to see what files exist? like: >>>> >> _orc_acid_version >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> -- >>>> >> >>>> >> Best Regards >>>> >> Mingyu Chen >>>> >> >>>> >> Email: >>>> >> morning...@apache.org >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> At 2023-10-05 22:31:58, "Mingyu Chen" <morning...@163.com> wrote: >>>> >> >Oh, I see, let me check it again. >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> >-- >>>> >> > >>>> >> >Best Regards >>>> >> >Mingyu Chen >>>> >> > >>>> >> >Email: >>>> >> >morning...@apache.org >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> >At 2023-10-05 21:49:28, "Joice Jacob" <joicejacob1...@gmail.com> >>>> wrote: >>>> >> >>Hi, >>>> >> >>I wanted to share some important information regarding the CDP Hive >>>> >> version >>>> >> >>that I am currently using, which is Hive 3.1.3000.7.1.7.0-551. >>>> >> >> >>>> >> >>In CDP Hive version 3.x, a significant change has been introduced >>>> >> regarding >>>> >> >>managed tables. By default, managed tables in Hive 3.x are >>>> considered >>>> >> >>transactional. This means that if you create a managed table >>>> explicitly >>>> >> >>specifying it as transactional with a value of "false," it will be >>>> >> treated >>>> >> >>as an external table instead. >>>> >> >>Thank you for your attention to this matter, and I appreciate your >>>> >> >>continued support. >>>> >> >> >>>> >> >>>> https://www.thecodersstop.com/hadoop/apache-hive-3-changes-in-cdp-upgrade-part-1/ >>>> >> >>Best regards, >>>> >> >>Joice >>>> >> >> >>>> >> >>On Thu, Oct 5, 2023 at 7:01 PM Mingyu Chen <morning...@163.com> >>>> wrote: >>>> >> >> >>>> >> >>> I saw that your hive table is with property "transactional" = >>>> "true", >>>> >> >>> And Doris only support ACID table with Hive 3, not support with >>>> Hive >>>> >> 2.x. >>>> >> >>> So you may need to create non-transactional hive table for Doris >>>> to >>>> >> visit. >>>> >> >>> >>>> >> >>> >>>> >> >>> -- >>>> >> >>> Best Regards >>>> >> >>> Mingyu Chen >>>> >> >>> >>>> >> >>> Email: >>>> >> >>> morning...@apache.org >>>> >> >>> >>>> >> >>> >>>> >> >>> 在 2023-10-05 13:30:25,"Joice Jacob" <joicejacob1...@gmail.com> >>>> 写道: >>>> >> >>> >>>> >> >>> Dear Doris Community & Jiafeng.Zhang >>>> >> >>> , >>>> >> >>> Thank you for your prompt response and your willingness to assist >>>> >> with the >>>> >> >>> issue I've been encountering with the integration of Doris and >>>> >> Cloudera's >>>> >> >>> Hive. >>>> >> >>> >>>> >> >>> To provide you with the necessary information, here are the >>>> details >>>> >> you >>>> >> >>> requested: >>>> >> >>> >>>> >> >>> *1. Hive Catalog Creation Statement:* >>>> >> >>> >>>> >> >>> My Hive catalog creation statement is as follows: >>>> >> >>> >>>> >> >>> CREATE CATALOG hive PROPERTIES ( >>>> >> >>> 'type'='hms', >>>> >> >>> 'hive.metastore.uris' = 'thrift://10.128.0.4:9083', >>>> >> >>> 'hive.version' = '3.1.3', >>>> >> >>> 'hive.metastore.sasl.enabled' = 'true', >>>> >> >>> 'hive.metastore.kerberos.principal' = 'hive/ >>>> instanc...@hadoop.com >>>> >> ', >>>> >> >>> 'hadoop.security.authentication' = 'kerberos', >>>> >> >>> 'hadoop.kerberos.keytab' = >>>> '/home/techuser/doris/hive.keytab', >>>> >> >>> 'hadoop.kerberos.principal' = 'hive/instanc...@hadoop.com', >>>> >> >>> 'yarn.resourcemanager.principal' = 'yarn/ >>>> instanc...@hadoop.com' >>>> >> >>> ); >>>> >> >>> >>>> >> >>> *2. CDP Hive Version:* >>>> >> >>> >>>> >> >>> The CDP Hive version I am using is: Hive 3.1.3000.7.1.7.0-551 >>>> >> >>> >>>> >> >>> *3. Configuration Attempt:* >>>> >> >>> >>>> >> >>> I have tried specifying the Hive version in my catalog >>>> configuration >>>> >> with >>>> >> >>> the following statement: >>>> >> >>> "hive.version=2.1.0" >>>> >> >>> Despite attempting to set the Hive version to 2.1.0, I continue >>>> to >>>> >> >>> experience the same issue, which is detailed in my previous >>>> emails. >>>> >> >>> >>>> >> >>> *4. Logs: * >>>> >> >>> I will attach both the fe.log and fe.warn logs to this email for >>>> your >>>> >> >>> reference. These logs should provide additional context >>>> regarding the >>>> >> issue >>>> >> >>> I'm facing. >>>> >> >>> >>>> >> >>> Once again, I want to express my gratitude for your assistance >>>> and >>>> >> support >>>> >> >>> in resolving this matter. I look forward to your insights and >>>> >> >>> recommendations based on the provided logs and catalog >>>> configuration. >>>> >> >>> >>>> >> >>> Please feel free to let me know if you require any further >>>> >> information or >>>> >> >>> if there are additional steps I should take to assist in >>>> diagnosing >>>> >> and >>>> >> >>> resolving the issue. >>>> >> >>> >>>> >> >>> [image: doris_error.JPG] >>>> >> >>> fe.log log at hive catlog creation time >>>> >> >>> [image: fe_log_at_hivecatalog.JPG] >>>> >> >>> Hive managed table script >>>> >> >>> [image: image.png] >>>> >> >>> >>>> >> >>> >>>> >> >>> Best regards, >>>> >> >>> Joice >>>> >> >>> >>>> >> >>> On Thu, Oct 5, 2023 at 8:10 AM Jiafeng.Zhang <zhang...@gmail.com >>>> > >>>> >> wrote: >>>> >> >>> >>>> >> >>>> Can you provide your hive catalog creation statement, your >>>> fe.log >>>> >> log at >>>> >> >>>> that time, and your cdp hive version? This will help us locate >>>> the >>>> >> >>>> problem, >>>> >> >>>> thank you. >>>> >> >>>> You can also try specifying your hive version in your catalog >>>> >> statement: >>>> >> >>>> "hive.version=2.1.0" >>>> >> >>>> >>>> >> >>>> Joice Jacob <joicejacob1...@gmail.com> 于2023年10月5日周四 00:45写道: >>>> >> >>>> >>>> >> >>>> > I am reaching out once again to seek assistance and share a >>>> >> specific >>>> >> >>>> issue >>>> >> >>>> > I've encountered while integrating Doris with >>>> Cloudera-flavored >>>> >> Hive. >>>> >> >>>> The >>>> >> >>>> > error message I'm facing is as follows: >>>> >> >>>> > detailMessage = get file split failed for table: baby1, err: >>>> >> >>>> > java.lang.Exception: Hive 2.x versioned full-acid tables need >>>> to >>>> >> run >>>> >> >>>> major >>>> >> >>>> > compaction. >>>> >> >>>> > >>>> >> >>>> > This error message appears when I attempt to query a Hive >>>> table >>>> >> from >>>> >> >>>> > Doris, and it seems to be related to Hive 2.x versioned >>>> full-ACID >>>> >> tables >>>> >> >>>> > requiring a major compaction. >>>> >> >>>> > I would like to ask the Doris community for guidance on how to >>>> >> handle >>>> >> >>>> this >>>> >> >>>> > issue effectively. Specifically, I am interested in >>>> understanding >>>> >> the >>>> >> >>>> best >>>> >> >>>> > practices and steps to follow when dealing with >>>> Cloudera-flavored >>>> >> Hive >>>> >> >>>> > tables that require major compaction for Doris integration. >>>> >> >>>> > >>>> >> >>>> > If anyone in the community has successfully addressed this >>>> issue >>>> >> or can >>>> >> >>>> > provide insights into how to configure and manage >>>> >> Cloudera-flavored Hive >>>> >> >>>> > tables for integration with Doris, your expertise would be >>>> highly >>>> >> >>>> > appreciated. >>>> >> >>>> > >>>> >> >>>> > Thank you for your time and support, and I look forward to >>>> >> receiving >>>> >> >>>> your >>>> >> >>>> > valuable input. >>>> >> >>>> > Joice >>>> >> >>>> > >>>> >> >>>> > On Wed, Oct 4, 2023 at 9:23 PM Joice Jacob < >>>> >> joicejacob1...@gmail.com> >>>> >> >>>> > wrote: >>>> >> >>>> > >>>> >> >>>> >> Dear Doris Community, >>>> >> >>>> >> I have an update on the issue I previously mentioned >>>> regarding the >>>> >> >>>> >> integration of Hive on a CDP distribution with Doris 2.0.1.1. >>>> >> >>>> >> >>>> >> >>>> >> After further investigation, I have identified that the >>>> issue is >>>> >> >>>> related >>>> >> >>>> >> to Hive managed tables being transactional by default. This >>>> >> appears to >>>> >> >>>> be >>>> >> >>>> >> causing the error I encountered earlier. >>>> >> >>>> >> >>>> >> >>>> >> To address this issue, I am seeking guidance from the >>>> community >>>> >> on any >>>> >> >>>> >> specific configurations or settings that need to be adjusted >>>> for >>>> >> Hive >>>> >> >>>> >> transactional tables when using Hive as the catalog in >>>> Doris. Are >>>> >> >>>> there any >>>> >> >>>> >> recommended configurations or best practices that I should >>>> follow >>>> >> to >>>> >> >>>> ensure >>>> >> >>>> >> smooth integration and query execution? >>>> >> >>>> >> >>>> >> >>>> >> Any insights or recommendations from the Doris community >>>> would be >>>> >> >>>> greatly >>>> >> >>>> >> appreciated. Your expertise and guidance will be >>>> instrumental in >>>> >> >>>> helping me >>>> >> >>>> >> resolve this challenge. >>>> >> >>>> >> >>>> >> >>>> >> Thank you for your continued support, and I look forward to >>>> your >>>> >> >>>> valuable >>>> >> >>>> >> input. >>>> >> >>>> >> >>>> >> >>>> >> Best regards, >>>> >> >>>> >> Joice >>>> >> >>>> >> >>>> >> >>>> >> On Wed, Oct 4, 2023 at 6:54 PM Joice Jacob < >>>> >> joicejacob1...@gmail.com> >>>> >> >>>> >> wrote: >>>> >> >>>> >> >>>> >> >>>> >>> Dear Doris Community, >>>> >> >>>> >>> >>>> >> >>>> >>> I am reaching out to the community to seek assistance with >>>> an >>>> >> >>>> >>> integration issue I've encountered while trying to use Hive >>>> on a >>>> >> CDP >>>> >> >>>> >>> distribution with Doris 2.0.1.1. >>>> >> >>>> >>> >>>> >> >>>> >>> Here are the details of my setup: >>>> >> >>>> >>> >>>> >> >>>> >>> Doris Version: 2.0.1.1 >>>> >> >>>> >>> Hive Version: 3.1.3 >>>> >> >>>> >>> Cluster Security: Kerberized >>>> >> >>>> >>> >>>> >> >>>> >>> I have successfully created a Hive catalog in Doris and have >>>> >> been able >>>> >> >>>> >>> to set up the integration between Hive and Doris. However, >>>> when I >>>> >> >>>> attempt >>>> >> >>>> >>> to query a Hive table using Doris, I encounter the following >>>> >> error: >>>> >> >>>> >>> >>>> >> >>>> >>> ERROR 1105 (HY000): errCode = 2, detailMessage = get file >>>> split >>>> >> failed >>>> >> >>>> >>> for table: baby1, err: java.lang.Exception: Hive 2.x >>>> versioned >>>> >> >>>> full-acid >>>> >> >>>> >>> tables need to run major compaction. >>>> >> >>>> >>> >>>> >> >>>> >>> I have already performed a major compaction as recommended, >>>> but >>>> >> I am >>>> >> >>>> >>> still encountering the same error. >>>> >> >>>> >>> >>>> >> >>>> >>> I would greatly appreciate any insights, guidance, or >>>> solutions >>>> >> that >>>> >> >>>> the >>>> >> >>>> >>> Doris community can offer to help me resolve this issue. If >>>> >> anyone has >>>> >> >>>> >>> encountered a similar problem or has expertise in >>>> integrating >>>> >> Hive >>>> >> >>>> with >>>> >> >>>> >>> Doris, your assistance would be invaluable. >>>> >> >>>> >>> >>>> >> >>>> >>> Thank you in advance for your time and support. I look >>>> forward to >>>> >> >>>> >>> hearing from the community and working together to find a >>>> >> solution to >>>> >> >>>> this >>>> >> >>>> >>> challenge. >>>> >> >>>> >>> >>>> >> >>>> >>> [image: baby_table.JPG] >>>> >> >>>> >>> [image: doris_hive_catlog_result.JPG] >>>> >> >>>> >>> >>>> >> >>>> >>> [image: doris_fe_log.JPG] >>>> >> >>>> >>> >>>> >> >>>> >>> [image: hive_metastore_error.JPG] >>>> >> >>>> >>> >>>> >> >>>> >>> Thanks >>>> >> >>>> >>> Joice >>>> >> >>>> >>> >>>> >> >>>> >>> >>>> >> >>>> >>> >>>> >> >>>> >>> >>>> >> >>>> >>> >>>> >> >>>> >>>> >> >>>> -- >>>> >> >>>> 张家峰 >>>> >> >>>> >>>> >> >>> >>>> >> >>>> > >>>> >>>