Hi Joice,

BTW,could you pls try to query it by Doris after putting the
"_orc_acid_version"
file to the hdfs directory of hive manually? Thanks.

Best regards,
Kaka

kaka chen <kaka11.c...@gmail.com> 于2023年10月8日周日 10:10写道:

> Hi Joice,
>
> Got it, thanks very much. We will setup a cdp env and do some
> compatibility work.
>
> Best regards,
> Kaka
>
> Joice Jacob <joicejacob1...@gmail.com> 于2023年10月7日周六 14:58写道:
>
>> Hi,Dear Doris Community,
>> I wanted to provide an update on my recent experience with integrating
>> Doris with both Apache Hive and CDP Hive, as well as seek insights into a
>> particular observation.
>>
>> In my integration with Apache Hive, I noticed that transaction tables
>> retain the "_orc_acid_version" file, which aligns with the expected
>> behavior. However, when I integrated with CDP Hive, I observed that this
>> "_orc_acid_version" file was conspicuously absent.
>>
>> This discrepancy in behavior between the two environments has left me
>> somewhat perplexed, and I'm keen to understand the underlying reasons
>> behind it. It appears that in CDP Hive, *managed tables are set to be
>> transactional by default*. Interestingly, when we create managed tables
>> with the TBLPROPERTIES("transactional"="false") option, they are seemingly 
>> *treated
>> as external tables*.
>>
>> I've attached screenshots to illustrate this scenario for further clarity.
>> CREATE  TABLE  BABY2(
>> id int,
>> FNAME VARCHAR(50),
>> GENDER VARCHAR(2),
>> TOTCOUNT INT
>> )
>> PARTITIONED BY(DATA_ID INT)
>> TBLPROPERTIES (
>> 'transactional'='false',
>> 'orc.compress'='snappy');
>>
>> if we execute above statement table create like this.
>> [image: image.png]
>>
>> I would greatly appreciate any insights, suggestions, or explanations
>> that the Doris community may have regarding this behavior. Your expertise
>> will be invaluable in helping me navigate this integration and address any
>> associated challenges.
>>
>> Thank you in advance for your time and support. I look forward to hearing
>> from you and to collaborating with the community to better understand and
>> resolve this matter.
>>
>> Best regards,
>> joice
>>
>> On Sat, Oct 7, 2023 at 11:46 AM kaka chen <kaka11.c...@gmail.com> wrote:
>>
>>> Hi Joice:
>>> Thanks your reporting.
>>>
>>> It seems the root cause of this issue is missing "_orc_acid_version"
>>> file.
>>> From Hive version >= 3.0, delta/base files will always have file
>>> '_orc_acid_version' with value >= '2'.
>>> Maybe the hive3 of HDP has similar issue?
>>> https://issues.apache.org/jira/browse/HIVE-16964
>>>
>>> A workaround is try to create table without transactional props.
>>> TBLPROPERTIES("transactional"="true")
>>> pls try it, thanks.
>>>
>>> Best regards,
>>> Kaka
>>>
>>>
>>> Joice Jacob <joicejacob1...@gmail.com> 于2023年10月6日周五 22:58写道:
>>>
>>> > Hi,
>>> > I've checked the Hive data directory, and I couldn't find the
>>> > "_orc_acid_version" file. I have attached screenshots for your
>>> reference.
>>> >
>>> > Are there any recommended workarounds or alternative approaches that I
>>> can
>>> > consider to resolve this issue? I'm open to exploring different
>>> solutions
>>> > to ensure the successful integration of Doris with Hive in CDP. Any
>>> > guidance or suggestions would be greatly appreciated.
>>> >
>>> > Screenshots attached for your reference.
>>> >
>>> > Best regards,
>>> > Joice
>>> >
>>> > [image: image.png]
>>> >
>>> > On Fri, Oct 6, 2023 at 6:28 PM Mingyu Chen <morning...@163.com> wrote:
>>> >
>>> >> Could you list your hive data dir, to see what files exist? like:
>>> >> _orc_acid_version
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >>
>>> >> Best Regards
>>> >> Mingyu Chen
>>> >>
>>> >> Email:
>>> >> morning...@apache.org
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> At 2023-10-05 22:31:58, "Mingyu Chen" <morning...@163.com> wrote:
>>> >> >Oh, I see, let me check it again.
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >--
>>> >> >
>>> >> >Best Regards
>>> >> >Mingyu Chen
>>> >> >
>>> >> >Email:
>>> >> >morning...@apache.org
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >At 2023-10-05 21:49:28, "Joice Jacob" <joicejacob1...@gmail.com>
>>> wrote:
>>> >> >>Hi,
>>> >> >>I wanted to share some important information regarding the CDP Hive
>>> >> version
>>> >> >>that I am currently using, which is Hive 3.1.3000.7.1.7.0-551.
>>> >> >>
>>> >> >>In CDP Hive version 3.x, a significant change has been introduced
>>> >> regarding
>>> >> >>managed tables. By default, managed tables in Hive 3.x are
>>> considered
>>> >> >>transactional. This means that if you create a managed table
>>> explicitly
>>> >> >>specifying it as transactional with a value of "false," it will be
>>> >> treated
>>> >> >>as an external table instead.
>>> >> >>Thank you for your attention to this matter, and I appreciate your
>>> >> >>continued support.
>>> >> >>
>>> >>
>>> https://www.thecodersstop.com/hadoop/apache-hive-3-changes-in-cdp-upgrade-part-1/
>>> >> >>Best regards,
>>> >> >>Joice
>>> >> >>
>>> >> >>On Thu, Oct 5, 2023 at 7:01 PM Mingyu Chen <morning...@163.com>
>>> wrote:
>>> >> >>
>>> >> >>> I saw that your hive table is with property "transactional" =
>>> "true",
>>> >> >>> And Doris only support ACID table with Hive 3, not support with
>>> Hive
>>> >> 2.x.
>>> >> >>> So you may need to create non-transactional hive table for Doris
>>> to
>>> >> visit.
>>> >> >>>
>>> >> >>>
>>> >> >>> --
>>> >> >>> Best Regards
>>> >> >>> Mingyu Chen
>>> >> >>>
>>> >> >>> Email:
>>> >> >>> morning...@apache.org
>>> >> >>>
>>> >> >>>
>>> >> >>> 在 2023-10-05 13:30:25,"Joice Jacob" <joicejacob1...@gmail.com>
>>> 写道:
>>> >> >>>
>>> >> >>> Dear Doris Community & Jiafeng.Zhang
>>> >> >>> ,
>>> >> >>> Thank you for your prompt response and your willingness to assist
>>> >> with the
>>> >> >>> issue I've been encountering with the integration of Doris and
>>> >> Cloudera's
>>> >> >>> Hive.
>>> >> >>>
>>> >> >>> To provide you with the necessary information, here are the
>>> details
>>> >> you
>>> >> >>> requested:
>>> >> >>>
>>> >> >>> *1. Hive Catalog Creation Statement:*
>>> >> >>>
>>> >> >>> My Hive catalog creation statement is as follows:
>>> >> >>>
>>> >> >>> CREATE CATALOG hive PROPERTIES (
>>> >> >>>     'type'='hms',
>>> >> >>>     'hive.metastore.uris' = 'thrift://10.128.0.4:9083',
>>> >> >>> 'hive.version' = '3.1.3',
>>> >> >>>     'hive.metastore.sasl.enabled' = 'true',
>>> >> >>>     'hive.metastore.kerberos.principal' = 'hive/
>>> instanc...@hadoop.com
>>> >> ',
>>> >> >>>     'hadoop.security.authentication' = 'kerberos',
>>> >> >>>     'hadoop.kerberos.keytab' = '/home/techuser/doris/hive.keytab',
>>> >> >>>     'hadoop.kerberos.principal' = 'hive/instanc...@hadoop.com',
>>> >> >>>     'yarn.resourcemanager.principal' = 'yarn/
>>> instanc...@hadoop.com'
>>> >> >>> );
>>> >> >>>
>>> >> >>> *2. CDP Hive Version:*
>>> >> >>>
>>> >> >>> The CDP Hive version I am using is: Hive 3.1.3000.7.1.7.0-551
>>> >> >>>
>>> >> >>> *3. Configuration Attempt:*
>>> >> >>>
>>> >> >>> I have tried specifying the Hive version in my catalog
>>> configuration
>>> >> with
>>> >> >>> the following statement:
>>> >> >>> "hive.version=2.1.0"
>>> >> >>> Despite attempting to set the Hive version to 2.1.0, I continue to
>>> >> >>> experience the same issue, which is detailed in my previous
>>> emails.
>>> >> >>>
>>> >> >>> *4. Logs: *
>>> >> >>> I will attach both the fe.log and fe.warn logs to this email for
>>> your
>>> >> >>> reference. These logs should provide additional context regarding
>>> the
>>> >> issue
>>> >> >>> I'm facing.
>>> >> >>>
>>> >> >>> Once again, I want to express my gratitude for your assistance and
>>> >> support
>>> >> >>> in resolving this matter. I look forward to your insights and
>>> >> >>> recommendations based on the provided logs and catalog
>>> configuration.
>>> >> >>>
>>> >> >>> Please feel free to let me know if you require any further
>>> >> information or
>>> >> >>> if there are additional steps I should take to assist in
>>> diagnosing
>>> >> and
>>> >> >>> resolving the issue.
>>> >> >>>
>>> >> >>> [image: doris_error.JPG]
>>> >> >>>  fe.log log at hive catlog creation time
>>> >> >>> [image: fe_log_at_hivecatalog.JPG]
>>> >> >>> Hive managed table script
>>> >> >>> [image: image.png]
>>> >> >>>
>>> >> >>>
>>> >> >>> Best regards,
>>> >> >>> Joice
>>> >> >>>
>>> >> >>> On Thu, Oct 5, 2023 at 8:10 AM Jiafeng.Zhang <zhang...@gmail.com>
>>> >> wrote:
>>> >> >>>
>>> >> >>>> Can you provide your hive catalog creation statement, your fe.log
>>> >> log at
>>> >> >>>> that time, and your cdp hive version? This will help us locate
>>> the
>>> >> >>>> problem,
>>> >> >>>> thank you.
>>> >> >>>> You can also try specifying your hive version in your catalog
>>> >> statement:
>>> >> >>>> "hive.version=2.1.0"
>>> >> >>>>
>>> >> >>>> Joice Jacob <joicejacob1...@gmail.com> 于2023年10月5日周四 00:45写道:
>>> >> >>>>
>>> >> >>>> > I am reaching out once again to seek assistance and share a
>>> >> specific
>>> >> >>>> issue
>>> >> >>>> > I've encountered while integrating Doris with Cloudera-flavored
>>> >> Hive.
>>> >> >>>> The
>>> >> >>>> > error message I'm facing is as follows:
>>> >> >>>> > detailMessage = get file split failed for table: baby1, err:
>>> >> >>>> > java.lang.Exception: Hive 2.x versioned full-acid tables need
>>> to
>>> >> run
>>> >> >>>> major
>>> >> >>>> > compaction.
>>> >> >>>> >
>>> >> >>>> > This error message appears when I attempt to query a Hive table
>>> >> from
>>> >> >>>> > Doris, and it seems to be related to Hive 2.x versioned
>>> full-ACID
>>> >> tables
>>> >> >>>> > requiring a major compaction.
>>> >> >>>> > I would like to ask the Doris community for guidance on how to
>>> >> handle
>>> >> >>>> this
>>> >> >>>> > issue effectively. Specifically, I am interested in
>>> understanding
>>> >> the
>>> >> >>>> best
>>> >> >>>> > practices and steps to follow when dealing with
>>> Cloudera-flavored
>>> >> Hive
>>> >> >>>> > tables that require major compaction for Doris integration.
>>> >> >>>> >
>>> >> >>>> > If anyone in the community has successfully addressed this
>>> issue
>>> >> or can
>>> >> >>>> > provide insights into how to configure and manage
>>> >> Cloudera-flavored Hive
>>> >> >>>> > tables for integration with Doris, your expertise would be
>>> highly
>>> >> >>>> > appreciated.
>>> >> >>>> >
>>> >> >>>> > Thank you for your time and support, and I look forward to
>>> >> receiving
>>> >> >>>> your
>>> >> >>>> > valuable input.
>>> >> >>>> > Joice
>>> >> >>>> >
>>> >> >>>> > On Wed, Oct 4, 2023 at 9:23 PM Joice Jacob <
>>> >> joicejacob1...@gmail.com>
>>> >> >>>> > wrote:
>>> >> >>>> >
>>> >> >>>> >> Dear Doris Community,
>>> >> >>>> >> I have an update on the issue I previously mentioned
>>> regarding the
>>> >> >>>> >> integration of Hive on a CDP distribution with Doris 2.0.1.1.
>>> >> >>>> >>
>>> >> >>>> >> After further investigation, I have identified that the issue
>>> is
>>> >> >>>> related
>>> >> >>>> >> to Hive managed tables being transactional by default. This
>>> >> appears to
>>> >> >>>> be
>>> >> >>>> >> causing the error I encountered earlier.
>>> >> >>>> >>
>>> >> >>>> >> To address this issue, I am seeking guidance from the
>>> community
>>> >> on any
>>> >> >>>> >> specific configurations or settings that need to be adjusted
>>> for
>>> >> Hive
>>> >> >>>> >> transactional tables when using Hive as the catalog in Doris.
>>> Are
>>> >> >>>> there any
>>> >> >>>> >> recommended configurations or best practices that I should
>>> follow
>>> >> to
>>> >> >>>> ensure
>>> >> >>>> >> smooth integration and query execution?
>>> >> >>>> >>
>>> >> >>>> >> Any insights or recommendations from the Doris community
>>> would be
>>> >> >>>> greatly
>>> >> >>>> >> appreciated. Your expertise and guidance will be instrumental
>>> in
>>> >> >>>> helping me
>>> >> >>>> >> resolve this challenge.
>>> >> >>>> >>
>>> >> >>>> >> Thank you for your continued support, and I look forward to
>>> your
>>> >> >>>> valuable
>>> >> >>>> >> input.
>>> >> >>>> >>
>>> >> >>>> >> Best regards,
>>> >> >>>> >> Joice
>>> >> >>>> >>
>>> >> >>>> >> On Wed, Oct 4, 2023 at 6:54 PM Joice Jacob <
>>> >> joicejacob1...@gmail.com>
>>> >> >>>> >> wrote:
>>> >> >>>> >>
>>> >> >>>> >>> Dear Doris Community,
>>> >> >>>> >>>
>>> >> >>>> >>> I am reaching out to the community to seek assistance with an
>>> >> >>>> >>> integration issue I've encountered while trying to use Hive
>>> on a
>>> >> CDP
>>> >> >>>> >>> distribution with Doris 2.0.1.1.
>>> >> >>>> >>>
>>> >> >>>> >>> Here are the details of my setup:
>>> >> >>>> >>>
>>> >> >>>> >>> Doris Version: 2.0.1.1
>>> >> >>>> >>> Hive Version: 3.1.3
>>> >> >>>> >>> Cluster Security: Kerberized
>>> >> >>>> >>>
>>> >> >>>> >>> I have successfully created a Hive catalog in Doris and have
>>> >> been able
>>> >> >>>> >>> to set up the integration between Hive and Doris. However,
>>> when I
>>> >> >>>> attempt
>>> >> >>>> >>> to query a Hive table using Doris, I encounter the following
>>> >> error:
>>> >> >>>> >>>
>>> >> >>>> >>> ERROR 1105 (HY000): errCode = 2, detailMessage = get file
>>> split
>>> >> failed
>>> >> >>>> >>> for table: baby1, err: java.lang.Exception: Hive 2.x
>>> versioned
>>> >> >>>> full-acid
>>> >> >>>> >>> tables need to run major compaction.
>>> >> >>>> >>>
>>> >> >>>> >>> I have already performed a major compaction as recommended,
>>> but
>>> >> I am
>>> >> >>>> >>> still encountering the same error.
>>> >> >>>> >>>
>>> >> >>>> >>> I would greatly appreciate any insights, guidance, or
>>> solutions
>>> >> that
>>> >> >>>> the
>>> >> >>>> >>> Doris community can offer to help me resolve this issue. If
>>> >> anyone has
>>> >> >>>> >>> encountered a similar problem or has expertise in integrating
>>> >> Hive
>>> >> >>>> with
>>> >> >>>> >>> Doris, your assistance would be invaluable.
>>> >> >>>> >>>
>>> >> >>>> >>> Thank you in advance for your time and support. I look
>>> forward to
>>> >> >>>> >>> hearing from the community and working together to find a
>>> >> solution to
>>> >> >>>> this
>>> >> >>>> >>> challenge.
>>> >> >>>> >>>
>>> >> >>>> >>> [image: baby_table.JPG]
>>> >> >>>> >>> [image: doris_hive_catlog_result.JPG]
>>> >> >>>> >>>
>>> >> >>>> >>> [image: doris_fe_log.JPG]
>>> >> >>>> >>>
>>> >> >>>> >>> [image: hive_metastore_error.JPG]
>>> >> >>>> >>>
>>> >> >>>> >>> Thanks
>>> >> >>>> >>> Joice
>>> >> >>>> >>>
>>> >> >>>> >>>
>>> >> >>>> >>>
>>> >> >>>> >>>
>>> >> >>>> >>>
>>> >> >>>>
>>> >> >>>> --
>>> >> >>>> 张家峰
>>> >> >>>>
>>> >> >>>
>>> >>
>>> >
>>>
>>

Reply via email to