Hi, We encounter this in hive 0.13.1 when CREATE TEMPORARY FUNCTION while a SELECT is processing at the same database.
I think it's not necessary to require a EXCLUSIVE lock for such DDL statements. I found this patch <https://issues.apache.org/jira/browse/HIVE-6734>. It seems like it only check the writetype in DbTxnManager. Maybe it's a good idea to check the DLL's writetype in DummyTxnManager too. 2014-09-09 22:48 GMT+08:00 Edward Capriolo <edlinuxg...@gmail.com>: > We use our own library, simple constructions like files in hdfs that work > like pid/lock files. a file like /flags/tablea/process1 could mean "hey i'm > working on table a leave it alone". Accomplishes the exact same thing with > less fuss, it is also much easier for an external process/scheduler/shell > script to integrate with this system. I doubt many use hive locking as flow > control for a scheduling system. > > On Tue, Sep 9, 2014 at 3:25 AM, wzc <wzc1...@gmail.com> wrote: > >> Hi, >> We also encounter this in hive 0.13 , we need to enable concurrency in >> daily ETL workflows (to avoid sub etl start to read parent etl 's output >> while it's still running). >> We found that in hive 0.13 sometime when you open hive cli shell it would >> output the msg "conflicting lock present for default mode EXCLUSIVE" and >> wait for some locks to be released. We haven't encounter this in hive 0.11 >> and are still trying to figure it out. >> >> >> >> 2014-08-25 15:21 GMT+08:00 Sourygna Luangsay <sluang...@pragsis.com>: >> >>> Many thanks Edward for this complete answer. >>> >>> >>> >>> So the main idea is to simply disable concurrency in Hive if I get you. >>> >>> >>> >>> My doubt now is: is it something most Hive users do as default? >>> >>> Can somebody else share its own experience? >>> >>> >>> >>> Regards, >>> >>> >>> >>> *Sourygna Luangsay* >>> >>> >>> >>> *From:* Edward Capriolo [mailto:edlinuxg...@gmail.com] >>> *Sent:* viernes, 22 de agosto de 2014 16:07 >>> *To:* user@hive.apache.org >>> *Subject:* Re: doubt about locking mechanism in Hive >>> >>> >>> >>> IMHO locking support should be turned off by default. I would argue if >>> you are requiring this feature often you may be designing your systems >>> improperly. >>> >>> You really should not have that many situations where you need locking >>> in a write (mostly) once file system. The only time I have ever used it is >>> if I had a process completely re-writing the contents of a table and I >>> needed downstream things not to select from this table when it was in an >>> inconsistent state. Having it on by default is a bad idea. You have pointed >>> out a case where doing a simple select query attempts to acquire locks it >>> does not need. That puts strain on more systems and creates more changes >>> for issues. >>> >>> >>> >>> One of the big design philosophy issues I tend to have with hive lately >>> is we have this pool of users (like myself) that use hive for its original >>> purpose. To query write once text files, and create aggregations. >>> >>> Then there are other groups attempting to implement very complicated >>> semantics around streaming, transactions, locking, whatever. Then you have >>> tools like cloudera manager giving configution warnings such as: >>> >>> " Hive: Hive is not configured with ZooKeeper Service. As a result, >>> hive-site will not contain hive.zookeeper.quorum, which can lead to >>> corruption in concurrency scenarios." >>> >>> I think this statement is incorrect AND is BAD advice. Then users such >>> as yourself making a conclusion like "I should turn on locking" because no >>> one would ever assume that .... >>> >>> !!!SELECTING 1 ROW FROM A TABLE WOULD CAUSE 1100 LOCKS TO BE >>> ACQUIRED!!!! >>> >>> ::rant over:: I am not saying that hive locking is bad, but I am saying >>> I leave it off and turn it on when I need it on a per query basis. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Fri, Aug 22, 2014 at 8:48 AM, Sourygna Luangsay < >>> sluang...@pragsis.com> wrote: >>> >>> Hi, >>> >>> >>> >>> I have some troubles with the locking/concurrency mechanism of Hive when >>> doing a large select and trying to create a table at the same time. >>> >>> My version of Hive is 0.13. >>> >>> >>> >>> What I try to do is the following: >>> >>> >>> >>> 1) In a hive shell: >>> use mydatabase; >>> select * from competence limit 1; # this table has 1100 partitions. >>> So with hive.support.concurrency=true, it needs at least 90s to execute (I >>> know, this is a silly query: I should rather do a select * where “a >>> partition”… The purpose of this query is to replicate easily the problem by >>> having a query that needs a lot of time to execute) >>> >>> >>> >>> 2) In another hive shell, meanwhile the 1st query is executing: >>> use mydatabase; >>> create table probsourygna (foo string) ROW FORMAT DELIMITED FIELDS >>> TERMINATED BY '\t' STORED AS TEXTFILE ; >>> >>> The problem is that the “create table” does not execute untill the first >>> query (select) has finished. >>> >>> And we can see messages of the following type: >>> >>> conflicting lock present for mydatabase mode EXCLUSIVE >>> >>> conflicting lock present for mydatabase mode EXCLUSIVE >>> >>> … >>> >>> >>> >>> (1 line every 60 s) >>> >>> >>> >>> >>> >>> It seems to me that the first query puts a shared lock at the database >>> (mydatabase) level. >>> >>> Then, the second query tries to acquire an exclusive lock at the >>> database level (fails and retries every 60s). >>> >>> >>> >>> Am I right? (when I look at the documentation >>> https://cwiki.apache.org/confluence/display/Hive/Locking , it says >>> nothing about locks at a database level) >>> >>> Is there any solution to my problem? (avoiding a long “select” to block >>> a “create” query, without removing the concurrency of Hive) >>> >>> >>> >>> Regards, >>> >>> >>> >>> *Sourygna Luangsay* >>> >>> >>> AVISO CONFIDENCIAL >>> Este correo y la información contenida o adjunta al mismo es privada y >>> confidencial y va dirigida exclusivamente a su destinatario. Pragsis >>> informa a quien pueda haber recibido este correo por error que contiene >>> información confidencial cuyo uso, copia, reproducción o distribución está >>> expresamente prohibida. Si no es Vd. el destinatario del mismo y recibe >>> este correo por error, le rogamos lo ponga en conocimiento del emisor y >>> proceda a su eliminación sin copiarlo, imprimirlo o utilizarlo de ningún >>> modo. >>> CONFIDENTIALITY WARNING. >>> This message and the information contained in or attached to it are >>> private and confidential and intended exclusively for the addressee. >>> Pragsis informs to whom it may receive it in error that it contains >>> privileged information and its use, copy, reproduction or distribution is >>> prohibited. If you are not an intended recipient of this E-mail, please >>> notify the sender, delete it and do not read, act upon, print, disclose, >>> copy, retain or redistribute any portion of this E-mail. >>> >>> >>> >>> AVISO CONFIDENCIAL >>> Este correo y la información contenida o adjunta al mismo es privada y >>> confidencial y va dirigida exclusivamente a su destinatario. Pragsis >>> informa a quien pueda haber recibido este correo por error que contiene >>> información confidencial cuyo uso, copia, reproducción o distribución está >>> expresamente prohibida. Si no es Vd. el destinatario del mismo y recibe >>> este correo por error, le rogamos lo ponga en conocimiento del emisor y >>> proceda a su eliminación sin copiarlo, imprimirlo o utilizarlo de ningún >>> modo. >>> CONFIDENTIALITY WARNING. >>> This message and the information contained in or attached to it are >>> private and confidential and intended exclusively for the addressee. >>> Pragsis informs to whom it may receive it in error that it contains >>> privileged information and its use, copy, reproduction or distribution is >>> prohibited. If you are not an intended recipient of this E-mail, please >>> notify the sender, delete it and do not read, act upon, print, disclose, >>> copy, retain or redistribute any portion of this E-mail. >>> >> >> >