Hi , I updated the JIRA . Kindly give your suggestions so that I can go ahead and complete the task.
Thanks On Tue, Feb 1, 2011 at 12:25 PM, bharath vissapragada <bharathvissapragada1...@gmail.com> wrote: > Thanks for replying namit.. > > It is motivating to receive a mail from the authors of Hive :). > > I filed the jira based on the discussion.. > https://issues.apache.org/jira/browse/HIVE-1938 > > I will try to update my idea asap. > > Thanks > Bharath,V > 4th year Undergrad,IIIT Hyderabad. > w: http://research.iiit.ac.in/~bharath.v > > > > On Tue, Feb 1, 2011 at 11:46 AM, Namit Jain <nj...@fb.com> wrote: >> Bharath, >> >> This would be great. >> >> Why don¹t you write up something about how you are planning to proceed ? >> File a new jira and load some design notes/spec. there. >> We can definitely sync up. from there. >> >> >> This feature would be very useful to the community - We, at facebook, >> Would definitely like to use it. >> >> >> Thanks, >> -namit >> >> >> On 1/31/11 9:50 PM, "bharath vissapragada" >> <bharathvissapragada1...@gmail.com> wrote: >> >>>Hi Ning,Anja, >>> >>>I am doing my Masters thesis on this topic . I have implemented all >>>SQL features like joins , selects etc on top of Hadoop (before knowing >>>about Hive) and we have derived some basic cost-models for join >>>re-ordering which seem to be working fine on some basic scales of TPCH >>>datasets .. Later I came to know about Hive and I am trying to >>>implement the same in Hive . >>> >>>Right now I am in the process of understanding Hive's source and I am >>>almost done with "ql" package. I think it would be great if you guys >>>can help us in this regard .. I am a bit confused about the >>>implementation of joins and once i'm done with that , I can modify the >>>"joinReorder" of Optimizer package by using the cost-formulae and >>>metadata. It would be a great opportunity to work with you guys at fb >>>and contribute to Hive.. >>> >>>Thanks >>>Bharath,V >>>4th year Undergrad,IIIT Hyderabad. >>>w: http://research.iiit.ac.in/~bharath.v >>> >>>On Tue, Feb 1, 2011 at 9:22 AM, Ning Zhang <nzh...@fb.com> wrote: >>>> Hi Anja, >>>> >>>> As you noticed Hive only have limited supports for cost-baesd >>>>optimization. One of the reasons is that Hive used to have very small >>>>number of optional execution plans to choose from. One exception is >>>>mapjoin vs common joins. Liying Tang had some work on his last intern to >>>>convert common joins to mapjoin in a rule-based fashion. One of his >>>>future works is to automatically convert common join to mapjoins based >>>>on stats. There are also ongoing work on indexes on Hive. With the >>>>support of indexes, CBO will be much needed. >>>> >>>> In order for a decent CBO to work, we need stats and cost models. There >>>>are some work in stats. Table/partition level stats has already been >>>>supported. There is a JIRA open for column level stats (HIVE-1362). Cost >>>>model is much more complex in Hadoop environment and closely dependent >>>>on the mapjoin/index implementations. Given al these in place, we can >>>>then talk about plan enumeration etc. >>>> >>>> So yes, we are interested in CBO, but it is a large area and many >>>>missing pieces need to be filled in Hive. If you have particular >>>>interest in some area, you can propose your ideas in >>>>hive-...@hive.apache.org mailing list or even apply for an intern at FB >>>>if you would like to work closely with us. >>>> >>>> Thanks, >>>> Ning >>>> >>>> On Jan 31, 2011, at 2:04 PM, Anja Gruenheid wrote: >>>> >>>>> Hi! >>>>> >>>>> I'm a graduate student from Georgia Tech and I'm working with Hive for >>>>>a research project. I am interested in query optimization and the Hive >>>>>MetaStore in that context. Working through the documentation and code, >>>>>I noticed that the implementation right now is using a rule-based >>>>>optimization system. Therefore, I was wondering whether cost-based >>>>>query optimization will be a future task in the development of Hive and >>>>>if it would be possible for me to cooperate with the developers of Hive >>>>>to advance the project in general. >>>>> >>>>> Best regards, >>>>> Anja Gruenheid >>>> >>>> >> >> >