I'm building up a set of classes (objectinspectors and serdes) to allow hive queries over some data files I have. While I'm making it work, I don't fully grok all the concepts involved. Right now I've got 2 questions.
I'm able to make queries like this (this is the first syntax I tried to query into what I know are lists of objects, is it the best way?): select messageId,lastmodifiedDate,contexts[1].conceptId from MessageData LIMIT 5; and get the conceptId of the first context element in the first 5 rows/ (my messagedata contexts field is a list of context objects;) select messageId,lastmodifiedDate,contexts.conceptId from MessageData LIMIT 5; and get the conceptId of all the context elements in the first 5 rows but I can't make a query like this select messageId,lastmodifiedDate,count(contexts) LIMIT 5; Is there a different syntax to query the length of that list of objects? Also, currently when you query select messageId, lastmodifiedDate,contexts LIMIT 1; you get a fully expanded representation of all of the contexts for 1 row back. What I'd really like is for that query to just return the list of contextIds (as if the query had been contexts.contextId), but then to be able to query down into the contexts like above. Is there some way my ObjectInspector could respond to select messageId, lastmodifiedDate,contexts; as if it were select messageId,lastmodifiedDate.contexts.contextId but also still respond correctly to select messageId. lastmodifiedDate.contexts.conceptId ? Thanks for the help, Lauren