Yes values is the outer most array. Probably array < struct < struct is the
max level of nesting possible. Any number of structs can be nested, but
internal arrays seem to be an issue. The ones that failed had, array<
struct < struct < array < struct. This broke the serde.
Regarding pretty printin
hooray! over one hurdle and onto the next one. So something about that
one nested array caused the problem. very strange. I wonder if there is a
smaller test case to look at as it seems not all arrays break it since i
see one for the attribute "values".
As to the formatting issue i don't beli
Finally I could get it work. The issue resolves once I remove the arrays
within position structure. So that is the limitation of the serde. I
changed 'industries' to string and 'jobfunctions' to Map I
can query the table just fine now. Here is the complete DDL for reference:
create external table
Thanks Stephen,
Let me explore options. I will let you all know once I am successful.
regards
Sunita
On Wed, Jun 19, 2013 at 3:08 PM, Stephen Sprague wrote:
> try_parsed_json is not trivial imho :)
>
> start with the very, very basic, for example, { "jobs" : "foo" }. Get
> that to work first
try_parsed_json is not trivial imho :)
start with the very, very basic, for example, { "jobs" : "foo" }. Get
that to work first. :) When that works add a level of nesting and see
what happens. Keep building on it until you either break it (and then you
know that last thing you added broke it
Thanks for looking into it Ramki.
Yes I had tried these options. Here is what I get (renamed the table to
have a meaningful name):
hive> select jobs.values[1].id from linkedinjobsearch;
..mapreduce task details
OK
NULL
Time taken: 9.586 seconds
hive> select jobs.values[0].position.title
Can you run some other queries from job1 table and see if any query returns
some data?
I am guessing your query "select jobs.values.position.title from jobs1;"
may have some issue. May be it should be as
select jobs.values[0].position.title from jobs1;
Regards,
Ramki.
On Wed, Jun 19, 2013 at
Thanks Stephen,
That's just what I tried with the try_parsed table. It is exactly same data
with lesser nesting in the structure and lesser number of entries.
Do you mean to say that highly nested jsons can lead to issues? What are
typical solution to such issues? Write UDFs in hive or parse the J
I think you might have to start small here instead of going for the home
run on the first swing. when all else fails start with a trivial json
object and then build up from there and see what additional step breaks
it. that way you know if the trivial example fails is something
fundamental and n
Thanks for sharing your experience Richa.
I do have timestamps but in the format of year : INT, day : INT, month :
INT.
As per your suggestion, I changed them all to string, but still get null as
the output.
regards
Sunita
On Wed, Jun 19, 2013 at 2:17 AM, Richa Sharma
wrote:
> Do you have any t
Do you have any timestamp fields in the table that might contain null value
?
I faced a similar situation sometime back - changing the data type to
string made it work.
But I was working on delimited text files.
Not sure if it applies to JSON .. but its still worth giving a try !!
Richa
On We
Having the a column name same as the table name, is a problem due to which
I was not able to reference jobs.values.id from jobs. Changing the table
name to jobs1 resolved the semantic error.
However, the query still returns null
hive> select jobs.values.position.title from jobs1;
Total MapReduce j
Ok.
The data files are quite small. Around 35 KB and 1 KB each.
[sunita@node01 tables]$ hadoop fs -ls /user/sunita/tables/jobs
Found 1 items
-rw-r--r-- 3 sunita hdfs 35172 2013-06-18 18:31
/user/sunita/tables/jobs/jobs_noSite_parsed.json
[sunita@node01 tables]$ hadoop fs -text
/user/suni
As Nitin alluded to its best to confirm the data is definitely in hdfs
using hdfs semantics rather than hive for the first step.
1. how big is it? hadoop fs -ls
2. cat a bit of it and see if anything is there. hadoop fs -text / | head -10
do you see any data from step #2?
On Tue, Jun 18,
can you run a little more complex query
select uniq across columns or do some maths. so we know when it fires up a
mapreduce
On Wed, Jun 19, 2013 at 1:59 AM, Sunita Arvind wrote:
> Thanks for responding Nitin. Yes I am sure that serde is working fine and
> json file is being picked based on all
Thanks for responding Nitin. Yes I am sure that serde is working fine and
json file is being picked based on all the errors that showed up till this
stage. What sort of error are you suspecting. File not present or serde not
parsing it ?
On Tuesday, June 18, 2013, Nitin Pawar wrote:
> select * fr
select * from table is as good as hdfs -cat
are you sure there is any data in the table?
On Tue, Jun 18, 2013 at 11:54 PM, Sunita Arvind wrote:
> Hi,
>
> I am able to parse the input JSON file and load it into hive. I do not see
> any errors with create table, so I am assuming that. But when I
Hi,
I am able to parse the input JSON file and load it into hive. I do not see
any errors with create table, so I am assuming that. But when I try to read
the data, I get null
hive> select * from jobs;
OK
null
I have validated the JSON with JSONLint and Notepad++ JSON plugin and it is
a valid JS
18 matches
Mail list logo