Hi Josh, At this stage I don't know whether there's anything wrong with Hive or it's just user error. Perhaps if I go through what I have done you can see where the error lies. Unfortunately this is going to be wordy. Apologies in advance for the long email.
So I created a "normal" table in HDFS with a variety of column types like this: CREATE TABLE employees4 ( rowid STRING, flag BOOLEAN, number INT, bignum BIGINT, name STRING, salary FLOAT, bigsalary DOUBLE, numbers ARRAY<INT>, floats ARRAY<DOUBLE>, subordinates ARRAY<STRING>, deductions MAP<STRING, FLOAT>, namedNumbers MAP<STRING, INT>, address STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>); And I put some data into it and I can see the data: hive> SELECT * FROM employees4; OK row1 true 100 7 John Doe 100000.0 100000.0 [13,23,-1,1001] [3.14159,2.71828,-1.1,1001.0] ["Mary Smith","Todd Jones"] {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} {"nameOne":123,"Name Two":49,"The Third Man":-1} {"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600} row2 false 7 100 Mary Smith 100000.0 80000.0 [13,23,-1,1001] [3.14159,2.71828,-1.1,1001.0,1001.0] ["Bill King"] {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1}{"nameOne":123,"Name Two":49,"The Third Man":-1} {"street":"100 Ontario St.","city":"Chicago","state":"IL","zip":60601} row3 false 3245 877878 Todd Jones 100000.0 70000.0 [13,23,-1,1001] [3.14159,2.71828,-1.1,1001.0,2.0] [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"nameOne":123,"Name Two":49,"The Third Man":-1} {"street":"200 Chicago Ave.","city":"Oak Park","state":"IL","zip":60700} row4 true 877878 3245 Bill King 100000.0 60000.0 [13,23,-1,1001] [3.14159,2.71828,-1.1,1001.0,1001.0,1001.0,1001.0] [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"nameOne":123,"Name Two":49,"The Third Man":-1} {"street":"300 Obscure Dr.","city":"Obscuria","state":"IL","zip":60100} Time taken: 0.535 seconds, Fetched: 4 row(s) Everything looks fine. Now I create a Hive table stored in Accumulo: DROP TABLE IF EXISTS accumulo_table4; CREATE TABLE accumulo_table4 ( rowid STRING, flag BOOLEAN, number INT, bignum BIGINT, name STRING, salary FLOAT, bigsalary DOUBLE, numbers ARRAY<INT>, floats ARRAY<DOUBLE>, subordinates ARRAY<STRING>, deductions MAP<STRING, FLOAT>, namednumbers MAP<STRING, INT>, address STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>) STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler' WITH SERDEPROPERTIES('accumulo.columns.mapping' = ':rowid,person:flag#binary,person:number#binary,person:bignum#binary,person:name,person:salary#binary,person:bigsalary#binary,person:numbers#binary,person:floats,person:subordinates,deductions:*,namednumbers:*,person:address'); (Note that I am only really interested in storing the values in "binary".) Now I can load the Accumulo table from the normal table: INSERT OVERWRITE TABLE accumulo_table4 SELECT * FROM employees4; And I can query the data from the Accumulo table. hive> SELECT * FROM accumulo_table4; OK row1 true 100 7 John Doe 100000.0 100000.0 [null] [null] ["Mary Smith\u0003Todd Jones"] {"Federal Taxes":0.2,"Insurance":0.1,"State Taxes":0.05} {"Name Two":49,"The Third Man":-1,"nameOne":123} {"street":"1 Michigan Ave.\u0003Chicago\u0003IL\u000360600","city":null,"state":null,"zip":null} row2 false 7 100 Mary Smith 100000.0 80000.0 [null] [null] ["Bill King"] {"Federal Taxes":0.2,"Insurance":0.1,"State Taxes":0.05} {"Name Two":49,"The Third Man":-1,"nameOne":123} {"street":"100 Ontario St.\u0003Chicago\u0003IL\u000360601","city":null,"state":null,"zip":null} row3 false 3245 877878 Todd Jones 100000.0 70000.0 [null] [null] [] {"Federal Taxes":0.15,"Insurance":0.1,"State Taxes":0.03} {"Name Two":49,"The Third Man":-1,"nameOne":123} {"street":"200 Chicago Ave.\u0003Oak Park\u0003IL\u000360700","city":null,"state":null,"zip":null} row4 true 877878 3245 Bill King 100000.0 60000.0 [null] [null] [] {"Federal Taxes":0.15,"Insurance":0.1,"State Taxes":0.03} {"Name Two":49,"The Third Man":-1,"nameOne":123} {"street":"300 Obscure Dr.\u0003Obscuria\u0003IL\u000360100","city":null,"state":null,"zip":null} Time taken: 0.109 seconds, Fetched: 4 row(s) Notice that the columns with type ARRAY<INT>and ARRAY<DOUBLE> are empty. I assume that this means that there is something wrong and the Hive Storage Handler is returning a null? When I use the accumulo shell to look at the data stored in Accumulo root@accumulo> scan -t accumulo_table4 row1 deductions:Federal Taxes [] 0.2 row1 deductions:Insurance [] 0.1 row1 deductions:State Taxes [] 0.05 row1 namednumbers:Name Two [] 49 row1 namednumbers:The Third Man [] -1 row1 namednumbers:nameOne [] 123 row1 person:address [] 1 Michigan Ave.\x03Chicago\x03IL\x0360600 row1 person:bignum [] \x00\x00\x00\x00\x00\x00\x00\x07 row1 person:bigsalary [] @\xF8j\x00\x00\x00\x00\x00 row1 person:flag [] \x01 row1 person:floats [] 3.14159\x032.71828\x03-1.1\x031001.0 row1 person:name [] John Doe row1 person:number [] \x00\x00\x00d row1 person:numbers [] \x00\x00\x00\x0D\x03\x00\x00\x00\x17\x03\xFF\xFF\xFF\xFF\x03\x00\x00\x03\xE9 row1 person:salary [] G\xC3P\x00 row1 person:subordinates [] Mary Smith\x03Todd Jones This shows that the columns of type INT and FLOAT have been converted to binary, which is great. However the column with type ARRAY<INT> has had the individual values converted, but still has the field separator (0x03) present. I thought that this might just be a conversion problem and so I hacked the Accumulo table to have the "correct" value: row1 person:numbers [] \x00\x00\x00\x0D\x00\x00\x00\x17\xFF\xFF\xFF\xFF\x00\x00\x03\xE9 However when I run the query the numbers field is still "[null]" I'm happy to arrange to store whatever is need in Accumulo to make it work, I just need to know what that is. The second issue is to do with the MAP<STRING,INT> column, in this case called namednumbers. As you can so far it works fine and I am very happy :) However, as I stated before, I really want everything stored in binary. However when I change the table defintion to have a #binary I get an error: hive> CREATE TABLE accumulo_table4 ( > rowid STRING, > flag BOOLEAN, > number INT, > bignum BIGINT, > name STRING, > salary FLOAT, > bigsalary DOUBLE, > numbers ARRAY<INT>, > floats ARRAY<DOUBLE>, > subordinates ARRAY<STRING>, > deductions MAP<STRING, FLOAT>, > namednumbers MAP<STRING, INT>, > address STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>) > STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler' > WITH SERDEPROPERTIES('accumulo.columns.mapping' = ':rowid,person:flag#binary,person:number#binary,person:bignum#binary,person:name,person:salary#binary,person:bigsalary#binary,person:numbers#binary,person:floats,person:subordinates,deductions:*,namednumbers:*#binary,person:address'); FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.IllegalArgumentException: Expected map encoding for a map specification, namednumbers:* with encoding binary I thought that maybe this is because the syntax "column_family:*#binary" is too much. So I try using a default. DROP TABLE IF EXISTS accumulo_table4; CREATE TABLE accumulo_table4 ( rowid STRING, flag BOOLEAN, number INT, bignum BIGINT, name STRING, salary FLOAT, bigsalary DOUBLE, numbers ARRAY<INT>, floats ARRAY<DOUBLE>, subordinates ARRAY<STRING>, deductions MAP<STRING, FLOAT>, namednumbers MAP<STRING, INT>, address STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>) STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler' WITH SERDEPROPERTIES('accumulo.columns.mapping' = ':rowid,person:flag,person:number,person:bignum,person:name,person:salary,person:bigsalary,person:numbers,person:floats,person:subordinates,deductions:*,namednumbers:*,person:address', "accumulo.default.storage" = "binary"); This table creation works, however when I try to insert the data I get a long error message, which follows. However, before that I just want to say that I'm happy to look at the source if I have to. I guess that I would appreciate a pointer as to the file name/path for the Hive Storage Manager code. Many thanks in advance for any help. Z PS. I never thought of using column_family with sequence numbers in the qualifiers for an array. I will try that and get back to you. Here's the conversion error: hive> INSERT OVERWRITE TABLE accumulo_table4 SELECT * FROM employees4; Query ID = hive_20150910125252_f6fb143e-13df-4e81-98d0-fe8391025dc7 Total jobs = 1 Launching Job 1 out of 1 Tez session was closed. Reopening... Session re-established. Status: Running (Executing on YARN cluster with App id application_1441875240043_0005) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 FAILED 1 0 0 1 4 0 -------------------------------------------------------------------------------- VERTICES: 00/01 [>>--------------------------] 0% ELAPSED TIME: 24.54 s -------------------------------------------------------------------------------- Status: Failed Vertex failed, vertexName=Map 1, vertexId=vertex_1441875240043_0005_1_00, diagnostics=[Task failed, taskId=task_1441875240043_0005_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163) ... 13 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:503) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83) ... 16 more Caused by: java.lang.RuntimeException: Hive internal error. at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitive(LazyUtils.java:327) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeBinary(AccumuloRowSerializer.java:368) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeWithLevel(AccumuloRowSerializer.java:270) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeWithLevel(AccumuloRowSerializer.java:288) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.getSerializedValue(AccumuloRowSerializer.java:249) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.serializeColumnMapping(AccumuloRowSerializer.java:148) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.serialize(AccumuloRowSerializer.java:130) at org.apache.hadoop.hive.accumulo.serde.AccumuloSerDe.serialize(AccumuloSerDe.java:119) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:660) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493) ... 17 more ], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163) ... 13 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:503) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83) ... 16 more Caused by: java.lang.RuntimeException: Hive internal error. at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitive(LazyUtils.java:327) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeBinary(AccumuloRowSerializer.java:368) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeWithLevel(AccumuloRowSerializer.java:270) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeWithLevel(AccumuloRowSerializer.java:288) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.getSerializedValue(AccumuloRowSerializer.java:249) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.serializeColumnMapping(AccumuloRowSerializer.java:148) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.serialize(AccumuloRowSerializer.java:130) at org.apache.hadoop.hive.accumulo.serde.AccumuloSerDe.serialize(AccumuloSerDe.java:119) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:660) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493) ... 17 more ], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163) ... 13 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:503) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83) ... 16 more Caused by: java.lang.RuntimeException: Hive internal error. at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitive(LazyUtils.java:327) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeBinary(AccumuloRowSerializer.java:368) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeWithLevel(AccumuloRowSerializer.java:270) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeWithLevel(AccumuloRowSerializer.java:288) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.getSerializedValue(AccumuloRowSerializer.java:249) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.serializeColumnMapping(AccumuloRowSerializer.java:148) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.serialize(AccumuloRowSerializer.java:130) at org.apache.hadoop.hive.accumulo.serde.AccumuloSerDe.serialize(AccumuloSerDe.java:119) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:660) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493) ... 17 more ], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163) ... 13 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"rowid":"row1","flag":true,"number":100,"bignum":7,"name":"John Doe","salary":100000.0,"bigsalary":100000.0,"numbers":[13,23,-1,1001],"floats":[3.14159,2.71828,-1.1,1001.0],"subordinates":["Mary Smith","Todd Jones"],"deductions":{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1},"namednumbers":{"nameOne":123,"Name Two":49,"The Third Man":-1},"address":{"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:503) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83) ... 16 more Caused by: java.lang.RuntimeException: Hive internal error. at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitive(LazyUtils.java:327) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeBinary(AccumuloRowSerializer.java:368) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeWithLevel(AccumuloRowSerializer.java:270) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.writeWithLevel(AccumuloRowSerializer.java:288) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.getSerializedValue(AccumuloRowSerializer.java:249) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.serializeColumnMapping(AccumuloRowSerializer.java:148) at org.apache.hadoop.hive.accumulo.serde.AccumuloRowSerializer.serialize(AccumuloRowSerializer.java:130) at org.apache.hadoop.hive.accumulo.serde.AccumuloSerDe.serialize(AccumuloSerDe.java:119) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:660) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493) ... 17 more ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1441875240043_0005_1_00 [Map 1] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask hive> DROP TABLE IF EXISTS accumulo_table4; OK Time taken: 1.101 seconds -----Original Message----- From: Josh Elser [mailto:josh.el...@gmail.com] Sent: 08 September 2015 22:15 To: user@hive.apache.org Subject: Re: Accumulo Storage Manager For the Array support: it might have just been a missed test case and is just a bug. I don't recall how off the top of my head Arrays are intended to be serialized (if it's some numeric counter in the Accumulo CQ or just serializing all the elements in the array into the Accumulo Value). If it isn't working for you, feel free to open up a JIRA issue with the details and mention me so I notice it :). I can try to help figure out what's busted, and, if necessary, a fix. For the Map support, what are you trying to do differently? Going from memory, I believe the support is for a fixed column family and an optional column qualifier prefix. This limits the entries in a Map to that column family, and allows you to place multiple maps into a given family for locality purposes (identifying the maps by qualifier-prefix, and getting Key uniqueness from the qualifier-suffix). There isn't much flexibility in this regard for alternate serialization approaches -- the considerations at the time were for a general-purpose schema that you don't really have to think about (you just think SQL). - Josh Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems Applied Intelligence Limited, details of which can be found at http://www.baesystems.com/Businesses/index.htm.