Trying to import a 3GB JSON file which was exported from sstable2json. I let it run for over an hour and saw zero IO activity. The last thing it logs is the following:
DEBUG 23:19:32,638 collecting 0 of 2147483647: Avro/Schema:false:2042@1298067089267 DEBUG 23:19:32,638 collecting 1 of 2147483647: reddit:false:2502@1298067089267 Considering I saw zero reads on my disk when I ran it, I don't think it is even reading the JSON file. I shrunk the file down to a handful of keys, and it worked fine. Is there an issue with json2sstable loading large JSON files? Does it try to read it into memory? Also as a note, this data is unsorted. I did generate it via sstable2json, but my sstables were broken and had unsorted data, which is the whole reason I am doing this. Thanks! Jason