Yep, going to smaller files (and possibly feeding them through multiple clients) is the way to go here.
You could also use a SolrJ client which would be more efficient, here's a place to start. Admittedly it doesn't parse JSON, but should give you an idea of how you could go about it if you wanted. https://lucidworks.com/blog/indexing-with-solrj/ Best, Erick On Wed, Sep 23, 2015 at 6:25 AM, Siddhartha Singh Sandhu < [email protected]> wrote: > HI Erik, > > Thank you for your reply. I wrote into a file. Allmy love to cat * > file. > > I structured my JSON in a format that I would like to upload into solr. > defined a schema.xml and solrconfig.xml and took it from there. Initiallu I > uploaded a 1G with post then I got a bit to over zealous I guess. > > Regards, > Sid. > > On Tue, Sep 22, 2015 at 5:45 PM, Erick Erickson <[email protected]> > wrote: > > > Let's back up quite a ways here. Where did the 20G file come from? > > Indexing files in JSON requires that they follow a very specific format, > > Solr doesn't index arbitrary JSON files. > > > > With that out of the way, yes, 20G is unlikely to work without tweaking > > some parameters in both solrconfig.xml (there's an upload file limit > > there) and your communications layer is unlikely to accept uploads of > > that size. I'd really recomend breaking it up into smaller chunks and/or > > using SolrJ to parse the file with a sax-like parser and index docs > > in smaller chunks (I often use 1,000 docs at a time, but it depends on > > how big they are). > > > > Best, > > Erick > > > > On Tue, Sep 22, 2015 at 12:43 PM, Siddhartha Singh Sandhu > > <[email protected]> wrote: > > > Hi, > > > > > > I am relatively new to Solr and have usage query. I have a 20 GB JSON > > file > > > which I want to upload into my solr. Do I have to form smaller chunks > or > > is > > > there a way to upload the whole thing in one go? > > > > > > I am getting the following error with bin/post: > > > > > > Entering auto mode. File endings considered are > > > > > > xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log > > > POSTing file 20150401.json (application/json) to [base] > > > > > > stderr: Exception in thread "main" java.lang.IllegalArgumentException: > > > invalid content length > > > at > > > > > > java.net.HttpURLConnection.setFixedLengthStreamingMode(HttpURLConnection.java:149) > > > at > > > org.apache.solr.util.SimplePostTool.postData(SimplePostTool.java:887) > > > at > > > org.apache.solr.util.SimplePostTool.postFile(SimplePostTool.java:794) > > > at > > > org.apache.solr.util.SimplePostTool.postFiles(SimplePostTool.java:515) > > > at > > > org.apache.solr.util.SimplePostTool.postFiles(SimplePostTool.java:435) > > > at > > > > org.apache.solr.util.SimplePostTool.doFilesMode(SimplePostTool.java:311) > > > at > > > org.apache.solr.util.SimplePostTool.execute(SimplePostTool.java:177) > > > at > > org.apache.solr.util.SimplePostTool.main(SimplePostTool.java:166) > > > > > > Thank you for your help! > > > > > > Regards, > > > > > > Sid. > > >
