Yep, going to smaller files (and possibly feeding
them through multiple clients) is the way to go
here.

You could also use a SolrJ client which would be
more efficient, here's a place to start. Admittedly
it doesn't parse JSON, but should give you an idea
of how you could go about it if you wanted.

https://lucidworks.com/blog/indexing-with-solrj/

Best,
Erick

On Wed, Sep 23, 2015 at 6:25 AM, Siddhartha Singh Sandhu <
[email protected]> wrote:

> HI Erik,
>
> Thank you for your reply. I wrote into a file. Allmy love to cat * > file.
>
> I structured my JSON in a format that I would like to upload into solr.
> defined a schema.xml and solrconfig.xml and took it from there. Initiallu I
> uploaded a 1G with post then I got a bit to over zealous I guess.
>
> Regards,
> Sid.
>
> On Tue, Sep 22, 2015 at 5:45 PM, Erick Erickson <[email protected]>
> wrote:
>
> > Let's back up quite a ways here. Where did the 20G file come from?
> > Indexing files in JSON requires that they follow a very specific format,
> > Solr doesn't index arbitrary JSON files.
> >
> > With that out of the way, yes, 20G is unlikely to work without tweaking
> > some parameters in both solrconfig.xml (there's an upload file limit
> > there) and your communications layer is unlikely to accept uploads of
> > that size. I'd really recomend breaking it up into smaller chunks and/or
> > using SolrJ to parse the file with a sax-like parser and index docs
> > in smaller chunks (I often use 1,000 docs at a time, but it depends on
> > how big they are).
> >
> > Best,
> > Erick
> >
> > On Tue, Sep 22, 2015 at 12:43 PM, Siddhartha Singh Sandhu
> > <[email protected]> wrote:
> > > Hi,
> > >
> > > I am relatively new to Solr and have usage query. I have a 20 GB JSON
> > file
> > > which I want to upload into my solr. Do I have to form smaller chunks
> or
> > is
> > > there a way to upload the whole thing in one go?
> > >
> > > I am getting the following error with bin/post:
> > >
> > > Entering auto mode. File endings considered are
> > >
> >
> xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
> > > POSTing file 20150401.json (application/json) to [base]
> > >
> > > stderr: Exception in thread "main" java.lang.IllegalArgumentException:
> > > invalid content length
> > >         at
> > >
> >
> java.net.HttpURLConnection.setFixedLengthStreamingMode(HttpURLConnection.java:149)
> > >         at
> > > org.apache.solr.util.SimplePostTool.postData(SimplePostTool.java:887)
> > >         at
> > > org.apache.solr.util.SimplePostTool.postFile(SimplePostTool.java:794)
> > >         at
> > > org.apache.solr.util.SimplePostTool.postFiles(SimplePostTool.java:515)
> > >         at
> > > org.apache.solr.util.SimplePostTool.postFiles(SimplePostTool.java:435)
> > >         at
> > >
> org.apache.solr.util.SimplePostTool.doFilesMode(SimplePostTool.java:311)
> > >         at
> > > org.apache.solr.util.SimplePostTool.execute(SimplePostTool.java:177)
> > >         at
> > org.apache.solr.util.SimplePostTool.main(SimplePostTool.java:166)
> > >
> > > Thank you for your help!
> > >
> > > Regards,
> > >
> > > Sid.
> >
>

Reply via email to