Re: How to do fastest loading and indexing

Vince McMahon Sat, 11 Nov 2023 16:36:29 -0800

Benedict,

Thanks for your replies.


I am trying to load to Solr Core and index there, not postgres.

Would you happen to know the fastest way to load and index Solr Core?

Thanks.


On Sat, Nov 11, 2023, 4:41 PM Benedict Holland <benedict.m.holl...@gmail.com>
wrote:

> Oh, also, this matters because copy from is a batch job that will stream
> your data into the table. It's extremely fast. Your indexes are not the
> problem. They are extremely efficient. The problem is likely how you are
> loading the data.
>
> So actually, how are you loading the data?
>
> Thanks,
> Ben
>
> On Sat, Nov 11, 2023, 3:55 PM Vince McMahon <sippingonesandze...@gmail.com
> >
> wrote:
>
> > I'm not querying with catch_all at the moment, but, other developers may.
> >
> > I am new.  Mind sharing how it matters, esp. How it makes loading n idx
> > fast?
> >
> >
> >
> > On Sat, Nov 11, 2023, 3:05 PM Benedict Holland <
> > benedict.m.holl...@gmail.com>
> > wrote:
> >
> > > Are you using copy from?
> > >
> > > On Sat, Nov 11, 2023, 2:33 PM Vince McMahon <
> > sippingonesandze...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I have a CVS file with 200 fields and 100 million rows of historical
> > and
> > > > latest data.
> > > >
> > > > The current processing is taking 20+ hours.
> > > >
> > > > The schema is liked:
> > > > <field name ="column1" type="string" indexed="true" stored="true">
> > > > ...
> > > > <field name ="column200" type="string" indexed="true" stored="true">
> > > > <copyField source="column1" dest="_text_"/>
> > > > <copyField source="column1" dest="_fuzzy_"/>
> > > > ...
> > > > <copyField source="column50" dest="_text_"/>
> > > > <copyField source="column50" dest="_fuzzy_"/>
> > > >
> > > > In terms of hardware, I have 3 identical servers.  One of them is
> used
> > to
> > > > load this CSV to create a core.
> > > >
> > > > What is the fastest way to load and index this large and wide CSV
> file?
> > > It
> > > > is taking too long, 20+ hours, now.
> > > >
> > >
> >
>

Re: How to do fastest loading and indexing

Reply via email to