Re: How to do fastest loading and indexing

Benedict Holland Sat, 11 Nov 2023 13:41:04 -0800

Oh, also, this matters because copy from is a batch job that will stream
your data into the table. It's extremely fast. Your indexes are not the
problem. They are extremely efficient. The problem is likely how you are
loading the data.


So actually, how are you loading the data?

Thanks,
Ben

On Sat, Nov 11, 2023, 3:55 PM Vince McMahon <sippingonesandze...@gmail.com>
wrote:

> I'm not querying with catch_all at the moment, but, other developers may.
>
> I am new.  Mind sharing how it matters, esp. How it makes loading n idx
> fast?
>
>
>
> On Sat, Nov 11, 2023, 3:05 PM Benedict Holland <
> benedict.m.holl...@gmail.com>
> wrote:
>
> > Are you using copy from?
> >
> > On Sat, Nov 11, 2023, 2:33 PM Vince McMahon <
> sippingonesandze...@gmail.com
> > >
> > wrote:
> >
> > > Hi,
> > >
> > > I have a CVS file with 200 fields and 100 million rows of historical
> and
> > > latest data.
> > >
> > > The current processing is taking 20+ hours.
> > >
> > > The schema is liked:
> > > <field name ="column1" type="string" indexed="true" stored="true">
> > > ...
> > > <field name ="column200" type="string" indexed="true" stored="true">
> > > <copyField source="column1" dest="_text_"/>
> > > <copyField source="column1" dest="_fuzzy_"/>
> > > ...
> > > <copyField source="column50" dest="_text_"/>
> > > <copyField source="column50" dest="_fuzzy_"/>
> > >
> > > In terms of hardware, I have 3 identical servers.  One of them is used
> to
> > > load this CSV to create a core.
> > >
> > > What is the fastest way to load and index this large and wide CSV file?
> > It
> > > is taking too long, 20+ hours, now.
> > >
> >
>

Re: How to do fastest loading and indexing

Reply via email to