Hi,

I have a CVS file with 200 fields and 100 million rows of historical and
latest data.

The current processing is taking 20+ hours.

The schema is liked:
<field name ="column1" type="string" indexed="true" stored="true">
...
<field name ="column200" type="string" indexed="true" stored="true">
<copyField source="column1" dest="_text_"/>
<copyField source="column1" dest="_fuzzy_"/>
...
<copyField source="column50" dest="_text_"/>
<copyField source="column50" dest="_fuzzy_"/>

In terms of hardware, I have 3 identical servers.  One of them is used to
load this CSV to create a core.

What is the fastest way to load and index this large and wide CSV file?  It
is taking too long, 20+ hours, now.

Reply via email to