I d like to create a class for creating a classical transaction.
Overviewing lucene api , i can see commit/rollback/prepareCommit are just
for the entire index not for partial modifications.
So i thought i could to use api writer.addIndexes as support:
when i open a transaction i could
create a te
in 6.1.0 version BigIntegerPoint seams be moved in the mains module (no
more in sandbox).
However
1) BigIntegerPoint seams be a class for searching a 128integer not for
sorting. NumericDocValuesField supports long not BigInteger. so I used for
sorting SortedDocValuesField.
2) BigIntegerPoint name
Ipothesis:
i want split universe set to index in different subtopics/entities/subsets
Considering for semplicity entity X is indexed in Index_X folder.
Considering now i have a query when is searched
X1,X2,X3 ,
for semplicity N is the number of sub readers.
I could use MultiReader lucene api for
I don't think there would be significant disadvantages of using
MultiReader, but depending what the data looks like, there might not be
benefits either. If data is homogeneous per entity but not across entities
then the MultiReader approach might have potential for making things more
efficient, but
What you are suggesting sounds like something that can already be done with
IndexWriter.addDocuments? (note the final "s") This API ensures that all
provided documents will become visible at the same time (and with adjacent
doc ids moreover).
Le jeu. 18 août 2016 à 10:52, Cristian Lorenzetto <
cri
docid is a signed int32 so it is not so big, but really docid seams not a
primary key unmodifiable but a temporary id for the view related to a
specific search.
So repository can contains more than 2^31 documents.
My deduction is correct ? is there a maximum size for lucene index?
No, IndexWriter enforces that the number of documents cannot go over
IndexWriter.MAX_DOCS (which is a bit less than 2^31) and
BaseCompositeReader computes the number of documents in a long variable and
ensures it is less than 2^31, so you cannot have indexes that contain more
than 2^31 documents.
Or maybe it is time Lucene re-examined this limit.
There are use cases out there where >2^31 does make sense in a single index
(huge number of tiny docs).
Also, I think the underlying hardware and the JDK have advanced to make
this more defendable.
Constructively,
Glen
On Thu, Aug 18, 2016 at
it seams more complex the problem. I m trying to reasoning aload :)
if i create a transaction using addDocuments... i suppose all the docs to
persist inside transaction are before in memory. Not always is so.
In addition for a complete isolation inside transaction , there is a aspect
not working.
Maybe lucene has maxsize 2^31 because result set are java array where
length is a int type.
A suggestion for possible changes in future is to not use java array but
Iterator. Iterator is a ADT more scalable , not sucking memory for
returning documents.
2016-08-18 16:03 GMT+02:00 Glen Newton :
>
in my old code
i created public class BinDocValuesField extends Field {
/**
* Type for numeric DocValues.
*/
public static final FieldType TYPE = new FieldType();
static {
TYPE.setTokenized(false);
TYPE.setOmitNorms(true);
TYPE.setIndexOptions(IndexOptions.DOCS);
TYPE.setStored(true);
TYPE.set
What are you trying to index that has more than 3 billion documents per
shard / index and can not be split as Adrien suggests?
On Thu, Aug 18, 2016, at 07:35 AM, Cristian Lorenzetto wrote:
> Maybe lucene has maxsize 2^31 because result set are java array where
> length is a int type.
> A suggest
normally databases supports at least long primary key.
try to ask to twitter application , for example increasing every year more
than 4 petabytes :) Maybe they use big storage devices bigger than a pc
storage:)
However If you offer a possibility to use shards ... it is a possibility
anyway :)
For
using TYPE.setDocValuesType(DocValuesType.SORTED);
it works.
I didnt undestand the reasons. Maybe for for fast grouping is necessary
maybe to sorting , so algo can find distinct groups
2016-08-18 17:40 GMT+02:00 Cristian Lorenzetto <
cristian.lorenze...@gmail.com>:
> in my old code
>
> i create
On Thu, Aug 18, 2016 at 11:55 PM, Adrien Grand wrote:
> No, IndexWriter enforces that the number of documents cannot go over
> IndexWriter.MAX_DOCS (which is a bit less than 2^31) and
> BaseCompositeReader computes the number of documents in a long variable and
> ensures it is less than 2^31, so y
Dear all,
We suffered from a power lost and found that the Segments files were all in
0 bytes now, the worst is that we have no way to fix the index with the
CheckIndex utility. Does anyone know any way to fix or re-create the segment
files?
By the way, I did a comparison with other index p
OK, I'm a little out of my league here, but I'll plow on anyway
bq: There are use cases out there where >2^31 does make sense in a single index
Ok, let's put some definition to this and define the use-case
specifically rather than
be vague. I've just run an experiment for instance where I had
17 matches
Mail list logo