[
https://issues.apache.org/jira/browse/LUCENE-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870041#comment-16870041
]
Tomoko Uchida edited comment on LUCENE-8873 at 6/22/19 4:17 AM:
----------------------------------------------------------------
I'm trying to find handy ways to properly manage / document the properties (for
both of developers and users).
e.g., The pseudo code would look good? (This would not work but I'd like to
give a blueprint.)
{code:java}
/**
* Factory for {@link NGramTokenizer}.
*
* @since 3.1
* @lucene.spi {@value #NAME}
*/
public class NGramTokenizerFactory extends TokenizerFactory {
/** SPI name */
public static final String NAME = "nGram";
/** Property {@value #PROP_MAX_GRAM_SIZE} - Maximum gram size */
public static final String PROP_MAX_GRAM_SIZE = "maxGramSize";
/** Property {@value #PROP_MIN_GRAM_SIZE} - Minimum gram size */
public static final String PROP_MIN_GRAM_SIZE = "minGramSize";
@lucene.analysis.property(name="maxGramSize", required=false,
default=NGramTokenizer.DEFAULT_MIN_NGRAM_SIZE)
private final int maxGramSize;
@lucene.analysis.property(name="minGramSize", required=false,
default=NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE)
private final int minGramSize;
/** Creates a new NGramTokenizerFactory */
public NGramTokenizerFactory(Map<String, String> args) {
super(args);
/* All properties are derived from annotations (in the superclass's
constructor), so we don't have to set those manually */
// minGramSize = getInt(args, "minGramSize",
NGramTokenizer.DEFAULT_MIN_NGRAM_SIZE);
// maxGramSize = getInt(args, "maxGramSize",
NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE);
if (!args.isEmpty()) {
throw new IllegalArgumentException("Unknown parameters: " + args);
}
}
}
{code}
[~thetaphi]: if you have anything in your mind (about the interface design),
can you please share your thoughts?
was (Author: tomoko uchida):
I'm trying to find handy ways to properly manage / document the properties (for
both of developers and users).
e.g., The pseudo code would look good?
{code:java}
/**
* Factory for {@link NGramTokenizer}.
*
* @since 3.1
* @lucene.spi {@value #NAME}
*/
public class NGramTokenizerFactory extends TokenizerFactory {
/** SPI name */
public static final String NAME = "nGram";
/** Property {@value #PROP_MAX_GRAM_SIZE} - Maximum gram size */
public static final String PROP_MAX_GRAM_SIZE = "maxGramSize";
/** Property {@value #PROP_MIN_GRAM_SIZE} - Minimum gram size */
public static final String PROP_MIN_GRAM_SIZE = "minGramSize";
@lucene.analysis.property(name="maxGramSize", required=false,
default=NGramTokenizer.DEFAULT_MIN_NGRAM_SIZE)
private final int maxGramSize;
@lucene.analysis.property(name="minGramSize", required=false,
default=NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE)
private final int minGramSize;
/** Creates a new NGramTokenizerFactory */
public NGramTokenizerFactory(Map<String, String> args) {
super(args);
/* All properties are derived from annotations (in the superclass's
constructor), so we don't have to set those manually */
// minGramSize = getInt(args, "minGramSize",
NGramTokenizer.DEFAULT_MIN_NGRAM_SIZE);
// maxGramSize = getInt(args, "maxGramSize",
NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE);
if (!args.isEmpty()) {
throw new IllegalArgumentException("Unknown parameters: " + args);
}
}
}
{code}
[~thetaphi]: if you have anything in your mind (about the interface design),
can you please share your thoughts?
> Improve analyzer factoryies' Javadoc.
> -------------------------------------
>
> Key: LUCENE-8873
> URL: https://issues.apache.org/jira/browse/LUCENE-8873
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/analysis
> Reporter: Tomoko Uchida
> Priority: Minor
>
> Currently, the documentation for analyzer factories (subclasses of
> {{TokenizerFactory}}, {{CharFilterFactory}}, {{TokenFilterFactory}}) still
> includes lots of Solr schema.xml examples and not all properties are
> documented. >From my perspective, the latter is more problematic because
> users who want to use the factories have to refer to source code to know what
> properties are defined.
> To improve documentation, XML examples should be removed for cleanup, and
> instead, *all properties which can be passed to factory constructors should
> be properly documented*.
> Documentation is often overlooked so some validation rules and
> standardization effort would be desired (e.g. marking properties by
> annotations).
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]