There was a long thread about enum's initiated by Xiangrui several months
back in which the final consensus was to use java enum's.  Is that
discussion (/decision) applicable here?

2015-09-16 17:43 GMT-07:00 Ulanov, Alexander <alexander.ula...@hpe.com>:

> Hi Joseph,
>
>
>
> Strings sounds reasonable. However, there is no StringParam (only
> StringArrayParam). Should I create a new param type? Also, how can the user
> get all possible values of String parameter?
>
>
>
> Best regards, Alexander
>
>
>
> *From:* Joseph Bradley [mailto:jos...@databricks.com]
> *Sent:* Wednesday, September 16, 2015 5:35 PM
> *To:* Feynman Liang
> *Cc:* Ulanov, Alexander; dev@spark.apache.org
> *Subject:* Re: Enum parameter in ML
>
>
>
> I've tended to use Strings.  Params can be created with a validator
> (isValid) which can ensure users get an immediate error if they try to pass
> an unsupported String.  Not as nice as compile-time errors, but easier on
> the APIs.
>
>
>
> On Mon, Sep 14, 2015 at 6:07 PM, Feynman Liang <fli...@databricks.com>
> wrote:
>
> We usually write a Java test suite which exercises the public API (e.g.
> DCT
> <https://github.com/apache/spark/blob/master/mllib/src/test/java/org/apache/spark/ml/feature/JavaDCTSuite.java#L71>
> ).
>
>
>
> It may be possible to create a sealed trait with singleton concrete
> instances inside of a serializable companion object, the just introduce a
> Param[SealedTrait] to the model (e.g. StreamingDecay PR
> <https://github.com/apache/spark/pull/8022/files#diff-cea0bec4853b1b2748ec006682218894R99>).
> However, this would require Java users to use
> CompanionObject$.ConcreteInstanceName to access enum values which isn't the
> prettiest syntax.
>
>
>
> Another option would just be to use Strings, which although is not type
> safe does simplify implementation.
>
>
>
> On Mon, Sep 14, 2015 at 5:43 PM, Ulanov, Alexander <
> alexander.ula...@hpe.com> wrote:
>
> Hi Feynman,
>
>
>
> Thank you for suggestion. How can I ensure that there will be no problems
> for Java users? (I only use Scala API)
>
>
>
> Best regards, Alexander
>
>
>
> *From:* Feynman Liang [mailto:fli...@databricks.com]
> *Sent:* Monday, September 14, 2015 5:27 PM
> *To:* Ulanov, Alexander
> *Cc:* dev@spark.apache.org
> *Subject:* Re: Enum parameter in ML
>
>
>
> Since PipelineStages are serializable, the params must also be
> serializable. We also have to keep the Java API in mind. Introducing a new
> enum Param type may work, but we will have to ensure that Java users can
> use it without dealing with ClassTags (I believe Scala will create new
> types for each possible value in the Enum) and that it can be serialized.
>
>
>
> On Mon, Sep 14, 2015 at 4:31 PM, Ulanov, Alexander <
> alexander.ula...@hpe.com> wrote:
>
> Dear Spark developers,
>
>
>
> I am currently implementing the Estimator in ML that has a parameter that
> can take several different values that are mutually exclusive. The most
> appropriate type seems to be Scala Enum (
> http://www.scala-lang.org/api/current/index.html#scala.Enumeration).
> However, the current ML API has the following parameter types:
>
> BooleanParam, DoubleArrayParam, DoubleParam, FloatParam, IntArrayParam,
> IntParam, LongParam, StringArrayParam
>
>
>
> Should I introduce a new parameter type in ML API that is based on Scala
> Enum?
>
>
>
> Best regards, Alexander
>
>
>
>
>
>
>

Reply via email to