Hey Xiangrui, Do you want to write up a straw man proposal based on this line of discussion?
- Patrick On Mon, Mar 16, 2015 at 12:12 PM, Kevin Markey <kevin.mar...@oracle.com> wrote: > In some applications, I have rather heavy use of Java enums which are needed > for related Java APIs that the application uses. And unfortunately, they > are also used as keys. As such, using the native hashcodes makes any > function over keys unstable and unpredictable, so we now use Enum.name() as > the key instead. Oh well. But it works and seems to work well. > > Kevin > > > On 03/05/2015 09:49 PM, Mridul Muralidharan wrote: >> >> I have a strong dislike for java enum's due to the fact that they >> are not stable across JVM's - if it undergoes serde, you end up with >> unpredictable results at times [1]. >> One of the reasons why we prevent enum's from being key : though it is >> highly possible users might depend on it internally and shoot >> themselves in the foot. >> >> Would be better to keep away from them in general and use something more >> stable. >> >> Regards, >> Mridul >> >> [1] Having had to debug this issue for 2 weeks - I really really hate it. >> >> >> On Thu, Mar 5, 2015 at 1:08 PM, Imran Rashid <iras...@cloudera.com> wrote: >>> >>> I have a very strong dislike for #1 (scala enumerations). I'm ok with >>> #4 >>> (with Xiangrui's final suggestion, especially making it sealed & >>> available >>> in Java), but I really think #2, java enums, are the best option. >>> >>> Java enums actually have some very real advantages over the other >>> approaches -- you get values(), valueOf(), EnumSet, and EnumMap. There >>> has >>> been endless debate in the Scala community about the problems with the >>> approaches in Scala. Very smart, level-headed Scala gurus have >>> complained >>> about their short-comings (Rex Kerr's name is coming to mind, though I'm >>> not positive about that); there have been numerous well-thought out >>> proposals to give Scala a better enum. But the powers-that-be in Scala >>> always reject them. IIRC the explanation for rejecting is basically that >>> (a) enums aren't important enough for introducing some new special >>> feature, >>> scala's got bigger things to work on and (b) if you really need a good >>> enum, just use java's enum. >>> >>> I doubt it really matters that much for Spark internals, which is why I >>> think #4 is fine. But I figured I'd give my spiel, because every >>> developer >>> loves language wars :) >>> >>> Imran >>> >>> >>> >>> On Thu, Mar 5, 2015 at 1:35 AM, Xiangrui Meng <men...@gmail.com> wrote: >>> >>>> `case object` inside an `object` doesn't show up in Java. This is the >>>> minimal code I found to make everything show up correctly in both >>>> Scala and Java: >>>> >>>> sealed abstract class StorageLevel // cannot be a trait >>>> >>>> object StorageLevel { >>>> private[this] case object _MemoryOnly extends StorageLevel >>>> final val MemoryOnly: StorageLevel = _MemoryOnly >>>> >>>> private[this] case object _DiskOnly extends StorageLevel >>>> final val DiskOnly: StorageLevel = _DiskOnly >>>> } >>>> >>>> On Wed, Mar 4, 2015 at 8:10 PM, Patrick Wendell <pwend...@gmail.com> >>>> wrote: >>>>> >>>>> I like #4 as well and agree with Aaron's suggestion. >>>>> >>>>> - Patrick >>>>> >>>>> On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson <ilike...@gmail.com> >>>> >>>> wrote: >>>>>> >>>>>> I'm cool with #4 as well, but make sure we dictate that the values >>>> >>>> should >>>>>> >>>>>> be defined within an object with the same name as the enumeration >>>>>> (like >>>> >>>> we >>>>>> >>>>>> do for StorageLevel). Otherwise we may pollute a higher namespace. >>>>>> >>>>>> e.g. we SHOULD do: >>>>>> >>>>>> trait StorageLevel >>>>>> object StorageLevel { >>>>>> case object MemoryOnly extends StorageLevel >>>>>> case object DiskOnly extends StorageLevel >>>>>> } >>>>>> >>>>>> On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust < >>>> >>>> mich...@databricks.com> >>>>>> >>>>>> wrote: >>>>>> >>>>>>> #4 with a preference for CamelCaseEnums >>>>>>> >>>>>>> On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley >>>>>>> <jos...@databricks.com> >>>>>>> wrote: >>>>>>> >>>>>>>> another vote for #4 >>>>>>>> People are already used to adding "()" in Java. >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch <java...@gmail.com> >>>>>>> >>>>>>> wrote: >>>>>>>>> >>>>>>>>> #4 but with MemoryOnly (more scala-like) >>>>>>>>> >>>>>>>>> http://docs.scala-lang.org/style/naming-conventions.html >>>>>>>>> >>>>>>>>> Constants, Values, Variable and Methods >>>>>>>>> >>>>>>>>> Constant names should be in upper camel case. That is, if the >>>> >>>> member is >>>>>>>>> >>>>>>>>> final, immutable and it belongs to a package object or an object, >>>> >>>> it >>>>>>> >>>>>>> may >>>>>>>> >>>>>>>> be >>>>>>>>> >>>>>>>>> considered a constant (similar to Java'sstatic final members): >>>>>>>>> >>>>>>>>> >>>>>>>>> 1. object Container { >>>>>>>>> 2. val MyConstant = ... >>>>>>>>> 3. } >>>>>>>>> >>>>>>>>> >>>>>>>>> 2015-03-04 17:11 GMT-08:00 Xiangrui Meng <men...@gmail.com>: >>>>>>>>> >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> There are many places where we use enum-like types in Spark, but >>>> >>>> in >>>>>>>>>> >>>>>>>>>> different ways. Every approach has both pros and cons. I wonder >>>>>>>>>> whether there should be an "official" approach for enum-like >>>> >>>> types in >>>>>>>>>> >>>>>>>>>> Spark. >>>>>>>>>> >>>>>>>>>> 1. Scala's Enumeration (e.g., SchedulingMode, WorkerState, etc) >>>>>>>>>> >>>>>>>>>> * All types show up as Enumeration.Value in Java. >>>>>>>>>> >>>>>>>>>> >>>> >>>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html >>>>>>>>>> >>>>>>>>>> 2. Java's Enum (e.g., SaveMode, IOMode) >>>>>>>>>> >>>>>>>>>> * Implementation must be in a Java file. >>>>>>>>>> * Values doesn't show up in the ScalaDoc: >>>>>>>>>> >>>>>>>>>> >>>> >>>> http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode >>>>>>>>>> >>>>>>>>>> 3. Static fields in Java (e.g., TripletFields) >>>>>>>>>> >>>>>>>>>> * Implementation must be in a Java file. >>>>>>>>>> * Doesn't need "()" in Java code. >>>>>>>>>> * Values don't show up in the ScalaDoc: >>>>>>>>>> >>>>>>>>>> >>>> >>>> http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields >>>>>>>>>> >>>>>>>>>> 4. Objects in Scala. (e.g., StorageLevel) >>>>>>>>>> >>>>>>>>>> * Needs "()" in Java code. >>>>>>>>>> * Values show up in both ScalaDoc and JavaDoc: >>>>>>>>>> >>>>>>>>>> >>>> >>>> http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$ >>>>>>>>>> >>>>>>>>>> >>>> >>>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html >>>>>>>>>> >>>>>>>>>> It would be great if we have an "official" approach for this as >>>> >>>> well >>>>>>>>>> >>>>>>>>>> as the naming convention for enum-like values ("MEMORY_ONLY" or >>>>>>>>>> "MemoryOnly"). Personally, I like 4) with "MEMORY_ONLY". Any >>>>>>> >>>>>>> thoughts? >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Xiangrui >>>>>>>>>> >>>>>>>>>> >>>> --------------------------------------------------------------------- >>>>>>>>>> >>>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>>>>>>>> For additional commands, e-mail: dev-h...@spark.apache.org >>>>>>>>>> >>>>>>>>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: dev-h...@spark.apache.org >>>> >>>> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> For additional commands, e-mail: dev-h...@spark.apache.org >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org