Will make a fix to the site. Thanks all.
2018년 8월 24일 (금) 오전 9:41, Reynold Xin 님이 작성:
> I wrote both the Spark one and later the Databricks one. The latter had a
> lot more work put into it and is consistent with the Spark style. I'd just
> use the second one and link to it, if possible.
>
>
>
>
Yes, that makes sense, but just to be clear, using the same seed does *not*
imply that the algorithm should produce “equivalent” results by some definition
of equivalent if you change the input data. For example, in SGD, the random
seed might be used to select the next minibatch of examples, but
I wrote both the Spark one and later the Databricks one. The latter had a
lot more work put into it and is consistent with the Spark style. I'd just
use the second one and link to it, if possible.
On Thu, Aug 23, 2018 at 6:38 PM Hyukjin Kwon wrote:
> If you meant "Code Style Guide", many of th
If you meant "Code Style Guide", many of them are missing and it refers
https://docs.scala-lang.org/style/ not
https://github.com/databricks/scala-style-guide (please correct me if I
misunderstood).
For instance, I lately guided 2 indents for line continuation but I found
it's actually not in the o
Seems OK to me. The style is pretty standard Scala style anyway. My
guidance is always to follow the code around the code you're changing.
On Thu, Aug 23, 2018 at 8:14 PM Hyukjin Kwon wrote:
> Hi all,
>
> I usually follow https://github.com/databricks/scala-style-guide for
> Apache Spark's style
There’s already a code style guide listed on
http://spark.apache.org/contributing.html. Maybe it’s the same? We should
decide which one we actually want and update this page if it’s wrong.
Matei
> On Aug 23, 2018, at 6:33 PM, Sean Owen wrote:
>
> Seems OK to me. The style is pretty standard S
Hi all,
I usually follow https://github.com/databricks/scala-style-guide for Apache
Spark's style, which is usually generally the same with the Spark's code
base in practice.
Thing is, we don't explicitly mention this within Apache Spark as far as I
can tell.
Can we explicitly mention this or por
Behaviors at this level of detail, across different ML implementations, are
highly unlikely to ever align exactly. Statistically small changes in
logic, such as "<" versus "<=", or differences in random number generators,
etc, (to say nothing of different implementation languages) will accumulate
o
Dear Matei,
thanks for the feedback!
I used the setSeed option for all randomized classifiers and always used
the same seeds for training with the hope that this deals with the
non-determinism. I did not run any significance tests, because I was
considering this from a functional perspective,