Reminder: this Jenkins migration is happening tomorrow morning (Monday).
On Fri, Oct 10, 2014 at 1:01 PM, shane knapp wrote:
> reminder: this IS happening, first thing monday morning PDT. :)
>
> On Wed, Oct 8, 2014 at 3:01 PM, shane knapp wrote:
>
> > greetings!
> >
> > i've got some updates
The fixed-length binary type can hold fewer bytes than an int64, though many
encodings of int64 can probably do the right thing. We can look into supporting
multiple ways to do this -- the spec does say that you should at least be able
to read int32s and int64s.
Matei
On Oct 12, 2014, at 8:20
I'm also against these huge reformattings. They slow down development and
backporting for trivial reasons. Let's not do that at this point, the style of
the current code is quite consistent and we have plenty of other things to
worry about. Instead, what you can do is as you edit a file when you
Another big problem with these patches are that they make it almost
impossible to backport changes to older branches cleanly (there
becomes like 100% chance of a merge conflict).
One proposal is to do this:
1. We only consider new style rules at the end of a release cycle,
when there is the smalle
I actually think we should just take the bite and follow through with the
reformatting. Many rules are simply not possible to enforce only on deltas
(e.g. import ordering).
That said, maybe there are better windows to do this, e.g. during the QA
period.
On Sun, Oct 12, 2014 at 9:37 PM, Josh Rosen
There are a number of open pull requests that aim to extend Spark’s automated
style checks (see https://issues.apache.org/jira/browse/SPARK-3849 for an
umbrella JIRA). These fixes are mostly good, but I have some concerns about
merging these patches. Several of these patches make large reforma
Hi Matei,
Thanks, I can see you've been hard at work on this! I examined your patch and
do have a question. It appears you're limiting the precision of decimals
written to parquet to those that will fit in a long, yet you're writing the
values as a parquet binary type. Why not write them using
I was under the impression that we were using the usual sort by average
response value heuristic when storing histogram bins (and searching for optimal
splits) in the tree code.
Maybe Manish or Joseph can clarify?
> On Oct 12, 2014, at 2:50 PM, Sean Owen wrote:
>
> I'm having trouble getting
Hi Michael,
I've been working on this in my repo:
https://github.com/mateiz/spark/tree/decimal. I'll make some pull requests with
these features soon, but meanwhile you can try this branch. See
https://github.com/mateiz/spark/compare/decimal for the individual commits that
went into it. It has
Hello,
I'm interested in reading/writing parquet SchemaRDDs that support the Parquet
Decimal converted type. The first thing I did was update the Spark parquet
dependency to version 1.5.0, as this version introduced support for decimals in
parquet. However, conversion between the catalyst decim
I'm having trouble getting decision forests to work with categorical
features. I have a dataset with a categorical feature with 40 values.
It seems to be treated as a continuous/numeric value by the
implementation.
Digging deeper, I see there is some logic in the code that indicates
that categoric
11 matches
Mail list logo