+1 On Fri, Dec 25, 2015 at 8:31 PM, vaquar khan <vaquar.k...@gmail.com> wrote:
> +1 > On 24 Dec 2015 22:01, "Vinay Shukla" <vinayshu...@gmail.com> wrote: > >> +1 >> Tested on HDP 2.3, YARN cluster mode, spark-shell >> >> On Wed, Dec 23, 2015 at 6:14 AM, Allen Zhang <allenzhang...@126.com> >> wrote: >> >>> >>> +1 (non-binding) >>> >>> I have just tarball a new binary and tested am.nodelabelexpression and >>> executor.nodelabelexpression manully, result is expected. >>> >>> >>> >>> >>> At 2015-12-23 21:44:08, "Iulian Dragoș" <iulian.dra...@typesafe.com> >>> wrote: >>> >>> +1 (non-binding) >>> >>> Tested Mesos deployments (client and cluster-mode, fine-grained and >>> coarse-grained). Things look good >>> <https://ci.typesafe.com/view/Spark/job/mit-docker-test-ref/8/console>. >>> >>> iulian >>> >>> On Wed, Dec 23, 2015 at 2:35 PM, Sean Owen <so...@cloudera.com> wrote: >>> >>>> Docker integration tests still fail for Mark and I, and should >>>> probably be disabled: >>>> https://issues.apache.org/jira/browse/SPARK-12426 >>>> >>>> ... but if anyone else successfully runs these (and I assume Jenkins >>>> does) then not a blocker. >>>> >>>> I'm having intermittent trouble with other tests passing, but nothing >>>> unusual. >>>> Sigs and hashes are OK. >>>> >>>> We have 30 issues fixed for 1.6.1. All but those resolved in the last >>>> 24 hours or so should be fixed for 1.6.0 right? I can touch that up. >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Dec 22, 2015 at 8:10 PM, Michael Armbrust >>>> <mich...@databricks.com> wrote: >>>> > Please vote on releasing the following candidate as Apache Spark >>>> version >>>> > 1.6.0! >>>> > >>>> > The vote is open until Friday, December 25, 2015 at 18:00 UTC and >>>> passes if >>>> > a majority of at least 3 +1 PMC votes are cast. >>>> > >>>> > [ ] +1 Release this package as Apache Spark 1.6.0 >>>> > [ ] -1 Do not release this package because ... >>>> > >>>> > To learn more about Apache Spark, please see http://spark.apache.org/ >>>> > >>>> > The tag to be voted on is v1.6.0-rc4 >>>> > (4062cda3087ae42c6c3cb24508fc1d3a931accdf) >>>> > >>>> > The release files, including signatures, digests, etc. can be found >>>> at: >>>> > >>>> http://people.apache.org/~pwendell/spark-releases/spark-1.6.0-rc4-bin/ >>>> > >>>> > Release artifacts are signed with the following key: >>>> > https://people.apache.org/keys/committer/pwendell.asc >>>> > >>>> > The staging repository for this release can be found at: >>>> > >>>> https://repository.apache.org/content/repositories/orgapachespark-1176/ >>>> > >>>> > The test repository (versioned as v1.6.0-rc4) for this release can be >>>> found >>>> > at: >>>> > >>>> https://repository.apache.org/content/repositories/orgapachespark-1175/ >>>> > >>>> > The documentation corresponding to this release can be found at: >>>> > >>>> http://people.apache.org/~pwendell/spark-releases/spark-1.6.0-rc4-docs/ >>>> > >>>> > ======================================= >>>> > == How can I help test this release? == >>>> > ======================================= >>>> > If you are a Spark user, you can help us test this release by taking >>>> an >>>> > existing Spark workload and running on this release candidate, then >>>> > reporting any regressions. >>>> > >>>> > ================================================ >>>> > == What justifies a -1 vote for this release? == >>>> > ================================================ >>>> > This vote is happening towards the end of the 1.6 QA period, so -1 >>>> votes >>>> > should only occur for significant regressions from 1.5. Bugs already >>>> present >>>> > in 1.5, minor regressions, or bugs related to new features will not >>>> block >>>> > this release. >>>> > >>>> > =============================================================== >>>> > == What should happen to JIRA tickets still targeting 1.6.0? == >>>> > =============================================================== >>>> > 1. It is OK for documentation patches to target 1.6.0 and still go >>>> into >>>> > branch-1.6, since documentations will be published separately from the >>>> > release. >>>> > 2. New features for non-alpha-modules should target 1.7+. >>>> > 3. Non-blocker bug fixes should target 1.6.1 or 1.7.0, or drop the >>>> target >>>> > version. >>>> > >>>> > >>>> > ================================================== >>>> > == Major changes to help you focus your testing == >>>> > ================================================== >>>> > >>>> > Notable changes since 1.6 RC3 >>>> > >>>> > >>>> > - SPARK-12404 - Fix serialization error for Datasets with >>>> > Timestamps/Arrays/Decimal >>>> > - SPARK-12218 - Fix incorrect pushdown of filters to parquet >>>> > - SPARK-12395 - Fix join columns of outer join for DataFrame using >>>> > - SPARK-12413 - Fix mesos HA >>>> > >>>> > >>>> > Notable changes since 1.6 RC2 >>>> > >>>> > >>>> > - SPARK_VERSION has been set correctly >>>> > - SPARK-12199 ML Docs are publishing correctly >>>> > - SPARK-12345 Mesos cluster mode has been fixed >>>> > >>>> > Notable changes since 1.6 RC1 >>>> > >>>> > Spark Streaming >>>> > >>>> > SPARK-2629 trackStateByKey has been renamed to mapWithState >>>> > >>>> > Spark SQL >>>> > >>>> > SPARK-12165 SPARK-12189 Fix bugs in eviction of storage memory by >>>> execution. >>>> > SPARK-12258 correct passing null into ScalaUDF >>>> > >>>> > Notable Features Since 1.5 >>>> > >>>> > Spark SQL >>>> > >>>> > SPARK-11787 Parquet Performance - Improve Parquet scan performance >>>> when >>>> > using flat schemas. >>>> > SPARK-10810 Session Management - Isolated devault database (i.e USE >>>> mydb) >>>> > even on shared clusters. >>>> > SPARK-9999 Dataset API - A type-safe API (similar to RDDs) that >>>> performs >>>> > many operations on serialized binary data and code generation (i.e. >>>> Project >>>> > Tungsten). >>>> > SPARK-10000 Unified Memory Management - Shared memory for execution >>>> and >>>> > caching instead of exclusive division of the regions. >>>> > SPARK-11197 SQL Queries on Files - Concise syntax for running SQL >>>> queries >>>> > over files of any supported format without registering a table. >>>> > SPARK-11745 Reading non-standard JSON files - Added options to read >>>> > non-standard JSON files (e.g. single-quotes, unquoted attributes) >>>> > SPARK-10412 Per-operator Metrics for SQL Execution - Display >>>> statistics on a >>>> > peroperator basis for memory usage and spilled data size. >>>> > SPARK-11329 Star (*) expansion for StructTypes - Makes it easier to >>>> nest and >>>> > unest arbitrary numbers of columns >>>> > SPARK-10917, SPARK-11149 In-memory Columnar Cache Performance - >>>> Significant >>>> > (up to 14x) speed up when caching data that contains complex types in >>>> > DataFrames or SQL. >>>> > SPARK-11111 Fast null-safe joins - Joins using null-safe equality >>>> (<=>) will >>>> > now execute using SortMergeJoin instead of computing a cartisian >>>> product. >>>> > SPARK-11389 SQL Execution Using Off-Heap Memory - Support for >>>> configuring >>>> > query execution to occur using off-heap memory to avoid GC overhead >>>> > SPARK-10978 Datasource API Avoid Double Filter - When implemeting a >>>> > datasource with filter pushdown, developers can now tell Spark SQL to >>>> avoid >>>> > double evaluating a pushed-down filter. >>>> > SPARK-4849 Advanced Layout of Cached Data - storing partitioning and >>>> > ordering schemes in In-memory table scan, and adding distributeBy and >>>> > localSort to DF API >>>> > SPARK-9858 Adaptive query execution - Intial support for >>>> automatically >>>> > selecting the number of reducers for joins and aggregations. >>>> > SPARK-9241 Improved query planner for queries having distinct >>>> aggregations >>>> > - Query plans of distinct aggregations are more robust when distinct >>>> columns >>>> > have high cardinality. >>>> > >>>> > Spark Streaming >>>> > >>>> > API Updates >>>> > >>>> > SPARK-2629 New improved state management - mapWithState - a DStream >>>> > transformation for stateful stream processing, supercedes >>>> updateStateByKey >>>> > in functionality and performance. >>>> > SPARK-11198 Kinesis record deaggregation - Kinesis streams have been >>>> > upgraded to use KCL 1.4.0 and supports transparent deaggregation of >>>> > KPL-aggregated records. >>>> > SPARK-10891 Kinesis message handler function - Allows arbitraray >>>> function to >>>> > be applied to a Kinesis record in the Kinesis receiver before to >>>> customize >>>> > what data is to be stored in memory. >>>> > SPARK-6328 Python Streamng Listener API - Get streaming statistics >>>> > (scheduling delays, batch processing times, etc.) in streaming. >>>> > >>>> > UI Improvements >>>> > >>>> > Made failures visible in the streaming tab, in the timelines, batch >>>> list, >>>> > and batch details page. >>>> > Made output operations visible in the streaming tab as progress bars. >>>> > >>>> > MLlib >>>> > >>>> > New algorithms/models >>>> > >>>> > SPARK-8518 Survival analysis - Log-linear model for survival analysis >>>> > SPARK-9834 Normal equation for least squares - Normal equation >>>> solver, >>>> > providing R-like model summary statistics >>>> > SPARK-3147 Online hypothesis testing - A/B testing in the Spark >>>> Streaming >>>> > framework >>>> > SPARK-9930 New feature transformers - ChiSqSelector, >>>> QuantileDiscretizer, >>>> > SQL transformer >>>> > SPARK-6517 Bisecting K-Means clustering - Fast top-down clustering >>>> variant >>>> > of K-Means >>>> > >>>> > API improvements >>>> > >>>> > ML Pipelines >>>> > >>>> > SPARK-6725 Pipeline persistence - Save/load for ML Pipelines, with >>>> partial >>>> > coverage of spark.mlalgorithms >>>> > SPARK-5565 LDA in ML Pipelines - API for Latent Dirichlet Allocation >>>> in ML >>>> > Pipelines >>>> > >>>> > R API >>>> > >>>> > SPARK-9836 R-like statistics for GLMs - (Partial) R-like stats for >>>> ordinary >>>> > least squares via summary(model) >>>> > SPARK-9681 Feature interactions in R formula - Interaction operator >>>> ":" in >>>> > R formula >>>> > >>>> > Python API - Many improvements to Python API to approach feature >>>> parity >>>> > >>>> > Misc improvements >>>> > >>>> > SPARK-7685 , SPARK-9642 Instance weights for GLMs - Logistic and >>>> Linear >>>> > Regression can take instance weights >>>> > SPARK-10384, SPARK-10385 Univariate and bivariate statistics in >>>> DataFrames - >>>> > Variance, stddev, correlations, etc. >>>> > SPARK-10117 LIBSVM data source - LIBSVM as a SQL data source >>>> > >>>> > Documentation improvements >>>> > >>>> > SPARK-7751 @since versions - Documentation includes initial version >>>> when >>>> > classes and methods were added >>>> > SPARK-11337 Testable example code - Automated testing for code in >>>> user guide >>>> > examples >>>> > >>>> > Deprecations >>>> > >>>> > In spark.mllib.clustering.KMeans, the "runs" parameter has been >>>> deprecated. >>>> > In spark.ml.classification.LogisticRegressionModel and >>>> > spark.ml.regression.LinearRegressionModel, the "weights" field has >>>> been >>>> > deprecated, in favor of the new name "coefficients." This helps >>>> disambiguate >>>> > from instance (row) weights given to algorithms. >>>> > >>>> > Changes of behavior >>>> > >>>> > spark.mllib.tree.GradientBoostedTrees validationTol has changed >>>> semantics in >>>> > 1.6. Previously, it was a threshold for absolute change in error. >>>> Now, it >>>> > resembles the behavior of GradientDescent convergenceTol: For large >>>> errors, >>>> > it uses relative error (relative to the previous error); for small >>>> errors (< >>>> > 0.01), it uses absolute error. >>>> > spark.ml.feature.RegexTokenizer: Previously, it did not convert >>>> strings to >>>> > lowercase before tokenizing. Now, it converts to lowercase by >>>> default, with >>>> > an option not to. This matches the behavior of the simpler Tokenizer >>>> > transformer. >>>> > Spark SQL's partition discovery has been changed to only discover >>>> partition >>>> > directories that are children of the given path. (i.e. if >>>> > path="/my/data/x=1" then x=1 will no longer be considered a partition >>>> but >>>> > only children of x=1.) This behavior can be overridden by manually >>>> > specifying the basePath that partitioning discovery should start with >>>> > (SPARK-11678). >>>> > When casting a value of an integral type to timestamp (e.g. casting a >>>> long >>>> > value to timestamp), the value is treated as being in seconds instead >>>> of >>>> > milliseconds (SPARK-11724). >>>> > With the improved query planner for queries having distinct >>>> aggregations >>>> > (SPARK-9241), the plan of a query having a single distinct >>>> aggregation has >>>> > been changed to a more robust version. To switch back to the plan >>>> generated >>>> > by Spark 1.5's planner, please set >>>> > spark.sql.specializeSingleDistinctAggPlanning to true (SPARK-12077). >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: dev-h...@spark.apache.org >>>> >>>> >>> >>> >>> -- >>> >>> -- >>> Iulian Dragos >>> >>> ------ >>> Reactive Apps on the JVM >>> www.typesafe.com >>> >>> >>