Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

Peter Rudenko Mon, 01 Jun 2015 14:19:41 -0700

Thanks Yin, tried on a clean VM - works now. But tests in my app stillfails:

|[info] Cause: javax.jdo.JDOFatalDataStoreException: Unable to open atest connection to the given database. JDBC url =jdbc:derby:;databaseName=metastore_db;create=true, username = APP.Terminating connection pool (set lazyInit to true if you expect to startyour database after your app). Original Exception: ------ [info]java.sql.SQLException: Failed to start database 'metastore_db' withclass loaderorg.apache.spark.sql.hive.client.IsolatedClientLoader$anon$1@380628de,see the next exception for details. [info] atorg.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(UnknownSource) [info] atorg.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)[info] at org.apache.derby.impl.jdbc.Util.seeNextException(UnknownSource) [info] atorg.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)[info] at org.apache.derby.impl.jdbc.EmbedConnection.<init>(UnknownSource) [info] atorg.apache.derby.impl.jdbc.EmbedConnection40.<init>(Unknown Source)[info] at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(UnknownSource) [info] at org.apache.derby.jdbc.InternalDriver.connect(UnknownSource) [info] at org.apache.derby.jdbc.Driver20.connect(Unknown Source)[info] at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)[info] at java.sql.DriverManager.getConnection(DriverManager.java:571)[info] at java.sql.DriverManager.getConnection(DriverManager.java:187)[info] atcom.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)[info] at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:416) [info] atcom.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:120)|


I’ve set

parallelExecution in Test := false,

Thanks,
Peter Rudenko

On 2015-06-01 21:10, Yin Huai wrote:

Hi Peter,

Based on your error message, seems you were not using the RC3. For theerror thrown at HiveContext's line 206, we have changed the message tothis one<https://github.com/apache/spark/blob/v1.4.0-rc3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala#L205-207> justbefore RC3. Basically, we will not print out the class loader name.Can you check if a older version of 1.4 branch got used? Have youpublished a RC3 to your local maven repo? Can you clean your localrepo cache and try again?


Thanks,

Yin

On Mon, Jun 1, 2015 at 10:45 AM, Peter Rudenko<[email protected] <mailto:[email protected]>> wrote:


    Still have problem using HiveContext from sbt. Here’s an example
    of dependencies:

    |val sparkVersion = "1.4.0-rc3" lazy val root = Project(id =
    "spark-hive", base = file("."), settings = Project.defaultSettings
    ++ Seq( name := "spark-1.4-hive", scalaVersion := "2.10.5",
    scalaBinaryVersion := "2.10", resolvers += "Spark RC" at
    "https://repository.apache.org/content/repositories/orgapachespark-1110/";
    <https://repository.apache.org/content/repositories/orgapachespark-1110/>,
    libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" %
    sparkVersion, "org.apache.spark" %% "spark-mllib" % sparkVersion,
    "org.apache.spark" %% "spark-hive" % sparkVersion,
    "org.apache.spark" %% "spark-sql" % sparkVersion ) )) |

    Launching sbt console with it and running:

    |val conf = new
    SparkConf().setMaster("local[4]").setAppName("test") val sc = new
    SparkContext(conf) val sqlContext = new
    org.apache.spark.sql.hive.HiveContext(sc) val data =
    sc.parallelize(1 to 10000) import sqlContext.implicits._ scala>
    data.toDF java.lang.IllegalArgumentException: Unable to locate
    hive jars to connect to metastore using classloader
    scala.tools.nsc.interpreter.IMain$TranslatingClassLoader. Please
    set spark.sql.hive.metastore.jars at
    
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:206)
    at
    org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:175)
    at
    org.apache.spark.sql.hive.HiveContext$anon$2.<init>(HiveContext.scala:367)
    at
    
org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:367)
    at
    org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:366)
    at
    org.apache.spark.sql.hive.HiveContext$anon$1.<init>(HiveContext.scala:379)
    at
    
org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:379)
    at
    org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:378)
    at
    
org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:901)
    at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:134) at
    org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51) at
    org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:474)
    at
    org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:456)
    at
    
org.apache.spark.sql.SQLContext$implicits$.intRddToDataFrameHolder(SQLContext.scala:345)
    |

    Thanks,
    Peter Rudenko

    On 2015-06-01 05:04, Guoqiang Li wrote:

    +1 (non-binding)


    ------------------ Original ------------------
    *From: * "Sandy Ryza";<[email protected]>
    <mailto:[email protected]>;
    *Date: * Mon, Jun 1, 2015 07:34 AM
    *To: * "Krishna Sankar"<[email protected]>
    <mailto:[email protected]>;
    *Cc: * "Patrick Wendell"<[email protected]>
    <mailto:[email protected]>; "[email protected]"
    <mailto:[email protected]><[email protected]>
    <mailto:[email protected]>;
    *Subject: * Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

    +1 (non-binding)

    Launched against a pseudo-distributed YARN cluster running Hadoop
    2.6.0 and ran some jobs.

    -Sandy

    On Sat, May 30, 2015 at 3:44 PM, Krishna Sankar
    <[email protected] <mailto:[email protected]>> wrote:

        +1 (non-binding, of course)

        1. Compiled OSX 10.10 (Yosemite) OK Total time: 17:07 min
             mvn clean package -Pyarn -Dyarn.version=2.6.0
        -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests
        2. Tested pyspark, mlib - running as well as compare results
        with 1.3.1
        2.1. statistics (min,max,mean,Pearson,Spearman) OK
        2.2. Linear/Ridge/Laso Regression OK
        2.3. Decision Tree, Naive Bayes OK
        2.4. KMeans OK
               Center And Scale OK
        2.5. RDD operations OK
              State of the Union Texts - MapReduce, Filter,sortByKey
        (word count)
        2.6. Recommendation (Movielens medium dataset ~1 M ratings) OK
               Model evaluation/optimization (rank, numIter, lambda)
        with itertools OK
        3. Scala - MLlib
        3.1. statistics (min,max,mean,Pearson,Spearman) OK
        3.2. LinearRegressionWithSGD OK
        3.3. Decision Tree OK
        3.4. KMeans OK
        3.5. Recommendation (Movielens medium dataset ~1 M ratings) OK
        3.6. saveAsParquetFile OK
        3.7. Read and verify the 4.3 save(above) -
        sqlContext.parquetFile, registerTempTable, sql OK
        3.8. result = sqlContext.sql("SELECT
        OrderDetails.OrderID,ShipCountry,UnitPrice,Qty,Discount FROM
        Orders INNER JOIN OrderDetails ON Orders.OrderID =
        OrderDetails.OrderID") OK
        4.0. Spark SQL from Python OK
        4.1. result = sqlContext.sql("SELECT * from people WHERE
        State = 'WA'") OK

        Cheers
        <k/>

        On Fri, May 29, 2015 at 4:40 PM, Patrick Wendell
        <[email protected] <mailto:[email protected]>> wrote:

            Please vote on releasing the following candidate as
            Apache Spark version 1.4.0!

            The tag to be voted on is v1.4.0-rc3 (commit dd109a8):
            
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=dd109a8746ec07c7c83995890fc2c0cd7a693730

            The release files, including signatures, digests, etc.
            can be found at:
            
http://people.apache.org/~pwendell/spark-releases/spark-1.4.0-rc3-bin/
            
<http://people.apache.org/%7Epwendell/spark-releases/spark-1.4.0-rc3-bin/>

            Release artifacts are signed with the following key:
            https://people.apache.org/keys/committer/pwendell.asc

            The staging repository for this release can be found at:
            [published as version: 1.4.0]
            
https://repository.apache.org/content/repositories/orgapachespark-1109/
            [published as version: 1.4.0-rc3]
            
https://repository.apache.org/content/repositories/orgapachespark-1110/

            The documentation corresponding to this release can be
            found at:
            
http://people.apache.org/~pwendell/spark-releases/spark-1.4.0-rc3-docs/
            
<http://people.apache.org/%7Epwendell/spark-releases/spark-1.4.0-rc3-docs/>

            Please vote on releasing this package as Apache Spark 1.4.0!

            The vote is open until Tuesday, June 02, at 00:32 UTC and
            passes
            if a majority of at least 3 +1 PMC votes are cast.

            [ ] +1 Release this package as Apache Spark 1.4.0
            [ ] -1 Do not release this package because ...

            To learn more about Apache Spark, please see
            http://spark.apache.org/

            == What has changed since RC1 ==
            Below is a list of bug fixes that went into this RC:
            http://s.apache.org/vN

            == How can I help test this release? ==
            If you are a Spark user, you can help us test this release by
            taking a Spark 1.3 workload and running on this release
            candidate,
            then reporting any regressions.

            == What justifies a -1 vote for this release? ==
            This vote is happening towards the end of the 1.4 QA period,
            so -1 votes should only occur for significant regressions
            from 1.3.1.
            Bugs already present in 1.3.X, minor regressions, or bugs
            related
            to new features will not block this release.

            
---------------------------------------------------------------------
            To unsubscribe, e-mail: [email protected]
            <mailto:[email protected]>
            For additional commands, e-mail:
            [email protected] <mailto:[email protected]>

Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

Reply via email to