[ https://issues.apache.org/jira/browse/FLINK-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15533029#comment-15533029 ]
ASF GitHub Bot commented on FLINK-4613: --------------------------------------- Github user thvasilo commented on a diff in the pull request: https://github.com/apache/flink/pull/2542#discussion_r81159482 --- Diff: flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/recommendation/ImplicitALSTest.scala --- @@ -0,0 +1,171 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.recommendation + +import org.apache.flink.ml.util.FlinkTestBase +import org.scalatest._ + +import scala.language.postfixOps +import org.apache.flink.api.scala._ +import org.apache.flink.core.testutils.CommonTestUtils + +class ImplicitALSTest + extends FlatSpec + with Matchers + with FlinkTestBase { + + override val parallelism = 2 + + behavior of "The modification of the alternating least squares (ALS) implementation" + + "for implicit feedback datasets." + + it should "properly compute Y^T * Y, and factorize matrix" in { --- End diff -- AFAIK in the rest of the FlinkML tests we just use `val env = ExecutionEnvironment.getExecutionEnvironment`. I don't know if that policy has now changed, maybe @tillrohrmann can clarify. For now I would say to just split the tests. > Extend ALS to handle implicit feedback datasets > ----------------------------------------------- > > Key: FLINK-4613 > URL: https://issues.apache.org/jira/browse/FLINK-4613 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library > Reporter: Gábor Hermann > Assignee: Gábor Hermann > > The Alternating Least Squares implementation should be extended to handle > _implicit feedback_ datasets. These datasets do not contain explicit ratings > by users, they are rather built by collecting user behavior (e.g. user > listened to artist X for Y minutes), and they require a slightly different > optimization objective. See details by [Hu et > al|http://dx.doi.org/10.1109/ICDM.2008.22]. > We do not need to modify much in the original ALS algorithm. See [Spark ALS > implementation|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala], > which could be a basis for this extension. Only the updating factor part is > modified, and most of the changes are in the local parts of the algorithm > (i.e. UDFs). In fact, the only modification that is not local, is > precomputing a matrix product Y^T * Y and broadcasting it to all the nodes, > which we can do with broadcast DataSets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)