Thank you for the reply.

I proceeded as per the Example listed in Apache Mahout help page at this
link <https://mahout.apache.org/users/recommender/intro-als-hadoop.html>:

https://mahout.apache.org/users/recommender/intro-als-hadoop.html

As per Step 4 of this link, after creation of sequence file, issue the
following command:

$ mahout recommendfactorized --input $als_input --userFeatures
$als_output/U/ --itemFeatures $als_output/M/ --numRecommendations 1
--output recommendations --maxRating 1

Now, the folders 'U' and 'M' as are mentioned in above command
are created during the process of sequence file creation by mahout as
per the following command:

$mahout seqdirectory -i /user/ashokharnal/testdata -ow -o
/user/ashokharnal/seqfiles

Since these very names were used in the Example, I thought nothing more
was required to be done in creating sequence file.

What further steps are needed? Please suggest simple shell command.

Thanks,

Ashok Kumar Harnal









On 25 November 2014 at 14:52, Gokhan Capan <[email protected]> wrote:

> The problem is that seqdirectory doesn't do what you want. From the
> documentation page:
>
> The output of seqDirectory will be a Sequence file < Text, Text > of
> all documents (/sub-directory-path/documentFileName, documentText).
>
> Please see
> http://mahout.apache.org/users/basics/creating-vectors-from-text.html
> for more details
>
> Sent from my iPhone
>
> > On Nov 25, 2014, at 10:35, Ashok Harnal <[email protected]> wrote:
> >
> > I have now tested on a fresh cluster of Cloudera 5.2. Mahout 0.9 comes
> > installed with it.
> >
> > My input data is just five lines, tab-separated. I have typed this data
> > myself. So
> > I do not expect anything else in this data.
> >
> > 1    100    1
> > 1    200    5
> > 1    400    1
> > 2    200    2
> > 2    300    1
> >
> > I use the following Mahout command for factorization:
> >
> > mahout parallelALS --input /user/ashokharnal/mydata --output
> > /user/ashokharnal/outdata --lambda 0.1 --implicitFeedback true --alpha
> 0.8
> > --numFeatures 2 --numIterations 5  --numThreadsPerSolver 1 --tempDir
> > /tmp/ratings
> >
> > I then, create the following just two-line tab separated test file.
> >
> > 1    100
> > 2    200
> >
> > I have typed this out myself. So no text string is expected.
> >
> > This file was then converted to sequence format, as:
> >
> > mahout seqdirectory -i /user/ashokharnal/testdata -ow -o
> > /user/ashokharnal/seqfiles
> >
> > Finally, I ran the following command to get recommendations:
> >
> > mahout recommendfactorized --input /user/ashokharnal/seqfiles
> > --userFeatures /user/ashokharnal/outdata/U/ --itemFeatures
> > /user/ashokharnal/outdata/M/ --numRecommendations 1 --output
> > recommendations --maxRating 1
> >
> > I get the same error. Full error trace is as below:
> >
> >
> > $ mahout recommendfactorized --input /user/ashokharnal/seqfiles
> > --userFeatures /user/ashokharnal/outdata/U/ --itemFeatures
> > /user/ashokharnal/outdata/M/ --numRecommendations 1 --output
> > recommendations --maxRating 1
> > MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> > Running on hadoop, using
> > /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/bin/hadoop and
> > HADOOP_CONF_DIR=/etc/hadoop/conf
> > MAHOUT-JOB:
> /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/mahout/mahout-examples-0.9-cdh5.2.0-job.jar
> > 14/11/25 13:48:46 WARN driver.MahoutDriver: No
> > recommendfactorized.props found on classpath, will use command-line
> > arguments only
> > 14/11/25 13:48:46 INFO common.AbstractJob: Command line arguments:
> > {--endPhase=[2147483647], --input=[/user/ashokharnal/seqfiles],
> > --itemFeatures=[/user/ashokharnal/outdata/M/], --maxRating=[1],
> > --numRecommendations=[1], --numThreads=[1],
> > --output=[recommendations], --startPhase=[0], --tempDir=[temp],
> > --userFeatures=[/user/ashokharnal/outdata/U/]}
> > 14/11/25 13:48:47 INFO Configuration.deprecation: session.id is
> > deprecated. Instead, use dfs.metrics.session-id
> > 14/11/25 13:48:47 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> > processName=JobTracker, sessionId=
> > 14/11/25 13:48:47 WARN mapred.JobClient: Use GenericOptionsParser for
> > parsing the arguments. Applications should implement Tool for the
> > same.
> > 14/11/25 13:48:47 INFO input.FileInputFormat: Total input paths to
> process : 1
> > 14/11/25 13:48:48 WARN conf.Configuration:
> >
> file:/tmp/hadoop-bigdata1/mapred/local/localRunner/bigdata1/job_local2071551631_0001/job_local2071551631_0001.xml:an
> > attempt to override final parameter:
> > hadoop.ssl.keystores.factory.class;  Ignoring.
> > 14/11/25 13:48:48 WARN conf.Configuration:
> >
> file:/tmp/hadoop-bigdata1/mapred/local/localRunner/bigdata1/job_local2071551631_0001/job_local2071551631_0001.xml:an
> > attempt to override final parameter: hadoop.ssl.client.conf;
> > Ignoring.
> > 14/11/25 13:48:48 WARN conf.Configuration:
> >
> file:/tmp/hadoop-bigdata1/mapred/local/localRunner/bigdata1/job_local2071551631_0001/job_local2071551631_0001.xml:an
> > attempt to override final parameter: hadoop.ssl.server.conf;
> > Ignoring.
> > 14/11/25 13:48:48 WARN conf.Configuration:
> >
> file:/tmp/hadoop-bigdata1/mapred/local/localRunner/bigdata1/job_local2071551631_0001/job_local2071551631_0001.xml:an
> > attempt to override final parameter: hadoop.ssl.require.client.cert;
> > Ignoring.
> > 14/11/25 13:48:48 INFO mapred.LocalJobRunner: OutputCommitter set in
> config null
> > 14/11/25 13:48:48 INFO mapred.JobClient: Running job:
> job_local2071551631_0001
> > 14/11/25 13:48:48 INFO mapred.LocalJobRunner: OutputCommitter is
> > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
> > 14/11/25 13:48:48 INFO mapred.LocalJobRunner: Waiting for map tasks
> > 14/11/25 13:48:48 INFO mapred.LocalJobRunner: Starting task:
> > attempt_local2071551631_0001_m_000000_0
> > 14/11/25 13:48:48 WARN mapreduce.Counters: Group
> > org.apache.hadoop.mapred.Task$Counter is deprecated. Use
> > org.apache.hadoop.mapreduce.TaskCounter instead
> > 14/11/25 13:48:48 INFO util.ProcessTree: setsid exited with exit code 0
> > 14/11/25 13:48:48 INFO mapred.Task:  Using ResourceCalculatorPlugin :
> > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4e7f1fc4
> > 14/11/25 13:48:48 INFO mapred.MapTask: Processing split:
> > hdfs://bigdata1:8020/user/ashokharnal/seqfiles/part-m-00000:0+196
> > 14/11/25 13:48:48 INFO zlib.ZlibFactory: Successfully loaded &
> > initialized native-zlib library
> > 14/11/25 13:48:48 INFO compress.CodecPool: Got brand-new decompressor
> [.deflate]
> > 14/11/25 13:48:48 INFO mapred.LocalJobRunner: Map task executor complete.
> > 14/11/25 13:48:48 WARN mapred.LocalJobRunner: job_local2071551631_0001
> > java.lang.Exception: java.lang.RuntimeException:
> > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast
> > to org.apache.hadoop.io.IntWritable
> >    at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:406)
> > Caused by: java.lang.RuntimeException: java.lang.ClassCastException:
> > org.apache.hadoop.io.Text cannot be cast to
> > org.apache.hadoop.io.IntWritable
> >    at
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:151)
> >    at
> org.apache.mahout.cf.taste.hadoop.als.MultithreadedSharingMapper.run(MultithreadedSharingMapper.java:60)
> >    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
> >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
> >    at
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
> >    at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> >    at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >    at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >    at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text
> > cannot be cast to org.apache.hadoop.io.IntWritable
> >    at
> org.apache.mahout.cf.taste.hadoop.als.PredictionMapper.map(PredictionMapper.java:44)
> >    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> >    at
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper$MapRunner.run(MultithreadedMapper.java:268)
> > 14/11/25 13:48:49 INFO mapred.JobClient:  map 0% reduce 0%
> > 14/11/25 13:48:49 INFO mapred.JobClient: Job complete:
> job_local2071551631_0001
> > 14/11/25 13:48:49 INFO mapred.JobClient: Counters: 0
> > 14/11/25 13:48:49 INFO driver.MahoutDriver: Program took 2651 ms
> > (Minutes: 0.04418333333333333)
> > 14/11/25 13:48:49 ERROR hdfs.DFSClient: Failed to close inode 18867
> >
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
> > No lease on
> /user/bigdata1/recommendations/_temporary/_attempt_local2071551631_0001_m_000000_0/part-m-00000
> > (inode 18867): File does not exist. Holder
> > DFSClient_NONMAPREDUCE_-1603552809_1 does not have any open files.
> >    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3319)
> >    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3407)
> >    at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3377)
> >    at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:673)
> >    at
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.complete(AuthorizationProviderProxyClientProtocol.java:219)
> >    at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:520)
> >    at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> >    at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
> >    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> >    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> >    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> >    at java.security.AccessController.doPrivileged(Native Method)
> >    at javax.security.auth.Subject.doAs(Subject.java:415)
> >    at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> >    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> >
> >    at org.apache.hadoop.ipc.Client.call(Client.java:1411)
> >    at org.apache.hadoop.ipc.Client.call(Client.java:1364)
> >    at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >    at com.sun.proxy.$Proxy16.complete(Unknown Source)
> >    at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:435)
> >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >    at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >    at java.lang.reflect.Method.invoke(Method.java:606)
> >    at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> >    at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >    at com.sun.proxy.$Proxy17.complete(Unknown Source)
> >    at
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2180)
> >    at
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2164)
> >    at
> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:908)
> >    at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:926)
> >    at
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:861)
> >    at
> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2687)
> >    at
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2704)
> >    at
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> >
> > I now at least rule out that it is an Input/Output file problem. Same
> error
> > was observed when I worked on mahout 0.8 installed on Cloudera 5.0.
> >
> > So either mahout compilation in both Cloudera 5.0 and now 5.2 is a
> > problem or there is a problem with command line version of mahout
> > if the command arguments that I have supplied above are OK.
> >
> > Thanks,
> >
> > Ashok Kumar Harnal
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >> On 25 November 2014 at 08:52, Ashok Harnal <[email protected]>
> wrote:
> >>
> >> Thanks for the reply. I will recheck and repeat the experiment using
> >> self-typed input.
> >> I am reinstalling Cloudera 5.2.
> >>
> >> Ashok Kumar Harnal
> >>
> >>> On 24 November 2014 at 21:38, Ted Dunning <[email protected]>
> wrote:
> >>>
> >>> The error message that you got indicated that some input was textual
> and
> >>> needed to be an integer.
> >>>
> >>> Is there a chance that the type of some of your input is incorrect in
> your
> >>> sequence files?
> >>>
> >>>
> >>>
> >>> On Mon, Nov 24, 2014 at 3:47 PM, Ashok Harnal <[email protected]>
> >>> wrote:
> >>>
> >>>> Thanks for reply. I did not compile mahout. Mahout 0.9 comes along
> with
> >>>> Cloudera 5.2.
> >>>>
> >>>> Ashok Kumar Harnal
> >>>>
> >>>>> On 24 November 2014 at 18:42, <[email protected]> wrote:
> >>>>>
> >>>>> Looks like maybe a mismatch between mahout version you compiled code
> >>>>> against and the mahout version installed in the cluster?
> >>>>>
> >>>>>> On Nov 24, 2014, at 8:08 AM, Ashok Harnal <[email protected]>
> >>>> wrote:
> >>>>>>
> >>>>>> Thanks for reply. Here are the facts:
> >>>>>>
> >>>>>> 1. I am using mahout shell command and not a java program. So I am
> >>> not
> >>>>>> passing any arguments to map function.
> >>>>>>
> >>>>>> 2. I am using hadoop. Input training file is loaded in hadoop. It
> >>> is a
> >>>>> tab
> >>>>>> separated 'u1.base' file of MovieLens dataset.
> >>>>>>   It is something like below. All users are there along with
> >>> whatever
> >>>>>> ratings they have given.
> >>>>>>
> >>>>>> 1    1    5
> >>>>>> 1    2    3
> >>>>>> 1    3    4
> >>>>>> 1    4    3
> >>>>>> 1    5    3
> >>>>>> :
> >>>>>> :
> >>>>>> 2    1    4
> >>>>>> 2    10    2
> >>>>>> 2    14    4
> >>>>>> :
> >>>>>> :
> >>>>>>
> >>>>>> 3. I use the following mahout command to build model:
> >>>>>>
> >>>>>>     mahout parallelALS --input /user/ashokharnal/u1.base --output
> >>>>>> /user/ashokharnal/u1.out --lambda 0.1 --implicitFeedback true
> >>> --alpha
> >>>>>> 0.8 --numFeatures 15 --numIterations 10  --numThreadsPerSolver 1
> >>>>>> --tempDir /tmp/ratings
> >>>>>>
> >>>>>> 4. My test file is just two-lines tab-separated file as below:
> >>>>>>
> >>>>>>
> >>>>>> 1    1
> >>>>>> 2    1
> >>>>>>
> >>>>>> 5. This file is converted to sequence file using the following
> >>> mahout
> >>>>> command:
> >>>>>>
> >>>>>> mahout seqdirectory -i /user/ashokharnal/ufind2.test -o
> >>>>>> /user/ashokharnal/seqfiles
> >>>>>>
> >>>>>> 6. I then run the following mahout command:
> >>>>>>
> >>>>>> mahout recommendfactorized --input /user/ashokharnal/seqfiles
> >>>>>> --userFeatures  /user/ashokharnal/u1.out/U/ --itemFeatures
> >>>>>> /user/akh/u1.out/M/ --numRecommendations 1 --output
> >>> /tmp/reommendation
> >>>>>> --maxRating 1
> >>>>>>
> >>>>>> 7. I am using CentOS 6.5 with Cloudera 5.2 installed.
> >>>>>>
> >>>>>> The error messages are as below:
> >>>>>>
> >>>>>> 14/11/24 18:06:48 INFO mapred.MapTask: Processing split:
> >>>>>> hdfs://master:8020/user/ashokharnal/seqfiles/part-m-00000:0+195
> >>>>>> 14/11/24 18:06:49 INFO zlib.ZlibFactory: Successfully loaded &
> >>>>>> initialized native-zlib library
> >>>>>> 14/11/24 18:06:49 INFO compress.CodecPool: Got brand-new
> >>> decompressor
> >>>>> [.deflate]
> >>>>>> 14/11/24 18:06:49 INFO mapred.LocalJobRunner: Map task executor
> >>>> complete.
> >>>>>> 14/11/24 18:06:49 WARN mapred.LocalJobRunner:
> >>> job_local1177125820_0001
> >>>>>> java.lang.Exception: java.lang.RuntimeException:
> >>>>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be
> >>> cast
> >>>>>> to org.apache.hadoop.io.IntWritable
> >>>>>>   at
> >>>
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:406)
> >>>>>> Caused by: java.lang.RuntimeException: java.lang.ClassCastException:
> >>>>>> org.apache.hadoop.io.Text cannot be cast to
> >>>>>> org.apache.hadoop.io.IntWritable
> >>>>>>   at
> >>>
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:151)
> >>>>>>   at
> >>>
> org.apache.mahout.cf.taste.hadoop.als.MultithreadedSharingMapper.run(MultithreadedSharingMapper.java:60)
> >>>>>>   at
> >>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
> >>>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
> >>>>>>   at
> >>>
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
> >>>>>>   at
> >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>>>>>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> >>>>>>   at
> >>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>>>>>   at
> >>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>>>>>   at java.lang.Thread.run(Thread.java:744)
> >>>>>> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text
> >>>>>> cannot be cast to org.apache.hadoop.io.IntWritable
> >>>>>>   at
> >>>
> org.apache.mahout.cf.taste.hadoop.als.PredictionMapper.map(PredictionMapper.java:44)
> >>>>>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> >>>>>>   at
> >>>
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper$MapRunner.run(MultithreadedMapper.java:268)
> >>>>>> 14/11/24 18:06:49 INFO mapred.JobClient:  map 0% reduce 0%
> >>>>>> 14/11/24 18:06:49 INFO mapred.JobClient: Job complete:
> >>>>> job_local1177125820_0001
> >>>>>> 14/11/24 18:06:49 INFO mapred.JobClient: Counters: 0
> >>>>>> 14/11/24 18:06:49 INFO driver.MahoutDriver: Program took 2529 ms
> >>>>>> (Minutes: 0.04215)
> >>>>>> 14/11/24 18:06:49 ERROR hdfs.DFSClient: Failed to close inode 24733
> >>>
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
> >>>>>> No lease on
> >>>
> /tmp/reommendation/_temporary/_attempt_local1177125820_0001_m_000000_0/part-m-00000
> >>>>>> (inode 24733): File does not exist. Holder
> >>>>>> DFSClient_NONMAPREDUCE_157704469_1 does not have any open files.
> >>>>>>   at
> >>>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3319)
> >>>>>>   at
> >>>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3407)
> >>>>>>   at
> >>>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3377)
> >>>>>>   at
> >>>
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:673)
> >>>>>>   at
> >>>
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.complete(AuthorizationProviderProxyClientProtocol.java:219)
> >>>>>>   at
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:520)
> >>>>>>   at
> >>>
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> >>>>>>   at
> >>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
> >>>>>>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
> >>>>>>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> >>>>>>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> >>>>>>   at java.security.AccessController.doPrivileged(Native Method)
> >>>>>>   at javax.security.auth.Subject.doAs(Subject.java:415)
> >>>>>>   at
> >>>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> >>>>>>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> >>>>>>
> >>>>>>   at org.apache.hadoop.ipc.Client.call(Client.java:1411)
> >>>>>>   at org.apache.hadoop.ipc.Client.call(Client.java:1364)
> >>>>>>   at
> >>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >>>>>>   at com.sun.proxy.$Proxy16.complete(Unknown Source)
> >>>>>>   at
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:435)
> >>>>>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>>>>>   at
> >>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>>>>>   at
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>>>>   at java.lang.reflect.Method.invoke(Method.java:606)
> >>>>>>   at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> >>>>>>   at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>>>>>   at com.sun.proxy.$Proxy17.complete(Unknown Source)
> >>>>>>   at
> >>>
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2180)
> >>>>>>   at
> >>> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2164)
> >>>>>>   at
> >>>
> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:908)
> >>>>>>   at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:926)
> >>>>>>   at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:861)
> >>>>>>   at
> >>>>> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2687)
> >>>>>>   at
> >>>
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2704)
> >>>>>>   at
> >>>
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> >>>>>>
> >>>>>> Sorry for bothering
> >>>>>>
> >>>>>> Ashok Kumar Harnal
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 24 November 2014 at 15:50, Divyang Shah
> >>>>> <[email protected]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> hello,        problem is in map method argument you have passed is
> >>> not
> >>>>>>> matching with specified in job configuration. so, match both of
> >>> them.
> >>>>>>>
> >>>>>>>
> >>>>>>>    On Sunday, 23 November 2014 8:31 AM, Ashok Harnal <
> >>>>>>> [email protected]> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> I use mahout 0.7 installed in Cloudera. After creating user-feature
> >>>> and
> >>>>>>> item-feature matrix in hdfs, I run the following command:
> >>>>>>>
> >>>>>>> mahout recommendfactorized --input /user/ashokharnal/seqfiles
> >>>>>>> --userFeatures $res_out_file/U/ --itemFeatures $res_out_file/M/
> >>>>>>> --numRecommendations 1 --output $reommendation --maxRating 1
> >>>>>>>
> >>>>>>> After some time, I get the following error:
> >>>>>>>
> >>>>>>> :
> >>>>>>> :
> >>>>>>> 14/11/23 08:28:20 INFO mapred.LocalJobRunner: Map task executor
> >>>>> complete.
> >>>>>>> 14/11/23 08:28:20 WARN mapred.LocalJobRunner:
> >>> job_local954305987_0001
> >>>>>>> java.lang.Exception: java.lang.RuntimeException:
> >>>>>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be
> >>> cast
> >>>>> to
> >>>>>>> org.apache.hadoop.io.IntWritable
> >>>>>>>   at
> >>>>
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:406)
> >>>>>>> Caused by: java.lang.RuntimeException:
> >>> java.lang.ClassCastException:
> >>>>>>> org.apache.hadoop.io.Text cannot be cast to
> >>>>>>> org.apache.hadoop.io.IntWritable
> >>>>>>>   at
> >>>
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:151)
> >>>>>>>   at
> >>>
> org.apache.mahout.cf.taste.hadoop.als.MultithreadedSharingMapper.run(MultithreadedSharingMapper.java:60)
> >>>>>>>   at
> >>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
> >>>>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
> >>>>>>>   at
> >>>
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
> >>>>>>>   at
> >>>>
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>>>>>>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> >>>>>>>   at
> >>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>>>>>>   at
> >>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>>>>>>   at java.lang.Thread.run(Thread.java:744)
> >>>>>>> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text
> >>>>> cannot
> >>>>>>> be cast to org.apache.hadoop.io.IntWritable
> >>>>>>>   at
> >>>
> org.apache.mahout.cf.taste.hadoop.als.PredictionMapper.map(PredictionMapper.java:44)
> >>>>>>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> >>>>>>>   at
> >>>
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper$MapRunner.run(MultithreadedMapper.java:268)
> >>>>>>>
> >>>>>>>
> >>>>>>> Not sure what is wrong.
> >>>>>>> Request help.
> >>>>>>>
> >>>>>>> Ashok Kumar Harnal
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Visit my blog at: http://ashokharnal.wordpress.com/
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Visit my blog at: http://ashokharnal.wordpress.com/
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Visit my blog at: http://ashokharnal.wordpress.com/
> >>
> >>
> >>
> >> --
> >> Visit my blog at: http://ashokharnal.wordpress.com/
> >
> >
> >
> > --
> > Visit my blog at: http://ashokharnal.wordpress.com/
>



-- 
Visit my blog at: http://ashokharnal.wordpress.com/

Reply via email to