Shouldn't "val counts4 = text3" be "val counts4 = text4"?


2016-10-09 23:14 GMT+02:00 Alberto Ramón <a.ramonporto...@gmail.com>:

> I think The char delimited its OK
> (I attached CSV)
>
> val text4 = env.readCsvFile [Tuple1[String]]("file://data.csv"
>   ,fieldDelimiter = ","
>   ,includedFields = Array(2))
> val counts4 = text3
>   .map { (_, 1) }
>   .groupBy(0)
>   .sum(1)
> counts4.print()
>
> The result is:
> [image: Imágenes integradas 1]
>
> Can you see any bug in mi code to read only 1º column ¿?
>
>
> 2016-10-07 21:50 GMT+02:00 Fabian Hueske <fhue...@gmail.com>:
>
>> I would check that the field delimiter is correctly set.
>>
>> With the correct delimiter your code would give
>>
>> ((a),1)
>> ((aa),1)
>>
>> because the single field is wrapped in a Tuple1.
>> You have to unwrap it in the map function: .map { (_._1, 1) }
>>
>> 2016-10-07 18:08 GMT+02:00 Alberto Ramón <a.ramonporto...@gmail.com>:
>>
>>> Humm
>>>
>>> Your solution compile with out errors, but IncludedFields Isn't working:
>>> [image: Imágenes integradas 1]
>>>
>>> The output is incorrect:
>>> [image: Imágenes integradas 2]
>>>
>>> The correct result must be only 1º Column
>>> (a,1)
>>> (aa,1)
>>>
>>> 2016-10-06 21:37 GMT+02:00 Fabian Hueske <fhue...@gmail.com>:
>>>
>>>> Hi Alberto,
>>>>
>>>> if you want to read a single column you have to wrap it in a Tuple1:
>>>>
>>>> val text4 = env.readCsvFile[Tuple1[String]]("file:data.csv" 
>>>> ,includedFields = Array(1))
>>>>
>>>> Best, Fabian
>>>>
>>>> 2016-10-06 20:59 GMT+02:00 Alberto Ramón <a.ramonporto...@gmail.com>:
>>>>
>>>>> I'm learning readCsvFile
>>>>> (I discover if the file ends on "/n", you will return a null exception)
>>>>>
>>>>> *if I try to read only 1 column *
>>>>>
>>>>> val text4 = env.readCsvFile[String]("file:data.csv" ,includedFields = 
>>>>> Array(1))
>>>>>
>>>>> The error is: he type String has to be a tuple or pojo type. [null]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *If  I put > 1 column; (*1º and 2º in this case*)*
>>>>>
>>>>> val text4 = env.readCsvFile [(String,String)]("data.csv"
>>>>>   ,fieldDelimiter = ","
>>>>>   ,includedFields = Array(0,1))
>>>>>
>>>>> Read all columns from, CSV (3 in my example)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to