Thank you.

I uploaded the file to s3, then created an external table using spark-sql
based on this data.  I was then able to access it for syncing in Kylin.

Question -- does the spark-sql and the hive store the data in the same
place?  In other words, is he result the same of using spark sql or hive to
upload the csv?

Thanks, WILL





On Mon, Oct 10, 2022 at 7:07 PM Mukvin <boyboys...@163.com> wrote:

> Hi WILL,
> I have checked the CSV uploading feature locally with your sample.csv
> file. And I got the same error
> Yes, the best way is to load into hive directly.
>
> Two Method:
> 1. you can follow the
> https://kylin.apache.org/blog/2022/04/20/kylin4-on-cloud-part1/ to check
> the command to do your custom tables samely to do so.
> 2. As your scene, I suggest you upload 3GB files to an AWS bucket or the
> `Kylin node` of the ec2 instance and use the hive command to set the data
> source to map the 3GB CSV files.
> Examples:
> https://stackoverflow.com/questions/19320611/hadoop-hive-loading-data-from-csv-on-a-local-machine.
>
>
>
> --
> Best regards.
> Tengting Xu
>
>
>
> At 2022-10-11 07:05:34, "Will Glass-Husain" <wgl...@forio.com> wrote:
> >Hi,
> >
> >I have a 3GB CSV file with about 90 columns of data I want to load into
> >Kylin.   I have set up Kylin cloud based on tutorial using kylin4_on_cloud
> >branch.
> >
> >Are there simple instructions for a new user as to the best method to load
> >the CSV file?  I tried the online csv loader with a small 5 line file and
> >it doesn't work.   (see KYLIN-5276).
> >
> >I assume the best way is to load into hive directly?   Can someone point me
> >to simple instructions?
> >
> >Much appreciated.  Trying to evaluate Kylin to see if it can speed up some
> >online data analysis we are trying to do.
> >
> >Best regards, WILL
> >
> >--
> >William Glass-Husain   /forio  |  +1 (415) 440 7500 x802  |  forio.com
> ><http://www.forio.com/>
>
>

-- 
William Glass-Husain   /forio  |  +1 (415) 440 7500 x802  |  forio.com
<http://www.forio.com/>

Reply via email to