Hi Roshan!

> From: Roshan Naik <ros...@hortonworks.com>
> If one would use the Kite dataset cli to create a dataset as per this
> http://kitesdk.org/docs/current/guide/Using-the-Kite-CLI-to-Create-a-Dataset/
>
>
> It is unclear what the 'kite.repo.uri' should be set to in the kite dataset
> sink.

Sorry for the confusion! The current release of Flume uses the
deprecated method of specifying a repository URI and a dataset name
while the CLI documentation covers the use of the newer dataset URIs.

The short version is you can convert from a dataset URI to a repo
URI/dataset name combination by replacing the scheme dataset: with
repo: and removing the name of that dataset.

Here are some examples:

dataset:hdfs:/data/repository/events ->

kite.repo.uri=repo:hdfs:/data/repository
kite.dataset.name=events

dataset:hive?dataset=events ->

kite.repo.uri=repo:hive
kite.dataset.name=events

We fixed this in a recent patch[1] so that in the next release you can
specify the configuration with just a dataset URI.

> On a side note... it is also unclear how to create a local file system based
> Kite dataset using the CLI so that flume can be pointed to it.

You can use the dataset:file format for URIs[2]. For example:

dataset:file:/data/repository/events

If I can help with any other Kite questions, feel free to ping me or
the Kite mailing list[3]!

-Joey

-- 
Joey Echeverria

[1] https://issues.apache.org/jira/browse/FLUME-2439
[2] http://kitesdk.org/docs/current/guide/URIs/#local
[3] https://groups.google.com/a/cloudera.org/forum/#!forum/cdk-dev

Reply via email to