Hi Roshan! > From: Roshan Naik <ros...@hortonworks.com> > If one would use the Kite dataset cli to create a dataset as per this > http://kitesdk.org/docs/current/guide/Using-the-Kite-CLI-to-Create-a-Dataset/ > > > It is unclear what the 'kite.repo.uri' should be set to in the kite dataset > sink.
Sorry for the confusion! The current release of Flume uses the deprecated method of specifying a repository URI and a dataset name while the CLI documentation covers the use of the newer dataset URIs. The short version is you can convert from a dataset URI to a repo URI/dataset name combination by replacing the scheme dataset: with repo: and removing the name of that dataset. Here are some examples: dataset:hdfs:/data/repository/events -> kite.repo.uri=repo:hdfs:/data/repository kite.dataset.name=events dataset:hive?dataset=events -> kite.repo.uri=repo:hive kite.dataset.name=events We fixed this in a recent patch[1] so that in the next release you can specify the configuration with just a dataset URI. > On a side note... it is also unclear how to create a local file system based > Kite dataset using the CLI so that flume can be pointed to it. You can use the dataset:file format for URIs[2]. For example: dataset:file:/data/repository/events If I can help with any other Kite questions, feel free to ping me or the Kite mailing list[3]! -Joey -- Joey Echeverria [1] https://issues.apache.org/jira/browse/FLUME-2439 [2] http://kitesdk.org/docs/current/guide/URIs/#local [3] https://groups.google.com/a/cloudera.org/forum/#!forum/cdk-dev