Hi, I work on a project where we build a cube multiple times a day using Kylin. We were using Kylin 1.6 and upgraded this week to Kylin 2.0.
Since the upgrade I noticed that the HDFS usage had increased every time we rebuild the cube and the space is not cleared up. This is although we run both the StorageCleanupJob and metastore clean command as described here and here. When looking into HDFS to see where the increase is I see that the accumulated data is at: /kylin/kylin_metadata/ It looks like every job is getting a new folder inside that folder and its size is at least the same as the size of the cube. Seems like some of these folders were not cleared even for very old jobs but since the upgrade to V2.0 all the folders for all jobs were not cleared. I deleted some of the older folders and it didn't affect the cube. I also created a test cube and then deleted the folder that was created for it and could still query the cube. Is it safe to delete these folders manually? Is it correct to assume that after the job is done all the data that needs to be maintained will be in HBase (Where I can find the cube and the metadata information)? Many thanks, Itay ----- Itay Shwartz StructureIt 6th Floor Aldgate Tower 2 Leman Street London E1 8FA direct line: +44 (0)20 3286 9902 mobile: +44 (0)74 1123 6614 www.structureit.net
