[ https://issues.apache.org/jira/browse/HIVE-27100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stamatis Zampetakis reassigned HIVE-27100: ------------------------------------------ > Remove unused data/files from repo > ---------------------------------- > > Key: HIVE-27100 > URL: https://issues.apache.org/jira/browse/HIVE-27100 > Project: Hive > Issue Type: Task > Reporter: Stamatis Zampetakis > Assignee: Stamatis Zampetakis > Priority: Major > > Some files under [https://github.com/apache/hive/tree/master/data/files] are > not referenced anywhere else in the repo and can be removed. > Removing them makes it easier to see what is actually tested. Other minor > benefits: > * faster checkout times; > * smaller source/binary releases. > The script that was used to find which files are not referenced can be found > below: > {code:bash} > for f in `ls data/files`; do > echo -n "$f "; > grep -a -R "$f" --exclude-dir=".git" --exclude-dir=target > --exclude=\*.q.out --exclude=\*.class --exclude=\*.jar | wc -l | grep " 0$"; > done > {code} > +Output+ > {noformat} > cbo_t4.txt 0 > cbo_t5.txt 0 > cbo_t6.txt 0 > compressed_4line_file1.csv.bz2 0 > empty2.txt 0 > filterCard.txt 0 > fullouter_string_big_1a_old.txt 0 > fullouter_string_small_1a_old.txt 0 > futurama_episodes.avro 0 > in9.txt 0 > map_null_schema.avro 0 > regex-path-2015-12-10_03.txt 0 > regex-path-201512-10_03.txt 0 > regex-path-2015121003.txt 0 > sample.json 0 > sample-queryplan-in-history.txt 0 > sample-queryplan.txt 0 > smbbucket_2.txt 0 > smb_bucket_input.txt 0 > SortDescCol1Col2.txt 0 > SortDescCol2Col1.txt 0 > sortdp.txt 0 > srcsortbucket1outof4.txt 0 > srcsortbucket2outof4.txt 0 > srcsortbucket4outof4.txt 0 > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)