I need to find all files which names satisfy a pattern and contain a certain string, then from those files I need to printf some metadata, a la:
find "${_SDIR}" -type f -iregex .*"${_X}" -printf '"%TD %TT",%Ts,%s,"%P"\n' > "${_TMPFL}" 2>&1 I am trying to do all steps in one go, which I think should be possible, but it is not working. Tha tline prints out both all patterned files using the format specified in the -printf parameter and (repeatedly) those that match the search inside the content as paths (no -printf). I know I am being silly, since on that statement there are in fact two searches one on the metadata and one through the content of the patterned files, but you can see what I mean. There should be a way to do it in once swoop. Or probably there is another utility to do that. The thing is that I work on corpora research and very often you need to search large amounts fo text files in no time. Instead of the lines of the first search by the extensions to look like: "12/15/18 12:14:16.0000000000",1544872456,2542,"OK/OK00/OK00Test.java" "12/15/18 11:28:49.0000000000",1544869729,85,"OK/OK00/OK00Test_main_cli_UTF-8.properties.txt" "12/15/18 11:30:45.0000000000",1544869845,296,"OK/logs/OK00Test_20181215053045.0413_err.properties.txt" "12/15/18 11:35:23.0000000000",1544870123,296,"OK/logs/OK00Test_20181215053523.0420_err.properties.txt" I want them formatted a la yyyy-mm-dd hh:mm:ss (or dd.mm.yyyy hh:mm:ss, or whatever other way): "2018-12-15 12:14:16",1544872456,2542,"OK/OK00/OK00Test.java" "2018-12-15 11:28:49",1544869729,85,"OK/OK00/OK00Test_main_cli_UTF-8.properties.txt" "2018-12-18 11:30:45",1544869845,296,"OK/logs/OK00Test_20181215053045.0413_err.properties.txt" "2018-12-18 11:35:23",1544870123,296,"OK/logs/OK00Test_20181215053523.0420_err.properties.txt" # extensions _X=".\(java\|txt\)" # search directory _SDIR="/home/$(whoami)/java" # search string _W="java.io.UnsupportedEncodingException;" # start time _TM_START=$(date +%s); _DT=$(date +%Y%m%d%H%M%S) # log file _LOG_FL="find_${_DT}.log" echo "// __ \$_LOG_FL: |${_LOG_FL}|" _TMPFL="${_LOG_FL%.*}"_$(mktemp .XXXXXX) echo "// __ \$_TMPFL: |${_TMPFL}|" find "${_SDIR}" -type f -iregex .*"${_X}" -printf '"%TD %TT",%Ts,%s,"%P"\n' > "${_TMPFL}" 2>&1 ls -l "${_TMPFL}" wc -l "${_TMPFL}" _TMPFL02="${_LOG_FL%.*}"_$(mktemp "02".XXXXXX) echo "// __ \$_TMPFL02: |${_TMPFL02}|" _LNS=$(wc -l "${_TMPFL}" | awk '{print $1}') echo "// __ \$_LNS: |${_LNS}|" _FND_CNT=0 _IX=0 while read -r _L; do _PTH=$(echo "${_L}" | awk -F '"' '{print $4}') # echo "// __ [$_IX/$_LNS): |${_L}|${_PTH}|" _IFL="${_SDIR}/${_PTH}" if [ -s "${_IFL}" ]; then _FND_W=$(cat "${_IFL}" | grep "${_W}") # echo "// __ [$_IX/$_LNS): |${_L}|${_PTH}|${_FND_W}|" # not empty string if [[ ! -z ${_FND_W} ]]; then # echo "// __ \$_IFL: |${_IFL}|" _FND_CNT=$(( _FND_CNT+1 )) echo "${_L}" >> "${_TMPFL02}" fi else echo "// __ File not found! \$_IFL: |$_IFL|" fi _IX=$(( _IX+1 )) done < "${_TMPFL}" rm -fv "${_TMPFL}" # _TM_END=$(date +%s); _TM_DIFF=$((_TM_END - _TM_START)) echo "// __ |${_W}| found in |${_FND_CNT}| of |${_LNS}| files in ${_TM_DIFF} seconds" ls -l "${_TMPFL02}" wc -l "${_TMPFL02}" # fix "sort -k ..." as csv line from last/most recent modified down (reverse) after fixing date format ... #cat "${_TMPFL02}" | sort -k 3,3nr > "${_LOG_FL}" cat "${_TMPFL02}" | sort -k 1,1nr > "${_LOG_FL}" rm -fv "${_TMPFL02}" ls -l "${_LOG_FL}" wc -l "${_LOG_FL}"