On Fri, Jul 31, 2015 at 4:31 AM, <sutanu....@gmail.com> wrote: > #!/bin/bash > > _maillist='pa...@email.com' > _hname=`hostname` > _logdir=/hadoop/logs > _dirlog=${_logdir}/directory_check.log > > _year=$(date -d "-5 hour" +%Y) > _month=$(date -d "-5 hour" +%m) > _day=$(date -d "-5 hour" +%d) > _hour=$(date -d "-5 hour" +%H) > > _hdfsdir=`hdfs dfs -ls -d /hadoop/flume_ingest_*/$_year/$_month | awk '{print > $8}'` > > echo "Checking for HDFS directories:" > ${_dirlog} > echo >> ${_dirlog} > > for _currdir in $_hdfsdir > do > hdfs dfs -ls -d $_currdir/$_day/$_hour &>> ${_dirlog} > done > > if [[ `grep -i "No such file or directory" ${_dirlog}` ]]; > then > echo "Verify Flume is working for all servers" | mailx -s "HDFS Hadoop > Failure on Flume: ${_hname}" -a ${_dirlog} ${_maillist} > fi > -- > https://mail.python.org/mailman/listinfo/python-list
There are two basic approaches to this kind of job. 1) Go through every line of bash code and translate it into equivalent Python code. You should then have a Python script which blindly and naively accomplishes the same goal by the same method. 2) Start by describing what you want to accomplish, and then implement that in Python, using algorithmic notes from the bash code. The second option seems like a lot more work, but long-term it often isn't, because you end up with better code. For example, bash lacks decent timezone support, so I can well believe random832's guess that your five-hour offset is a simulation of that; but Python can do much better work with timezones, so you can get that actually correct. Also, file handling, searching, and text manipulation and so on can usually be done more efficiently and readably in Python directly than by piping things through grep and awk. ChrisA -- https://mail.python.org/mailman/listinfo/python-list