Thanks Cameron Simpson for you suggestion and reply quite helpful :) On Wed, Jul 12, 2017 at 5:06 AM, Cameron Simpson <c...@zip.com.au> wrote:
> On 11Jul2017 22:01, Ganesh Pal <ganesh1...@gmail.com> wrote: > >> I am trying to open a file and check if there is a pattern has changed >> after the task got completed? >> >> file data: >> ........................................................ >> >> #tail -f /file.txt >> .......................................... >> Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = b6b20a65, >> journal_crc = d2097b00 >> Note: Task completed successfully. >> Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = d976d35e, >> journal_crc = a176af10 >> >> >> I have the below piece of code but would like to make this better more >> pythonic , I found regex pattern and exception handling poor here , any >> quick suggestion in your spare time is welcome. >> >> >> #open the existing file if the flag is set and check if there is a match >> >> log_file='/file.txt' >> flag_is_on=1 >> > > Use "True" instead of "1". A flag is a Boolean thing, and should use a > Boolean value. This lets you literally speak "true" and 'false" rather than > imoplicitly saying that "0 means false and nonzero means true". > > data = None >> > > There is no need to initialise data here because you immediately overwrite > it below. > > with open(log_file, 'r') as f: >> data = f.readlines() >> >> if flag_is_on: >> > > Oh yes. Just name this variable "flag". "_is_on" is kind of implicit. > > logdata = '\n'.join(data) >> > > Do other parts of your programme deal with the file data as lines? If not, > there is little point to reading the file and breaking it up into lines > above, then joining them together against here. Just go: > > with open(log_file) as f: > log_data = f.read() > > reg = "initiator_crc =(?P<ini_crc>[\s\S]*?), journal_crc" >> > > Normally we write regular expressions as "raw" python strings, thus: > > reg = r'initiator_crc =(?P<ini_crc>[\s\S]*?), journal_crc' > > because backslashes etc are punctuation inside normal strings. Within a > "raw" string started with r' nothing is special until the closing ' > character. This makes writing regular expressions more reliable. > > Also, why the character range "[\s\S]"? That says whitespace or > nonwhitespace i.e. any character. If you want any character, just say ".". > > crc = re.findall(re.compile(reg), logdata) >> > > It is better to compile a regexp just the once, getting a Regexp object, > and then you just use the compiled object. > > if not crc: >> raise Exception("Pattern not found in logfile") >> > > ValueError would be a more appropriate exception here; plain old > "Exception" is pretty vague. > > checksumbefore = crc[0].strip() >> checksumafter = crc[1].strip() >> > > Your regexp cannot start or end with whitespace. Those .strip calls are > not doing anything for you. > > This reads like you expect there to be exactly 2 matches in the file. What > if there are more or fewer? > > logging.info("checksumbefore :%s and checksumafter:%s" >> % (checksumbefore, checksumafter)) >> >> if checksumbefore == checksumafter: >> raise Exception("checksum not macthing") >> > > Don't you mean != here? > > I wouldn't be raising exceptions in this code. Personally I would make > this a function that returns True or False. Exceptions are a poor way of > returning "status" or other values. They're really for "things that should > not have happened", hence their name. > > It looks like you're scanning a log file for multiple lines and wanting to > know if successive ones change. Why not write a function like this > (untested): > > RE_CRC_LINE = re.compile(r'initiator_crc =(?P<ini_crc>[\s\S]*?), > journal_crc') > > def check_for_crc_changes(logfile): > old_crc_text = '' > with open(logfile) as f: > for line in f: > m = RE_CRC_LINE.match(line) > if not m: > # uninteresting line > continue > crc_text = m.group(0) > if crc_text != old_crc_text: > # found a change > return True > if old_crc_text == '': > # if this is really an error, you might raise this exception > # but maybe no such lines is just normal but boring > raise ValueError("no CRC lines seen in logfile %r" % (logfile,)) > # found no changes > return False > > See that there is very little sanity checking. In an exception supporting > language like Python you can often write code as if it will always succeed > by using things which will raise exceptions if things go wrong. Then > _outside_ the function you can catch any exceptions that occur (such as > being unable to open the log file). > > Cheers, > Cameron Simpson <c...@zip.com.au> > -- https://mail.python.org/mailman/listinfo/python-list