"Avi Gross" <avigr...@verizon.net> writes: > Just to be clear, Cameron, I retired very early and thus have had no reason > to use AWK in a work situation and for a while was not using UNIX-based > machines. I have no doubt I would have continued using WK as one part of my > toolkit for years albeit less often as I found other tools better for some > situations, let alone the kind I mentioned earlier that are not text-file > based such as databases. > > It is, as noted, a great tool and if you only had one or a few tools like it > available, it can easily be bent and twisted to do much of what the others > do as it is more programmable than most. But following that line of > reasoning, fairly simple python scripts can be written with python -c "..." > or by pointing to a script > > Anyone have a collection of shell scripts that can be used in pipelines > where each piece is just a call to python to do something simple?
I'm not doing that, but I am trying to replace a longish bash pipeline with Python code. Within Emacs, often I use Org mode[1] to generate date via some bash commands and then visualise the data via Python. Thus, in a single Org file I run /usr/bin/sacct -u $user -o jobid -X -S $start -E $end -s COMPLETED -n | \ xargs -I {} seff {} | grep 'Efficiency' | sed '$!N;s/\n/ /' | awk '{print $3 " " $9}' | sed 's/%//g' The raw numbers are formatted by Org into a table | cpu_eff | mem_eff | |---------+---------| | 96.6 | 99.11 | | 93.43 | 100.0 | | 91.3 | 100.0 | | 88.71 | 100.0 | | 89.79 | 100.0 | | 84.59 | 100.0 | | 83.42 | 100.0 | | 86.09 | 100.0 | | 92.31 | 100.0 | | 90.05 | 100.0 | | 81.98 | 100.0 | | 90.76 | 100.0 | | 75.36 | 64.03 | I then read this into some Python code in the Org file and do something like df = pd.DataFrame(eff_tab[1:], columns=eff_tab[0]) cpu_data = df.loc[: , "cpu_eff"] mem_data = df.loc[: , "mem_eff"] ... n, bins, patches = axis[0].hist(cpu_data, bins=range(0, 110, 5)) n, bins, patches = axis[1].hist(mem_data, bins=range(0, 110, 5)) which generates nice histograms. I decided rewrite the whole thing as a stand-alone Python program so that I can run it as a cron job. However, as a novice Python programmer I am finding translating the bash part slightly clunky. I am in the middle of doing this and started with the following: sacct = subprocess.Popen(["/usr/bin/sacct", "-u", user, "-S", period[0], "-E", period[1], "-o", "jobid", "-X", "-s", "COMPLETED", "-n"], stdout=subprocess.PIPE, ) jobids = [] for line in sacct.stdout: jobid = str(line.strip(), 'UTF-8') jobids.append(jobid) for jobid in jobids: seff = subprocess.Popen(["/usr/bin/seff", jobid], stdin=sacct.stdout, stdout=subprocess.PIPE, ) seff_output = [] for line in seff.stdout: seff_output.append(str(line.strip(), "UTF-8")) ... but compared the to the bash pipeline, this all seems a bit laboured. Does any one have a better approach? Cheers, Loris > -----Original Message----- > From: Cameron Simpson <c...@cskk.id.au> > Sent: Wednesday, March 24, 2021 6:34 PM > To: Avi Gross <avigr...@verizon.net> > Cc: python-list@python.org > Subject: Re: convert script awk in python > > On 24Mar2021 12:00, Avi Gross <avigr...@verizon.net> wrote: >>But I wonder how much languages like AWK are still used to make new >>programs as compared to a time they were really useful. > > You mentioned in an adjacent post that you've not used AWK since 2000. > By contrast, I still use it regularly. > > It's great for proof of concept at the command line or in small scripts, and > as the innards of quite useful scripts. I've a trite "colsum" script which > does nothing but generate and run a little awk programme to sum a column, > and routinely type "blah .... | colsum 2" or the like to get a tally. > > I totally agree that once you're processing a lot of data from places or > where a shell script is making long pipelines or many command invocations, > if that's a performance issue it is time to recode. > > Cheers, > Cameron Simpson <c...@cskk.id.au> Footnotes: [1] https://orgmode.org/ -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list