On 31/05/24 14:26, HenHanna via Python-list wrote:
On 5/30/2024 2:18 PM, dn wrote:
On 31/05/24 08:03, HenHanna via Python-list wrote:
Given a text file of a novel (JoyceUlysses.txt) ...
could someone give me a pretty fast (and simple) Python program
that'd give me a list of all words occurring exactly once?
-- Also, a list of words occurring once, twice or 3 times
re: hyphenated words (you can treat it anyway you like)
but ideally, i'd treat [editor-in-chief]
[go-ahead] [pen-knife]
[know-how] [far-fetched] ...
as one unit.
Split into words - defined as you will.
Use Counter.
Show some (of your) code and we'll be happy to critique...
hard to decide what to do with hyphens
and apostrophes
(I'd, he's, can't, haven't, A's and B's)
2-step-Process
1. make a file listing all words (one word per line)
2. then, doing the counting. using
from collections import Counter
Apologies for lateness - only just able to come back to this.
This issue is not Python, and is not solved by code!
If you/your teacher can't define a "word", the code, any code, will
almost-certainly be wrong!
One of the interesting aspects of our work is that we can write all
manner of tests to try to ensure that the code is correct: unit tests,
integration tests, system tests, acceptance tests, eye-tests, ...
However, there is no such thing as a test (or proof) that statements of
requirements are complete or correct!
(nor for any other previous stages of the full project life-cycle)
As coders we need to learn to require clear specifications and not
attempt to read-between-the-lines, use our initiative, or otherwise 'not
bother the ...'. When there is ambiguity, we should go back to the
user/client/boss and seek clarification. They are the
domain/subject-matter experts...
I'm reminded of a cartoon, possibly from some IBM source, first seen in
black-and-white but here in living-color:
https://www.monolithic.org/blogs/presidents-sphere/what-the-customer-really-wants
That has been the sad history of programming and dev.projects - wherein
we are blamed for every short-coming, because no-one else understands
the nuances of development projects.
If we don't insist on clarity, are we our own worst enemy?
--
Regards,
=dn
--
https://mail.python.org/mailman/listinfo/python-list