On 5/30/2024 2:18 PM, dn wrote:
On 31/05/24 08:03, HenHanna via Python-list wrote:
Given a text file of a novel (JoyceUlysses.txt) ...
could someone give me a pretty fast (and simple) Python program that'd
give me a list of all words occurring exactly once?
-- Also, a list of words occurring once, twice or 3 times
re: hyphenated words (you can treat it anyway you like)
but ideally, i'd treat [editor-in-chief]
[go-ahead] [pen-knife]
[know-how] [far-fetched] ...
as one unit.
Split into words - defined as you will.
Use Counter.
Show some (of your) code and we'll be happy to critique...
hard to decide what to do with hyphens
and apostrophes
(I'd, he's, can't, haven't, A's and B's)
2-step-Process
1. make a file listing all words (one word per line)
2. then, doing the counting. using
from collections import Counter
Related code (for 1) that i'd used before:
Rfile = open("JoyceUlysses.txt", 'r')
with open( 'Out.txt', 'w' ) as fo:
for line in Rfile:
line = line.rstrip()
wLis = line.split()
for w in wLis:
if w != "":
w = w.rstrip(";:,'\"[]()*&^%$#@!,./<>?_-+=")
w = w.lstrip(";:,'\"[]()*&^%$#@!,./<>?_-+=")
fo.write(w.lower())
fo.write('\n')
--
https://mail.python.org/mailman/listinfo/python-list