On 5/30/2024 2:18 PM, dn wrote:
On 31/05/24 08:03, HenHanna via Python-list wrote:

Given a text file of a novel (JoyceUlysses.txt) ...

could someone give me a pretty fast (and simple) Python program that'd give me a list of all words occurring exactly once?

               -- Also, a list of words occurring once, twice or 3 times



re: hyphenated words        (you can treat it anyway you like)

        but ideally, i'd treat  [editor-in-chief]
                                [go-ahead]  [pen-knife]
                                [know-how]  [far-fetched] ...
        as one unit.



Split into words - defined as you will.
Use Counter.

Show some (of your) code and we'll be happy to critique...


hard to decide what to do with hyphens
               and apostrophes
             (I'd,  he's,  can't, haven't,  A's  and  B's)


2-step-Process

          1. make a file listing all words (one word per line)

          2.  then, doing the counting.  using
                              from collections import Counter


Related code  (for 1)  that i'd used before:

 Rfile  = open("JoyceUlysses.txt", 'r')

 with open( 'Out.txt', 'w' ) as fo:
    for line in Rfile:
        line = line.rstrip()
        wLis = line.split()
        for w in wLis:
            if w != "":
                w = w.rstrip(";:,'\"[]()*&^%$#@!,./<>?_-+=")
                w = w.lstrip(";:,'\"[]()*&^%$#@!,./<>?_-+=")
                fo.write(w.lower())
                fo.write('\n')

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to