Re: How can I count word frequency in a web site?

Michiel Overtoom Sun, 29 Nov 2015 23:59:40 -0800

> On 30 Nov 2015, at 03:54, ryguy7272 <ryanshu...@gmail.com> wrote:
> 
> Now, how can I count specific words like 'fraud' and 'lawsuit'?


- convert the page to plain text
- remove any interpunction
- split into words
- see what words occur
- enumerate all the words and increase a counter for each word

Something like this:

s = """Today we're rounding out our planetary tour with ice giants Uranus
and Neptune. Both have small rocky cores, thick mantles of ammonia, water,
and methane, and atmospheres that make them look greenish and blue. Uranus
has a truly weird rotation and relatively dull weather, while Neptune has
clouds and storms whipped by tremendous winds. Both have rings and moons,
with Neptune's Triton probably being a captured iceball that has active
geology."""

import collections
cleaned = s.lower().replace("\n", " ").replace(".", "").replace(",", 
"").replace("'", " ")
count = collections.Counter(cleaned.split(" "))
for interesting in ("neptune", "and"):
    print "The word '%s' occurs %d times" % (interesting, count[interesting])


# Outputs:

The word 'neptune' occurs 3 times
The word 'and' occurs 7 times




-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How can I count word frequency in a web site?

Reply via email to