On Thu, 26 Jun 2003 13:25:39 +0200, you wrote:

># i would like a similar function that removes interpuntuation like "." etc.
># all i want remaining in the array are the separate words, all in lower
>case

How do you define a word? Is a hyphen part of a word? Do you want to strip
out numbers? Make a list of valid characters, and remove everything else.
For example, if my valid characters were the set

        ' abcdef';

I'd do something like this:

        function strfilter($s, $v) {
                $r = '';
                for ($i = 0; $i < strlen($s); $i++) {
                        if (strchr($v, $s[$i])) {
                                $r .= $s[$i];
                        }
                }
                return ($r);
        }
        
        echo (strfilter ($string, ' abcdef'));

>       sort($array); # sort array elements in alphabetical order
>       $num = count($array); # count the number of elements in the array
>       for ($c=0; $c<=$num; $c++) #as long as function has nog reached past last
>element in array
>       {
>               $wordInText = $array[$c]; #array element is word in text
>               echo "$wordInText<br>"; #print word in text
>       }
>
># i would like a function that pushes this word into a second array.
># before pushing, it has to check whether or not the same word is already
>in the array. 

This is way faster when implemented as a binary tree. However, for all I
know that's how PHP arrays are implemented under the hood.

># if it is: do not push word into array, but add "1" to the number of occurrences
>of that word
># if it is not: push this new word into array
># all of this has to result into a word - frequency array (content analysis
>of free text)
># question 1: how do i produce such an array?
># question 2: how do i get the two elements (word and number of occurrences)

Pseudocode:

        $results = array();

        foreach ($word in $sentence) {
                if ($results.has_key ($word)) {
                        $results[$word] += 1;
                } else {
                        $results[$word] = 1;
                }
        }

># together out of the array and print them to the screen?
># f.e.: the word "computer" occurred two times in this text.

Pseudocode:

        foreach ($results as $key=>$value) {
                echo ("the word $key appears $value times in this text<br>");
        }


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to