On Fri, Jan 16, 2015 at 09:54:08PM +0100, Benjamin Oppermann wrote: Hello, This is basically a question about cleaning up my data: I made a survey where for some of the questions, the answer could be any number of keywords. The informant could be more or less specific to their liking. This was intended to work like tags (they are comma-separated), but we didn't consider that they would still end up in the same column of the output. This means, for example the values for "place of origin" could be as precise as "Hannover, Niedersachsen, Germany, Europe" or "Hannover, Germany", or just "Hannover". All in the same variable! I currently have this variable specified as a string in PSPP, which means each represents a different value for PSPP. I want to achieve an outcome where all these values would read the same, i.e. the region, "Nidersachsen" in this example. Since my sample includes ~360 cases, I'd like to find a way to do this automatically. I might be able to do it in another program like a spreadsheet application or GoogleRefine/OpenRefine, but maybe do you know a way to recode this variable in PSPP? Any suggestions? Regards, Ben
There are a number of string functions which you could use to parse the string variable into several other variables. See section 7.7.7 of the manual. But I think I agree with other people, that this is best done manually. Doing it automatically is an exercise in artificial intelligence. One could use a self learning neural net to deal with typos: Eg: "Niedersachsen" vs. "Nidersachsen". A clever algorithm could also deal with umlauts. But it would be very hard to get an automated program which, for example, without being specifically programmed, could know that "Munich" and "Muenchen" are the same place. J' -- PGP Public key ID: 1024D/2DE827B3 fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3 See http://sks-keyservers.net or any PGP keyserver for public key.
signature.asc
Description: Digital signature
_______________________________________________ Pspp-users mailing list Pspp-users@gnu.org https://lists.gnu.org/mailman/listinfo/pspp-users