As I see it, you have two choices, neither of which are simple, error-proof or quick.
You could try to search for keywords and create a new variable that has a numeric value that you label "Nidersachsen" when you detect "Hannover" or "Niedersachsen". I assume this would be error-prone and tedious. Maybe Germans never make typos, but you might be able to out-smart typos with soundex (or a better, ideally germanized algorithm). See here for PSPP syntax: https://groups.google.com/forum/#!topic/comp.soft-sys.stat.spss/Ry-bwrV8pFY When I have free-response data that I want to recode, I do a frequencies of the data and export the unique responses to a spreadsheet and then manually assign codes to the results. Then I construct SPSS/PSPP code to recode the data to the desired codes. If the number of unique free-responses are a lot less than 360 then you have saved yourself some time. By "construct SPSS/PSPP code to recode the data to the desired codes" I mean let's say that your free responses are in Column A (starting in A2) and you have manually entered the desired recodes in Column B (starting in B2). And in SPSS your variable is called V1 and you want to recode it into V1_recoded, then in cell C2 I insert a macro like: ="if V1 = '"&A2&"' V1_recoded="&B2&"." So, if A2 held "Hannover, Niedersachsen, Germany, Europe" and B2 held 1 then this would create the SPSS/PSPP syntax: if V1 = 'Hannover, Niedersachsen, Germany, Europe' V1_recoded=1. Copy C2 down for all rows with data and then paste all the cells of column C into PSPP as a long block of IF statements. Add an execute at the end and run it. -Alan On 1/16/2015 2:54 PM, Benjamin Oppermann wrote: > Hello, > This is basically a question about cleaning up my data: > I made a survey where for some of the questions, the answer could be any > number of keywords. The informant could be more or less specific to > their liking. This was intended to work like tags (they are > comma-separated), but we didn't consider that they would still end up in > the same column of the output. > This means, for example the values for "place of origin" could be as > precise as "Hannover, Niedersachsen, Germany, Europe" or "Hannover, > Germany", or just "Hannover". All in the same variable! > I currently have this variable specified as a string in PSPP, which > means each represents a different value for PSPP. > I want to achieve an outcome where all these values would read the same, > i.e. the region, "Nidersachsen" in this example. Since my sample > includes ~360 cases, I'd like to find a way to do this automatically. > I might be able to do it in another program like a spreadsheet > application or GoogleRefine/OpenRefine, but maybe do you know a way to > recode this variable in PSPP? Any suggestions? > Regards, > Ben > > _______________________________________________ > Pspp-users mailing list > Pspp-users@gnu.org > https://lists.gnu.org/mailman/listinfo/pspp-users -- Alan D. Mead, Ph.D. President, Talent Algorithms Inc. science + technology = better workers +815.588.3846 (Office) +267.334.4143 (Mobile) http://www.alanmead.org Announcing the Journal of Computerized Adaptive Testing (JCAT), a peer-reviewed electronic journal designed to advance the science and practice of computerized adaptive testing: http://www.iacat.org/jcat _______________________________________________ Pspp-users mailing list Pspp-users@gnu.org https://lists.gnu.org/mailman/listinfo/pspp-users