Regarding the question whether there is a ‘codebook format’ that PSPP could implement an import function for, the only thing that comes to my mind is DDI-XML. That is the standard that the Social Sciences data archives have developed and now implemented for their data, so PSPP would actually have a base of stuff to import… DDI happens to be quite complex as such, but it wouldn’t be necessary to implement the full scope for just getting the attributes in that PSPP can digest.
But if there now is the idea of adding a function that transfers dictionary information from one file to another one, which is a different thing than codebook import, but perhaps just as useful for a range of other situations, I would personally propose to go the syntax route. That means, read in the source data file, automatically produce PSPP syntax that codes the labels and missing information, and employ that to the target file (or leave it to the user to do that, after any necessary modifications that the algorithm would never be able to predict). Markus Quandt From: Pspp-users <pspp-users-bounces+markus.quandt=gesis....@gnu.org> On Behalf Of Elio Spinello Sent: Friday, January 21, 2022 7:55 PM To: 'Ben Pfaff' <b...@cs.stanford.edu>; 'Alan Mead' <am...@alanmead.org> Cc: 'pspp-users' <pspp-users@gnu.org> Subject: RE: Import Codebook If memory serves me correctly, there is a Copy Data Properties tool that allows you to select another dataset or unopened SAV file and then copy the data properties from it into the active dataset. Or you can copy and paste portions of the datasheet from one dataset to another. https://www.youtube.com/watch?v=ZakhRd4aDAQ I would think that one of those approaches would probably be the easiest to work with for both developers and users. Elio Spinello Elio Spinello, EdD RPM Consulting, LLC 27943 Seco Canyon Rd #320 Santa Clarita, CA 91350-3872 Office: 818-831-7607 Cell: 818-570-3546 [cid:image001.jpg@01D80F03.B30D6E30] From: Pspp-users <pspp-users-bounces+espinello=rpmconsulting....@gnu.org<mailto:pspp-users-bounces+espinello=rpmconsulting....@gnu.org>> On Behalf Of Ben Pfaff Sent: Friday, January 21, 2022 10:29 AM To: Alan Mead <am...@alanmead.org<mailto:am...@alanmead.org>> Cc: pspp-users <pspp-users@gnu.org<mailto:pspp-users@gnu.org>> Subject: Re: Import Codebook If PSPP were to add a feature to import a codebook, what format should it be able to import it from? On Fri, Jan 21, 2022 at 10:20 AM <am...@alanmead.org<mailto:am...@alanmead.org>> wrote: Yes, but variable labels aren't always that big a deal; value labels can be more critical. You should rename/label, but it's fairly easy to remember that V3 is sex. Good luck, however, remembering what the five responses 1, 2, 3, 4, 5 mean... Elio ninja'd me last night because I spent a few minutes googling whether there was a way to import a code book. I don't think there is, and that's a shame. Labeling data is so important and such an improvement in the SAV file format (over, say, SQL or CSV). I guess the other way to deal with this is to not use codes, in favor of response strings, in the dataset. So, the Sex variable might have values: 'male', 'female', 'non-binary', etc. And I guess if you had your labels in a spreadsheet you could probably arrange to use INDEX/MATCH to replace the codes with response strings that would be clear to anyone looking at the data. Of course, that solves the labeling in a way, but when you import your data into PSPP, you then have to write a bunch of syntax to change those strings (of numeric variables like Likert responses) into numeric values to be used in analysis. And, I guess, ideally you'd want those numeric variables to have sensible value labels. -Alan On 1/21/2022 11:50 AM, jhwh...@techwriteinc.com<mailto:jhwh...@techwriteinc.com> wrote: If I understand the issue correctly, variable labels are not being installed when importing some Excel files into PSPP. Is this correct? Take care, John ___________________________ [cid:image002.jpg@01D80F03.B30D6E30] Email: jhwh...@techwriteinc.com<mailto:jhwh...@techwriteinc.com> From: Pspp-users <pspp-users-bounces+jhwhite=techwriteinc....@gnu.org><mailto:pspp-users-bounces+jhwhite=techwriteinc....@gnu.org> On Behalf Of Alan Mead Sent: Thursday, January 20, 2022 9:23 PM To: Marek Ludwig <marek.lud...@fh-potsdam.de><mailto:marek.lud...@fh-potsdam.de>; pspp-users@gnu.org<mailto:pspp-users@gnu.org> Cc: Katja Behrndt <katja.behr...@fh-potsdam.de><mailto:katja.behr...@fh-potsdam.de> Subject: Re: Import Codebook I find applying labels to be very time-consuming, so maybe that's bad news for you. Maybe someone else will have a great idea. But to make it as quick as possible, I'd recommend that you generate syntax and execute that syntax. I think that will be MUCH quicker than individually clicking and editing these values using the graphical user interface. A lot of people are scared of syntax, but it's not so hard. An added advantage of doing it this way is that you easily fix an error by fixing the syntax and re-running it. Also, if you have the information in a spreadsheet, I would try to generate the syntax using formulas in the spreadsheet. If column A contained the spss variable name (maybe "V1") and column B contained the variable label, then into cell C1 I would insert: ="variable labels "&A1&" '"&B1&"'." (Note that there are single quotes, inside the double quotes, around B1 because it's a string.) If A1 = V1 and B1 = Beschriftung then this would generate: variable labels V1 'Beschriftung'. And if you paste that into a syntax window, add the line "Execute." and run it, it would label this variable. You could paste 200 rows of Column C, add "Execute." and create the 200 variable labels very easily. The value labels could be done similarly but I'd have to see the spreadsheet to devise the correct formula(s)... This page describes the syntax: http://www.statsmakemecry.com/smmctheblog/using-syntax-to-assign-variable-labels-and-value-labels-in-s.html This includes my solution and suggests an alternative (that may not work with PSPP): https://www.reddit.com/r/spss/comments/mobw0z/import_excel_file_while_maintaining_variable/ Here are the relevant PSPP manual pages: https://www.gnu.org/software/pspp/manual/html_node/VALUE-LABELS.html https://www.gnu.org/software/pspp/manual/html_node/VARIABLE-LABELS.html https://www.gnu.org/software/pspp/manual/html_node/MISSING-VALUES.html -Alan On 1/19/2022 9:01 AM, Marek Ludwig wrote: Dear All, we have read in a CSV dataset that we had generated from an Excel file. Unfortunately, the codebook got lost in the process, so that the columns for labels("Beschriftung"), value labels ("Wertelabels") and missing values ("Fehlende Werte") are empty. Since our dataset has over 200 variables, filling them in manually would be very time consuming. Is there an efficient, faster solution to read in the codebook or fill in these columns? I would be very grateful for a hint! Thanks a lot, Marek -- Alan D. Mead, Ph.D. President, Talent Algorithms Inc. science + technology = better workers https://talalg.com Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law. -- Alan D. Mead, Ph.D. President, Talent Algorithms Inc. science + technology = better workers https://talalg.com Going was easy. Keep on going was hard. -- Ursula K. Le Guin