-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading and Writing of long String Variables from SPSS #119
Comments
Very similar to #118 . Reported to Readstat for them to take a look. |
@ofajardo Are there any news regarding this bug? I just stumbled across this problem again when reading spss data with long strings. Some standard code wasn't working all of a sudden and it took me ages to realise that it was down to this problem again (columns being split without any warning). |
no news, sorry |
the issue can be replicated in pure C: WizardMac/ReadStat#260 |
@ofajardo Since I keep encountering this issue, I spent some time creating data to reproduce this issue, in case that it is of any help for finding the bug (sadly I don't have the abilities to actually help solve the issue). There are alot of variaties how the error is expressed when opening the file in spss, I tried to find a few examples.
|
I also tried the other examples and all of them seem good now. Closing this. |
Hi @ofajardo and thanks for testing! I just installed the newest version (1.2.1) but the problems from this issue haven't changed. Did you open the file in spss or how did you check whether it worked? When reading the same files back into python after writing them with pyreadstat the split columns don't appear. But when opened in spss they are being split. When creating the file directly in spss and then reading with pyreadstat, the variables were kept the way they should be. |
I see, I was checking by reading them with pyreadstat only. I re-open this issue then. Now I realize the issue was always that pyreadstat was reading it correctly but SPSS was not. |
I'm also experiencing a similar issue:
In the SPSS file that's created the length is set at 255. If I set it as 255 or less it will work, but anything higher than that and it will default to 255. |
I am running into the same issue. The names of the variables that are being created seem quite unpredictable, which makes writing a hacky quick fix difficult. Hopefully our friends at ReadStat can look into it! |
When reading and writing spss files with long string variables, the respective variable is being split into several variables.
Reproducing writing issue:
When this file is opened in SPSS, instead of 2 variable, it contains 5 ("LongString2" is follwed by "V2_A1", "V2_A2", "V2_A3").
When read back into Python with pyreadstat it only shows the 2 created variables.
Strangely, when only "LongString2" is created and written, or when its variable name is shorter ("LongStr"), the splitting does not occur.
Reproducing Reading Issue
Unfortunately I can't offer a file to reproduce the reading issue. The one, that causes a problem for me, can't be shared due to data protection.
And I didn't succeed in creating a sample file, that produces the same problem.
Setup Information:
The text was updated successfully, but these errors were encountered: