We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
By skipping the read_csv function, we lose the detection of nan values, so columns that are numeric are coded as objects.
read_csv
ie
import GEOparse geo = GEOparse.get_GEO("GSE112676") geo.phenotype_data["characteristics_ch1.3.age_onset"]
gives
GSM3076582 72.69 GSM3076584 66.97 GSM3076586 73.73 GSM3076588 NA GSM3076590 NA ... GSM3078502 74.88 GSM3078503 73.57 GSM3078505 71.29 GSM3078507 61.84 GSM3078510 74.49 Name: characteristics_ch1.3.age_onset, Length: 741, dtype: object
So despite being "NA" strings, they are not interpreted as being consistent with floats.
my fix is something like this:
from io import StringIO out = StringIO() pheno.to_csv(out) pheno = pd.read_csv(StringIO(out.getvalue()), index_col=0)
I can put in a quick PR, but it feels a little crude to do this, but I haven't been able to find a more elegant way.
The text was updated successfully, but these errors were encountered:
Thanks for reporting. Let me think how to do this - maybe a PR would be good to do so we can test it.
Sorry, something went wrong.
No branches or pull requests
By skipping the
read_csv
function, we lose the detection of nan values, so columns that are numeric are coded as objects.ie
gives
So despite being "NA" strings, they are not interpreted as being consistent with floats.
my fix is something like this:
I can put in a quick PR, but it feels a little crude to do this, but I haven't been able to find a more elegant way.
The text was updated successfully, but these errors were encountered: