Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues for Assignment 5 #19

Open
jrmcgarvey opened this issue Feb 11, 2025 · 0 comments
Open

Issues for Assignment 5 #19

jrmcgarvey opened this issue Feb 11, 2025 · 0 comments

Comments

@jrmcgarvey
Copy link
Contributor

The assignment asks the student to create a dataframe with the provided data, followed by some data to "get you started". It is not clear where the rest of the data is to come from. Again, as we are using Kaggle, it would be best to find an appropriate Kaggle dataset for the purpose.

There is an error in task 1 part 2. It asks the student to do dropna() followed by several invocations of fillna(). The dropna() would eliminate rows with empty values, making the subsequent calls to fillna() into no-ops.

Task 4 on outliers won't change the provided data. It is as if the student has to create a dataset with outliers in it, in order to perform task 4, but this is not described. An existing Kaggle dataset should be used, as the provided data does not support the exercises.

Task 6 on inconsistent entries, again, is not supported by anything in the dataset.

Task 7 on invalid ages is not supported by the dataset. Also, it should be explained why one would convert to NaN in one step and then substitute the median value in the next step. The idea is that you don't want to compute the median on the basis of invalid values, but that is not explained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant