The data consists of various attributes of housing societies in Californaia including latitude, longitude, population, households and many more.I have applied multiple Linear Regression model to this data. Lot of data cleaning has been done to make appropriate predictions. Data preprocessing is applied to remove the null values, getting the average rooms per household and much more.One - hot vectors are prepared for attributes with various possible cases.
Predictions are made with our model and root mean squared error is calculated to show how these predictions are differing from actual data.The error shows that various parameters needed to be added to our model such as latitude and longitude to make it more related to actual data.