- Anna Montoya, DataCanary. (2016). House Prices - Advanced Regression Techniques. Kaggle. https://kaggle.com/competitions/house-prices-advanced-regression-technique
The problem at hand is to predict the home prices for various homes. In the current housing market of fall 2023, we are witnessing a significant decline in home prices. It seems weird to try and find a price point when I know prices are changing, and to values we have not seen in 10-15 years. That being said, let’s dig into this one.
For this challenge, I saw that we have lots of column data. I’d like to find an easy method to import and deal with the columns. My learning goal is to get better at importing and encoding data for my machine-learning projects.
Inspecting and printing
A few good settings I found was to ensure that I was printing of bigger tables from pandas.
import pandas as pd pd.set_option('display.max_rows', 500) pd.set_option('display.max_columns', 500) pd.set_option('display.width', 1000)
Next, I saw that $.info()$ was useful, as well as dropping columns such as $id$ using $.pop(“id”)$ needed to be done, as that data did little to improve the learning of our model.