airbnb#

Module for loading the New York City Airbnb 2019 Open Dataset.

The New York City Airbnb 2019 Open Data is a dataset containing varius details about a listed unit, when the goal is to predict the rental price of a unit.

This dataset contains the details for units listed in NYC during 2019, was adapted from the following open kaggle dataset: https://www.kaggle.com/datasets/dgomonov/new-york-city-airbnb-open-data. This, in turn was downloaded from the Airbnb data repository http://insideairbnb.com/get-the-data.

This dataset is licensed under the CC0 1.0 Universal License (https://creativecommons.org/publicdomain/zero/1.0/).

The typical ML task in this dataset is to build a model that predicts the average rental price of a unit.

Dataset Shape:
Dataset Shape#

Property

Value

Samples Total

47.3K

Dimensionality

9

Features

real, string

Targets

int 31 - 795

Description:
Dataset Description#

Column name

Column Role

Description

datestamp

Datetime

The date of the observation

neighbourhood_group

Feature

neighbourhood

Feature

room_type

Feature

minimum_nights

Feature

number_of_reviews

Feature

reviews_per_month

Feature

calculated_host_listings_count

Feature

availability_365

Feature

has_availability

Feature

price

Label

The rental price of the unit

Functions

load_data_and_predictions([data_format, ...])

Load and returns the Airbnb NYC 2019 dataset (regression).

load_pre_calculated_feature_importance()

Load the pre-calculated feature importance for the Airbnb NYC 2019 dataset.