Titanic Classification

    Author Name(s)

    Ashley Skelton
    Cynthia Dean
    Norberto Huerta
    Eric Rauno

    Faculty Advisor(s)

    Alona Kryshchenko

    Abstract

    One of the most famous shipwrecks was the sinking of the Titanic. There were about 700 survivors out of roughly 2400 people. What led to their survival? Using the data of a significant number of people that were on board that night, we are training models to accurately predict the survival of passengers through Ensemble methods on Python. Two datasets were provided by the Kaggle competition, a training dataset that contains the survival information for the passenger records in the dataset and a test data set that does not contain the survival information. Within each dataset there are individual passenger records that contain a unique passenger identifier, the passengers name, age, sex, fare paid, port of embarkment, passenger class, if they had family aboard the Titanic, and cabin. We analyzed the results of our models on the survival rate of passengers on board the Titanic and used several different indicators to train these models. We found that there were four main indicators that had visual correlations with survival rate on board the Titanic – sex, age, fare, and passenger class. We will present various results we got in terms of survival rate using Random Forest and Ensemble methods.

    Presentation

    Slides

    Additional Materials

    Leave a Reply

    Your email address will not be published. Required fields are marked *