Published - Sat, 15 Apr 2023
Title: Working with Airline Safety Data
Introduction: In this lecture, we will learn how to work with the airline_safety data frame included in the fivethirtyeight data package. We will explore the data, clean it up, and convert it into a tidy format using R programming language.
Step 1: Load the dataset We start by loading the airline_safety dataset using the following command:
scsslibrary(fivethirtyeight)
data("airline_safety")
Step 2: Exploring the dataset
We can use the head()
and summary()
functions to get a quick overview of the dataset.
scsshead(airline_safety)
summary(airline_safety)
Step 3: Cleaning the dataset
We will remove the incl_reg_subsidiaries
and avail_seat_km_per_week
columns from the dataset using the select()
function from the dplyr package.
scsslibrary(dplyr)
airline_safety_smaller <- airline_safety %>%
select(-c(incl_reg_subsidiaries, avail_seat_km_per_week))
Step 4: Converting to Tidy Format
The current format of the data frame is not tidy. We can convert it to tidy format using the tidyr
package.
scsslibrary(tidyr)
airline_safety_tidy <- airline_safety_smaller %>%
pivot_longer(
cols = c(
incidents_85_99, fatal_accidents_85_99, fatalities_85_99,
incidents_00_14, fatal_accidents_00_14, fatalities_00_14
),
names_to = "incident_type_years",
values_to = "count"
)
Step 5: Viewing the Tidy Dataset
We can use the head()
function to view the first few rows of the tidy dataset.
scsshead(airline_safety_tidy)
Conclusion: In this lecture, we learned how to work with the airline_safety data frame using R programming language. We explored the dataset, cleaned it up, and converted it to tidy format. The resulting dataset is easier to work with and can be used for further analysis.
Sat, 15 Apr 2023
Sat, 15 Apr 2023
Sat, 15 Apr 2023
Write a public review