Load Packages
# numerical calculation & data frames
import numpy as np
import pandas as pd
# visualization
import matplotlib.pyplot as plt
import seaborn as sns
import seaborn.objects as so
# statistics
import statsmodels.api as sm
R for Data Science by Wickham & Grolemund
# numerical calculation & data frames
import numpy as np
import pandas as pd
# visualization
import matplotlib.pyplot as plt
import seaborn as sns
import seaborn.objects as so
# statistics
import statsmodels.api as sm
# pandas options
"mode.copy_on_write", True)
pd.set_option(= 2
pd.options.display.precision = '{:.2f}'.format # pd.reset_option('display.float_format')
pd.options.display.float_format = 7
pd.options.display.max_rows
# Numpy options
= 2, suppress=True) np.set_printoptions(precision
Combine 섹션에서 다른 nycflight13
의 4개의 relational data를 이용하세요.
Add the location of the origin and destination (i.e. the lat
and lon
in airports
) to flights
.
Is there a relationship between the age of a plane and its delays?
What weather conditions make it more likely to see a delay?
flights
테이블에서 하루 평균 도착지연(arr_delay
)가 가장 큰 10일에 해당하는 항공편을 선택
flights
테이블의 도착지(dest
)에 대한 공항정보가 airports
테이블에 없는 그러한 도착지(dest
)를 구하면?
Filter flights (항공편) in flights
to only show flights with planes that have flown at least 100 flights.
Find the 48 hours (over the course of the whole year) that have the worst (departure) delays.
You might expect that there’s an implicit relationship between plane and airline, because each plane is flown by a single airline. Confirm or reject this hypothesis using the tools you’ve learned above.