728x90
반응형
출처 : https://www.kaggle.com/code/ekrembayar/store-sales-ts-forecasting-a-comprehensive-guide/notebook
Kaggle Competition
- Store Sales - Time Series Forecasting ( Use machine learning to predict grocery sales )의
Code 탭에서 Most Votes 순으로 정렬 - 제일 위에 있는 Code를 필사하면서 주석도 달아보고 해석하여 이해와 공부
# Packages
# BASE
# ------------------------------------------------------
import numpy as np
import pandas as pd
import os
import gc
import warnings
# 통계 분석
# ------------------------------------------------------
import statsmodels.api as sm
# 시각화
# ------------------------------------------------------
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
# CONFIGURATIONS
# ------------------------------------------------------
pd.set_option('display.max_columns', None)
pd.options.display.float_format = '{:.2f}'.format
warnings.filterwarnings('ignore')
# Google Drive 연결
from google.colab import drive
drive.mount('/content/drive')
# 데이터 불러오기
DATA_PATH = '/content/drive/MyDrive/2023_Yonsei_IT/kaggle/store-sales-time-series-forecasting/'
train = pd.read_csv(DATA_PATH + "train.csv")
test = pd.read_csv(DATA_PATH + "test.csv")
stores = pd.read_csv(DATA_PATH + "stores.csv")
sub = pd.read_csv(DATA_PATH + "sample_submission.csv")
transactions = pd.read_csv(DATA_PATH + "transactions.csv").sort_values(["store_nbr", "date"])
oil = pd.read_csv(DATA_PATH + "oil.csv")
holidays = pd.read_csv(DATA_PATH + "holidays_events.csv")
# Datetime
train["date"] = pd.to_datetime(train.date)
test["date"] = pd.to_datetime(test.date)
transactions["date"] = pd.to_datetime(transactions.date)
# Data types
train.onpromotion = train.onpromotion.astype("float16")
train.sales = train.sales.astype("float32")
stores.cluster = stores.cluster.astype("int8")
# 이하생략
언젠가 혼자 스스로 모든 대시보드 및 시각화, ML를 만들어 낼 수 있도록 꾸준히 공부하자.
728x90
반응형