{
"cells": [
{
"cell_type": "markdown",
"id": "41d3d48b",
"metadata": {},
"source": [
"# Regresion Models: Prediction of the value of Tesla's stock"
]
},
{
"cell_type": "markdown",
"id": "4f95381b",
"metadata": {},
"source": [
"## Abstract"
]
},
{
"cell_type": "markdown",
"id": "50ceeae9",
"metadata": {},
"source": [
"Starting with a data set containing close to a year of historical data related to Tesla's stock movements, in this notebook we will attempt to predict the next five days' worth of this stock. The dataset was downloaded from the Yahoo! Finance (for historical Tesla data: click here).
\n",
"Different prediction models were used: simple linear model, simple polynomial model and multiple linear model. All models come with indicators like R2 and RMSE."
]
},
{
"cell_type": "markdown",
"id": "8fb9ed14",
"metadata": {},
"source": [
"## 1. Exploratory Data Analysis"
]
},
{
"cell_type": "markdown",
"id": "00b0d4f4",
"metadata": {},
"source": [
"### 1.1. Getting the Dataset"
]
},
{
"cell_type": "markdown",
"id": "4808a1ca",
"metadata": {},
"source": [
"#### Libreries"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "265913e8",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"from matplotlib import pyplot as plt\n",
"from sklearn.metrics import mean_squared_error, r2_score\n",
"from sklearn.linear_model import LinearRegression\n",
"from sklearn.preprocessing import PolynomialFeatures\n",
"from math import sqrt"
]
},
{
"cell_type": "markdown",
"id": "449a5069",
"metadata": {},
"source": [
"#### The dataset"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "d7499986",
"metadata": {},
"outputs": [],
"source": [
"# The complete dataset\n",
"data=pd.read_csv('datasets/TSLA.csv')\n",
"# Drop of columns: 'Date' y 'Adj Close'\n",
"ds=data.drop(['Adj Close'], axis=1)\n",
"# Number of days of prediction\n",
"periodos=5\n",
"# Global Dataset train and test\n",
"dsTrain=ds.iloc[:(len(ds)-periodos),:]\n",
"dsTest=ds.iloc[(len(ds)-periodos):,:]\n",
"# Simple Data Train: For linear and Polynomial models\n",
"dsTrainS=dsTrain.drop(['Open','High','Low','Volume'], axis=1)\n",
"# Simple Data Test:\n",
"dsTestS=dsTest.drop(['Open','High','Low','Volume'], axis=1)"
]
},
{
"cell_type": "markdown",
"id": "466f02ea",
"metadata": {},
"source": [
"### 1.2. Univariate Analysis"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "672df446",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | Date | \n", "Open | \n", "High | \n", "Low | \n", "Close | \n", "Adj Close | \n", "Volume | \n", "
---|---|---|---|---|---|---|---|
0 | \n", "2021-03-15 | \n", "694.090027 | \n", "713.179993 | \n", "684.039978 | \n", "707.940002 | \n", "707.940002 | \n", "29335600 | \n", "
1 | \n", "2021-03-16 | \n", "703.349976 | \n", "707.919983 | \n", "671.000000 | \n", "676.880005 | \n", "676.880005 | \n", "32195700 | \n", "
2 | \n", "2021-03-17 | \n", "656.869995 | \n", "703.729980 | \n", "651.010010 | \n", "701.809998 | \n", "701.809998 | \n", "40372500 | \n", "
3 | \n", "2021-03-18 | \n", "684.289978 | \n", "689.229980 | \n", "652.000000 | \n", "653.159973 | \n", "653.159973 | \n", "33224800 | \n", "
4 | \n", "2021-03-19 | \n", "646.599976 | \n", "657.229980 | \n", "624.619995 | \n", "654.869995 | \n", "654.869995 | \n", "42894000 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
248 | \n", "2022-03-08 | \n", "795.530029 | \n", "849.989990 | \n", "782.169983 | \n", "824.400024 | \n", "824.400024 | \n", "26799700 | \n", "
249 | \n", "2022-03-09 | \n", "839.479980 | \n", "860.559998 | \n", "832.010010 | \n", "858.969971 | \n", "858.969971 | \n", "19728000 | \n", "
250 | \n", "2022-03-10 | \n", "851.450012 | \n", "854.450012 | \n", "810.359985 | \n", "838.299988 | \n", "838.299988 | \n", "19549500 | \n", "
251 | \n", "2022-03-11 | \n", "840.200012 | \n", "843.799988 | \n", "793.770020 | \n", "795.349976 | \n", "795.349976 | \n", "22272800 | \n", "
252 | \n", "2022-03-14 | \n", "780.609985 | \n", "800.699890 | \n", "756.344971 | \n", "783.020020 | \n", "783.020020 | \n", "8910193 | \n", "
253 rows × 7 columns
\n", "