What is Polynomial Regression?
Polynomial Regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modeled as an nth-degree polynomial. Polynomial regression fits a nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted E(y | x).
Why use Polynomial Regression?
Polynomial regression is used when the relationship between the independent and dependent variables is not linear, and a simple linear regression model does not fit the data well. By using higher-degree polynomials, the model can capture more complex relationships between the variables.
Polynomial Regression example:
Here’s a simple example of how to perform polynomial regression using Python and the scikit-learn library:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
# Generate sample data
x = np.random.rand(20, 1)
y = 2 * (x ** 3) - 6 * (x ** 2) + 3 * x + np.random.randn(20, 1) * 0.1
# Transform the data to include polynomial features
poly_features = PolynomialFeatures(degree=3, include_bias=False )
x_poly = poly_features.fit_transform(x)
# Perform linear regression on the transformed data
lin_reg = LinearRegression()
lin_reg.fit(x_poly, y)
# Visualize the polynomial regression fit
plt.scatter(x, y, color='blue')
x_new = np.linspace(0, 1, 100).reshape(100, 1)
x_new_poly = poly_features.transform(x_new)
y_new = lin_reg.predict(x_new_poly)
plt.plot(x_new, y_new, color='red', linewidth=2)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Polynomial Regression')
plt.show()
In this example, we generate some sample data with a cubic relationship, add polynomial features to the dataset, and then perform linear regression on the transformed data. The resulting plot shows the original data points and the fitted polynomial curve.