What is R?
R is a programming language and software environment for statistical computing and graphics. It is widely used by statisticians, data scientists, and researchers for data analysis, statistical modeling, and data visualization. R is open-source, which means it is freely available and has a large community of contributors who develop and maintain packages for various analytics tasks.
Example of R usage: Linear Regression
Linear regression is a fundamental statistical technique used to model the relationship between a dependent variable (response) and one or more independent variables (predictors). In this example, we’ll use R to perform a simple linear regression analysis on a sample dataset.
1. Install and load required packages:
To perform linear regression in R, we’ll use the built-in dataset mtcars
. First, let’s load the required packages.
# Load the required package
library(datasets)
2. Load and explore the dataset:
Now, we can load and explore the mtcars
dataset, which contains information about various car models, including their miles per gallon (mpg) and horsepower (hp).
# Load the dataset
data(mtcars)
# Display the first few rows of the dataset
head(mtcars)
3. Perform linear regression:
To perform linear regression, we’ll use the lm()
function in R. We’ll create a model where miles per gallon (mpg) is the dependent variable and horsepower (hp) is the independent variable.
# Perform linear regression
model <- lm(mpg ~ hp, data = mtcars)
# Display the model summary
summary(model)
4. Interpret the results:
The model summary will display the coefficients, R-squared, and other statistics. In this example, the coefficient for hp
is -0.06823, indicating a negative relationship between horsepower and miles per gallon. The R-squared value is 0.6024, which means that about 60.24% of the variation in mpg can be explained by the horsepower.
Resources
To learn more about R and its applications for data analysis, you can explore the following resources: