Course Overview
What is R Language
R is a programming language which is used for performing statistical analysis. R is very similar to the S language except in few features. R is a open source software and as new statistical techniques are developed new packages are also created in R . R also contains a variety of graph drawing tools which makes it easy to produce graphs of the computed data.
Course Objectives
After the completion of this course you will
- Learn data manipulation and statistics basics using R
- Know how to perform business analytics using R
- Able to know how testing and forecasting is done in R
- Learn to use visualizations in R
Pre requisites for taking this course
Before taking this course you should have a basic knowledge in statistics and computer programming terminologies. You should have installed R and RStudio in your system.
Target Audience for this course
Web developers, software programmers, data miners, researchers and anyone who is interested in learning R can take up this course.
Course Description
Section 1: Understanding R
R is a software language used to carry out statistical analysis. It also includes graphical presentations and data modelling.
Basics of R
This chapter will let you learn how to start writing the programs in R. Programs can be written in R either in the command prompt or in the R script file. R Command Prompt, R Script File and Comments are explained in this chapter
Basic R Functions
Functions is a group of statements collected together to perform a specific task. In R function is created using the keyword function. R has many functions for statistical analysis and graphics. The components of R function, Built in function in R, User defined function and how to call a function in R is discussed in this lesson.
Data Types
The variables in R are assigned with R objects and the data type of the R object becomes the data type of the variable. The most commonly used R objects are Vectors, Lists, Matrices, Arrays, Factors and Data Frames
Recycling Rule
If someone tries to add two structures with different number of elements then the shortest is recycled to length of longest.
Special Numerical Values
The R has four special numerical values – NA, Inf, -Inf and NaN and 28 symbols are used to represent the special numeric values.
Parallel Summary Functions
A lot of packages are developed in R to provide support for various paradigms of parallel computing. This package supports local multi core parallelism
Logical Conjunctions
Under this chapter you will learn about the logical operators and its symbols used in R
Pasting Strings together
String concatenation is another function in R which is used to join two strings. The method of concatenating the strings is mentioned in this chapter
Type Coercion
Whenever a function is called in R with argument of the wrong type then the R coerce values to a different type that can be processed. Two types of coercion are explained with examples in this tutorial
Array & Matrix
Arrays in R are used to store data in more than two dimensions. It is created using the array() function in R. Matrices in R are the object where the elements are arranged in a two dimensional rectangular layout. The syntax and elements of a matrix are discussed in this chapter.
Factor
Factors in R are used to categorize the data and store it as levels. Factors can be string or integers. In this chapter you will learn how to change the order of levels and how to generate factor levels.
Repository & Packages
CRAN is a repository from which the packages can be installed in R
Installing a Package
There are two methods to add a new package – CRAN directory and downloading the package.
Importing Data
Importing data is easy in R. There are two main packages available for importing data – foreign and Hmisc.
Importing Data SPSS
The library(foreign) function is used to import the SPSS data into R
Data Aggregation
Aggregating data in R is done using one or more BY variables and a defined function. The function used is aggregate()
Section 2: Data Manipulation and Statistics Basics
Data Manipulation & Statistics Basics
The common data manipulation techniques in R includes Sorting, Randomizing Order, Vector types conversion, deleting duplicate records, recoding data and mapping vector values.
This chapter describes the basic statistics in R which includes descriptive statistics, frequency counts, cross tabulations, t-tests, regression, ANOVA, MANOVA and others.
Merging
The merge() function is used to merge two data frames in R. Merging is explained with example.
Data Creation
This chapter contains the common data creation methods in R
What is Statistics
This chapter gives an introduction to statistics in R along with the working of statistics in R. It will let you know how to calculate variance, covariance and cumulative frequency in R with examples.
Variables
Under this lesson you will know how to create new variables, specify variables, recode and rename variables in R
Quantiles
Here you will learn how to compute the Quantiles on observation variable in R
Library (mass)
MASS is a CRAN package in R which has certain functions and datasets.
Head (faithful)
Faithful is a built in data frame in R which is explained in detail in this chapter.
Scatter Plot
A scatter plot joins values of two quantitative variables in a data set. There are different ways to create a scatter plot. The function used in R for this is plot(x,y)
Control Flow
The control flow in R works like control statements in any other language. The usage and arguments of the control flow are given in this section.
Section 3: Statistics, Probability and Distribution
Statistics, Probability & Distribution
Under this chapter a brief introduction to statistics and probability distributions in R are given along with their description and example.
Random Variable
R has a wide range of functions in its library to help generate random numbers from various statistical computations. This section will let you understand how random numbers can be generated in R along with few examples
Discrete Example
The joint distributions discrete cases are explained in this chapter with example
Continuous Case
The joint distributions continuous case in R are discussed in detail in this lesson with example.
Exponential Distribution Practice Problem
The exponential distribution specifies the arrival time of a randomly recurring independent even sequence. This section contains its usage, arguments, details and exponential graph.
Expected Value
This section explains how to get a expected value (E-Value) for a dataset in R
Gambling Example
Here you will learn how R helps to simulate the Gambler’s ruin and how R helps in gambling.
Deal or no deal
Under this lesson you will know how R is used in betting analysis
Distribution details
In this chapter you will see the basic operation connected with distributions in R and there are only few important probability distributions discussed in this chapter.
Binomial Distribution
In this lesson you will know what are the functions of R used in Binomial distribution, parameters used in this distribution, expected value from binomial distribution and its example.
Uniform Random Variables
This chapter contains the details of uniform distribution in R and the functions used in R for this distribution
Probability distributions examples
Here you will see examples of all the major types of probability distributions mentioned in the previous chapters and that includes – The normal distribution, The t distribution, The binomial distribution and the Chi squared distribution.
Section 4: Business Analytics Using R
Business Analytics using R
This chapter will help you to understand how R is used in Business analytics and what are the techniques used in R for business analytics.
Normal PDF
PDF is called Probability Density Function. The syntax and example of PDF is given in this tutorial
What is Normal, Not Normal
In this lesson you will learn about normal distribution in R, it description, usage, arguments and the functions used in R
SAT Example
This section explains how the normality of SAT scores is found out using R
Example- Birth Weights
Under this chapter you will learn how to predict the birth weights of infant using decision tree in R. To explore the data here you will need to use MASS and rpart of the library in R
dNorm, pNorm, qNorm
In this topic d stands for density, p for probability and q for quantile. So all these distributions are discussed in detail in this chapter with examples.
Understanding Estimation
Empirical Bayes estimation is a statistical method which helps to estimate a large number of proportions in R.
Properties of Good Estimators
This section lists all the qualities that an estimator should possess
Central Limit Theorem
This tutorial will help you to learn the Central limit theorem used in R
Kurtosis
Kurtosis measures the peakedness of the data distribution in R. The three types of kurtosis are platykurtic, leptokurtic and mesokurtic which are explained in this chapter with examples
Confidence Intervals for the Mean
This section explains what is confidence interval and how to calculate confidence interval from a normal distribution, from a t distribution and calculating many confidence intervals from a t distribution. Examples are also given in this section
Computer Lab Example
This section explains how a Single Sample t test is used in a computer lab software.
t-distribution
This chapter contains description, usage, arguments, details, values, graph of the student t distribution with degrees of freedom and examples of t distribution.
Section 5: Examples, Testing and Forecasting
R Examples
In this chapter you will see few examples of using R
Standard error of the mean
The standard error is the standard deviation divided by the square root of the sample size.
Downloading the Package
This section explains the various methods through which the packages can be downloaded in R
Sample Differences
The comparison of two population proportions and its sample differences are explained in this chapter
Hypothesis Generation and Testing
In this chapter we shall see the procedure of hypothesis generation and testing in R using the intuitive critical value approach
One sided P Value
This chapter will help you to learn how to calculate Single p value from a normal distribution, single p value from a t distribution and many p values from a t distribution. It also explains about the one sided test
Power & Sample Size
Power analysis is a part of experimental design and it has four quantities – sample size, effect size, significant level and power = 1-p. All the four quantities are explained in detail here
Calculating the Z value
Z scores are used to measure the distance of a value from the mean measured in standard deviations. This chapter lets you know how it is calculated and used in R
Lower Tail test of population proportion
The Lower Tail test of population proportion and its null hypothesis is explained using examples in this lesson
Time Series Analysis Applications
R has a lot of facilities for time series analysis. This section explains the creation of time series in R.
Forecasting
The forecast package is used in R for automatic selection of exponential and ARIMA models. The forecast function and its approaches are discussed here
Observation Components
There are three observation components in Time series analysis – Trend, Seasonal and Irregular. These components are given with examples in this tutorial
Traditional Approaches
This section explains how the traditional time series models are different from the models which are currently used.
Double Exponentional Smoothing
This smoothing is used when there is a trend observation. The procedure of double exponential smoothing in ARIMA is explained in this lesson
ARIMA Steps
The ARIMA model steps in R are explained in this chapter
Forecasting Performance
This explains how forecasting is done in an ARIMA model
Univariate ARIMA
This explains how to fit an ARIMA model to a univariate time series.
Section 6: Understanding Visualizations
R Visualization
This chapter gives an introduction to data visualization in R. R programming offers a wide variety of in built and function and libraries to visualize the data. This also gives a brief history of data visualization in R
Why Visualize
The importance of data visualization is discussed in this chapter
Overlaying Plots
This section will let you learn how to overlay the scatter plots or how to combine two graphs in R
Graphs representation of Data
This gives an overview of R graphics and the different types of graphs used in R for presenting the data.
Advanced Graphs
The advanced graphs like Heat map, mosaic map, map visualization, 3D graphs and Correlogram are explained in detail in this chapter
Bubble Charts
The bubble charts can be used in R by using the function ggplot2.
Anova
Here you will learn how to conduct analysis of variance in R that includes one way Anova, post hoc testing and other Anova models with examples for each
Estimate of Average Treatment effect
This section contains details about ATE package in R, its uses, functions, usage, description and arguments.
Factorial Anova
The factorial experiments in Anova is explained with its advantages, disadvantages, examples and interaction plot in R
Regression
One of the most often used technique is statistics is regression. Simple linear regression and multiple linear regression models in R can be studied in this chapter with examples.
Output of Regression Model
This section contains the sample output of the regression model in R and explains each of its sections in detail
FAQ’s General Questions
- Is statistics a must to be known to learn this course ?
It is not mandatory but basic statistical knowledge is desirable. We also offer few course on basic and elementary statistics which will help you to refresh you with your statistical knowledge.
- Why learn this course on R ?
Learning R will help you to perform analytics and build models on your own. It is the most powerful and widely used programming language for statistical computing and graphics. Thus it is a must known language for most of the data scientists these days. It will help you to improve your career.
Testimonials
Jersey
I took this course from educba and it was such a useful course for me. It helped me to improve my knowledge in R language to a great extent. The course covers all the topics from basics of R to its deeper context. Each topic is explained with neat examples to make learning easy. Such a great course at a great cost. Highly recommended.
Where do our learners come from? |
Professionals from around the world have benefited from eduCBA’s Oracle SOA Suite 11g Comprehensive courses. Some of the top places that our learners come from include New York, Dubai, San Francisco, Bay Area, New Jersey, Houston, Seattle, Toronto, London, Berlin, UAE, Hong Kong, Singapore, Australia, New Zealand, Bangalore, New Delhi, Mumbai, Pune, Kolkata, Hyderabad and Gurgaon among many. |