# Course Identification

Introduction to Data Analysis in R
20205081

## Lecturers and Teaching Assistants

Dr. Giora Alexandron
Dr. Tanya Nazaretsky

## Course Schedule and Location

2020
First Semester
Thursday, 09:00 - 10:30, Musher, Lab 3
07/11/2019

## Field of Study, Course Type and Credit Points

Science Teaching: Lecture; Elective; Regular; 2.00 points

## Comments

ב 12 בדצמבר וב 26 בדצמבר, הקורס יתקיים בכיתה 2
Musher Lab 2

No

20

Hebrew

## Attendance and participation

Expected and Recommended

## Grade Type

Numerical (out of 100)

10%
40%
45%
5%

Final assignment

N/A
N/A
-
N/A

2

## Syllabus

Data analysis is becoming a fundamental competency in educational research. R is a free programming language for statistical computing, which is very popular among researchers and data scientists for data analysis and visualization. The course will cover the basics of data analysis in R.  Evaluation will be based mainly on assignments and a final project.

Below is a list of the main topics that the course will touch upon (not necessarily in this order):

• Preliminaries – the very basics of R:
1. Introduction to variables, data structures and their representation in R (variables, Vectors, Lists, Matrices, data frames)
2. Input/output: loading data files, saving results to file
• Data Programming:
• Sub-setting and Splitting data
• Control flow: Conditions and loops
• Merging datasets
• The Apply family of functions: Apply a function to all items of a list simultaneously
• Data Cleaning: Detecting and handling incomplete, incorrect, inaccurate or irrelevant parts of the data (e.g., missing values, outliers, etc.)
• Learning from examples: Using online forums and code bases to retrieve programming solutions
• Applied Statistics:
• Descriptive Statistics
• Hypothesis testing (t.test, Wilcoxon), parametric and non-parametric statistics, bootstrap hypothesis testing
• Basics of Supervised Machine Learning: Linear and logistic Regression*
• Basics of Unsupervised Machine Learning: Cluster Analysis *
• Visualizing Research Results (plot, barplot, , boxplot,…).
• Building interactive interfaces with Shiny*

* Advanced topics; depend on students’ progress in the course

## Learning Outcomes

Upon successful completion of this course, students will be able to use R to analyze and extract insights from structured data-sets, and use R’s visualization capabilities to report and communicate research findings in presentations and papers.

## Reading List

Harvard’s edX MOOC on R basics: https://www.edx.org/course/data-science-r-basics-2

Stack Overflow: A Q/A forum for programmers

The R Project for Statistical Computing – the homepage of the R project.

An Introduction for Statistical Learning with Applications in R – A comprehensive Textbook by James, Witten, Hastie and Tibshirani.