Course Identification

Introduction to Big Data and Machine Learning
20213121

Lecturers and Teaching Assistants

Prof. Zohar Yakhini , Dr. Keren Ouaknine, Prof. Ariel Shamir
N/A

Course Schedule and Location

2021
First Semester
N/A
04/11/2020

Field of Study, Course Type and Credit Points

Life Sciences (Computational and Systems Biology Track): Lecture; Obligatory; Core; 2.50 points

Comments

Class will take place 14:00-16:30 every Wednesday at IDC.
Recitation will take place 1700-1800 and the course staff will be available for consultation after that (office hours), until 1900.
The first class will take place on 4/Nov/2020 and the last class will take place on 27/Jan/2021, will be 13 consecutive weeks.

Prerequisites

Basic programming course + basic calculus and linear algebra

Restrictions

20

Language of Instruction

English

Registration by

06/10/2020

Attendance and participation

Obligatory

Grade Type

Numerical (out of 100)

Grade Breakdown (in %)

60%
40%
Project

Evaluation Type

Final assignment

Scheduled date 1

N/A
N/A
-
N/A

Estimated Weekly Independent Workload (in hours)

N/A

Syllabus

Part 1 - Introduction to Machine Learning 

Introduction, linear regression 
Python libraries: pandas, numpy, visualization libraries 
Evaluation, training and test sets, ROC curves
Classification
Clustering, PCA

Part 2 - Data science and Statistics 

Density estimation, MLE, Bayes classification
Statistics for scientists: correlations, p-values and multiple testing
A mini project in analyzing high throughput data

Part 3 - Big Data 

Introduction to Hadoop
Query Languages 
Machine Learning use cases over Big Data   

Part 4 - Presentation of the mini project results
 

Learning Outcomes

Upon successful completion of this course students should be able to:

- understand machine learning algorithms and apply them to data 

- statistically asses observations in data including correlations

- launch and use Big Data platforms to analyze large volumes of data

- understand and configure machine learning packages including Deep Learning and SVM

- analyze large volumes of experimental data and present results   

Reading List

  • An introduction to statistical learning by James and co.
  • Pattern recognition and machine learning by Bishop and co.

Website

N/A