Course Identification

Introduction to Big Data and Machine Learning
20203361

Lecturers and Teaching Assistants

Prof. Zohar Yakhini , Dr. Keren Ouaknine, Prof. Ariel Shamir
Alon Oring, Ben Galili

Course Schedule and Location

2020
First Semester
Wednesday, 13:30 - 16:00
13/11/2019
05/02/2020

Field of Study, Course Type and Credit Points

Life Sciences (Computational and Systems Biology Track): Lecture; Obligatory; Core; 2.50 points
Life Sciences: Lecture; Elective; Core; 2.50 points
Life Sciences (Molecular and Cellular Neuroscience Track): Lecture; Elective; Core; 2.50 points
Life Sciences (Brain Sciences: Systems, Computational and Cognitive Neuroscience Track): Lecture; Elective; Regular; 2.50 points

Comments

Class will take place 13:30-16:00 every Wednesday at IDC.
Recitation will take place 1630-1730 and the course staff will be available for consultation after that (office hours), until 1830.
The first class will take place on 13/Nov/2019 and the last class will take place on 5/Feb/2020, will be 13 consecutive weeks.

Prerequisites

Basic programming course + basic calculus and linear algebra

Restrictions

20

Language of Instruction

English

Registration by

24/09/2019

Attendance and participation

Obligatory

Grade Type

Numerical (out of 100)

Grade Breakdown (in %)

68%
32%
Project

Evaluation Type

Final assignment

Scheduled date 1

N/A
N/A
-
N/A

Estimated Weekly Independent Workload (in hours)

N/A

Syllabus

Part 1 - Introduction to Machine Learning 

Introduction, linear regression 
Python libraries: pandas, numpy, visualization libraries 
Evaluation, training and test sets, ROC curves
Decisions trees  

Part 2 - Data science and Statistics 

Density estimation, MLE, Bayes classification
Clustering, PCA
Statistics for scientists: correlations, p-values and multiple testing
A mini project in analyzing high throughput data

Part 3 - Big Data 

Introduction to Hadoop
Query Languages 
Machine Learning use cases over Big Data   

Part 4 - A next step in Machine Learning 

Classifiers - SVM and kNN
A brief introduction to Deep Learning

Part 5 - Presentation of the mini project results

Further details here: 
http://kereno.com/syllabus_wis.pdf

Learning Outcomes

Upon successful completion of this course students should be able to:

- understand machine learning algorithms and apply them to data 

- statistically asses observations in data including correlations

- launch and use Big Data platforms to analyze large volumes of data

- understand and configure machine learning packages including Deep Learning and SVM

- analyze large volumes of experimental data and present results   

Reading List

  • An introduction to statistical learning by James and co.
  • Pattern recognition and machine learning by Bishop and co.

Website