Big Data Processing (AI60004) / Spring,2022-23


Course Description

The course gives a comprehensive introduction to storing and processing big data using modern big data systems such as Map-reduce and Spark that run on large commodity clusters. The primary focus is on algorithm design and programming at scale applied to all major domains: text, graph, streaming and relational data. The course also introduces scalable machine learning algorithms using Spark. The course will use Databricks cloud platform for hands-on demo. The students can use Databricks or any other cluster computing framework for their assignments.

Prerequisites

  • Programming and Data Structure
  • Algorithm Design
  • Python Programming language

Instructions

  • Attendance is Mandatory.
  • Assignments have to be submitted in the given time period only, time will not be extended under any condition.

Honor Code

Academic integrity is very important for us. You are required to follow the honor code to maintain academic integrity.

  • Your solutions against assignments, tests must be entirely your own (Exception: You may collaborate if instructed by the faculty).
  • You may not share your solutions for the scheduled assignments and tests with your peers unless instructed by the faculty.

Instructor

Teaching Assistants

Animesh

ainimesh1@gmail.com

Subhajit

shubhajitdatta1988@gmail.com

RajKrishna

rajkrishanghosh@kgpian.iitkgp.ac.in

Om Prakash

omchakrabarty@gmail.com

Raktima

reddish.rb@gmail.com