DSML SYLLABUS
RATIONALE
The diploma holders in Computer Science and Engineering needs to understand about Data Science and Machine Learning and how to implement Machine Learning Algorithms.
They should be able to solve real time problems using data science and Machine learning techniques. Hence this subject is introduced in the curriculum.
LEARNING OUTCOMES
After undergoing the subject, the students will be able to:
- Understand the basics of Data Science
- Understand and develop Machine Learning Algorithms.
- Implement Dimensionality Reduction Techniques
DETAILED CONTENTS (DSML)
1. Introduction of data Science and Machine Learning
Fundamentals of Artificial Intelligence, need and applications of Data Science, Data Mining, data preparation, Machine Learning , Types and Applications of Machine learning
2. Data Preprocessing, Analysis and Visualization
Data Pre-processing: Pre-processing Techniques- Mean Removal, Scaling, Normalization, Binarization, One Hot Encoding, Label encoding, Data Analyses: Loading and summarizing the dataset, Data Visualization: Univariate Plots, Multivariate Plots,
Training Data, Test Data, Performance Measures
3. Statistical Inference
Populations and samples, Types of Statistical modelling, Types of probability distributions. Parametric and Non-Parametric Methods, Distance Metrics
4. Exploratory Data Analysis and the Data Science Process
Basic tools (plots, graphs and summary statistics) of EDA, Philosophy of EDA, The Data Science Process
5. Machine Learning Algorithms
Introduction to Supervised Learning Algorithms –Decision Tree, Linear Regression, k Nearest Neighbours (k-NN), SVM and Introduction to Unsupervised Learning Algorithms – K-means Clustering, Mean Shift Algorithm, Dimensionality Reduction
Techniques, Introduction to Neural Networks,
6. Mining Social-Network Graphs
Social networks as graphs, Clustering of graphs, Direct discovery of communities in graphs, Partitioning of graphs, Neighbourhood properties in graphs
7. Data Science and Ethical Issues
Discussions on privacy, security, ethics, A look back at Data Science, Next-generation data scientists
LIST OF PRACTICALS
- WAP to implement the Decision Tree Algorithm
- WAP to implement the Linear Regression
- WAP to implement the k-Nearest Neighbors (k-NN)
- WAP to implement the SVM Algorithm
- WAP to implement the K-means Clustering
- WAP to implement various Distance Metrics
- WAP to implement Dimensionality Reduction Techniques
INSTRUCTIONAL STRATEGY
The subject is conceptual and practical based. Students should be given clear idea about the basic concepts of Data Science and Machine Learning. In practical session student should be asked to explain the algorithm and then write program for algorithm and run on computer. It is required that students should maintain records (files with printouts).
MEANS OF ASSESSMENT
− Assignments and quiz/class tests, mid-term and end-term written tests
− Actual laboratory and practical work, exercises and viva-voce
− Software installation, operation, development and viva-voce