Você está na página 1de 14

Pr me C L A S S E S

Learn from Prac oners

Course Brochure
DATA SCIENCE foundation
DATA SCIENCE advanced
python

“ AI will add 2.3 million jobs by 2020


~ Gartner
data science foundation - course overview
These are state of the art programs designed by Prime Classes - Promoted by a team of Data science
prac oners with 15+ years of experience in IT Industry.

These programs are experien al applica on driven programs for highly mo vated working
professionals to become data science prac oners.

COURSE DURATION TOTAL SESSION HOURS TOTAL CODING HOURS


4 WEEKS 40 HRS 20 HRS

EFFORT CASE STUDIES & PROJECTS PROSPECTIVE CAREERS


20 HRS / WEEK 10+ executive
data analyst

SESSION DURATION
2 HRS (2 Hr – Concept Learning and Concept Implementation)

program curriculum
Module 1 – Statistics and Probability
Define Sta s cal Inference.
List the terminologies of Sta s cs.

Illustrate the measure of center and spread.

Probability Theory.
Condi onal Probability.

Bayes Theorem.

Module 2 – Introduction to R
R basics, understanding of data types, func ons, control structures, data
manipula ons, date and string manipula ons.
List the terminologies of Sta s cs.

“ AI is the next industrial revolution Pr me C L A S S E S


Hands-on implementa on on all the pre-processing techniques in R.
Hands on implementa on on all the visualiza on techniques in R.

Module 3 - Data Preprocessing and Exploration


Pre-processing techniques: Binning, Filling missing values, Standardiza on and
Normaliza on, Type conversions, train – test Data Split.
Hands on implementa on on all the pre-processing techniques in R.
Need for Visualiza ons.
Communica ng with data: Issues and guiding principles; Primary ingredients of data
visualiza on; How to pick visual encodings such as colour, shape, size, etc.; Which chart
to use when; How to accommodate more than 2 dimensions .
Case highligh ng the transi on from a simple chart to a powerful visualiza on, complete
with storytelling .

Module 4 – Introduction to Machine Learning


Define Machine Learning.
Machine Learning Use Cases.
List the categories of Machine Learning.

Module 5 – Linear Regression, Logistic Regression and ROCR Curves


Introduc on to sta s cal concepts such as hypothesis, T-test and ANOVA.
Simple Linear Regression and mul ple Linear Regression.
Logis c Regression.
ROC Curves.

Module 6 – Classification
Define Classifica on
Decision Trees.
Random Forest.
Naïve Bayes.
Support Vector Machines.

Module 7 - Unsupervised Learning


Define Unsupervised Learning.
K Means Clustering.
C Means Clustering.
Hierarchical Clustering.

“ The world is now awash in data and we


can see consumers in a lot clearer ways. Pr meC L A S S E S
Module 8 – Distance Measures
Introduc on to different variable types.
Cosine Distance.
Hamming Distance.
Jaccard Distance.

Module 9 – Recommendation Engines


Associa on Rules and Market Basket Analysis.
Content based Recommenda on Engines.
User based Collabora ve Filtering.
Item based Collabora ve Filtering.

Module 10 – Text Mining


Introduc on to TF-IDF.
Bag of Words Approach.
Introduc on to Sen ment Analysis.

Module 11 – Timeseries Forecasting


Introduc on to Time Series Forecas ng
Understanding Time Series Components - Trend, Seasonality and Random Noise.
ARIMA.
HoltWinters and Exponen al Time Series.

Module 12 – Introduction to Neural Networks


Introduc on to Neural Networks.
Introduc on to different types of Neural Networks:
Feedforward Networks.
Feedback Networks.
Recursive Neural Networks.
Convoluted Neural Networks.
How to design a Neural Network.

Module 13: Fine tuning a Machine Learning Model


How to choose an Error Metric.
Overfi ng and Variance.
Underfi ng and Bias.
Learning Curves.
Measures to reduce Bias and Variance.

“ Pr me
Google has gone from two
deep learning projects to 1000 today.
~ Fortune C L A S S E S
data science advanced - course overview

COURSE DURATION TOTAL SESSION HOURS TOTAL CODING HOURS


16 WEEKS 180 HRS 150+ HRS

EFFORT CASE STUDIES & PROJECTS PROSPECTIVE CAREERS


business analyst
24 HRS / WEEK 25+ data architect
data scientist

SESSION DURATION
2 HRS 30 MINS (1 Hr 30 Mins – Concept Learning, 1 hr – Concept Implementation)
Collaborate with mentors on coding assignments and projects in the last 1 hr of
every session.

program curriculum
1.Introduction to Probability and Statistics for Data Science
This module aims at preparing you for the essen al skill of thinking like a sta s cian.

This module will enable you to change your analy cal thinking process, and you will begin to start
looking at data and numbers from a different perspec ve. This is a fundamental
module and strong concepts in this area will enable you to differen ate yourself as a
Data Scien st.

This module covers


Probability theory and related algorithms
Descrip ve sta s cal methods

Inferen al sta s cal methods

From a tools perspec ve, you will gain confidence with tools like R and Excel.

“ Pr me
The big technology trend is to make systems
intelligent and data is the raw material.
C L A S S E S
~ Amod Malviya, CTO, Flipkart
Fundamentals of Probability
Introduc on to random variables
Probability theory
Condi onal probability
Bayes Theorem
The Concept of a data set

Understanding the proper es of an a ribute: Central tendencies (Mean, Median, Mode);


Measures of spread (Range, Variance, Standard Devia on)
Basics of Probability Distribu ons; Expecta on and Variance of a variable

Probability distribu on and differences between discrete and con nuous distribu ons

Discrete probability distribu ons: Binomial, Poisson


Con nuous probability distribu ons: Normal distribu on; t-distribu on

Procedure for gaining inference about popula ons from samples.


Understand the data a ributes, distribu ons, sample vs popula on
Procedure for sta s cal tes ng
Extend the understanding to analyze rela onships between variables
How to conduct sta s cal hypothesis tes ng and introduc on to various methods such as
chi-square test, t-test, z-test, F-test and ANOVA
Covariance and Correla on and a Precursor to Regression
Hands-on Implementa on in R

2. Essential mathematical concepts


Vectors, Matrices, Eigen values, Eigen vectors, Orthogonality, etc.
Kernel tricks, kernel func ons, PCA, SVD, LSA
Hands-on implementa on in R

3. Essential Engineering skills for Data Science


Data preprocessing techniques
Python and R basics
Database Concepts
String and list objects
Excep on handling
Understanding of data structures, func ons, control structures, data manipula ons, date
and string manipula ons
Pre-processing techniques: Binning, Filling missing values, Standardiza on and
Normaliza on, Type conversions, train-test Data split, ROCR1

“ Data is the new science. Big data holds the answers.


~ Pat Gelsinger, CEO, Vmware Pr me
C L A S S E S
4. Data Exploration, Data Visualizations and Data Story
Need for Visualiza ons
How to tell a Data Story
Communica ng with data: Issues and guiding principles; Primary ingredients of data
visualiza on; How to pick visual encodings such as color, shape, size; Which chart to
use when; How to accommodate more than 2 dimensions
A case highligh ng the transi on from a simple chart to a powerful visualiza on, complete
with storytelling
Using R-ggplots and Qliksens for visualiza ons

5. Introduction to Planning and Architecting Data Science Solutions


Frameworks to analyze a data science problem
How to choose an error metrics
What are the efficient ways to present results of data Science and data Analy cs
What are different forms in which data is available

6. Introduction to Machine Learning - Methods and Algorithms


Fundamentals of Linear regression.
Linear regression
Rela onship between mul ple variables: regression (Linear, Mul variate Linear
Regression) in predic on.
Understanding the summary output of Linear Regression
Residual Analysis
Iden fying significant features, feature reduc on using AIC, mul collinearity check,
observing influen al points.
Non-normality and Heteroscedas city
Hypothesis tes ng of regression Model
Confidence intervals of Slope
R-square and goodness of fit
Influen al observa ons- leverage of Mul ple linear Regression
Polynomial Regression
Categorical Variables in Regression
Hands-on Linear Regression
Introduc on and deep dive into logis c regression and the important concept of ROC
curves
Logis c Regression
ROC curves
Logis c regression in classifica on; output interpreta ons
Hands-on logis c Regression

“ Pr me
Tech giants have acquired 140
AI companies since 2011.
~ Observer Magazine C L A S S E S
Time Series Analysis
Decomposi on of Time Series
Trend and Seasonality detec on and forecas ng
Smothering Techniques
Understanding ACF & PCF plots
ARIMA Modeling
Holt-Winter Method

Principles and ideas in the field of Data Mining


Rule pa erns, construc on of rule-based classifier from data, turning trees into rules, rule
growing strategy, rule evalua on and stopping criteria, several business metrics such as
ac on ability, explicability and later turns towards associa on rules and cover them
in detail.
Indirect from decision trees
Direct: Sequen al covering
Market Basket Analysis, Apriori, Recommenda on engines, Associa on Rules
How to combine clustering and classifica on
How to measure the quality of clustering – outlier analysis
Associa on Analysis
FP Trees
Hands-on with R
Introduc on and deep dive into logis c regression and the important concept of ROC
curves
Top Induc on of decision trees (TDIDT)
A ribu on selec on based on informa on theory approach
Recursive par oning (binary search)
Id3, C4.5, C5.0 for pa ern recogni on problems, avoiding over fi ng, conver ng trees to
rules
Hands-on with R

Distance-based classifiers.

K-Nearest Neighbor algorithm


Aspects to consider while designing K-Nearest Neighbor
Hands-on example of K-Nearest Neighbor using R
Collabora ve filtering

Neural networks
Perceptron and Single Layer Neural Network.
Back Propaga on algorithm and a typical Feed Forward Neural Net.
Hands-on with R with a Case.

“ Pr me
Without big data, companies are blind and deaf,
wandering out onto the web like deer on a freeway.
~ Geoffrey Moore, American organizational theorist C L A S S E S
Support vector machines (SVM).
Linear learning machines and kernel space, making kernels and working in feature space.
SVM algorithm and comparison with Neural Nets
Demonstrate the working of SVM classifica on problems using a business case in R.

Ensemble methods
Bagging and boos ng and its impact on bias and variance
C 5.0 boos ng
Random Forest
AdaBoost
Gradient boos ng machines
Unsupervised learning algorithm-Clustering
Different clustering methods; review of several distance measures
Itera ve distance-based clustering
Dealing with con nuous, categorical values in K-Means
Construc ng a hierarchical cluster, K-medoids, k-mode and density-based clustering to
handle different types in prac ce.
Test for stability check of clusters
Hands on implementa on of each of these methods will be conducted in R.

Bayesian belief nets, Naïve Bayes, popular techniques to handle Overfi ng and
Underfi ng
Introduc on to genera ve techniques
Bayesian belief nets (BBN)
Naïve Bayes- a special case of BBN
Hands-on Naïve Bayes in R
How to avoid Overfi ng and Underfi ng
Refresher on all the machine learning algorithms

7. Text Mining and Natural language Processing


Text processing algorithms
Basics of search engines
Introduc on to the Fundamentals to the informa on retrieval; Language modeling
N-gram models of language
Smoothing and probabilis c language models
Query likelihood model
2-stage smoothing
Text Indexing and Crawling
Inverted Indexes

“ Data is the Next Intel Inside.


~ Tim O’Reilly, Founder, O'Reilly Media Pr me
C L A S S E S
Boolean query processing
Handling phrase queries
Proximity queries
Crawling
Relevance Ranking
Need for Relevance Ranking
TF and IDF
Thinking about the math behind the text; Proper es of words; Vector Space Model
Evalua on metrics for Ranking

Link Analysis Algorithms


PageRank
HITS
Topic-sensi ve PageRank
Spam Detec on Algorithms

Natural Language Processing


Stemming, phrase iden fica on, word sense disambigua on
POS tagging
Parsing and seman c structures
Coreference resolu on
Named En ty Recogni on
What is NER?
Possible applica ons of NER
Evalua on and tes ng
NER methods

8. DEEP LEARNING USING TENSORFLOW


Basics of neural network
Linear algebra
Implementa on of neural network in Vanilla
Basics of TensorFlow
Convolu onal neural networks (CNNs)
Recurrent neural networks (RNNs)
Genera ve models
Semi-supervised learning using GAN
Seq-to-seq model
Encoder and decoder

“ Just as consumer appliances have microprocessors,


apps will have AI.
~ Gartner
Pr meC L A S S E S
Python programming - course overview

COURSE DURATION TOTAL SESSION HOURS TOTAL CODING HOURS


4 WEEKS 30 HRS 30 HRS

EFFORT PROSPECTIVE CAREERS


PROJECTS programmer
15 HRS / WEEK
game developers
Jr. data analyst

SESSION DURATION
3 HRS (1 Hr 30 Mins – Concept Learning, 1hr 30 mins - Collaborate with mentors on
coding assignments and projects .

program curriculum

1. Introduction to Python
Overview of Python
The Companies using Python
Other applica ons in which Python is used
Discuss Python Scripts on UNIX/Windows
Values, Types, Variables
Operands and Expressions
Condi onal Statements
Loops
Command Line Arguments
Wri ng to the screen

“ Pr me
AI is the most important technology on the
planet today.
~ Dave Choplin C L A S S E S
2. Sequences and File operations
Python files I/O Func ons
Numbers
Strings and related opera ons
Tuples and related opera ons
Lists and related opera ons
Dic onaries and related opera ons
Sets and related opera ons

3. Functions, OOPs, Modules, Errors and Exceptions


Func ons
Func on Parameters
Global variables
Variable scope and Returning Values
Lambda Func ons
Object-Oriented Concepts
Standard Libraries
Modules Used in Python
The Import statements
Module search path
Package installa on ways
Errors and Excep on Handling
Handling mul ple excep ons

4. Introduction to NumPy and Pandas


NumPy - arrays
Opera ons on arrays
Indexing slicing and itera ng
Reading and wri ng arrays on files
Pandas - data structures & index opera ons
Reading and Wri ng data from Excel/CSV formats into Pandas

5. Data Visualization
matplotlib library
Grids, axes, plots
Markers, colours, fonts and styling
Types of plots - bar graphs, pie charts, histograms
Contour plots

“ Machine learning drives almost everything at Amazon.


~ Jeff Bezos Pr me
C L A S S E S
6. Data Manipulation
Basic Func onali es of a data object
Merging of Data objects
Concatena on of data objects
Types of Joins on data objects
Exploring a Dataset
Analyzing a dataset

7. Developing Web maps and representing information using plots


Use of Folium Library
Use of Pandas Library
Flow chart of Web Map applica on
Developing Web Map using Folium and Pandas
Reading informa on from Dataset and represent it using Plots

8. Computer vision using open CV and visualization using Bokeh


Beau ful Soup Library
Requests Library
Scrap all hyperlinks from a webpage, using Beau ful Soup & Requests
Plo ng charts using Bokeh
Plo ng sca erplots using Bokeh
Image Edi ng using OpenCV
Face detec on using OpenCV
Mo on Detec on and Capturing Video

CAPSTONE project
The course culminates in an enterprise project for a fic ous client that will expose you to every stage
of the data science process – from data acquisi on and prepara on to evalua on, interpreta on,
deployment, opera ons, and op miza on. The project is an opportunity for you test your skills and
demonstrate your ability to invent solu ons for real-world problems.

“ Pr me
In the next 10 years, data science and software will
do more for medicine than all of the biological sciences together
~ Vinod Khosla, American-Indian engineer C L A S S E S
lead trainers
sravan chaitanya Siddhartha
IIT - Madras, IIM - Ahmedabad IIT - Madras, University of Alberta
A Competent Professional with about Siddhartha is the Founder of Zessta.
15+ years of experience in Consul ng, A hardcore technologist Siddhartha’s
Data Science, successfully trained passion, lies in developing innova ve and
more than 4000+ working professionals futuris c products in machine learning
in becoming Data Scien sts. and AI.

sumith murali krishna


IIT - Guwaha IIT - Kharagpur
A passionate engineer, having 17 years Murali is renowned data science
of experience in driving an idea through consultant and worked for Pearson,
all phases. The con nuous thirst to serve C a d b u r y, J o h n s o n a n d J o h n s o n ,
be er in Digital Signal Processing, HouseJoy. He is an expert in course design,
Computer Vision, Machine Learning development and delivery.
and Data Sciences.

why PRIME CLASSES

world class curriculum learn from practitioners


Our curriculum is designed to meet Our team of trainers hail from IT Industry
growing industry needs. Our lectures are with more than 15+ years of experience
supplemented with industry-ready in architec ng and building scalable tech
assignments and tasks. solu ons. Ac on workshops by industry
leaders from Silicon Valley.

experiential learning Placement support


Prime Classes offers access to world-class Get Internship /Placement with MNCs
lab facility. Our mentors let you work /Startups working in cu ng-edge tech
on curated case studies and selected domains.Opportunity to work on your
industry problems. ideas with scale and get mentored by
seasoned entrepreneurs to convert it
to business.

about us
Prime Classes is a career accelerator for youth promoted by IIT and IIM Alumni. Our vision is to
create a na onal level talent pool by skilling millions of youth for growing industry needs in
new age technologies viz Data Science, AI and Machine Learning.


“ An investment in knowledge pays the best interest

A breakthrough in machine learning will be worth 10 Microsofts.


~ Bill Gates
~ Benjamin Franklin

Pr me C L A S S E S

Você também pode gostar