Escolar Documentos
Profissional Documentos
Cultura Documentos
AND FILTERING
By Prasanna Kunchavaram
Introduction
PROBLEM:
Unsolicited commercial email, commonly known as spam, is a
pressing problem on the Internet.
GOAL :
The goal of the project is to design and develop a spam detection
system for emails by using classifiers like NaiveBayes,
NaiveBayes Multinomial, KNN, SVM.
Data Preparation
In this phase, we are going to train the system with the data of
known classes.
I am using Part 1 to part 7 as a training data to the system.
Input to the training step is the path of structured data which is
already classified into two classes.
Then the system builds the model using any one of the
following algorithms:
Nave Bayes
Nave Bayes Multinomial
K-Nearest Neighbour
Support Vector Machine
Training Screen
Validation
Precision Recall
110 110
105
100
100
90
NB NB
95
NB_Multinomial
NB_Multinomial
KNN 80
KNN
SVM
90 SVM
70
85
60
80
75 50
Dataset1 Dataset2 Dataset3 Dataset4 Dataset5 Dataset1 Dataset2 Dataset3 Dataset4 Dataset5
Results (continued)
Correctly classified Wrongly classified
(TP+TN) (FP+FN)
290 25
285
20
280
275
NB 15 NB
NB_Multinomial NB_Multinomial
270 KNN KNN
SVM
10 SVM
265
260
5
255
250 0
Dataset1 Dataset2 Dataset3 Dataset4 Dataset5 Dataset1 Dataset2 Dataset3 Dataset4 Dataset5
Conclusion