Download PDFOpen PDF in browser

SMS Spam Detection

EasyChair Preprint no. 5166

35 pagesDate: March 16, 2021

Abstract

Over recent years, as the popularity of mobile phone devices has increased, Short Message Service (SMS) has grown into a multi-billion dollars industry. In this project, a database of real SMS Spams from UCI Machine Learning repository is used, and after pre processing and feature extraction, different machine learning techniques are applied to the database. Finally, the results are compared and the best algorithm for spam filtering for text messaging is introduced. Final simulation results using 10-fold cross validation shows the best classifier in this work reduces the overall error rate of best model in original paper citing this dataset by more than half. Algorithms used in this technique are: Logistic regression (LR), K-nearest neighbour(K-NN) and Decision tree (DT) are used for classification of spam messages in mobile device communication. The SMS spam collection set is used for testing the method.

Keyphrases: Bayes Theorem, Count Vectorization, Preprocessing

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@Booklet{EasyChair:5166,
  author = {Ambika Methre and Kavali Veena},
  title = {SMS Spam Detection},
  howpublished = {EasyChair Preprint no. 5166},

  year = {EasyChair, 2021}}
Download PDFOpen PDF in browser