Android-Based Short Message Service Filtering using Long Short-Term Memory Classification Model

M. Laylul Mustagfirin(1*), Giri Wahyu Wiriasto(2), I Made Budi Suksmadana(3), Indira Puteri Kinasih(4),

(1) Universitas Mataram
(2) Universitas Mataram
(3) Universitas Mataram
(4) Universitas Negeri Mataram
(*) Corresponding Author
DOI: https://doi.org/10.23917/khif.v8i2.17995

Abstract

Short Message Service (SMS) is a technology for sending messages in text format between two mobile phones that support such a facility. Despite the emergence of many mobile text messaging applications, SMS still finds its use in communication among people and broadcasting messages by governments and mobile providers. SMS users often receive messages from parties, particularly for marketing and business purposes, advertisements, or elements of fraud. Many of those messages are irrelevant and fraudulent spam. This research aims at developing android-based applications that enable the filtering of SMS in Bahasa Indonesia. We investigate 1469 SMS text data and classify them into three categories: Normal, Fraudulent, and Advertisement. The classification or filtering method is the long short-term memory (LSTM) model from TensorFlow. The LSTM model is suitable because it has cell states in the architecture that are useful for storing previous information. The feature is applicable for use on sequential data such as SMS texts because every word in the texts constructs a sequential form to complete a sentence. The observation results show that the classification accuracy level is 95%. This model is then integrated into an Android-based mobile application to execute a real-time classification.

Keywords

text filtering; recurrent neural network; long short term memory

Full Text:

PDF

References

X. Liu, H. Lu and A. Nayak, "A Spam Transformer Model for SMS Spam Detection," in IEEE Access, vol. 9, pp. 80253-80263, 2021.

A. Ghourabi, M. A. Mahmood, and Q. M. Alzubi, “A Hybrid CNN-LSTM Model for SMS Spam Detection in Arabic and English Messages,” in Future Internet, vol. 12, no. 9, p. 156, Sep. 2020

K. F. Kok, “Truecaller Insights: Top 20 Countries Affected by Spam Calls & SMS in 2019,” Truecaller Blog, Dec. 03, 2019. http://truecaller.blog/2019/12/03/truecaller-insights-top-20-countries-affected-by-spam-calls-sms-in-2019/ (accessed Feb. 4, 2022).

M. T. Nuruzzaman, C. Lee and D. Choi, "Independent and Personal SMS Spam Filtering," 2011 IEEE 11th International Conference on Computer and Information Technology, pp. 429-435, 2011.

C. Khemapatapan, "Thai-English spam SMS filtering," 2010 16th Asia-Pacific Conference on Communications (APCC), pp. 226-230, 2010.

J. M. G. Hidalgo, G. C. Bringas, E. P. Sanz, and F. C. Garcia. "Content based SMS spam filtering." In Proceedings of the 2006 ACM symposium on Document engineering, pp. 107-114. 2006.

Herwanto, N. L. Chusna, and M. S. Arif, "Klasifikasi SMS Spam Berbahasa Indonesia Menggunakan Algoritma Multinomial Naïve Bayes," JURNAL MEDIA INFORMATIKA BUDIDARMA 5, no. 4, pp. 1316-1325, 2021.

S. N. Ayu, D. N. Fitriana, and A. Yusuf, "Perbandingan Algoritma Naïve Bayes, Svm, Dan Decision Tree Untuk Klasifikasi Sms Spam," JUSIM (Jurnal Sistem Informasi Musirawas) 5, no. 2, pp. 167-174, 2020.

P. Jeffrey, R. Socher, and C. D. Manning, "Glove: Global vectors for word representation," In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532-1543, 2014.

T. Mikolov, A. Joulin, S. Chopra, M. Mathieu, and M. Ranzato, “Learning Longer Memory in Recurrent Neural Networks,” arXiv:1412.7753 [cs], Apr. 2015, Accessed: Feb. 13, 2022. [Online]. Available: http://arxiv.org/abs/1412.7753

C. Zhou, C. Sun, Z. Liu, and F. C. M. Lau, “A C-LSTM Neural Network for Text Classification,” arXiv:1511.08630 [cs], Nov. 2015, Accessed: Feb. 13, 2022. [Online]. Available: http://arxiv.org/abs/1511.08630

A. Graves, N. Jaitly and A. Mohamed, "Hybrid speech recognition with Deep Bidirectional LSTM," 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 273-278, 2013.

R. Kohavi and F. Provost, "Glossary of terms," Journal of Machine Learning 30, no. 2-3, pp. 271-274, 1998

B. M. Randles, I. V. Pasquetto, M. S. Golshan and C. L. Borgman, "Using the Jupyter Notebook as a Tool for Open Science: An Empirical Study," 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pp. 1-2, 2017.

Google, “Google Colab.” https://research.google.com/colaboratory/faq.html (accessed Feb. 4, 2022).

T. Carneiro, R. V. Medeiros Da NóBrega, T. Nepomuceno, G. -B. Bian, V. H. C. De Albuquerque and P. P. R. Filho, "Performance Analysis of Google Colaboratory as a Tool for Accelerating Deep Learning Applications," in IEEE Access, vol. 6, pp. 61677-61685, 2018.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, et al, “Tensorflow: A system for large-scale machine learning,” In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) pp. 265–283, 2016.

Tensorflow Team. “TensorFlow Lite.” https://www.tensorflow.org/lite/guide (accessed Feb. 4, 2022).

V. G. Tandra, Y. Yowen, R. Tanjaya, W. L. Santoso and N. Nurul Qomariyah, "Short Message Service Filtering with Natural Language Processing in Indonesian Language," 2021 International Conference on ICT for Smart Society (ICISS), pp. 1-7, 2021.

A. Theodorus, T. K. Prasetyo, R. Hartono and D. Suhartono, "Short Message Service (SMS) Spam Filtering using Machine Learning in Bahasa Indonesia," 2021 3rd East Indonesia Conference on Computer and Information Technology (EIConCIT), pp. 199-203, 2021.

G. Sethi and V. Bhootna, “SMS spam filtering application using Android,” Int. J. Comput. Sci. Inf. Technol 5, no 3, pp. 4624-4626, 2014.

A. K. Uysal, S. Gunal, S. Ergin and E. S. Gunal, "A novel framework for SMS spam filtering," 2012 International Symposium on Innovations in Intelligent Systems and Applications, pp. 1-4, 2012.

K. Yadav, P. Kumaraguru, A. Goyal, A. Gupta, and V. Naik, “SMSAssassin: crowdsourcing driven mobile-based system for SMS spam filtering,” in Proceedings of the 12th Workshop on Mobile Computing Systems and Applications - HotMobile ’11, p. 1, 2011.

S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," in Neural Computation, vol. 9, no. 8, pp. 1735-1780, 15 Nov. 1997

Z. C. Lipton, J. Berkowitz, and C. Elkan, “A Critical Review of Recurrent Neural Networks for Sequence Learning,” arXiv:1506.00019 [cs], Oct. 2015, Accessed: Feb. 13, 2022. [Online]. Available: http://arxiv.org/abs/1506.00019

G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, “Improving neural networks by preventing co-adaptation of feature detectors,” arXiv:1207.0580 [cs], Jul. 2012, Accessed: Feb. 13, 2022. [Online]. Available: http://arxiv.org/abs/1207.0580

J. Tompson, R. Goroshin, A. Jain, Y. LeCun, and C. Bregler, “Efficient object localization using Convolutional Networks,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 648–656, 2015.

V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML'10), pp. 807–814, 2010.

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.

V. L. Helen Josephine, A. P. Nirmala, and V. L. Alluri, “Impact of Hidden Dense Layers in Convolutional Neural Network to enhance Performance of Classification Model,” IOP Conf. Ser.: Mater. Sci. Eng., vol. 1131, no. 1, p. 012007, 2021.

X. Deng, Q. Liu, Y. Deng, and S. Mahadevan, “An improved method to construct basic probability assignment based on the confusion matrix for classification problem,” Information Sciences, vol. 340–341, pp. 250–261, 2016.

Article Metrics

Abstract view(s): 488 time(s)
PDF: 328 time(s)

Refbacks

  • There are currently no refbacks.