Towards a cyberbullying detection approach: fine-tuned contrastive self-supervised learning for data augmentation

Research Output

Cyberbullying on social media platforms is pervasive and challenging to detect due to linguistic subtleties and the need for extensive data annotation. We introduce a Deep Contrastive Self-Supervised Learning (DCSSL) model that integrates a Natural Language Inference (NLI) dataset, a fine-tuned sentence encoder, and data augmentation to enhance the understanding of cyberbullying's nuanced semantics and offensiveness. The DCSSL model effectively captures contextual dependencies and the varied semantic implications inherent in cyberbullying instances, addressing the limitations of manual data annotation processes when compared against established models such as BERT and Bi-LSTM. Our proposed model registers a significant improvement, achieving a macro average F1 score of 0.9231 on cyberbullying datasets, highlighting its applicability in environments where manual annotation is impractical or unavailable.

Date:

17 July 2024
Publication Status:

Published
DOI:

10.1007/s41060-024-00607-9
ISSN:

2364-415X
Funders:

Edinburgh Napier Funded

http://researchrepository.napier.ac.uk/output/3702802 Al-Harigy, L. M., Al-Nuaim, H. A., Moradpoor, N., & Tan, Z. (2025). Towards a cyberbullying detection approach: fine-tuned contrastive self-supervised learning for data augmentation. International Journal of Data Science and Analytics, 19(3), 469-490. https://doi.org/10.1007/s41060-024-00607-9

Citation

Al-Harigy, L. M., Al-Nuaim, H. A., Moradpoor, N., & Tan, Z. (2025). Towards a cyberbullying detection approach: fine-tuned contrastive self-supervised learning for data augmentation. International Journal of Data Science and Analytics, 19(3), 469-490. https://doi.org/10.1007/s41060-024-00607-9