Research Output
Towards a cyberbullying detection approach: fine-tuned contrastive self-supervised learning for data augmentation
  Cyberbullying on social media platforms is pervasive and challenging to detect due to linguistic subtleties and the need for extensive data annotation. We introduce a Deep Contrastive Self-Supervised Learning (DCSSL) model that integrates a Natural Language Inference (NLI) dataset, a fine-tuned sentence encoder, and data augmentation to enhance the understanding of cyberbullying's nuanced semantics and offensiveness. The DCSSL model effectively captures contextual dependencies and the varied semantic implications inherent in cyberbullying instances, addressing the limitations of manual data annotation processes when compared against established models such as BERT and Bi-LSTM. Our proposed model registers a significant improvement, achieving a macro average F1 score of 0.9231 on cyberbullying datasets, highlighting its applicability in environments where manual annotation is impractical or unavailable.

  • Date:

    17 July 2024

  • Publication Status:

    Published

  • DOI:

    10.1007/s41060-024-00607-9

  • ISSN:

    2364-415X

  • Funders:

    Edinburgh Napier Funded

Citation

Al-Harigy, L. M., Al-Nuaim, H. A., Moradpoor, N., & Tan, Z. (2025). Towards a cyberbullying detection approach: fine-tuned contrastive self-supervised learning for data augmentation. International Journal of Data Science and Analytics, 19(3), 469-490. https://doi.org/10.1007/s41060-024-00607-9

Authors

Keywords

Cyberbullying Detection, Deep Contrastive Self-Supervised Learning, Data Augmentation, Natural Language Inference, Offensive Content Detection

Monthly Views:

Available Documents