DETECTION AND CLASSIFICATION OF CYBERBULLYING IN DIGITAL TEXTS USING ARTIFICIAL INTELLIGENCE
DOI:
https://doi.org/10.31891/2219-9365-2024-80-18Keywords:
cyberbullying, representativeness, interpretation of results, BERT, LIMEAbstract
A comprehensive approach to detecting and classifying cyberbullying in digital text using artificial intelligence has been developed. This approach consists of three key stages, each addressing specific challenges in cyberbullying detection. The first stage involves evaluating and adjusting the representativeness of the dataset to ensure ethical balance with respect to age, ethnicity, and gender. This is essential to prevent bias in the model’s decision-making process. The analysis of dataset representativeness showed minimal deviations from the ideal distribution, with a maximum deviation of 0.04%, confirming the effectiveness of data preprocessing. The second stage uses a neural network-based model for detecting and classifying cyberbullying. This multi-label classification model evaluates the overall level of cyberbullying in a text and classifies it into different types, such as age-related, religious, ethnic, and gender-based bullying. The BiLSTM model, used for binary classification, achieved an accuracy of 0.96, precision of 0.96, recall of 0.96, and an F1 score of 0.96. The BERT model, used for multi-label classification, demonstrated an accuracy of 0.94, precision of 0.93, recall of 0.93, and an F1 score of 0.93. The third stage focuses on the visual interpretation of the model’s decisions, providing detailed explanations for each detected type of cyberbullying. This is achieved by combining a transformer-based multi-label classifier with an interpretative machine learning model, enhancing transparency and understanding of the model’s reasoning. The proposed approach not only improves the detection of cyberbullying but also ensures fairness and transparency in AI systems. By addressing ethical concerns and providing interpretable results, this approach contributes to building trust in AI systems, especially in sensitive areas like cyberbullying detection. It provides a robust framework for ethical and effective cyberbullying detection in textual data.