Hybrid XLM-R + Character-CNN Fusion for Robust Multilingual and Code-Mixed Sentiment Classification

Authors

  • Wiwien Hadikurniawati Universitas Stikubank Author
  • Veronica Lusiana Universitas Stikubank Author
  • Olabode, Adeyinka Ayoola Federal College of Education Author

DOI:

https://doi.org/10.70062/jeci.v2i1.274

Keywords:

Code-Mixed Dataset, Hybrid Architecture, Multilingual Text, Robust Model, Sentiment Classifier

Abstract

Multilingual and code-mixed user-generated text (UGT) is noisy: spelling variants, elongations, and typos are common and can degrade transformer-only sentiment classifiers. This paper evaluates a hybrid architecture that fuses a subword transformer encoder (XLM R) with a character-level convolutional branch (CharCNN) to improve robustness under character-level perturbations. We benchmark on two test settings: (1) NusaX, a multilingual Southeast-Asian sentiment dataset, and (2) Indonglish, an Indonesian–English code-mixed sentiment dataset. We report standard clean-set metrics (Accuracy, Macro-F1) and a controlled robustness protocol that applies character-level noise with probability p = 0.18 (seed = 42) and measures performance drops. Results show the hybrid model reduces robustness degradation substantially on code-mixed text (Macro-F1 drop 0.007 vs. 0.030 for the XLM-R baseline), while incurring a modest clean-set performance trade-off. We provide an end-to-end pipeline, ablation analysis, and reproducible reporting artifacts (metrics JSON, confusion matrices, and error samples). 

References

Alqahtani, H., Alhassan, A., & Alshammari, M. (2024). Code-mixed sentiment analysis with multilingual transformers: Benchmarking and error analysis. Information Processing & Management.

Astuti, A. L., & Sari, S. N. (2023). Code-mixed sentiment analysis using transformers and back translation. International Journal of Advanced Computer Science and Applications, 14(10), 580–586. https://doi.org/10.14569/IJACSA.2023.0141072

Bayer, M., Kaufhold, M.-A., & Reuter, C. (2023). A survey on data augmentation for text classification. ACM Computing Surveys. https://doi.org/10.1145/3544558

Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2022). New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems.

Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., Grave, E., Ott, M., Zettlemoyer, L., & Stoyanov, V. (2019). Unsupervised cross-lingual representation learning at scale. arXiv preprint.

Danang, D., & Mustofa, Z. (2026). CLSTMNet architecture: A CNN–LSTM-based hybrid deep learning model for DDoS attack detection and mitigation in network security. Journal of Artificial Intelligence and Technology.

Danang, D., Wahyono, T., Sembiring, I., Wellem, T., & Dzulkefly, N. H. (2025, August). An adaptive framework integrating ML, blockchain, and TEE for cloud security. In 2025 4th International Conference on Creative Communication and Innovative Technology (ICCIT) (pp. 1–7). https://doi.org/10.37965/jait.2025.0887

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL-HLT, 4171–4186. https://doi.org/10.18653/v1/N19-1423

Doshi, N., & Kumar, P. (2021). Noise-robust sentiment classification for user-generated text. Expert Systems with Applications.

Gasparetto, A., Cagnini, A., Basile, V., Caputo, A., & Rossi, D. (2022). A survey on text classification algorithms: From text to predictions. Information, 13(2), 83. https://doi.org/10.3390/info13020083

Huang, Y., Liu, C., & Zhang, M. (2020). Code-switching and code-mixing in NLP: A survey. Language Resources and Evaluation.

Jürgens, D., Hemphill, L., & Chandrasekharan, E. (2021). Social media text robustness and noise: A review. ACM Transactions on Social Computing.

Khan, M. Z., Atta, M., Khan, J., Khan, M. A., Nazir, M., & Khan, I. U. (2022). Multiclass sentiment analysis of Urdu text using multilingual BERT. Scientific Reports, 12, 5426. https://doi.org/10.1038/s41598-022-09381-9

Kit, C. K., & Mokji, M. M. (2022). Text classification of code-mixed text using pretrained language models without fine-tuning. IEEE Access, 10, 125030–125044. https://doi.org/10.1109/ACCESS.2022.3212367

Kit, C. K., Sarmat, S., Mokji, M. M., Arif, F. N., & Hazim, N. (2022). Identification and analysis of multilingual code-mixed text in social media using pretrained language models. IEEE Access, 10, 130053–130065. https://doi.org/10.1109/ACCESS.2022.3223703

Koto, F., Lau, J. H., & Baldwin, T. (2020). Indolem and indobert: A benchmark dataset and pre-trained language model for Indonesian NLP. arXiv preprint.

Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text classification algorithms: A survey. Information, 10(4), 150. https://doi.org/10.3390/info10040150

Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint.

Liu, Y., Zhang, X., & Chen, S. (2019). A survey of sentiment analysis based on transfer learning. IEEE Access, 7, 85401–85416. https://doi.org/10.1109/ACCESS.2019.2925059

Liu, Z., Chen, X., & Wang, Y. (2022). Evaluating robustness of text classifiers under character-level perturbations. Neurocomputing.

Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., & Gao, J. (2021). Deep learning-based text classification: A comprehensive review. ACM Computing Surveys, 54(3), 1–40. https://doi.org/10.1145/3439726

Mukta, M. S., Rahman, M. A., & Das, R. (2021). Sentiment annotation in Bengali and building of sentiment classifier from Bangla text. ACM Transactions on Asian and Low-Resource Language Information Processing, 20(6), 1–28. https://doi.org/10.1145/3474363

Pota, M., Ventura, M., Fujita, H., & Esposito, M. (2021). Cross-lingual and multilingual text classification using pretrained language models. Sensors, 21(1), 133. https://doi.org/10.3390/s21010133

Rahman, M., Islam, M., & Rahman, M. (2022). Data augmentation techniques for low-resource sentiment analysis: A systematic review. Knowledge-Based Systems.

Salsabila, N., Winatmoko, Y. A., Septiandri, A. F., & Jamal, A. A. (2018). Colloquial Indonesian lexicon. Proceedings of the 2018 International Conference on Asian Language Processing (IALP), 226–229. https://doi.org/10.1109/IALP.2018.8629151

Sari, S., & Al Faridzi, R. (2023). Multi-label text classification with domain-based text preprocessing in Indonesian posts. Indonesian Journal of Computing and Cybernetics Systems, 17(4), 367–378. https://doi.org/10.22146/ijccs.79623

Setiawan, S., Santoso, E., & Inggriani, R. (2025). Indonlp: A benchmark and resources for Indonesian natural language processing. International Journal of Electrical and Computer Engineering, 19(4), 2418–2430. https://doi.org/10.11591/ijccs.v19.i4.pp2418-2430

Setiono, M., & Sari, S. N. (2025). Ensemble of contrastive learning and back translation for Indonesian hate speech detection. Indonesian Journal of Computing and Cybernetics Systems, 19(4), 2418–2430. https://doi.org/10.22146/ijccs.104757

Shah, F., Raj, S., & Patel, A. (2022). A comprehensive review of code-mixed sentiment analysis in Indian corpora. International Journal of Advanced Computer Science and Applications, 13(2), 195–208. https://doi.org/10.14569/IJACSA.2022.0130254

Shorten, C., & Khoshgoftaar, T. M. (2021). Text data augmentation for deep learning. Journal of Big Data, 8(1), 101. https://doi.org/10.1186/s40537-021-00492-0

Siswanto, E., Danang, D., Kusumaningroem, I., & Akhsani, I. (2026). Assessing software architecture resilience using quantitative metrics in cloud-native application development environments. Indonesian Journal of Informatics, 1(1), 11–21. https://doi.org/10.66472/iji.v1i1.27

Wang, S., Li, J., & Zhou, Y. (2023). Multilingual sentiment classification with transformer-based models: An empirical study. IEEE Access.

Winata, G. I., Kurniawan, K., Lin, Z., Liu, Z., Chen, X., Moët, D., Madotto, A., Zhang, R., & Fung, P. (2022). Nusax: A multilingual parallel sentiment dataset for Southeast Asian languages. arXiv preprint.

Zhang, Y., Li, H., & Chen, Q. (2021). Character-aware neural models for robust text classification: A review. Applied Sciences.

Downloads

Published

2026-04-15