AI-powered natural language processing in language education: A systematic review

Main Article Content

Daniel Murcia
https://orcid.org/0000-0002-3146-6116
Luis Felipe Jaramillo-Calderón
https://orcid.org/0009-0002-8782-4047

Abstract

This systematic review investigates the potential of Natural Language Processing (NLP) based Artificial Intelligence (AI) technologies to enhance literacy development in higher education. We reviewed (n=63) documents published between 2015 and 2023, exploring how NLP has been used in language education within processes of literacy, biliteracy instruction, and language assessment. The literature reveals exploratory integrations and empirical evidence of the impact of these technologies in language instruction, learning, and assessment which sheds light on NLP software tools used and key application areas. Our findings reveal exploratory integrations and initial evidence for the impact of NLP-based AI on language education, language instruction, assessment and feedback, existing challenges and future directions, as well as ethical considerations that reveal the ongoing debates and efforts to leverage AI powered technologies to current curricular approaches in higher education.

Article Details

How to Cite
Murcia, D., & Jaramillo Calderón, L. F. (2026). AI-powered natural language processing in language education: A systematic review . HOW, 33(1), 43–67. https://doi.org/10.19183/how.33.1.836
Section
Review Articles
Author Biographies

Daniel Murcia, The Pennsylvania State University

(Associate Professor, Universidad Tecnológica de Pereira) is a Fulbright scholar researching AI applications in language assessment. His expertise spans Automated Writing Evaluation, Language Assessment, Human Language Technology and Discourse Analysis, with publications in these areas.

Luis Felipe Jaramillo-Calderón, Universidad Tecnológica de Pereira

Luis Felipe Jaramillo-Calderón is a professor and researcher from Universidad Tecnológica de Pereira. He holds an M.A. in Bilingual Education from the same university. He is a member of the Poliglosia research group. His research interests are Bilingual Education, Language Assessment, Natural Language Processing and Literacy Development.

References

Allen, L. K., Likens, A. D., & McNamara, D. S. (2017). Recurrence Quantification Analysis: a technique for the dynamical analysis of student writing. Grantee Submission. http://files.eric.ed.gov/fulltext/ED585783.pdf

Allen, L. K., Snow, E. L., & McNamara, D. S. (2015). Are You Reading My Mind? Modeling Students’ Reading Comprehension Skills with Natural Language Processing Techniques. Grantee Submission. http://files.eric.ed.gov/fulltext/ED588531.pdf

Alrashidi, H., Almujally, N. A., Kadhum, M., Ullmann, T., & Joy, M. (2022). Evaluating an automated analysis using machine learning and natural language processing approaches to classify computer science students’ reflective writing. In Lecture notes in networks and systems (pp. 463–477). https://doi.org/10.1007/978-981-19-2840-6_36

Attali, Y., Runge, A., LaFlair, G. T., Yancey, K. P., Goodwin, S., Park, Y., & Von Davier, A. A. (2022). The interactive reading task: Transformer-based automatic item generation. Frontiers in Artificial Intelligence, 5. https://doi.org/10.3389/frai.2022.903077

Barbu, E., Martín–Valdivia, M. T., Martínez‐Cámara, E., & Ureña‐López, L. A. (2015). Language technologies applied to document simplification for helping autistic people. Expert Systems With Applications, 42(12), 5076–5086. https://doi.org/10.1016/j.eswa.2015.02.044

Bauer, E., Sailer, M., Kiesewetter, J., Fischer, M. R., Gurevych, I., & Fischer, F. (2024). Facilitating justification, disconfirmation, and transparency in diagnostic argumentation. Zeitschrift Fur Padagogische Psychologie, 38(1–2), 49–54. https://doi.org/10.1024/1010-0652/a000363

Bond, M., Khosravi, H., De Laat, M. et al., (2024)A meta systematic review of artificial intelligence in higher education: a call for increased ethics, collaboration, and rigour. Int J Educ Technol High Educ 21, 4 . https://doi.org/10.1186/s41239-023-00436-z

Bradáč, V., Smolka, P., Kotyrba, M., & Průdek, T. (2022). Design of an intelligent tutoring system to create a personalized study plan using expert systems. Applied Sciences, 12(12), 6236. https://doi.org/10.3390/app12126236

Burstein, J., McCaffrey, D., Klebanov, B. B., & Ling, G. (2017). Exploring Relationships between Writing & Broader Outcomes with Automated Writing Evaluation. Grantee Submission. https://files.eric.ed.gov/fulltext/ED598703.pdf

Chan, C., & Colloton, T. (2024). Generative AI in higher education. The ChatGPT effect. Routledge.

Cerga-Pashoja, A., Gaete, J., Shishkova, A. M., & Jordanova, V. (2019). Improving Reading in Adolescents and Adults with High-Functioning Autism through an Assistive Technology tool: a Cross-Over Multinational study. Frontiers in Psychiatry, 10. https://doi.org/10.3389/fpsyt.2019.00546

Chalmers, H., Brown, J., & Koryakina, A. (2023). Topics, publication patterns, and reporting quality in systematic reviews in language education. Lessons from the international database of education systematic reviews (IDESR). Applied Linguistics Review.https://doi.org/10.1515/applirev-2022-0190

Chen, X., Zou, D., Xie, H., Chen, G., Lin, J., & Cheng, G. (2022). Exploring contributors, collaborations, and research topics in educational technology: A joint analysis of mainstream conferences. Education and Information Technologies, 28(2), 1323–1358. https://doi.org/10.1007/s10639-022-11209-y

Chong, C., Sheikh, U. U., Samah, N. A., & Sha’ameri, A. Z. (2020). Analysis on reflective writing using natural language processing and sentiment analysis. IOP Conference Series: Materials Science and Engineering, 884(1), 012069. https://doi.org/10.1088/1757899x/884/1/012069

Contreras, J., Hilles, S. M. S., & Abubakar, Z. B. (2019). Automated Essay Scoring using Ontology Generator and Natural Language Processing with Question Generator based on Blooms Taxonomy’s Cognitive Level. International Journal of Engineering and Advanced Technology, 9(1), 2448–2457. https://doi.org/10.35940/ijeat.a9974.109119

Crompton, H., & Burke, D. (2023). Artificial intelligence in higher education: the state of the field. International Journal of Educational Technology in Higher Education, 20(1). https://doi.org/10.1186/s41239-023-00392-8

Demir, Ü. (2019). The effect of using negative knowledge based intelligent tutoring system evaluator software to the academic success in English language education. Pedagogies: An International Journal, 15(4), 245–261. https://doi.org/10.1080/1554480x.2019.1706522

Dergaa, I., Chamari, K., Żmijewski, P., & Saad, H. B. (2023). From human writing to artificial intelligence generated text: examining the prospects and potential threats of ChatGPT in academic writing. Biology of Sport, 40(2), 615–622. https://doi.org/10.5114/biolsport.2023.125623

Eisenstein, J. (2019). Introduction to natural language processing. MIT Press. 39. https://doi.org/10.58459/rptel.2023.18002

Dergaa, I., Chamari, K., Żmijewski, P., & Saad, H. B. (2023). From human writing to artificial intelligence generated text: examining the

prospects and potential threats of ChatGPT in academic writing. Biology of Sport, 40(2), 615–622. https://doi.org/10.5114/biolsport.2023.125623

Feng, H., Sarıcaoğlu, A., & Chukharev‐Hudilainen, E. (2016). Automated Error Detection for developing grammar proficiency of ESL learners. The CALICO Journal, 33(1), 49–70. https://doi.org/10.1558/cj.v33i1.26507

Fryer, L. K., Ainley, M., Thompson, A., Gibson, A., & Sherlock, Z. (2017). Stimulating and sustaining interest in a language course: An experimental comparison of Chatbot and Human task partners. Computers in Human Behavior, 75, 461–468. https://doi.org/10.1016/j.chb.2017.05.045

Fu, S., Gu, H., & Yang, B. (2020). The affordances of AI‐enabled automatic scoring applications on learners’ continuous learning intention: An empirical study in China. British Journal of Educational Technology, 51(5), 1674–1692. https://doi.org/10.1111/bjet.12995

Fung, Y., Kwok, J. C., Lee, L., Chui, K. T., & U, L. H. (2020). Automatic Question Generation System for English Reading Comprehension. In Communications in computer and information science (pp. 136–146). https://doi.org/10.1007/978-981-33-4594-2_12

Gao, Y., & Passonneau, R. J. (2021). Automated assessment of quality and coverage of ideas in students’ Source-Based writing. In Lecture Notes in Computer Science (pp. 465–470). https://doi.org/10.1007/978-3-030-78270-2_82

Gough, D., Oliver, S., & Thomas, J. (2017). An introduction to systematic reviews (2nd Edition). SAGE.

Houston, A. B., & Corrado, E. M. (2023). Embracing ChatGPT: Implications of Emergent Language Models for academia and libraries. Technical Services Quarterly, 40(2), 76–91. https://doi.org/10.1080/07317131.2023.2187110

Huang, X., Zou, D., Cheng, G., Chen, X., & Xie, H. (2023). Trends, Research Issues and Applications of Artificial Intelligence in Language Education. Educational Technology & Society, 26(1), 112–131. https://www.jstor.org/stable/48707971

Jeon, J. H., & Lee, S. (2023). Large language models in education: A focus on the complementary relationship between human teachers and ChatGPT. Education and Information Technologies, 28(12), 15873–15892. https://doi.org/10.1007/s10639-023-11834-1

Jeon, J. H., Lee, S., & Choi, S. (2023). A systematic review of research on speech-recognition chatbots for language learning: Implications for future directions in the era of large language models. Interactive Learning Environments, 1–19. https://doi.org/10.1080/10494820.2023.2204343

Katsarou, E., Wild, F., Sougari, A., & Chatzipanagiotou, P. (2023). A Systematic Review of Voice-based Intelligent Virtual Agents in EFL Education. International Journal of Emerging Technologies in Learning (Ijet), 18(10), 65–85. https://doi.org/10.3991/ijet.v18i10.37723

Kuhail, M. A., Alturki, N., Alramlawi, S., & Alhejori, K. (2022). Interacting with educational chatbots: A systematic review. Education and Information Technologies, 28(1), 973–1018. https://doi.org/10.1007/s10639-022-11177-3

Li, H., & Graesser, A. C. (2021). The impact of conversational agents’ language on summary writing. Journal of Research on Technology in Education, 53(1), 44–66. https://doi.org/10.1080/15391523.2020.1826022

Li, H., Gobert, J. D., Dickler, R., & Morad, N. (2018). Students’ academic language use when constructing scientific explanations in an intelligent tutoring system. In Lecture Notes in Computer Science (pp. 267–281). https://doi.org/10.1007/978-3-319-93843-1_20

Liang, J., Hwang, G., Chen, M. A., & Darmawansah, D. (2021). Roles and research foci of artificial intelligence in language education: an integrated bibliographic analysis and systematic review approach. Interactive Learning Environments, 31(7), 4270–4296. https://doi.org/10.1080/10494820.2021.1958348

Lim, K., Song, J., & Park, J. (2022). Neural automated writing evaluation for Korean L2 writing. Natural Language Engineering, 29(5), 1341–1363. https://doi.org/10.1017/s1351324922000298

Lippert, A., Gatewood, J., Cai, Z., & Graesser, A. C. (2019). Using an Adaptive Intelligent Tutoring System to Promote Learning Affordances for Adults with Low Literacy Skills. In Lecture Notes in Computer Science (pp. 327–339). https://doi.org/10.1007/978-3-030-22341-0_26

Liu, M., Yi, L., Xu, W., & Liu, L. (2017). Automated essay feedback Generation and its impact on revision. IEEE Transactions on Learning Technologies, 10(4), 502–513. https://doi.org/10.1109/tlt.2016.2612659

Liu, X., Faisal, M., & Alharbi, A. (2022). A decision support system for assessing the role of the 5G network and AI in situational teaching research in higher education. Soft Computing, 26(20), 10741–10752. https://doi.org/10.1007/s00500-022-06957-5

Maqsood, S., Shahid, A., Afzal, M. T., Roman, M., Khan, Z., Nawaz, Z., & Aziz, M. H. (2022). Assessing English language sentences readability using machine learning models. PeerJ, 7, e818. https://doi.org/10.7717/peerj-cs.818

McNamara, D. S., Arner, T., Butterfuss, R., Ying, F., Watanabe, M., Newton, N., McCarthy, K. S., Allen, L. K., & Roscoe, R. D. (2022). ISTART: Adaptive Comprehension Strategy Training and Stealth Literacy Assessment. International Journal of Human-Computer Interaction, 39(11), 2239–2252. https://doi.org/10.1080/10447318.2022.2114143

Meurers, D. (2012). Natural language processing and language learning. In C. Chapelle (Ed.). Encyclopedia of applied linguistics. Wiley Blackwell.

Miaschi, A., Brunato, D., & Dell’Orletta, F. (2021). A NLP-based stylometric approach for tracking the evolution of L1 written language competence. Journal of Writing Research, 13(1), 71–105. https://doi.org/10.17239/jowr-2021.13.01.03

Minoofam, S. a. H., Bastanfard, A., & Keyvanpour, M. R. (2022). RALF: an adaptive reinforcement learning framework for teaching dyslexic students. Multimedia Tools and Applications, 81(5), 6389–6412. https://doi.org/10.1007/s11042-021-11806-y

Miranda, J. P., (2020). Assisting vocabulary acquisition and script writing skills using mobile-based kapampangan intelligent tutoring system. So, H. J. et al. (Eds.) Proceedings of the 28th International Conference on Computers in Education. Asia-Pacific Society for Computers in Education

Nehyba, J., & Štefánik, M. (2022). Applications of deep language models for reflective

writings. Education and Information Technologies, 28(3), 2961–2999. https://doi.org/10.1007/s10639-022-11254-7

New London Group (1996). A pedagogy of multiliteracies: Designing social futures. Harvard Educational Review, 66, 60-92.

Nguyen, L., Yuan, Z., & Seed, G. (2022). Building Educational Technologies for Code-Switching: Current Practices, Difficulties and Future Directions. Languages 7(3). https://doi.org/10.3390/languages7030220

Núñez, J. A., & Da Cunha, I. (2022). El impacto del uso de herramientas automáticas de

ayuda a la redacción en el proceso de escritura de estudiantes universitarios. Circulo De Linguistica Aplicada a La Comunicacion, 89, 131–144. https://doi.org/10.5209/clac.73906

Olave, G., Rojas, I., & Cisneros, M. (2013). Deserción universitaria y alfabetización académica. Educación y Educadores, 16 (3), 455-471.

Öncel, P., Flynn, L. E., Sonia, A. N., Barker, K. E., Lindsay, G. C., McClure, C. M., McNamara, D. S., & Allen, L. K. (2021). Automatic Student Writing Evaluation: Investigating the Impact of Individual Differences on Source-Based Writing. LAK21: 11th International Learning Analytics and Knowledge Conference. https://doi.org/10.1145/3448139.3448207

Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., & Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ (clinical Research Ed.), 372, n71. https://doi.org/10.1136/bmj.n71

Pegrum, M. (2019). Mobile lenses on learning: Languages and literacies on the move. Springer.

Pengel, N., Martin, A., Meissner, R., Arndt, T., Neumann, A. T., De Lange, P. J., &

Wollersheim, H. (2021). TecCoBot: Technology-aided support for self-regulated learning. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2111.11881

Pokrivčáková, S. (2019). Preparing teachers for the application of AI-powered technologies

in foreign language education. Journal of Language and Cultural Education, 7(3), 135–153. https://doi.org/10.2478/jolace-2019-0025

Popenici, S. A. D., & Kerr, S. (2017). Exploring the impact of artificial intelligence on teaching and learning in higher educa‑tion. Research and Practice in Technology Enhanced Learning. https://doi.org/10.1186/s41039‑017‑0062‑8

Rapp, C., & Kauf, P. (2018). Scaling Academic Writing instruction: Evaluation of a

scaffolding tool (Thesis writer). International Journal of Artificial Intelligence in Education, 28(4), 590–615. https://doi.org/10.1007/s40593-017-0162-z

Reyes, R. V., Garza, D., Garrido, L., De La Cueva, V., & Ramírez, J. (2019). Methodology for the implementation of virtual assistants for education using Google Dialogflow. In Lecture Notes in Computer Science (pp. 440–451). https://doi.org/10.1007/978-3-030-33749-0_35

Rus, V., & Ştefănescu, D. (2016). Non-intrusive assessment of learners’ prior knowledge in dialogue-based intelligent tutoring systems. Smart Learning Environments, 3(1). https://doi.org/10.1186/s40561-016-0025-3

Salas‐Pilco, S. Z., & Yang, Y. (2022). Artificial intelligence applications in Latin American higher education: a systematic review. International Journal of Educational Technology in Higher Education, 19(1). https://doi.org/10.1186/s41239-022-00326-w

Shehab, A., Elhoseny, M., & Hassanien, A. E. (2016). A hybrid scheme for Automated Essay Grading based on LVQ and NLP techniques. 2016 12th International Computer Engineering Conference (ICENCO). https://doi.org/10.1109/icenco.2016.7856447

Steuer, T., Filighera, A., Tregel, T., & Miede, A. (2022). Educational Automatic Question Generation Improves reading comprehension in non-native speakers: a Learner-Centric Case Study. Frontiers in Artificial Intelligence, 5. https://doi.org/10.3389/frai.2022.900304

Strobl, C., Ailhaud, É., Benetos, K., Devitt, A., Kruse, O., Proske, A., & Rapp, C. (2019). Digital support for academic writing: A review of technologies and pedagogies. Computers & Education, 131, 33–48. https://doi.org/10.1016/j.compedu.2018.12.005

Taele, P., Koh, J. I., & Hammond, T. (2020). Kanji Workbook: A Writing-Based Intelligent Tutoring System for Learning Proper Japanese Kanji Writing Technique with Instructor-Emulated Assessment. Proceedings of the . . . AAAI Conference on Artificial Intelligence, 34(08), 13382–13389. https://doi.org/10.1609/aaai.v34i08.7053

Tyen, G., Brenchley, M., Caines, A., & Buttery, P. (2022). Towards an open-domain chatbot for language practice. Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022). https://doi.org/10.18653/v1/2022.bea-1.28

UNESCO. (2023). Guidance for generative AI in education and research. UNESCO.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems.

Venegas, R. (2021). Aplicaciones de inteligencia artificial para la clasificación automatizada de propósitos comunicativos en informes de ingeniería. Revista Signos, 54(107), 942–970. https://doi.org/10.4067/s0718-09342021000300942

Vitartas, P., Midford, S., & Kanjere, A. (2019). Supporting student writing with an intelligent tutoring system for assignment checking. ASCILITE Publications, 588–592. https://doi.org/10.14742/apubs.2019.336

Wambsganss, T., Janson, A., & Leimeister, J. M. (2022). Enhancing argumentative writing with automated feedback and social comparison nudging. Computers & Education, 191, 104644. https://doi.org/10.1016/j.compedu.2022.104644

Whitelock, D., & Bektik, D. (2018). Progress and challenges for Automated Scoring and feedback Systems for Large-Scale Assessments. In Springer international handbooks of education (pp. 617–634). https://doi.org/10.1007/978-3-319-71054-9_39

Yan, D. (2023). Impact of ChatGPT on learners in a L2 writing practicum: An exploratory investigation. Education and Information Technologies, 28(11), 13943–13967. https://doi.org/10.1007/s10639-023-11742-4

Yang, S. and Stansfield, K. (2022) AI chatbot for Educational Service Improvement in the post-pandemic ERA: A case study prototype for supporting Digital Reading List, 13th International Conference on E-Education, E-Business, E-Management, and E-Learning (IC4E). https://doi.org/10.1145/3514262.3514289

Zhang, K., & Aslan, A. B. (2021). AI technologies for education: Recent research & future directions. Computers and Education: Artificial intelligence, 2, 100025

Zhang, R., Zou, D., & Cheng, G. (2023). A review of chatbot-assisted learning: pedagogical approaches, implementations, factors leading to effectiveness, theories, and future directions. Interactive Learning Environments, 1–29. https://doi.org/10.1080/10494820.2023.2202704

Ziegler, N., Meurers, D., Rebuschat, P., Ruíz, S., Moreno‐Vega, J. L., Chinkina, M., Li, W., & Grey, S. (2017). Interdisciplinary research at the intersection of CALL, NLP, and SLA: Methodological Implications from an input Enhancement Project. Language Learning, 67(S1), 209–231. https://doi.org/10.1111/lang.12227