skip to Main Content

Text Mining and Linguistic Analysis in Educational Contexts

Text mining is an Artificial Intelligence technology that uses computational linguistics and machine learning techniques in analysing large amounts of unstructured text data to extract interesting information, patterns and insights hidden in texts. Using these computational techniques to analyse written and oral interactions in digital learning environments provides educators with valuable insights to enhance learning and teaching.

What are the potential benefits of text mining in education?

Rapid expansion of online learning, as pointed out by Ferreira-Mello et al. (2019), has produced new opportunities (as well as challenges) to mine text data for educational purposes (p.1). Educational text mining is currently being used in assisting instructors with the evaluation of students’ performance, and the development of automated methods to support students and instructors has become a fundamental educational goal (p.10). Today’s online education environments generate massive text data from discussion platforms, student conversations, chats, blogs, essays and assignments. Take as an example the enormous amount of student essays that a teacher may need to evaluate and the complex task of analysing them for measures of critical thinking (p.7). The applications of text mining for essay analysis range from automatic detection of stylistic errors, plagiarism and analytical skills to automated scoring and feedback provision to support students with their writing skills. Similar and other educational text data (such as discussion forum transcripts, exam questions, course contents, chats and assignments) can be used to analyse the dynamics of participation and collaboration among students, automate the analysis of student writing to provide formative and summative feedback, automatically generate educational content, improve recommendation systems and detect students’ emotions, attitudes and opinions (Ferreira-Mello et al., 2019). As Artificial Intelligence technology is rapidly progressing, it remains an exciting open challenge to develop and implement new models, methods and tools for educational data mining.

How does text mining provide a means for evaluating learning processes?

In recent years, much research has been directed at automating the analysis of online learning. With an emphasis on the importance of establishing “pedagogically relevant measures that can aid the development of distinct, automated analysis systems”, O’riordan et al. (2016) examine four pedagogical content analysis methods—the Digital Artefacts for Learning Engagement (DiAL-e) framework, Bloom’s Taxonomy, Structure of Observed Learning Outcomes (SOLO) and Community of Inquiry (CoI)—to conclude that “computational approaches to pedagogical analysis may provide useful insights into learning processes” and students’ cognitive activity (p.1). Building on a similar premise, the language students use in their writing and in online interactions can open a gateway to their cognitive, social, and emotional processes as well as to their critical thinking and problem solving skills. An automated analysis of these processes has now been made possible by the affordances of technology through text mining, computational linguistics and machine learning techniques.

How can text mining support student wellbeing and social interactions?

In addition to providing a means for evaluating learning processes, applying text mining to educational contexts can further assist educators in supporting the wellbeing of students. Automated textual analysis can help identify the sentiment of texts produced by students as well as detect and predict psychological, behavioural and emotional disorders (such as bullying and aggressive behaviour, psychosis and depression) that can be manifested in language use. Automated identification of personality traits and social interaction patterns is another affordance of text mining. It can be used, for instance, in adaptive learning environments to assist teachers in deciding how to group students for collaborative projects based on predictions about how any selected cohort of students might collaborate and perform together.

Dowell et al. (2019) have devised an interesting computerised framework of this kind for analysing student interaction patterns in collaborative learning activities. This model gives valuable insights into the social roles that emerge during group communications by taking into account a set of metrics built on the socio-cognitive processes that can be modelled through language and discourse. Dowell et al.’s work can inform educators on how students participate in learning activities. It can provide an insight, among others, into the extent students are able to contribute new information, build on their teammates’ ideas, respond to contributions of others, and invoke responses from their partners.

What does text analysis technology hold for the future of education?

As new methods of text mining as well as its more common applications (such as text similarity calculation, information extraction, text clustering and sentiment analysis) continue to yield promising results in the educational context, exciting opportunities are opened for educational stakeholders to use technology to aid learning and teaching practices. Drawing on M. A. K. Halliday’s linguistic theory, Nguyen (2017) has aptly noted that “the examination of meanings created by learners is also the examination of learners’ construction of knowledge through their linguistic choices” (p.52). Measuring students’ learning progression, conceptual understanding and knowledge building over time through an automated analysis of written and verbal utterances is an exciting, though challenging, application of computerized textual analysis with a potential to innovate educational support and personalisation of learning.

With these and the many other applications of educational text mining, we can see how the technological advances that make human language intelligible to machines continue to push education beyond its traditional boundaries. The outputs of these technological developments need to be seamlessly integrated into educational environments so that they can make sense to teachers and, from there, be readily incorporated into teaching and learning practices.

Read more and references

Dowell, N. M., Nixon, T. M., & Graesser, A. C. (2019). Group communication analysis: A computational linguistics approach for detecting sociocognitive roles in multiparty interactions. Behavior research methods, 51(3), 1007-1041. https://doi.org/10.3758/s13428-018-1102-z

Ferreira‐Mello, R., André, M., Pinheiro, A., Costa, E., & Romero, C. (2019). Text mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(6), e1332. https://doi.org/10.1002/widm.1332

Nguyen, T. H. (2017). EFL Vietnamese learners’ engagement with English language during oral classroom peer interaction [PhD thesis, University of Wollongong]. University of Wollongong Thesis Collection. https://ro.uow.edu.au/cgi/viewcontent.cgi?article=1107&context=theses1

O’RiordanT., MillardD. E., & SchulzJ. (2016). How should we measure online learning activity?. Research in Learning Technology, 24. https://doi.org/10.3402/rlt.v24.30088

Back To Top

Pin It on Pinterest

Share This