Investigating the Dynamic Change of Pre- and In-service Teachers’ Experiences, Attitudes, and Perceptions through CS Autobiography Using Topic Modeling
Shan Zhang
University of Florida
zhangshan@ufl.edu
Hai Li
University of Florida
li.ha@ufl.edu
Hongming Li
University of Florida
hli3@ufl.edu
Anthony F. Botelho
University of Florida
abotelho@coe.ufl.edu
Maya Israel
University of Florida
misrael@coe.ufl.edu

ABSTRACT

K-12 Computer Science (CS) education has seen remarkable growth recently, driven by the increasing focus on CS and Computational Thinking (CT) integration. Despite the abundance of Professional development (PD) programs designed to prepare future CS teachers with the required knowledge and skills, there is a lack of research on how teachers’ perceptions and attitudes of CS and CT evolve before and after participating in these programs. To address this gap, our exploratory study aims to study the dynamics of pre-and in-service teachers’ experiences, attitudes, and perceptions towards CS and CT through their participation in a K-12 CS education micro-credential program. In this study, we employed topic modeling to identify topics that emerged from teachers’ written pre- and post-CS autobiographies, conducted statistical analysis to explore how these topics evolve over time and applied regression analysis to investigate the factors influencing these dynamics. We observed a shift in teachers’ initial feelings of fear, intimidation, and stress towards confidence, fun, and feeling competent in basic CS, reflecting a positive transformation. Regression analysis revealed that features, such as experienced teacher status and CT conceptual understanding, correlate with participants’ evolving views. These observed relationships highlight the micro-credential’s role in not only enhancing technical competency but also fostering an adaptive, integrative pedagogical mindset, providing new insights for course design.

Keywords

Computer Science Education, LDA, Topic Modeling, Educational Data Mining, Teacher Education

1. INTRODUCTION

K-12 Computer Science (CS) education has gained increased attention in the past decade from various stakeholders, including researchers, practitioners, and policymakers [5]. According to the national non-profit organization Code.org, to date, a total of 30 states require their schools to offer computer science, and 34 states have updated 38 policies to establish computer science as a foundational subject [7]. In addition, computational thinking (CT), as a fundamental 21st-century problem-solving skill [25], has been widely studied on integrating it into different K-12 subjects, such as math [26], science [24], and STEM Education [15].

While acknowledging the rapid statewide integration of CS and CT in K-12 education, a significant challenge remains, the scarcity of qualified teachers who can teach or integrate CS and CT into their classroom instruction [10]. To tackle this obstacle, scaling effective CS teacher professional development (PD) is crucial. As highlighted in the 2023 State of Computer Science Education report [7], 23 states have implemented state-level initiatives that aim to increase teachers’ CS exposure. Among different PD programs, studies have shown that Micro-credentials (MCs) are a timely and effective solution due to their flexibility and accessibility, providing competency-based, personalized, and self-directed learning experiences [51].

To examine the effectiveness of such PD programs, some studies have investigated teachers’ learning experiences in CS PD [1420]. For example, one study used qualitative coding methods to evaluate teachers’ beliefs and perceptions toward integrating CT in elementary science finding that teachers generally hold positive views [13]; another study found that teachers gained significantly greater knowledge about CS, self-efficacy, and positive perceptions after attending structured CS PD [19]. Although research delves into teachers’ perceptions and attitudes toward CS and CT integration, there is a noticeable gap in studies examining the evolution of these perceptions before and after participation in PD programs, as well as identifying factors impacting such changes. Moreover, most existing studies use qualitative techniques that rely on human annotators [20], such approaches become impractical for analyzing hundreds of teachers’ reflections.

To address this gap, we propose topic modeling, an unsupervised machine-learning modeling technique, to examine the thematic structure of a collection of textual data by analyzing word probability distribution across documents [83]. Considering topic modeling can deal with large-scale written essays in a low-cost, timely, and efficient manner, it has been widely implemented in education data mining [3] to analyze rich textual dialogue from Massively Open Online Course (MOOCs) discussion forum posts [9], identify distinct themes of study in course enrollments [17], describe the underlying topics from educational leadership research literature [23] and extract topics from the open-ended question in teacher self-assessment [4].

Building on the insights from previous studies, this exploratory study aims to use topic modeling to uncover topics that emerged from pre- and in-service teachers’ written CS autobiographies and understand how topics change over time, before and after participating in a CS micro-credential. Specifically, our research questions are: 1) How do the dynamics of pre- and in-service teachers’ experiences, attitudes, and perceptions toward CS and CT evolve through participation in a micro-credential program? 2) What factors influence their evolving views on CS?

2. METHOD

Table 2: Eight Topics Extracted from Topic Modeling
Topic#
Description
Top 10 Keywords
Two Examples
1 Teachers’ prior CS knowledge classroom, micro-credential, one, education, first, take, however, engineering, interested, biggest Remember first PC CD ROM single speed loaded top like record player; Remember first internet access CompuServe 14.4k modem
2 Teachers’ feelings toward learning and teaching CS feel, subject, like, lesson, able, would, resource, future, time, excited Scary looking code never seen trying figure, amount time needed to teach Scratch; Really enjoyed learning CS CT know to begin to understand the field
3 How CS education fits into students’ life help, course, taking, technology, much, still, thing, one, even, idea ComputerScience fit every day life right taking educational technology class; Fascinated technology appreciating problem-solving aspect computing
4 Teachers’ motivation to teach CS and take this micro-credential learn, think, different, life, could, teaching, concept, knowledge, credential, topic Current life, use computer science daily various task related coursework free time; Excited learn new skill new way integrate computer science multiple format
5 Teachers’ earliest experiences with computing and computer science. school, science, course, work, experience, class, taught, year, currently, teacher Thinking first experience relationship computer science, would high school; Introduction CS first time used computer 6 grade
6 Teachers’ perspectives on the promising potential of CS education child, field, well, thought, way, something, within, importance, activity, provide Computer science exciting undiscovered territory opportunity expand skill set knowledge future technological innovation within society; Find teaching computer science exciting allows student develop critical thinking problem-solving skill, essential today’s world
7 Teachers’ refection on CS skills and knowledge based on past classes and activities also, skill, learning, way, need, ct, want, use, many, language Believe better ‘unplugged’ exercise really illustrates need provide good input want good output; Better grasp define exactly CS, especially versus CT
8 Teachers’ long-term goal and their feelings toward this micro-credential thinking, computational, course, goal, basic, scratch, coding, learned, understanding, experience Hope integrate computer science English class in a meaningful way; Overall, micro-credential eye-opening experience

Participants in the study were pre-and in-service teachers enrolled in the CS Everyone Microcredential (MC), a six-week, self-paced asynchronous online program designed to teach them CS and CT knowledge and skills and prepare them to integrate CS and CT into their teaching1. At the beginning of MC, teachers were asked to complete a pre-survey to evaluate their attitude and perception towards CT and CS, alongside a CS autobiography (pre) documenting their initial perspective [18]. Upon completing six weeks of MC, they were asked to complete a second CS autobiography (post), with different prompts each time to facilitate structured reflections on their CS learning journey. More specifically, the pre prompt asked teachers to reflect on their journey with CS, from their earliest experiences to the present, encouraging them to discuss the number of classes they have taken, the challenges they’ve faced, the successes they’ve achieved, and how their attitudes toward CS have evolved over time. In addition, it also asks how CS education fits into their life, their short-term goals for this course, and their long-term goals for the future. Likewise, the post prompt poses similar questions, with a particular emphasis on the present. After excluding incomplete submissions, 54 teachers’ pre- & post-CS autobiographies were included in the analysis for RQ1 through text topic modeling. Additionally, the attitude and perception scale from the questionnaire was fully completed by 26 teachers, as depicted in Table 1. These data will be used to address RQ2, to explore factors influencing variations in relevant topic changes. In the Table 1, ‘Gender’ is represented as follows: male = 0, female = 1. Additionally, the variable ‘Experienced Teacher’ is denoted as follows: yes = 1, no = 0. The remaining features are measured on a 6-point Likert scale ranging from ‘Strongly Disagree’ (1) to ‘Strongly Agree’ (6).

To address RQ1, the initial step involves text preprocessing to obtain clean and effective text, followed by employing topic modeling techniques to derive the corresponding text topics. For text preprocessing, we started by cleaning up symbols such as parentheses, semicolons, and dashes. Then, we tokenized the text and removed stopwords and other irrelevant words [21]. Next, we performed lemmatization to normalize the word forms using part-of-speech tagging, ensuring consistent formatting for words with the same content. Lastly, we split teachers’ autobiographies from paragraphs into sentences, since we will conduct the topic modeling at the sentence level, which means each sentence will be associated with a specific topic.

After text preprocessing, we employed bag-of-words [27] and Latent Dirichlet Allocation (LDA) models for topic modeling [6]. To maintain topic consistency, the corpus for topic modeling includes text data from both the CS autobiography conducted pre- & post. Then, we determined the specific number of topics by calculating the coherence of different topic numbers, ranging from 3 to 15. This approach computes co-occurrence statistics among words within a topic by calculating the average pointwise mutual information (PMI) values for all word pairs within the topic, followed by taking a weighted average of these values as the coherence score for the topic [22]. After finalizing the number of topics (n), we aggregated the topics of sentences to obtain the count of different topics for each paragraph.

To answer RQ2, we first examined the dynamic changes in topics before and after. Specifically, n topics were assessed for their numerical variances post relative to pre. To explore the factors that influence changes in topics, we conducted a linear regression analysis, treating topic variations as dependent variables, and utilizing 18 features measured through a 6-point Likert scale from pre-survey as independent variables. Table 1 describes the 18 features in detail.

3. RESULTS

According to the segmentation of paragraphs into sentences, the average number of sentences per paragraph before was 19.68, whereas, after the MC, it dropped to 10.62, a decrease of 46%. This indicates a nearly halved dynamic reduction in sentence lengths. Upon experimentation, we found that when the number of topics is set to 8, the coherence of the topics is highest. Hence, we selected topics (n = 8). The themes of these topics were identified through keywords, original texts, and prompts, as shown in Table 2. For instance, Topic 1 emphasizes initial CS engagement with terms like ‘take’, ‘one’, and ‘first’. Topic 2 focuses on personal and temporal aspects with words like ‘able’, ‘feel’, and ‘lesson’. Topic 3 addresses practical steps and support with terms such as ‘technology’, ‘course’, and ‘help’. These visuals can help summarize the primary sentiments and educational experiences of the teachers.

The distribution of topics obtained for both the before and after CS autobiography is depicted at the top of Figure 1. It can be observed that, compared to the topics before the MC, there are more occurrences of topic 2 after the MC, while topic 5 is less frequent. The proportional differences in other topics are relatively minor. The bottom of Figure  1 shows the rate of change for each topic corresponding to each teacher’s autobiography, which is calculated as follows:

\begin {equation} \text {Change Rate}_i = \frac {\text {Mean}_{i,\text {after}} - \text {Mean}_{i,\text {before}}}{\text {Mean}_{i,\text {before}}} \times 100\% \end {equation}

where \(\text {Change Rate}i\) represents the rate of change for topic \(i\), \(\text {Mean}{i,\text {after}}\) and \(\text {Mean}{i,\text {before}}\) denote the mean occurrence of topic \(i\) in the after and before MC autobiographies, respectively. This calculation effectively captures the relative change in the prominence of each topic, normalized by its initial prevalence. It is important to note that when \(\text {Mean}{i,\text {before}}\) is zero, the change rate is set to NaN (Not a Number) to avoid division by zero.

Overall, all the average rates of change are negative, indicating that the number of all topics has decreased to varying degrees after the MC. It can be seen that Topics 5 (-73.6%), 1(-69.8%), and 7(-48.8%), in order, are the three topics with the largest change.

Based on these results, we conducted a further analysis to understand the trend of these changes. The analysis of topic distribution trends pre & post the MC, as summarized in Table 3, reveals several key observations. The entropy [2] decreased slightly, indicating a minor reduction in the diversity of the topic distributions. The Gini index [11] increased significantly from 0.4863 to 0.5805, suggesting an increase in the inequality of topic prominence. The dominant topic [16] shifted from Topic 5 before the MC to Topic 4 afterward, highlighting a change in the primary focus of the discussion. Additionally, the total variance and standard deviation of topic occurrences both decreased, indicating a reduction in the variability of topic mentions across the dataset [12].

Table 3: Topic Distribution Trends Between Pre & Post MC
Measure Pre Post
Entropy 2.0505 2.0240
Gini index 0.4863 0.5805
Dominant Topic Topic 5 Topic 4
Total Variance 5.2531 2.8086
Standard Deviation 2.2920 1.6759
The Change Rate in the Proportions of Each Topic in Pre \& Post
The Change Rate of Each from Pre to Post
Figure 1: The Change Rate in the Proportions of Each Topic in Pre & Post (Top) and the Change Rate of Each from Pre to Post (Bottom)

By using the changes in each topic as the dependent variables (the dynamic changes in topics between pre- and post autobiography serves as the regression target), with 18 features as the independent variables, we employed linear regression analysis and obtained the corresponding features associated with the changes in topics and summarized the significant relationships in Table 4.

From Table 4, ‘CSNotNeededForCareers’ appeared three times in topics 1, 6, and 8, all negatively correlated with topic changes, implying that individuals with a negative attitude toward CS knowledge are more likely to exhibit a greater decrease in these topics following the course (post) as compared to the beginning of the course (pre); that is, topics 1, 6, and 8 appear less often for these individuals following the MC. ‘CSIntegratedDisciplines’ appeared twice as associated with topics 1 and 6, negatively correlated with topic changes, suggesting that those with a positive view of integrating CS with other subjects are more likely to experience a more significant decrease in their inclusion of these topics following the course. ‘ComputingAcrossSubjects’ appeared twice as associated with topics 6 and 8, both positively correlated with topic changes, indicating that individuals holding a positive attitude towards integrating computing with other subjects are more likely to discuss these topics in their reflection following the MC. ‘ExperiencedTeacher’ appeared twice as associated with topics 6 and 8, both positively correlated with topic changes, implying that teachers with ample knowledge and teaching experience included more of these topics in their discussion following the MC.

Table 4: Summary of Significant Relationships Between Features and Topics
Feature Topic Coefficient

Note. *p<0.05, **p<0.01, ***p<0.001

CSNotNeededForCareers 1 -1.2388*
CSIntegratedDisciplines 1 -2.9059*
BasicCTConcepts 3 1.5757*
PastComputingExperience 3 -4.8039*
CSUnderstandable 6 -3.861**
CSNotNeededForCareers 6 -2.0343**
ComputingAcrossSubjects 6 7.4965*
CSTaughtSeparately 6 3.1438*
CSIntegratedDisciplines 6 -3.9126*
RecognizeCompConcepts 6 -1.9262*
ExperiencedTeacher 6 5.3777*
TeachPlanCTLessons 7 -2.3578*
CSNotNeededForCareers 8 -1.1891*
ComputingAcrossSubjects 8 6.9366*
ExperiencedTeacher 8 5.6642*

4. DISCUSSION

Regarding the first research question, our findings reveal significant shifts in pre- and in-service teachers’ engagement with CS and CT topics through their participation in the MC program. The changes in topic distribution, Gini index, entropy, and the emergence of Topic 4 as the new dominant topic post-MC suggest a more focused reflection on newly acquired knowledge and a concentration of focus areas aligned with the program’s goal. The smaller decreases in topics related to feelings, attitudes, and motivations towards CS compared to those related to previous experiences further support the MC’s effectiveness in shifting focus to affective and forward-looking themes. These findings align with previous research [51], which highlights the effectiveness of MC programs in enhancing pre-service teachers’ CS competencies, reshaping their perspectives and engagement with CS and CT. The targeted and confidence-boosting learning experience fosters a reflective, integrative teaching approach that promotes the application of CS and CT in educational settings.

For the second research question, the regression analysis reveals factors related to teachers’ evolving views on CS. Notably, experienced teachers and those supporting computing across subjects maintained engagement with future-oriented topics. In particular, the grasp of basic CT concepts correlated with a smaller drop in motivation (Topic 3), while previous computing experience was correlated with a greater decrease, highlighting the importance of conceptual CT understanding over mere technology use for sustaining enthusiasm. This echoes the findings of Yadav et al. [25]. Pre-MC grasp of basic CT concepts correlated with a smaller drop in motivation, while past computing experience correlated with a larger decrease, highlighting the importance of conceptual CT understanding over mere use for sustaining enthusiasm. These findings emphasize the role of prior knowledge, teaching experience, and interdisciplinary perspectives in influencing the evolving views of participants about CS.

Granular analysis at the sentence level extends the common paragraph-level analysis in existing topic modeling studies in education [91723], providing a more nuanced understanding of participants’ perspective shifts. This study’s innovative approach to analyzing autobiographies at the sentence level allows for a granular understanding of shifts in participants’ perspectives. Furthermore, the regression analysis reveals that factors such as acknowledging CS’s importance (‘CSNotNeededForCareers’), experienced teacher status (‘ExperiencedTeacher’), and CT conceptual understanding (‘BasicCTConcepts’) emerged as influential in shaping participants’ evolving views. Finally, this study contributes insights into the design of professional development programs that equip educators to navigate the evolving landscape of CS education, highlighting the importance of targeted learning experiences.

5. CONCLUSION AND FUTURE WORK

This study demonstrates the effectiveness of the MC program in positively transforming pre- and in-service teachers’ perceptions and engagement with CS and CT. The topic modeling analysis reveals a shift in focus from prior CS experiences to newly acquired knowledge, skills, and attitudes, as evidenced by the decrease in topics related to previous experiences and the emergence of motivation to learn and teach CS as the dominant theme post-MC. The successful application of topic modeling at the sentence level to assess teachers’ changes in experiences, attitudes, and perceptions towards CS education over time highlights the feasibility and innovativeness of this approach, addressing the limitations of traditional human coding methods. Regression analysis highlights the influence of features in shaping the evolving views of the teachers.

The current study has several limitations. First, the regression analysis was conducted on a small sample size of only 26 teachers, which may limit the generalizability of the findings. Second, the different writing prompts used for pre- and post-autobiography might have influenced the change of topic. Future studies should consider larger data samples and use consistent prompts to minimize potential confounding effects, which allows for the inclusion of content and can provide a more nuanced comparison of the dynamic changes in themes.

6. ACKNOWLEDGMENTS

We would like to thank Griffin Catalyst, NSF (2331379, 1903304, 1822830, 1724889), as well as IES (R305B230007), Schmidt Futures, MathNet, and OpenAI for their support of this work.

7. REFERENCES

  1. I. A. Bal, F. Alvarado-Albertorio, P. Marcelle, and C. T. Oaks-Garcia. Pre–service teachers computational thinking (ct) and pedagogical growth in a micro–credential: A mixed methods study. TechTrends, 66(3):468–482, 2022.
  2. B. K. Baradwaj and S. Pal. Mining educational data to analyze students’ performance. arXiv preprint arXiv:1201.3417, 2012.
  3. J. Boyd-Graber, Y. Hu, D. Mimno, et al. Applications of topic models. Foundations and Trends® in Information Retrieval, 11(2-3):143–296, 2017.
  4. D. Buenano-Fernandez, M. Gonzalez, D. Gil, and S. Luján-Mora. Text mining of open-ended questions in self-assessment of university teachers: An lda topic modeling approach. Ieee Access, 8:35318–35330, 2020.
  5. Q. Burke, C. Angevine, C. Proctor, J. Weisgrau, and K. A. O’Donnell. Empowering teachers in computational thinking through educator microcredentials. Professional Development for In-Service Teachers: Research and Practices in Computing Education, page 341, 2022.
  6. I.-C. Chang, T.-K. Yu, Y.-J. Chang, and T.-Y. Yu. Applying text mining, clustering analysis, and latent dirichlet allocation techniques for topic classification of environmental education journals. Sustainability, 13(19):10856, 2021.
  7. Code.org, CSTA, ECEP Alliance. 2023 state of computer science education, 2023. [Online; accessed March 11, 2024].
  8. M. Cutumisu and Q. Guo. Using topic modeling to extract pre-service teachers’ understandings of computational thinking from their coding reflections. IEEE Transactions on Education, 62(4):325–332, 2019.
  9. A. Ezen-Can, K. E. Boyer, S. Kellogg, and S. Booth. Unsupervised modeling for understanding mooc discussion forums: a learning analytics approach. In Proceedings of the fifth international conference on learning analytics and knowledge, pages 146–150, 2015.
  10. M. Israel, J. N. Pearson, T. Tapia, Q. M. Wherfel, and G. Reese. Supporting all learners in school-wide computational thinking: A cross-case qualitative analysis. Computers & Education, 82:263–279, 2015.
  11. L. Jiang, H. Chen, L. Pinello, and G.-C. Yuan. Giniclust: detecting rare cell types from single-cell gene expression data with gini index. Genome biology, 17:1–13, 2016.
  12. R. W. Katz and B. G. Brown. Extreme events in a changing climate: variability is more important than averages. Climatic change, 21(3):289–302, 1992.
  13. D. J. Ketelhut, K. Mills, E. Hestness, L. Cabrera, J. Plane, and J. R. McGinnis. Teacher change following a professional development experience in integrating computational thinking into elementary science. Journal of science education and technology, 29:174–188, 2020.
  14. R. Kong and G. K. Wong. Teachers’ perception of professional development in coding education. In 2017 IEEE 6th International Conference on Teaching, Assessment, and Learning for Engineering (TALE), pages 377–380. IEEE, 2017.
  15. I. Lee, S. Grover, F. Martin, S. Pillai, and J. Malyn-Smith. Computational thinking from a disciplinary perspective: Integrating computational thinking in k-12 science, technology, engineering, and mathematics education. Journal of Science Education and Technology, 29:1–8, 2020.
  16. D. J. Lemay, C. Baek, and T. Doleck. Comparison of learning analytics and educational data mining: A topic modeling approach. Computers and Education: Artificial Intelligence, 2:100016, 2021.
  17. B. Motz, T. Busey, M. Rickert, and D. Landy. Finding topics in enrollment data. International Educational Data Mining Society, 2018.
  18. C. Mouza, A. Yadav, and A. Ottenbreit-Leftwich. Preparing pre-service teachers to teach computer science: Models, practices, and policies. IAP, 2021.
  19. G. Nugent, K. Chen, L.-K. Soh, D. Choi, G. Trainin, and W. Smith. Developing k-8 computer science teachers’ content knowledge, self-efficacy, and attitudes through evidence-based professional development. In Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education Vol. 1, pages 540–546, 2022.
  20. T. E. Reding and B. Dorn. Understanding the" teacher experience" in primary and secondary cs professional development. In Proceedings of the 2017 ACM conference on international computing education research, pages 155–163, 2017.
  21. R. Rivera-Bergollo, S. Baral, A. Botelho, and N. Heffernan. Leveraging auxiliary data from similar problems to improve automatic open response scoring. In Conference for Educational Data Mining, 2022.
  22. S. Syed and M. Spruit. Full-text or abstract? examining topic coherence scores using latent dirichlet allocation. In 2017 IEEE International conference on data science and advanced analytics (DSAA), pages 165–174. Ieee, 2017.
  23. Y. Wang, A. J. Bowers, and D. J. Fikis. Automated text data mining analysis of five decades of educational leadership research literature: Probabilistic topic modeling of eaq articles from 1965 to 2014. Educational administration quarterly, 53(2):289–323, 2017.
  24. K. P. Waterman, L. Goldsmith, and M. Pasquale. Integrating computational thinking into elementary science curriculum: An examination of activities that support students’ computational thinking in the service of disciplinary learning. Journal of Science Education and Technology, 29(1):53–64, 2020.
  25. A. Yadav, H. Hong, and C. Stephenson. Computational thinking for all: Pedagogical approaches to embedding 21st century problem solving in k-12 classrooms. TechTrends, 60:565–568, 2016.
  26. H. Ye, B. Liang, O.-L. Ng, and C. S. Chai. Integration of computational thinking in k-12 mathematics education: A systematic review on ct-based mathematics instruction and student learning. International Journal of STEM Education, 10(1):3, 2023.
  27. Y. Zhang, R. Jin, and Z.-H. Zhou. Understanding bag-of-words model: a statistical framework. International journal of machine learning and cybernetics, 1:43–52, 2010.

1This study was conducted with de-identified data shared and analyzed under the University of Florida IRB202202047. Consent forms were obtained before data collection.