Let's teach Kibot: Discovering discussion patterns between student groups and two conversational agent designs
Funding information
This research is funded by University of California, Irvine's Center for Teacher Development and Professional Practice
Abstract
Conversational agents can deepen reasoning and encourage students to build on others' knowledge in collaborative learning. Embedding agents in group work, however, presents challenges where groups may ignore the agents, and this calls for designs where students perceive agents as learning partners. This study examines group interactions with two text-based agents (ie, chatbots) that posed as an expert and a less knowledgeable peer in a high school marine biology lesson. Student messages (N = 1764) from 18 groups (52 students ages 14–15) received codes for reasoning, building on prior ideas, and responsiveness to the agents. Results indicate no differences between agents in how often each discussion move occurred. Interestingly, sequential pattern mining suggests that the less-knowledgeable-peer agent prompted groups to show questioning and building on others' ideas, similar to how students may act as peer tutors to the agent. Meanwhile, sequences with the expert agent resembled student-teacher exchange, where groups responded to the agent's nudges and then provided reasoning. Findings illustrate the affordances of embedding humanized features in technology designs to promote discussion.
Practitioner notes
What is already known about this topic
- Conversational agents can facilitate group discussions, but can get abused or ignored by student groups.
- To engage students, agent designs can simulate the characteristics of familiar classroom figures, such as peers or teachers.
- There are limited explorations of how student groups adapt their interactions to different agent designs in collaborative settings.
What the paper adds
- Illustrations of the utilities of diversifying agent designs in collaborative learning.
- Insights into the unique interaction patterns that student groups displayed to different designs, such as questioning and building on others' ideas with a less-knowledgeable-peer agent.
Implications for practice and/or policy
- Embedding signs of humanness in classroom facilitation and technology design can foster responsiveness among users.
- Pedagogy can consider adaptive designs to promote interaction sequences that contribute to learning (eg, questioning, expanding on prior ideas) at opportune moments.
INTRODUCTION
Collaborative knowledge construction, where students negotiate ideas to deepen group understanding, plays a central role in the exchange and production of science knowledge (Scardamalia & Bereiter, 1993). However, simply inviting students to discussion does not always yield productive learning (Dillenbourg, 2002). Conversational agents—dialogue systems that provide learning support through natural language interactions—can serve as a potential solution to facilitate discussion (Adamson et al., 2014; Diziol et al., 2010; Dyke, Adamson, et al., 2013; Walker et al., 2014).
To make agents more engaging to students, designers have experimented with agents' appearances and linguistic features to simulate characteristics from familiar figures, such as mentors and tutees (Chen et al., 2020; Graesser, 2016; Kim & Baylor, 2006). These profiles may prime users to display respective conversational norms, as if they were getting hints from a knowledgeable tutor or giving help to an inexperienced tutee (Graesser, 2016).
However, researchers have mostly considered agent designs in individual settings, where a student interacts with one or several agents (Biswas et al., 2016; Kim & Baylor, 2016). Different from individual contexts, researchers have observed that student groups may ignore or abuse the agents (Kumar et al., 2010). Exploring variations in groups' interactions with agent profiles thus sheds light on how to best support groups' social and learning needs.
This study explored these interactions in a within-subject design, where 18 student groups (n students = 52, ages 14–15) in 9th grade science classes chatted with one another and with two text-based agents: a less knowledgeable peer and an expert. Students' messages received codes for causal reasoning, transactive exchange where students built on prior ideas, and responsiveness to the agents. Analyses examined how student groups' frequencies and sequences of interactions differed between agents. This analytical focus built on the assumption that the frequencies and sequences of discussion moves were related to knowledge construction and knowledge acquisition in science classrooms (Chen et al., 2017; Wise & Chiu, 2011).
THEORETICAL BACKGROUND
Designing effective conversational agents (CAs) calls for understanding how to promote knowledge construction and human-computer interactions. The current research thus builds on frameworks in knowledge construction (Hewitt & Scardamalia, 1998; Scardamalia & Bereiter, 1993; Weinberger & Fischer, 2006) and examples of CAs in education (eg, Adamson et al., 2014; Dyke, Adamson, et al., 2013; Dyke, Howley, et al., 2013).
Reasoning and transactivity in knowledge construction
Knowledge construction is a key facet of science discussion. This process involves students actively sharing ideas and building on others' work to advance collective understanding (Hmelo-Silver & Barrows, 2008; Hoadley & Kilner, 2005; Muhonen et al., 2017; Scardamalia & Bereiter, 1993; Stahl et al., 2014). Students negotiate a fit between their ideas and those of others, with a focus on explanations and causal mechanisms (Hewitt & Scardamalia, 1998; van Aalst, 2009). In the process, they construct reasoning and extend others' ideas through social exchange (Weinberger & Fischer, 2006).
Students develop and balance reasoning in working on communal tasks (Weinberger & Fischer, 2006). For example, when constructing a scientific concept map together, they assert and reformulate ideas about connections among concepts, challenge others to find evidence, or pose questions to elicit explanations. Exploring how students make claims, employ reasoning, and use questions to guide queries provides insights into jointly constructed understanding (Hmelo-Silver & Barrows, 2008).
Knowledge construction is an inherently social process, with many opportunities for students to build on others' contributions (Hewitt & Scardamalia, 1998; Weinberger & Fischer, 2006). In their seminal work on computer-supported knowledge-building environments, Hewitt and Scardamalia (1998) called for systems that provided access to different modes of participation and idea artifacts, such as written text, diagrams and drawings. These artifacts can detail idea evolution and encourage students to acknowledge, extend, and challenge others' queries.
The extent to which students refer to prior ideas in discussion is termed transactivity (Teasley, 1997). Students may showcase lower-order transactivity, where they externalize their own ideas or request clarification without elaboration of others' reasoning. Meanwhile, higher-order transactivity involves integration and counter-arguments of others' contributions (Teasley, 1997; Wen et al., 2016). Effective knowledge construction environments should foster higher-order transactivity, which has been linked to enhanced individual and group learning (Adamson et al., 2014; Rosé et al., 2008; Wen et al., 2016).
The responsibility to promote knowledge construction does not rest solely with the learners, but also on learning facilitators (Gillies, 2014; Scardamalia & Bereiter, 1993). Facilitators can find the appropriate task to call on certain students to initiate discussion moves. They can invite collective reasoning and support students to build on previous ideas (Hmelo-Silver & Barrows, 2008; Muhonen et al., 2017). Emergent research has explored how CAs can adopt these facilitative roles.
Designs of conversational agents
CAs can facilitate knowledge construction in real time (Adamson et al., 2014; Dyke, Adamson, et al., 2013; Tegos & Demetriadis, 2017). Using natural language understanding, the agents process students' unfolding conversations to propose relevant prompts. Agents have showed promise in encouraging transactive exchange and learning (Howley et al., 2013; Kumar et al., 2011). However, researchers have also documented cases where groups abused or ignored the agents (Kumar et al., 2010). CAs built on mechanical, task-oriented paradigms may not fit into group interactions that also value socio-emotional exchange (Kumar et al., 2010). Socially capable agents that display verbal cues such as self-disclosure, reassurance, and complements have been associated with effective support in tutoring, compared to agents without these features (Kumar et al., 2010; Romero et al., 2017).
These design cues build on understanding of the Computers-Are-Social-Actors (CASA) paradigm (Nass et al., 1994). The CASA paradigm suggests that users of computer systems often associate the computers with characteristics traditionally reserved for human partners, such as trust, reciprocity, and competence (Gong, 2008; Pearson et al., 2006; Zhou et al., 2019). Users enact social norms, for example, reciprocating help if the computers offer support or stereotyping that an agent is extroverted based on its assumed tone and gender (Kim et al., 2019; Lee & Nass, 2003; Moon & Nass, 1996; Nass & Moon, 2000; Nass et al., 1997).
Applying the CASA paradigm to learning settings offers design opportunities for CAs. The verbal and physical cues of the agents can align with stereotypes of learning partners in human-human interactions. Researchers have particularly explored the potential of agents as a peer and an expert (Biswas et al., 2016; Chen et al., 2020; Graesser, 2016; Kim & Baylor, 2006; Rosenberg-Kima et al., 2008). A peer agent possesses comparable knowledge levels and discourse to those of the learners (Kim & Baylor, 2006). The profile builds on the similarity-attraction effect (Byrne & Nelson, 1965), which suggests that users would find agents that resemble their appearance, knowledge, or interest more appealing. A related design is the learn-through-teaching model (Biswas et al., 2016), where students acquire knowledge through explaining and giving feedback to an agent with less knowledge than them.
Meanwhile, an expert or tutor agent appears to possess more advanced expertise. Students may perceive this design as more trustworthy and competent, and consequently treat the agent as a teacher (Liew et al., 2013). Prior work has leveraged this dynamic to design the agents to provide guidance and feedback to students (Biswas et al., 2016; Kim, 2007).
Students' interactions vary with agent designs
Students may demonstrate different behaviors and learning outcomes when interacting with different agents (Heidig & Clarebout, 2011). For example, young learners ages 5–7 showed slightly more affective facial displays when interacting with a robot that behaved as a peer, while acquiring more vocabulary from interactions with a tutor robot (Chen et al., 2020).
Emergent work has explored interaction sequences between students and agents (Howley et al., 2013; Jeong et al., 2008). The focus on interaction processes overlaps with knowledge construction's emphasis on cycles of idea formulation and refinement (Scardamalia & Bereiter, 1993). Howley et al. (2013) examined groups' exchange in three conditions: an agent that provided direct nudges for transactivity, an agent with indirect nudges and no agent. The sequence analyses revealed consistent off-task periods in the indirect nudge condition, when the agents' prompts were untimely. This finding called for redesigning the agent to promote transactive exchange at more opportune moments (Howley et al., 2013).
Similar analyses of interaction patterns can help to explore how the peer versus expert agent contributes to knowledge construction. If students treat the agents as social partners (Nass et al., 1994), they might demonstrate behaviors similar to how they interact with teachers versus peers in tutoring (Almasi, 1995; King et al., 1998; Roscoe & Chi, 2007). Students verbalize thinking through questions more frequently when assuming the role of peer tutors, compared to when they respond to teachers (Almasi, 1995; King et al., 1998). Questioning in peer tutoring can guide the peer tutors to provide reasoning and explanation (Berghmans et al., 2013).
Interaction sequences may also vary with the assumed roles. For instance, questioning in teacher-led discussions assumes that teachers are the knowledge providers, and students simply respond to the teachers' prompts (Scardamalia & Bereiter, 1991). In contrast, when students give hints to a peer, the peer's answers may trigger different response sequences from the student tutors, from feedback and explanation to further questioning.
RQ1. What discussion moves do student groups leverage in interactions with different agents (less knowledgeable peer versus expert)? How do groups' discussion moves differ between agent conditions?
RQ2. How do groups' discussion sequences differ between agent conditions?
METHODOLOGIES
Study setting
The current study is part of a multi-year partnership between a local state park, education and biology researchers, and local teachers in southwestern United States. The agents were integrated into the state park's environmental programme, entitled Marine Science Exploration (MSE). The MSE program consisted of eight lessons to engage high school students in a participatory science curriculum that anchored scientific concepts in local contexts (McKinley et al., 2017). The MSE program centred around the state park's marine protected area, which was created to reduce threats from human-driven ocean acidification, pollution, and overfishing. During the program, students learned about systems elements within the local marine ecosystem. In observing those lessons, the park educators and teachers noticed that not all students participated equally in discussions. Students also focused on linear links (eg, fish eat planktons), instead of complex system processes (eg, ocean acidification influences phytoplankton's activities, disrupting food chains). The author and the state park developed the text-based CAs (Kibot less-knowledgeable-peer and Kibot expert) to address these challenges.
Participants
Participants were 18 groups of two to three (52 students, two 9th-grade classes) taught by the same science teachers in a public high school in southwestern U.S. The school served a diverse student population that was 46% White, 38% Latinx and 9% Asian in 2019–20. Students were participating in the MSE programme during their science class time. The school had one-to-one laptop policy, and students were familiar with using chat windows. The 60-min lesson occurred when the school was enacting social distancing due to the COVID-19 pandemic. Students mostly relied on the chat interface on their schools' laptops instead of verbal interactions as they interacted with the agents.
Learning task
The agents were embedded into a modeling lesson early in the MSE programme. After learning about the marine protected area, students worked in groups of two or three to create a concept map of the park's marine ecosystems. Students focused on how changes in an element affected other elements and processes, for example, building connections between kelp, habitat and biodiversity. Students communicated through a chat window (Figure 1). Their messages were compared to an underlying “expert” map (described further below). To establish content validity, five park educators and six university marine biology researchers collaborated through three iterations to refine the expert map. Matched connections between students' and experts' answers appeared on the web interface next to students' chats (panel a, Figure 1). The agents kept track of the chats and provided prompts to help groups reflect on missing connections.

On average, groups interacted with each agent for 12.5 min, SD = 1.2. The interaction order with the agents was randomized to minimize practice effect with the second agent. Half of the student groups started with the less-knowledgeable-peer agent, while the other half started with the expert agent, and switched halfway through the lesson. Online Appendix A1 illustrates the switching from the expert to the less-knowledgeable-peer agent, where the agent's appearance and linguistic styles change accordingly.
Agent's prompt designs
Focus on reasoning
The agents' prompts focus on dimensions of scientific reasoning: element, evidence, and causal coherence (Kang et al., 2014; Nguyen & Santagata, 2020). Element captures living organisms, (eg, fish), non-living components (eg, sun), and processes (eg, global warming). Evidence describes how students use empirical data or experiences to support ideas. Finally, causal coherence refers to how students connect concepts to scientific ideas in a logical chain of reasoning, such as emphasizing feedback loops within systems.
The agents use natural language processing to parse students' messages and select the appropriate responses that promote element, evidence, and causal coherence. Using the Python's package “spaCy” dependency parser (Honnibal & Montani, 2017), the agents segment students' chat into subjects and objects. For example, a message such as “fish decreases plankton” gets parsed into “fish” (the subject) and “plankton” (the object).
The agents compare students' concept maps (the subject-object pair) and the expert maps using two algorithms: (1) fuzzy matching with Levenshtein-based string similarity and (2) word embedding. Levenshtein distance measures the distance between a student answer versus the expert map by counting the number of edits to turn one string into the other. The fuzzy matching disregards the punctuation and word order.
A limitation to string similarity approaches is that they do not capture cases where terms fall under similar domains but may not be exactly similar. For this, the agents calculate similarity between students' and expert maps using word embeddings with spaCy (Honnibal & Montani, 2017). The subject and object from student answers are turned into high-dimensional vectors that capture their contexts (ie, the words they are surrounded by). Shorter distances between a vector that represents a student's term and a vector for the expert answer would indicate higher similarity. If the semantic similarity is above 0.80, the terms are considered similar to the experts' terms. The 0.80 threshold was determined through user testing and calculating the range of semantic similarity values between users' responses and expert terms.
If there exists a missing link between a term that students mention and an expert's term, the agents provide hints for the missing terms. If students miss the hints, the agents follow up with a prompt that explicitly mentions the link between the target element and existing elements in students' concept maps. To foster evidence use, the agents ask students to provide reasoning. To promote causal coherence, the agents order the hints so that students finish all connections among existing terms, before moving onto another. Online Appendix A2 outlines these agents' prompts.
Focus on transactivity
The agents utilize transactive talk moves (Dyke, Howley, et al., 2013; Kumar et al., 2010). After every five talk turns, agents invite students to explain, elaborate on a previous statement (made by themselves or others), or discuss why they agree or disagree with peers. Within group, based on the chat counts per student, the agents direct the transactive nudges at those who have participated the least in the conversation. Figure 1 provides example talk.
The less-knowledgeable-peer and expert agents
Following examples from one-to-one agent-student interactions (Biswas et al., 2010; Chen et al., 2020; Kim & Baylor, 2006), this study tests two key designs: less-knowledgeable-peer and expert agents. The agents send the same prompts, but these prompts differ in their wording and agents' expressions to display different emotions and competence. Competence and emotions have been linked to students' affective experiences and cognitive engagement (Kim & Baylor, 2006).
The less-knowledgeable-peer agent
The less-knowledgeable-peer agent builds on the learn-through-teaching paradigm to represent a peer with lower levels of knowledge (Biswas et al., 2010). The agent explicitly states that it is learning from student chat. To simulate the agent's learning, the chat includes a “knowledge bar” that gets updated with each new link students make in the concept map. The agent uses colloquial expressions to ask students to explain concepts in ways that benefit its learning and sends animated texts when students build a correct systems connection. The agent shows multiple social expressions, such as excitement, confusion and appreciation. The animations change with these emotions, for instance, showing a frown when confused.
The expert agent
The expert agent is portrayed as a scientist. The agent enters the group chat by asking students if they are ready to learn. This agent can also answer students' questions to define terms students may include in the concept map. The expert agent speaks in a formal tone and keeps the same expression throughout the interactions with the students. To avoid situations where students abuse the expert agent for hints, if students ask for hints or questions beyond term definitions, both agents encourage students to discuss the questions with their group members instead of providing answers.
To validate the agent designs, the author conducted semi-structured focus groups with five user groups of three to four each (n = 17 participants, each session lasted about 15 min). Participants were a convenience sample of high school (n = 4), college students (n = 2), college graduates (n = 7) and park educators (n = 4). Participants interacted with each agent for 10 min, and rated the agents on their role (peer or expert), emotion and competence. All participants identified the agents' characteristics as intended. Observing how test users interacted with the agents surfaced additional talk moves to improve the conversation flow, including small talk (eg, discussion about the agents' favorite sports) and opinion conformity (eg, “I love that! What do others in the groups think”). Online Appendix A2 presents examples of all agents' talk moves, grouped under Reasoning, Transactivity, Concept Definitions and Social Expressions.
Data sources
The main data source came from students' chat logs. Each row of the log consisted of a message, group ID, student username, timestamp, and agent condition (less-knowledgeable-peer or expert). The agents' messages were retained for context, but were not included in the analyses.
There were two coding iterations for discussion moves. The first iteration built on prior frameworks in knowledge construction (Fiacco & Rosé, 2018; Teasley, 1997; Weinberger & Fischer, 2006). Each message was coded for how students constructed their reasoning and engaged in transactive exchange. The reasoning dimension describes how students produce claims, reasoning to warrant claims, and questions to guide group discussion. Reasoning consists of evidence from scientific facts, observations, or personal experiences, with logical connections for how such evidence may support claims (Nguyen & Santagata, 2020; Weinberger & Fischer, 2006).
The transactive dimension accounts for the social aspects of knowledge construction and consists of transactive and externalizing moves (Roschelle & Teasley, 1995; Weinberger & Fischer, 2006). Transactive acts can be broken down into different moves, for example, when students accept the contributions of a partner, assume the perspectives of a partner, or challenge and modify a partner's stances (Weinberger & Fischer, 2006). These moves are distinguished from externalizing, when students revisit their own ideas (Teasley, 1997).
In the second iteration, the author excluded or merged codes with low occurrences and used a bottom-up approach to develop emergent codes. First, I observed that most of the transactive moves from students were to integrate their partners' perspectives into their answers, and there was only one occurrence where they challenged another partner's ideas. Thus, these codes were merged under the transactive category. Second, responsiveness emerged as another category, to denote when students responded to nudges by the agents.
The codes were dichotomous (1 for occurrence, 0 for non-occurrence). Each message could receive codes for multiple categories; for instance, if a message showed both reasoning and transactive talk. Table 1 presents the final coding scheme.
Code | Description | Example |
---|---|---|
Reasoning | ||
Claim | A statement about an idea or concept | Fish decreases plankton |
Reasoning | Evidence or explanation to support a claim | … because plankton is abundant in the ocean and is easy food source |
Questioning | Guiding queries for the discussion | If the water temperature changes what would happen? |
Transactivity | ||
Externalizing | Articulate one's own prior thoughts to groups | As I mentioned, I think we should also regulate CO2 emissions |
Transactive | Integrate, apply or challenge perspectives of a peer's prior ideas | I agree, whales decrease fish because fish is whales's food source. Whales eat plankton too |
Responsiveness | Students respond to agent's prompts | Kibot: What would happen if CO2 increase? |
imastar: There'll be more acidification |
Once the coding scheme was established, the author and a research assistant separately coded 20% of the data and achieved substantial agreement across dimensions (reasoning: Cohen's κ = 0.98; transactivity: Cohen's κ = 0.92; responsiveness: κ = 1).
Analytical strategies
Student group was the unit of analysis. Group-level analyses align with the study's theoretical focus on how groups' ideas evolve in knowledge construction (Chen et al., 2017; Scardamalia & Bereiter, 1993). Analyses at the individual level were less suitable due to the small occurrences of chat moves per individual (M = 6 utterances per individual per agent). Interaction sequences emerging from such a small number of utterances may be less meaningful. The Limitations section outlines the constraints of this analytical decision in more detail.
Discussion moves and differences between agent conditions
The first research question examined differences between agent conditions in occurrences of discussion moves for reasoning (claim, reasoning, questioning), transactivity (externalize, transactive), and responsiveness to agents. I calculated the ratio of each move's occurrences out of all messages that each group sent, and used Wilcoxon signed rank tests to examine whether there was a significant difference between conditions for the ratios. To account for multiple comparisons, I used Benjamini-Hochberg corrections with the false discovery rate of 0.05.
Differences in sequential patterns
To answer RQ2, I applied sequential pattern mining to examine differences in the sequences of discussion moves between agent conditions. The same codes for discussion moves were arranged in the temporal order that they occurred. The rules for the sequential patterns were identified using R's arulesSequences packages (Buchta et al., 2020). The package used the cSPADE algorithm (Zaki, 2000) to identify the temporal association of an event and a subsequent one based on frequencies of occurrences. In this study, the students' chat logs formed a set of sequences (eg, one sequence per student group per condition), and each sequence contained a set of reasoning, transactive, and responsive moves. For example, if sequence S1 started with “claim”, there would be some likelihood that “reasoning” would follow “claim” within the same sequence and form the pattern claim -> reasoning. The discussion moves within a sequential pattern fell within a 1-min window to capture relevant discussion contexts.
The study considers three metrics (support, confidence and lift) to capture the likelihood of a sequential pattern and identify candidate sequences for subsequent analyses. Support, ranging between 0 and 1, describes the proportion that a specific pattern occurred out of all sessions. Confidence, also ranging between 0 and 1, indicates the likelihood of a discussion move B to follow A, once A occurred. Finally, lift shows the support for a pattern (eg, A -> B), divided by the support for A times the support for B. A lift value greater than 1 suggests positive likelihood that a pattern would occur, relative to chance occurrences. Prior work has noted that a high value of lift may indicate added values, since they have high correlations with domain experts' judgment of interesting patterns (Bazaldua et al., 2014; Merceron & Yacef, 2008).
The thresholds for the parameters were set as follows: support at 0.25, confidence at 0.50, and lift at 1.25. These thresholds helped capture a wider range of representative and interesting sequences than setting high support and confidence thresholds. The findings present the highest lift-value sequential patterns for each agent condition. Findings focus on lift values to indicate levels of interestingness and include excerpts from student discussions to illustrate the patterns.
To explore the differences between agent conditions, I calculated the occurrences of the top patterns for each student group. Wilcoxon signed-rank tests were used to determine whether occurrences differed by agents, with pattern occurrences as the dependent variables and the agent conditions as the independent variable. The tests used Benjamini-Hochberg corrections (false discovery rate of 0.05) to account for multiple comparisons.
Finally, in the study's within-subject design, groups started with an agent and switched to the other agent halfway through the lesson. Thus, I compared the interaction patterns with each agent in relation to which agent a student group started the learning task with. These analyses explored whether interactions with each agent remained consistent regardless of the starting agent.
FINDINGS
Frequencies of discussion moves
The descriptive statistics present an overview of students' interactions. Student groups did not differ in the number of messages they sent between agent conditions. On average, groups sent 16.21 messages, SD = 16.01 in the peer condition, while they sent 20.64 messages, SD = 32.05 in the expert condition (W = 302.5, p = 0.52).
Within conditions, students sent the most claims (peer: 264 messages; expert: 316), followed by externalizing their own ideas (peer: 172; expert: 173), responsiveness (peer: 101; expert: 114), transactive exchange to build on peers' ideas (peer: 93; expert: 96) and questioning (peer: 89; expert: 73). Reasoning had the lowest occurrences (peer: 60; expert: 72).
Wilcoxon tests suggest no substantial difference between conditions in the ratio of occurrence for reasoning, transactivity, or responsiveness (Table 2). These findings indicate that the agent designs might have resulted in similar levels of engagement for student groups. Additionally, Mann-Whitney tests suggest that discussion moves were consistent in relation to the agent profile groups started with. Recall that groups started with an agent (less-knowledgeable-peer or expert) and switched to the other (expert or less-knowledgeable-peer, respectively) during the lesson. The starting agent profiles did not significantly influence the frequencies of discussion moves with each agent (Online Appendix A3).
Code | Mpeer | SDpeer | Mexpert | SDexpert | W | p | adjusted-p |
---|---|---|---|---|---|---|---|
Reasoning | |||||||
Claim | 0.36 | 0.49 | 0.31 | 0.30 | 428.50 | 0.91 | 0.95 |
Reasoning | 0.20 | 0.21 | 0.16 | 0.15 | 124.50 | 0.69 | 0.95 |
Questioning | 0.22 | 0.22 | 0.25 | 0.28 | 75.50 | 0.95 | 0.95 |
Transactivity | |||||||
Externalizing (self's ideas) | 0.39 | 0.16 | 0.35 | 0.10 | 127.50 | 0.57 | 0.95 |
Transactive (prior ideas, friend) | 0.21 | 0.11 | 0.19 | 0.12 | 122.50 | 0.47 | 0.95 |
Responsiveness | 0.25 | 0.15 | 0.23 | 0.13 | 123.00 | 0.88 | 0.95 |
Questioning sequences with the less-knowledgeable-peer agent
The second question explores the temporality in groups' interactions. Table 3 lists the top patterns for each agent condition, ranked by lift values to indicate interestingness. For example, the two sequences with the highest lift values in the less-knowledgeable-peer condition were (1) questioning, followed by externalizing one's own ideas and questioning; and (2) reasoning and responding to agent, followed by transactive exchange. Meanwhile, the top two sequences in the expert condition were (1) questioning, followed by another question; and (2) responding to agent and building on one's own idea, followed by transactive exchange.
Less-knowledgeable-peer agent | Lift | Support | Confidence | n (peer) | n (expert) |
---|---|---|---|---|---|
Question => Externalize; Question | 1.54 | 0.33 | 0.60 | 12 | 6 |
Reasoning; Responsiveness => Transactive | 1.50 | 0.28 | 1.00 | 10 | 8 |
Question => Transactive | 1.50 | 0.28 | 1.00 | 10 | 0 |
Transactive (multiple) => Question | 1.50 | 0.28 | 1.00 | 10 | 2 |
Responsiveness; Externalize => Transactive | 1.50 | 0.33 | 1.00 | 12 | 16 |
Expert agent | Lift | Support | Confidence | n (peer) | n (expert) |
---|---|---|---|---|---|
Question => Question | 1.73 | 0.29 | 0.71 | 12 | 10 |
Responsiveness; Externalize => Transactive | 1.42 | 0.47 | 1.00 | 12 | 16 |
Externalize; Responsiveness => Transactive | 1.42 | 0.29 | 1.00 | 10 | 10 |
Externalize; Transactive; Responsiveness => Transactive | 1.42 | 0.35 | 1.00 | 6 | 12 |
Responsiveness (multiple) => Transactive | 1.30 | 0.35 | 1.00 | 12 | 12 |
Notes
- “;” indicate co-occurring moves within a message; “=>” indicates sequences.
- Lift = probability of sequence occurrences, compared to chance. Support = proportion of occurrences. Confidence = likelihood of pattern B to follow A, once A occurred.
S1: what kills fish?
S1: if there were plastic there what would happen
Kibot: I am not sure. What do you think would happen?
S1: plastic kills fish because fish gets stuck. Does plastic kill fish?
S1 began his interaction with Kibot less-knowledgeable-peer with open-ended questions. In response, Kibot expressed uncertainty to invite students to articulate their thinking. This utterance prompted S1 to provide an answer to his prior questions, and to follow with another question about the same concept.
S3: Global warming decreases oxygen.
S2: Yeah, because warm water holds less O2.
S2: And O2 is important because fish needs O2.
S3: zooplankton increases O2 because zooplankton creates O2.
S2: I think phytoplankton does.
S2: How does higher temperature effect fish?
Kibot: What would happen to elements in this system if ocean temperature increases?
S4: Higher temperature would influence the habitat because fish only lives in certain temperature range.
S5: Agreed, so fish would die off if temperature is too high. Higher temperature also influences the habitat by releasing more O2.
Wilcoxon tests provided evidence of differences in the occurrences of questioning sequences between agent conditions (questioning -> transactive; Mpeer = 0.56, SDpeer = 0.62; Mexpert = 0, SDexpert = 0; adjusted p = 0.003; transactive (multiple) -> questioning; Mpeer = 0.56, SDpeer = 0.51; Mexpert = 0.17, SDexpert = 0.38; adjusted p = 0.03). Table 4 presents the results.
Code | Mpeer | SDpeer | Mexpert | SDexpert | W | p | adjusted p | |
---|---|---|---|---|---|---|---|---|
1 | Question => Externalize; Question | 0.67 | 0.49 | 0.33 | 0.59 | 216 | 0.05 | 0.09 |
2 | Reasoning; Responsiveness => Transactive | 0.56 | 0.51 | 0.47 | 0.62 | 170 | 0.53 | 0.71 |
3 | Question => Transactive | 0.56 | 0.62 | 0 | 0 | 224 | 0.0004*** | 0.003** |
4 | Transactive (multiple) => Question | 0.56 | 0.51 | 0.17 | 0.38 | 206 | 0.01* | 0.03* |
5 | Responsiveness; Externalize => Transactive | 0.67 | 0.59 | 1.00 | 0.37 | 99 | 0.05 | 0.09 |
6 | Question => Question | 0.67 | 0.59 | 0.63 | 0.50 | 147 | 0.92 | 1.00 |
7 | Externalize; Responsiveness => Transactive | 0.56 | 0.70 | 0.63 | 0.50 | 128 | 0.55 | 0.71 |
8 | Externalize; Transactive; Responsiveness => Transactive | 0.33 | 0.49 | 0.75 | 0.45 | 84 | 0.02* | 0.06 |
9 | Responsiveness (multiple) => Transactive | 0.67 | 0.49 | 0.67 | 0.49 | 162 | 1.00 | 1.00 |
Note
- The “Respond agent; Externalize => Transactive” pattern appeared in the lists for both conditions.
- * 0.05
- ** 0.01
- *** 0.001.
The top sequences for each agent were consistent regardless of the agent student groups started with (Online Appendix A4). In groups who started with or switched to the less-knowledgeable-peer agent during the lesson, the top sequences with this agent primarily involved questioning. In comparison, most sequences with the expert agent included transactive and responsiveness moves. These results align with patterns from the overall sample and suggest that randomizing the starting agents and switching agents did not substantially impact interaction sequences.
In sum, the sequential patterns illustrate interesting patterns, such as when student groups in the less-knowledgeable-peer condition posed questions to the group and the agent, explained ideas, and posed questions again. Such sequences may suggest “quizzing” behaviors to the agent, which create opportunities for knowledge construction.
DISCUSSION
Designing for knowledge construction
S6: the fish eat zooplankton
S6: zooplankton gets eaten by fish
Kibot: What would happen to other elements in this system if whales decrease?
S7: if the whales go extinct there will be more fish
S6: do sharks eat fish
S8: why would whales eat fish
S7: whales eat plankton too
Here, the students started their concept map by linking fish and zooplankton. Kibot kept track of those connections and cued “whales”, an element linked to both “fish” and “zooplankton”. The conversation flowed naturally to include the target relationships (whales -> fish; whales -> zooplankton) and other predators to “fish” such as “sharks”. As another example, if after five talk turns following Kibot's hint about “whales”, students have not created the intended connections, the agent will ask more explicit prompts, eg, “Would fish increase or decrease if there were more whales?” The missing connections are linked to recently created terms in the concept map (“fish”), thus maintaining conversation coherence.
Adaptivity also manifests in agents' awareness of ongoing group dynamics. The agents keep track of each student's participation rates to engage the least active students, eg, “S2, help me out. Do you agree or disagree with your friends?” Recognition of group dynamics positions agents as an ingroup member and reduces users' potential antagonistic treatment of the agents, such as ignoring or abusing the agents (Sebo et al., 2020).
Furthermore, the combination of visual and verbal cues may have “humanized” the agents and made the interactions more engaging (Feine et al., 2019; Go & Sundar, 2019; Muresan & Pohl, 2019). The agents assume roles as an expert or a less-knowledgeable peer and have human-like visual features such as eyes, expressions and clothing to reflect these profiles. For example, the expert agent wears an “expert” tag on a white coat, while the less-knowledgeable-peer agent shows dynamic expressions (confusion, excitement) to reflect its younger profile (Figure 1). These visual cues can be helpful, because people tend to reason with anthropomorphized agents in ways they act with humans (Malle et al., 2016; Nass & Moon, 2000). Users may subsequently take on responsibilities as a tutor to the less-knowledgeable-peer agent or a tutee to the expert agent.
Previous agents in learning domains have mostly focused on conceptual prompts (eg, Dyke, Adamson, et al., 2013; Heidig & Clarebout, 2011; Kim & Baylor, 2016). The Kibot agents employ additional social verbal cues, including small talk, acknowledgement, uncertainty, and opinion conformity (Online Appendix A2). In expressing uncertainty, for instance, the less-knowledgeable-peer agent reveals its vulnerability to invite for students' idea elaboration. Students may be more willing to respond to the agents when they recognize the agents' human-like characteristics through the social cues and position the agents as conversational partners (Feine et al., 2019).
Questioning sequences with the less-knowledgeable-peer agent
Examination of sequences with high lift values suggests that student groups enacted distinct conversational norms with the two agents. Responses to the expert agent were followed by transactive exchange, similar to how students might react to a teacher's prompts. Meanwhile, several groups showed “quizzing” behaviors with the less-knowledgeable-peer agent.
The questioning sequences resemble interactions that support learning in peer tutoring. Tutors rely on elaborated explanations and questioning to guide tutees' thinking (Graesser & Person, 1994). Excerpts from the current work illustrate how questioning sequences might involve multiple students, instead of one student's interactions with the agent. Prior work with one-to-one intelligent tutors has employed the learn-through-teaching framework, where students iteratively taught and tested CAs (Biswas et al., 2010, 2016). These patterns can be extended to collaborative settings, where efforts to teach the agent no longer rest with an individual. Through these efforts, groups coordinate attention around the same concepts.
Questioning can benefit student tutors by providing opportunities for tutors to reflect on their own knowledge and move towards knowledge construction (Berghmans et al., 2013; King et al., 1998). Excerpt 2 in this study exemplifies complex connections that span across systems elements in the concept map. These questions may lead to elaborated explanations from the questioners and group members.
Currently, Kibot agents only respond to students' questions with general answers such as “I'm not sure. Can someone in the group explain?” to prompt for elaboration. Future iterations of the agents can introduce a mix of general prompts and responses that contain alternative conceptions, to engage students in idea elaboration. These prompts can trigger episodes of the tutor's questioning, guidance and feedback, which can enhance tutor's learning gains (Cohen et al., 1982; Graesser et al., 1995).
Limitations and future directions
The limitations to this study can guide future investigations. First, analyses were at the group instead of individual level. Agent designs may differ with group dynamics and individual behaviors (Sebo et al., 2020). In a case study with the same Kibot agents, Nguyen (2022) found that groups composed of students with expanding (high), mixed and emergent (low) prior science understanding interacted with the agents differently. The emergent group primarily showed responsiveness to the expert agent, whereas the expanding group showed more transactive exchange with the less-knowledgeable-peer agent. Incorporating group and individual dynamics into sequential pattern mining would require more complex analyses, such as multilevel models with large samples. Future studies can employ these analyses and consider additional variables that may influence knowledge construction, such as domain understanding, perceptions of agents and participation tendency.
Second, the within-subject design did not allow for linking interaction sequences to learning outcomes. Future research can apply between-subject designs to explore the pathways between agent condition, reasoning and transactivity, and learning performance.
Overall, findings broaden the design space for learning technologies in collaborative contexts. The sequences uncovered by this research can be applied towards developing agent adaptivity. Future systems can adapt designs to promote appropriate discussion moves and sequences, given dynamic learning goals and groups' knowledge and collaboration states.
CONCLUSION
This study examines the design of two agent prototypes to facilitate students' reasoning, transactivity and responsiveness to the agents. Findings illustrate that design tweaks in agents' appearances and linguistic styles can facilitate different discussion sequences, including groups' questioning and explaining to the less-knowledgeable-peer agent. These sequences can promote students' reflection and idea elaboration. As intelligent systems become increasingly prevalent in collaborative learning, we need to consider how student groups interact with such systems. This work illustrates how embedding humanness in agents' designs, such as in the form of a less knowledgeable peer or an expert, can support this vision.
ACKNOWLEDGEMENTS
The research in this study was funded by the [School]'s Center for Teacher Development & Professional Practice. The author acknowledges the state park educators, partner teachers and students who participated in this study. The author also thanks the editor and anonymous reviewers for their invaluable feedback on the manuscript.
CONFLICT OF INTEREST
The author has no conflict of interest.
ETHICS STATEMENT
The research for this study was approved under the university's Institutional Review Board, HS#2020-6273 e-App# 15676.
Open Research
DATA AVAILABILITY STATEMENT
The data from this study cannot be made openly available due to confidentiality agreements. The coding procedures are included in the paper.