StepWise Virtual Tutor for Math Word Problems Using SRSD
Querium received a new grant in May 2022 from the U.S. Department of Education Institute of Education Sciences (IES). With this grant, we seek to develop an important new software product for use with the StepWise AI: StepWise for Math Word Problems (SW4WP). To create this novel AI-based solution for assisting students with math word problems, we plan to use a combination of Natural Language Processing (NLP) software, Self-Regulated Strategy Development (SRSD) techniques, and research-based algorithms implementing effective methods for students to tackle math word problems (schemas).
Our funding agency has requested that we test our prototype and measure student outcomes. We have designed a study to be conducted this fall in a local school district near Austin, TX.
Querium’s work on SW4WP is based in part on the research of Professor Sarah Powell of UT Austin’s Department of Special Education in the College of Education. Dr. Powell and her colleagues have performed studies to determine effective math problem-solving methods for students with learning differences. Our work on SW4WP attempts to automate these methods (called schemas) in the StepWise AI software. Dr. Powell is a consultant on our grant team.
Dr. Leslie Laud is co-PI together with the University of Providence in an $11M Department of Education Education Innovation and Research (EIR) grant for 5 years. This grant establishes nationwide professional development programs in instruction of Self-Regulated Strategy Development (SRSD) techniques for writing. In 2021, Querium collaborated with Dr. Laud’s company ThinkAUM on a grant-funded project to develop AI software to teach SRSD for writing. Dr. Laud is a consultant on our new grant team.
Jerri LaMirand is the retired Director of STEM for the Eanes Independent School District, which is located outside Austin, TX. Ms. LaMirand is a member of this study team and is responsible for recruiting teachers for the study. Ms. LaMirand also contributed to the design of SW4WP.
None of these three consultants will have access to specific student data from the study or direct contact with study participant students.
Rationale and Background Information
In order to provide the Department of Education IES with the best possible evidence of efficacy for our prototype, we seek some indication that students who use SW4WP solve simple word problems with greater facility than students who do not use SW4WP. Because we are not providing significant instruction on schemas or SRSD to study participants at this time, we have only modest expectations for measurable differences between the control group and the treatment group. This study is therefore an exploratory study. Should our application for a larger phase 2 award receive funding, we expect to complete a full Study of Promise during the 2024-2025 school year.
In 2014, Querium began developing StepWise Virtual Tutor, a versatile and robust tool that deploys patented Artificial Intelligence (AI) technology to provide real-time, flexible, personalized tutoring and assessment for math students. The goal was to achieve outcomes comparable to those of traditional one-on-one tutoring—without the associated costs. StepWise for Algebra has now been used by 20,000 students in primary schools, high schools, and colleges.
Unlike other systems that walk students through only one path to solve each problem, StepWise allows students to explore many valid pathways to a solution. The system tracks each step in students’ work, catches algebraic and arithmetic errors, and provides hints and feedback.
StepWise AI is a cloud-hosted service that can be accessed by students from a standard web browser. For this study, StepWise software will be delivered through the open-source Open EdX Learning Management System (LMS), which will govern student access and record student performance results in StepWise.
For over 10 years, Professor Sarah Powell and her collaborators have conducted research into effective techniques for teaching students to tackle math word problems [Powell 2011], [Powell et al, 2020]. Querium now wishes to implement these schemas in the StepWise AI software.
For approximately 30 years, Self-Regulated Strategy Development (SRSD) techniques have been developed, tested, and used to teach writing. Using SRSD, students develop their own strategies for planning and completing a writing assignment [Harris et al, 2008]. More recently, researchers have begun to apply SRSD concepts to the teaching of mathematics [Case et al, 1992], [Cuenca-Carlino at al, 2016], [Hughes & Lee, 2020], [Kiuhara et al, 2020], [Popham et al, 2020]. Querium now wishes to implement these SRSD techniques in the StepWise AI software.
Querium’s approach to incorporating SRSD within StepWise is known as POWER:
- Work the Problem
- Explain Your Results
- Review & Revise
We have developed a user interface for SW4WP that utilizes these five steps to guide students through the schema for solving a given word problem.
Case, L. P., Harris, K. R., & Graham, S. (1992). Improving the mathematical problem-solving skills of students with learning disabilities: Self-regulated strategy development. The Journal of Special Education, 26(1), 1–19.
Cuenca-Carlino, Y., Freeman-Green, S., Stephenson, G. W., & Hauth, C. (2016). Self-regulated strategy development instruction for teaching multi-step equations to middle school students struggling in math. The Journal of Special Education, 50(2), 75–85.
Harris, K. R., Graham, S., Mason, L. H., & Freidlander, B.. (2008). Powerful writing strategies for all students. Brooks.
Hughes, E. M., & Lee, J. Y. (2020). Effects of a mathematical writing intervention on middle school students’ performance. Reading & Writing Quarterly, 36(2), 176–192.
Kiuhara, S. A., Gillespie Rouse, A., Dai, T., Witzel, B. S., Morphy, P., & Unker, B. (2020). Constructing written arguments to develop fraction knowledge. Journal of Educational Psychology, 112(3), 584–607.
Popham, M., Adams, S., & Hodge, J. (2020). Self-regulated strategy development to teach mathematics problem solving. Intervention in School and Clinic, 55(3), 154–161.
Powell, S. R. (2011). Solving word problems using schemas: A review of the literature. Learning Disabilities Research & Practice, 26(2), 94–108. https://doi.org/10.1111/j.1540- 5826.2011.00329.x.
Powell, S. R., Berry, K. A., & Benz, S. A. (2020). Analyzing the word-problem performance and strategies of students experiencing mathematics difficulty. Journal of Mathematical Behavior, 58(100759), 1–16.
Study Goals and Objectives
- Perform an initial exploratory study looking for any impact on student competence on a small set of word problem types.
- Generate data for inclusion in our Phase 2 grant proposal due February 15, 2023.
- Provide hands-on use of our prototype software and UI to real students.Study Design
We plan to test our prototype on a treatment group of 100 5th-grade and 6th-grade students. The treatment students will solve approximately 10 math word problems using a version of StepWise AI that implements problem-solving schemas taken from the research work of Dr. Powell. This version also implements the POWER methodology based on the principles of SRSD. The control group students will solve the same word problems using the current version of StepWise AI, which lacks support for schemas and SRSD. We wish to understand during the short study period whether the treatment students have better outcomes in solving the word problems using the software that has been enhanced with schemas and SRSD. At this time we do not wish to assess independently the impact of schemas separate from the impact of SRSD. We leave that refinement for the next phase in 2023. The short duration of our grant limits the size of the study population and the amount of work we can do with any given student.
We seek to limit the amount of in-class days consumed by our study, because the study uses precious classroom time for the participating teachers. We also seek to keep the number of participating teachers below 10, and ideally below 6. Finally, we will attempt to align the types of math word problems included in the study to the specific topics being taught to students during the study period.
Because this is a small exploratory study with limited time and funding, we do not plan to seek a very high confidence level or an ultra-low margin of error. We’ve chosen a 95% confidence level and a margin of error of 10%. Given a 50% population proportion for a general student population, we get a computed sample size of 97 students. We wish to have equally sized treatment and control populations. Therefore we nominally seek 100 treatment students and 100 control students. Assuming an average class size of 20-25 students, we are seeking 5 teachers, each of whom would provide 20 treatment students and 20 control students. We will also seek a few additional control and treatment students in case some students fail to complete the study, for example, due to absence on the dates of the study.
This relatively small population cannot easily be subdivided for further study based on socioeconomic status, race, standardized test scores, etc. Therefore we will not have the specific knowledge of individual student attributes necessary to allow us to draw meaningful conclusions based on student subgroups. For this reason, we plan to conduct the study anonymously. Classroom teachers who participate in the study will know the names of their student participants, and they will know which students are in the control group and which are in the treatment group. Our researchers will not have access to the Personally Identifiable Information (PII) about the students.
Both control and treatment students will work during one 55-minute class period on three consecutive school days. Parental permission will be obtained prior to the study days in cooperation with the school district.
Day 1: Students complete a pre-test to assess their word problem-solving proficiency and a survey of attitudinal questions on their math confidence. Students complete a brief online training course in the use of the StepWise software they will be using on Day 2.
Day 2: Students attempt to complete an online assignment of 10-12 math word problems using StepWise on their school-supplied iPads. Control students will use an unmodified version of StepWise. Treatment students will use a modified version of StepWise that implements Dr. Powell’s schemas presented in the context of the POWER methodology based on SRSD.
Day 3: Students retake the same assessment as they did on Day 1 as a post-test. Students also complete a brief online survey to provide opinions and feedback about StepWise.
By providing consistency between the control group and the treatment group in all study aspects except for the two items we will vary (use of schemas and use of SRSD/POWER), we hope to control for as many factors as possible.
Attitudinal questions will be taken from the Math Problem Solving Assessment-
Short Form (MPSA-SF). Sample Math Word Problems for the pre-test and post-test will be taken from a standardized set used by Dr. Powell’s research group at UT Austin.
There are no physical health risks associated with the use of StepWise on iPads. There are no repetitive manual operations associated with the use of StepWise. The StepWise software itself provides positive encouragement to students and attempts to mimic the feedback that students would receive from their classroom teacher, minimizing adverse impacts on students’ mental health. By keeping the student PII anonymous from the project team, we hope to avoid disclosure of PII to third parties.
Data Management and Statistical Analysis
The StepWise software will operate within the OpenEdX learning management system, a browser-based open-source software product developed by MIT and Harvard. Student answers to survey questions, pre-test questions, study questions, and post-test questions will be recorded in a MySQL database within OpenEdX. Data about student time-on-task will also be recorded. Data analysis will be performed on the MySQL data, likely through the use of Mathematica for statistical analysis.
We plan to compare the amount of time that students spend on each word problem between the control and treatment students, and compare the students’ success in getting correct answers on each word problem between the control students and the treatment students. We will look for potential correlations between the students’ answers on attitudinal questions with their performance time and correct answers.
Primary data analysis will be performed by April Warn, our team research member. Maria Robinson, a Ph.D. mathematician team member, will assist in the quality assurance of the resultant data. The quality assurance process will not disclose PII about the student participants.
Expected Outcomes of the Study
Given the short timeline for completing the study, we will not have a large amount of time to work with study participants, so we consider that any positive performance outcome from our sample will be a success. In a subsequent study planned for 2024, we will seek to assess a larger population and multiple subgroups of interest.
Dissemination of Results and Publication Policy
After the analysis of the study data, we must prepare and submit a final report for our funding source by January 15, 2023. This report will be a public record, i.e., subject to FOIA requests. We will also seek to publish the results in an academic journal through co-authorship with Sarah Powell (UT Austin) and/or Leslie Laud (Bank Street College of Education). Neither student PII nor the names of participating teachers will be included in our reports.
Duration of the Project
Our data collection will take place in late October and early November, 2022. Our final report is due January 15, 2023. Our application for a phase 2 award is due February 15, 2023, but is contingent on the congressional appropriation of funds for the Small Business Innovation Research (SBIR) program. Should we receive the Phase 2 award in mid-2023, the work to expand the prototype into a full product would take place from mid-2023 through mid-2024, with an expanded Study of Promise with students to take place during the 2024-2025 school year.
- Students don’t willingly do homework. We don’t believe that we could count on our test subjects to prepare at home for their participation in the study, for example, by reading background materials or watching videos at home.
- Parents often don’t complete permission slips. We plan to approach more potential study subjects than we need for our chosen sample size in order to get a sufficient number of subjects.
- The study software may not function well on the district-supplied iPads used by student participants. We plan to test our software on similar hardware and software to that used by the students in the study.
- The district-supplied iPads run ‘lock-down’ web browsers, so testing will be required to ensure their compatibility with the study software. We will work with district staff to ensure that the study participants have access to the study software website.
- Internet bandwidth within testing schools may be a scarce resource on the testing dates. Teachers participating in the study are not in control of the overall load on their campus internet connection. We will work with district IT staff to help ensure that our use of the internet during the study is harmonious with other uses of the campus internet bandwidth on the study dates.
- In past studies with students, our staff has been present in the classrooms with the participating teachers during data collection. In this study, we are working anonymously with students, so we don’t plan to have project members in the classroom during data collection. Our team will be available by phone to the study teachers to answer any questions during the study and to provide remote assistance for the study software.
The study project team of Elaine Kant (PI), Kent Fuka, and April Warn hold regular weekly meetings to coordinate the project. Our three software developers have no access to student PII. One software developer with a Mathematics Ph.D. will act as quality assurance on the analysis results produced by the team.
The government grant we obtained for funding this study has restrictions on conflicts of interest. Querium would seek to be awarded the phase 2 grant to provide further funding for 2023-2025. Elaine Kant and Kent Fuka are corporate officers of Querium, and Kent Fuka is a corporate director of Querium. As officers and directors of Querium, they bear liability for failures of governance. Querium corporate policy expressly prohibits ethical violations. Querium’s reputation in the market would be adversely affected if Querium demonstrated a lack of ethics. Both Elaine Kant and Kent Fuka are investors in Querium, and their investment value would be adversely affected by any findings of ethics violations.
Informed consent forms
A Parental Consent Form and a Student Assent Form have been developed, and are available for review.