Solving Proof Block Problems Using Large Language Models (SIGCSE TS 2024 - Papers)

Who

Seth Poulsen, Sami Sarsa, James Prather, Juho Leinonen, Brett Becker, Arto Hellas, Paul Denny, Brent Reeves

Track

SIGCSE TS 2024 Papers

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 22 Mar 2024 14:10 - 14:35 at Meeting Rooms C120-122 - Engaging Tools Chair(s): Lama Hamandi

Abstract

Large language models (LLMs) have recently taken many fields, including computer science, by storm. Most recent work on LLMs in computing education has shown that they are capable of solving most introductory programming (CS1) exercises, exam questions, Parsons problems, and several other types of exercises and questions. Some work has investigated the ability of LLMs to solve CS2 problems as well. However, it remains unclear how well LLMs fare against more advanced upper-division coursework, such as proofs in algorithms courses. After all, while known to be proficient in many programming tasks, LLMs have been shown to have more difficulties in forming mathematical proofs.

In this paper, we investigate the ability of LLMs to solve mathematical proofs by using Proof Blocks, a tool previously shown to efficaciously teach proofs to students. Our results show that GPT-3.5 is almost completely unable to provide correct solutions (11.4%), while GPT-4 shows a significant increase in correctness (64.8%). However, even given the incredible leap forward, current models still struggle to correctly order lines in a proof. It remains an open question whether this represents the temporary status of LLMs or if they will keep struggling to solve these types of exercises in the future.

DOI

https://doi.org/10.1145/3626252.3630928

Seth Poulsen

Utah State University

United States

Sami Sarsa

Aalto University

Finland

James Prather

Abilene Christian University

United States

Juho Leinonen

Aalto University

Finland

Brett Becker

University College Dublin

Ireland

Arto Hellas

Aalto University

Finland

Paul Denny

The University of Auckland

New Zealand

Brent Reeves

Abilene Christian University

United States

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 22 Mar
Displayed time zone: Pacific Time (US & Canada) change

13:45 - 15:00	Engaging ToolsPapers at Meeting Rooms C120-122 Chair(s): Lama Hamandi Northeastern University

13:45 25m Talk		Disentangling the Learning Gains from Reading a Book Chapter and Completing Proof Blocks Problems Papers Seth Poulsen Utah State University, Yael Gertner University of Illinois Urbana-Champaign, Hongxuan Chen University of Illinois at Urbana-Champaign, Benjamin Cosman University of California San Diego, Matthew West University of Illinois at Urbana-Champaign , Geoffrey Herman University of Illinois at Urbana-Champaign DOI
14:10 25m Talk		Solving Proof Block Problems Using Large Language ModelsGlobal Papers Seth Poulsen Utah State University, Sami Sarsa Aalto University, James Prather Abilene Christian University, Juho Leinonen Aalto University, Brett Becker University College Dublin, Arto Hellas Aalto University, Paul Denny The University of Auckland, Brent Reeves Abilene Christian University DOI
14:35 25m Talk		Using Worked Examples for Engaging in Epistemic Programming ProjectsGlobal Papers Sven Hüsing Paderborn University, Carsten Schulte University of Paderborn, Sören Sparmann Paderborn University, Mario Bolte Paderborn University DOI