Emergent Reasoning in Large Language Models: A Case Study on STEM Problem Solving

Emergent Reasoning in Large Language Models: A Case Study on STEM Problem Solving

Wei Chen

Case Study2024DOIOpen AccessPeer Reviewed

Abstract

We present a detailed case study examining emergent reasoning capabilities in GPT-4 class models on multi-step STEM problems. Using a novel evaluation framework comprising 2,400 problems across physics, chemistry, and mathematics, we characterize the boundaries of in-context learning and chain-of-thought prompting. Structured prompting outperforms direct approaches by 31% on multi-step calculus.

Publication Information

AcceptedJanuary 20, 2024

Author Information

Affiliation: MIT Laboratory for Artificial Intelligence
Keywordsemergent reasoning, LLM, chain-of-thought, STEM, in-context learning

Additional Information

Views8