Emergent Reasoning in Large Language Models: A Case Study on STEM Problem Solving

Journal of Artificial Intelligence Research • Vol. 13, No. 1

Case Study2024DOIOpen AccessPeer Reviewed

Abstract

We present a detailed case study examining emergent reasoning capabilities in GPT-4 class models on multi-step STEM problems. Using a novel evaluation framework comprising 2,400 problems across physics, chemistry, and mathematics, we characterize the boundaries of in-context learning and chain-of-thought prompting. Structured prompting outperforms direct approaches by 31% on multi-step calculus.

Publication Information

AcceptedJanuary 20, 2024

DOI10.12345/jair.2024.003

Author Information

Wei Chen

ORCID: 0000-0001-2345-6789

Affiliation: MIT Laboratory for Artificial Intelligence

Email: wei.chen@ailab.edu

Website: https://example.com/wchen

Keywordsemergent reasoning, LLM, chain-of-thought, STEM, in-context learning

Additional Information