You are using an out of date browser. It may not display this or other websites correctly. You should upgrade or use an alternative browser.
exam-testing
About this tag
The exam-testing tag on WindowsForum.com covers discussions about evaluating AI models using school-level tests. In one thread, a user runs OpenAI's gpt-oss:20b model through a test designed for 10- and 11-year-olds, finding the model capable in reasoning but ultimately scoring below a real child. This tag is relevant for those interested in benchmarking AI performance against human standards, particularly in educational contexts. Topics include local reasoning, open-weight models, and practical limitations of AI on academic exams.
OpenAI’s new open-weight model suite landed squarely in the spotlight — and when I ran the smaller gpt-oss:20b through a real-world school test designed for 10‑ and 11‑year‑olds, the model proved interestingly capable on paper, but ultimately fell short of beating an actual 10‑year‑old at their...