New OpenAI general reasoning model gets gold medal at international math olympia
| Razzle Pistol Wagecucks | 07/19/25 | | wonderful orchid potus | 07/19/25 | | zombie-like round eye | 07/19/25 | | Stimulating lay | 07/19/25 | | fragrant boistinker mad cow disease | 07/20/25 | | henna wild persian | 07/19/25 | | zombie-like round eye | 07/19/25 | | henna wild persian | 07/19/25 | | henna wild persian | 07/20/25 | | Floppy iridescent newt step-uncle's house | 07/20/25 | | henna wild persian | 07/20/25 | | Razzle Pistol Wagecucks | 07/20/25 | | Umber sickened tank hunting ground | 07/19/25 | | rough-skinned purple trailer park | 07/20/25 | | Spectacular Lime Casino | 07/20/25 | | henna wild persian | 07/20/25 | | alcoholic primrose property gay wizard | 07/20/25 | | henna wild persian | 07/20/25 | | Cracking principal's office candlestick maker | 07/20/25 | | henna wild persian | 07/20/25 | | Cracking principal's office candlestick maker | 07/20/25 | | impertinent hideous theater | 07/20/25 | | henna wild persian | 07/20/25 | | impertinent hideous theater | 07/20/25 | | Floppy iridescent newt step-uncle's house | 07/20/25 | | impertinent hideous theater | 07/20/25 | | Overrated Pit | 07/20/25 | | Razzle Pistol Wagecucks | 07/20/25 | | Floppy iridescent newt step-uncle's house | 07/20/25 | | lake stain | 07/20/25 | | Floppy iridescent newt step-uncle's house | 07/20/25 | | Stimulating lay | 07/20/25 | | Floppy iridescent newt step-uncle's house | 07/20/25 | | Excitant sooty school cafeteria | 07/20/25 | | Floppy iridescent newt step-uncle's house | 07/20/25 | | flushed liquid oxygen theater stage | 07/20/25 | | Razzle Pistol Wagecucks | 07/20/25 | | Floppy iridescent newt step-uncle's house | 07/20/25 | | Overrated Pit | 07/20/25 | | ,.,....,...,,,..,..,.,..,.,.,.,. | 07/21/25 |
Poast new message in this thread
 |
Date: July 20th, 2025 3:59 AM Author: fragrant boistinker mad cow disease
Yes, it's truly remarkable how quickly AI has advanced in contest math! Gemini's 50% score on the USAMO (United States of America Mathematical Olympiad) is a massive leap compared to where models like GPT-4 started just a couple of years ago.
### Key Observations on the Progress:
1. **From Near-Zero to Competitive Performance**
- Early versions of GPT-4 struggled to score even 1-2 problems on the AIME (American Invitational Mathematics Exam), which is significantly easier than the USAMO.
- Now, AI is not just solving AIME-level problems reliably but also tackling Olympiad-level questions, which require deep reasoning, creativity, and proof-writing.
2. **Why This is Surprising**
- Many experts (including myself) expected this level to take at least 2-3 more years. The fact that it happened this year suggests:
- Better training techniques (e.g., improved reasoning loops, synthetic data, and self-improvement methods).
- Stronger mathematical priors (models may now "understand" abstract patterns rather than just memorizing).
- Possibly, more sophisticated search/verification methods (though Gemini's exact approach isn't public).
3. **Is Contest Math "Close to Solved"?**
- **For AIME:** Probably yes. AI is likely near-superhuman at this level already.
- **For USAMO/IMO:** Not yet, but progress is accelerating. A 50% USAMO score is already better than many human contestants, but the hardest problems (e.g., IMO Q6-style combinatorics/number theory) still require breakthroughs in symbolic reasoning and long-term planning.
- **For Proof Writing:** AI still makes stylistic errors, but formal verification (e.g., Lean) can help close the gap.
4. **What’s Next?**
- **IMO Gold (60+ score) within 1-2 years?** If progress continues at this rate, it's plausible.
- **General Mathematical Research:** The real test will be whether these models can contribute to unsolved problems (e.g., combinatorics conjectures, lightweight formal math).
### Why This Matters Beyond Olympiads:
- **Education:** AI could become the ultimate tutor for advanced math.
- **Research:** Automated reasoning might assist in mathematical discovery.
- **AGI Benchmarks:** Math is a strong proxy for structured reasoning—this progress hints at broader capabilities.
It’s an exciting time! Wouldn’t be surprised if an AI wins an IMO gold medal by 2026.
(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2Reputation#49115797) |
 |
Date: July 20th, 2025 12:54 AM Author: henna wild persian Subject: this fucking faggot:
"we are releasing GPT-5 soon but want to set accurate expectations: this is an experimental model that incorporates new research techniques we will use in future models. we think you will love GPT-5, but we don't plan to release a model with IMO gold level of capability for many months."
https://x.com/sama/status/1946569252296929727
(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2Reputation#49115705) |
Date: July 21st, 2025 1:19 PM
Author: ,.,....,...,,,..,..,.,..,.,.,.,.
Google got gold too using a large language model.
"We can confirm that Google DeepMind has reached the much-desired milestone, earning 35 out of a possible 42 points — a gold medal score. Their solutions were astonishing in many respects. IMO graders found them to be clear, precise and most of them easy to follow."
https://deepmind.google/discover/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/
(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2Reputation#49119024) |
|
|