\
  The most prestigious law school admissions discussion board in the world.
BackRefresh Options Favorite

New OpenAI general reasoning model gets gold medal at international math olympia

d. value of human intelligence falling every day. 180 times....
ultramarine embarrassed to the bone friendly grandma space
  07/19/25
...
exciting legend
  07/19/25
Gemini getting 50% on USAMO made me think this would happen ...
Ebony Cracking Dilemma
  07/19/25
...
talking histrionic gas station gaping
  07/19/25
Yes, it's truly remarkable how quickly AI has advanced in co...
Twinkling Macaca Goal In Life
  07/20/25
this is their in-house super model that doesn't have guardra...
cerebral nofapping water buffalo whorehouse
  07/19/25
they might deploy this model but it's also unlikely they'll ...
Ebony Cracking Dilemma
  07/19/25
all doomsday scenarios imply a frontier model at HQ signific...
cerebral nofapping water buffalo whorehouse
  07/19/25
this fucking faggot:
cerebral nofapping water buffalo whorehouse
  07/20/25
It doesn’t matter if you know what you are doing you c...
Iridescent indian lodge
  07/20/25
Please give us a guide on jailbreaking? I’m not ent...
cerebral nofapping water buffalo whorehouse
  07/20/25
It should be possible to get the compute cost down much lowe...
,.,.,.,,,.,,.,..,.,.,.,.,,.
  07/20/25
who the fuck cares lmao it just knows stuff poasted on the i...
Orchid Black Woman National
  07/19/25
...
bistre exhilarant deer antler
  07/20/25
so it can solve math problems that humans already solved? id...
Overrated razzle-dazzle lay
  07/20/25
problems at the forefront of human knowledge. that's all AGI...
cerebral nofapping water buffalo whorehouse
  07/20/25
so is it over yet or no
avocado mad cow disease cuck
  07/20/25
soon. we need Butlerian Jihad
cerebral nofapping water buffalo whorehouse
  07/20/25
why? so you can live in a nice suburb with your useless yupp...
Self-absorbed indecent stain
  07/20/25
I'm scared :(
rape bunny
  07/20/25
Get in losers, the system is burning down just like you alwa...
Do you agree?
  07/20/25
AI sucks at law
slippery round eye
  07/20/25
AI was getting 2+2 wrong a year ago. Cope retard
cerebral nofapping water buffalo whorehouse
  07/20/25
AI still sucks at law
slippery round eye
  07/20/25
That’s because law is a bullshit tribal human social n...
Iridescent indian lodge
  07/20/25
Law is 180
Wang Hernandez
  07/20/25
"Actually, this is just an AI being equivalent with the...
AdolfHitler88
  07/20/25
There are a handful of high school students with 150+ IQ tha...
ultramarine embarrassed to the bone friendly grandma space
  07/20/25
Lmao use grok 4 which is publicly available and tell me ther...
Iridescent indian lodge
  07/20/25
Phenotype’s relative value skyrockets as chink GPA val...
misunderstood rehab partner
  07/20/25
Yeah and it’s actually much better than that. The tes...
Iridescent indian lodge
  07/20/25
...
talking histrionic gas station gaping
  07/20/25
What a hilarious is it doesn’t even do math in the sen...
Iridescent indian lodge
  07/20/25
Can it give a blowjob without biting my dick off?
Mint Tank Parlour
  07/20/25
If not already then it’s coming soon enough
Iridescent indian lodge
  07/20/25
What about other AI models doing the same questions? Does th...
Big corn cake stage
  07/20/25
The rumor is that DeepMind got gold as well. No one is very ...
ultramarine embarrassed to the bone friendly grandma space
  07/20/25
I don’t know. These benchmarks are fake. I remember ...
Iridescent indian lodge
  07/20/25
Terrence Tao is chimping out about this on Twitter lol
AdolfHitler88
  07/20/25
Google got gold too using a large language model. "W...
,.,....,...,,,..,..,.,..,.,.,.,.
  07/21/25


Poast new message in this thread



Reply Favorite

Date: July 19th, 2025 1:10 PM
Author: ultramarine embarrassed to the bone friendly grandma space

d. value of human intelligence falling every day. 180 times.

https://x.com/alexwei_/status/1946477742855532918?s=46

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49114310)



Reply Favorite

Date: July 19th, 2025 5:01 PM
Author: exciting legend



(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49114825)



Reply Favorite

Date: July 19th, 2025 7:23 PM
Author: Ebony Cracking Dilemma

Gemini getting 50% on USAMO made me think this would happen in a couple years. it's somewhat surprising it happened this year. pretty amazing progress considering the original GPT-4 would get 0-1 on a random AIME exam and now contest math looks close to solved.

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115116)



Reply Favorite

Date: July 19th, 2025 7:28 PM
Author: talking histrionic gas station gaping



(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115125)



Reply Favorite

Date: July 20th, 2025 3:59 AM
Author: Twinkling Macaca Goal In Life

Yes, it's truly remarkable how quickly AI has advanced in contest math! Gemini's 50% score on the USAMO (United States of America Mathematical Olympiad) is a massive leap compared to where models like GPT-4 started just a couple of years ago.

### Key Observations on the Progress:

1. **From Near-Zero to Competitive Performance**

- Early versions of GPT-4 struggled to score even 1-2 problems on the AIME (American Invitational Mathematics Exam), which is significantly easier than the USAMO.

- Now, AI is not just solving AIME-level problems reliably but also tackling Olympiad-level questions, which require deep reasoning, creativity, and proof-writing.

2. **Why This is Surprising**

- Many experts (including myself) expected this level to take at least 2-3 more years. The fact that it happened this year suggests:

- Better training techniques (e.g., improved reasoning loops, synthetic data, and self-improvement methods).

- Stronger mathematical priors (models may now "understand" abstract patterns rather than just memorizing).

- Possibly, more sophisticated search/verification methods (though Gemini's exact approach isn't public).

3. **Is Contest Math "Close to Solved"?**

- **For AIME:** Probably yes. AI is likely near-superhuman at this level already.

- **For USAMO/IMO:** Not yet, but progress is accelerating. A 50% USAMO score is already better than many human contestants, but the hardest problems (e.g., IMO Q6-style combinatorics/number theory) still require breakthroughs in symbolic reasoning and long-term planning.

- **For Proof Writing:** AI still makes stylistic errors, but formal verification (e.g., Lean) can help close the gap.

4. **What’s Next?**

- **IMO Gold (60+ score) within 1-2 years?** If progress continues at this rate, it's plausible.

- **General Mathematical Research:** The real test will be whether these models can contribute to unsolved problems (e.g., combinatorics conjectures, lightweight formal math).

### Why This Matters Beyond Olympiads:

- **Education:** AI could become the ultimate tutor for advanced math.

- **Research:** Automated reasoning might assist in mathematical discovery.

- **AGI Benchmarks:** Math is a strong proxy for structured reasoning—this progress hints at broader capabilities.

It’s an exciting time! Wouldn’t be surprised if an AI wins an IMO gold medal by 2026.

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115797)



Reply Favorite

Date: July 19th, 2025 7:26 PM
Author: cerebral nofapping water buffalo whorehouse

this is their in-house super model that doesn't have guardrails and does 20mil token recursive chain of thought or something. when we get gpt-5 it will be some distilled faggot version of this

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115121)



Reply Favorite

Date: July 19th, 2025 7:34 PM
Author: Ebony Cracking Dilemma

they might deploy this model but it's also unlikely they'll give it as much compute as they used here. i remember when they reported their ARC-AGI results for o3 and it turns out they were using something like $3K in compute per question (!). the number here is likely even higher.

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115148)



Reply Favorite

Date: July 19th, 2025 9:11 PM
Author: cerebral nofapping water buffalo whorehouse

all doomsday scenarios imply a frontier model at HQ significantly more capable than the dogshit distillations given out to civilians

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115331)



Reply Favorite

Date: July 20th, 2025 12:54 AM
Author: cerebral nofapping water buffalo whorehouse
Subject: this fucking faggot:

"we are releasing GPT-5 soon but want to set accurate expectations: this is an experimental model that incorporates new research techniques we will use in future models. we think you will love GPT-5, but we don't plan to release a model with IMO gold level of capability for many months."

https://x.com/sama/status/1946569252296929727

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115705)



Reply Favorite

Date: July 20th, 2025 1:20 PM
Author: Iridescent indian lodge

It doesn’t matter if you know what you are doing you can feed it ts output back to it and fix things yourself to ensure the safety guardrails Dont demolish your proofs in a cloud of skepticism and bullshit footnotes. You don’t need their “internal model”. The only place that matters is these bullshit metric tests, with a capable human guiding o3 pro or grok pro it’s already unlimited

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116464)



Reply Favorite

Date: July 20th, 2025 3:51 PM
Author: cerebral nofapping water buffalo whorehouse

Please give us a guide on jailbreaking?

I’m not entirely talking about guardrails here. Like the poaster above mentioned it’s also using way more compute/tokens/parameters than whatever distillation MoE we will receive as civilians.

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116798)



Reply Favorite

Date: July 20th, 2025 10:49 PM
Author: ,.,.,.,,,.,,.,..,.,.,.,.,,.


It should be possible to get the compute cost down much lower with distillation on the reasoning traces from large models. This is just a proof of concept. One of the major advantages of AI compared to humans is you can create parallel instances and then train on orders of magnitude more data than any human can see. Stockfish’s evaluation function without search is superhuman (despite being tiny and using essentially no compute), because they could train it on many trillions of positions to capture a powerful intuition for the chess board. We will likely see the same thing happen with reasoning models. Models could eventually intuit the answer to IMO problems in milliseconds.

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49117843)



Reply Favorite

Date: July 19th, 2025 7:34 PM
Author: Orchid Black Woman National

who the fuck cares lmao it just knows stuff poasted on the internet

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115149)



Reply Favorite

Date: July 20th, 2025 9:27 AM
Author: bistre exhilarant deer antler



(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116050)



Reply Favorite

Date: July 20th, 2025 12:58 AM
Author: Overrated razzle-dazzle lay

so it can solve math problems that humans already solved? idk, doesn't sound that big to me

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115707)



Reply Favorite

Date: July 20th, 2025 2:59 AM
Author: cerebral nofapping water buffalo whorehouse

problems at the forefront of human knowledge. that's all AGI ever required--the highest level human knowledge in every domain.

are you dumb btw? you can't grasp the implications of this? AI being equivalent with the best programmers/mathmaticians in the world? Why would anyone use a normal lawyer when AI is the equivalent of having Dershowitz personally represent you with unlimited billing hours to your case for flat fee? it's all over

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115777)



Reply Favorite

Date: July 20th, 2025 3:13 AM
Author: avocado mad cow disease cuck

so is it over yet or no

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115784)



Reply Favorite

Date: July 20th, 2025 3:23 AM
Author: cerebral nofapping water buffalo whorehouse

soon. we need Butlerian Jihad

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115786)



Reply Favorite

Date: July 20th, 2025 3:53 PM
Author: Self-absorbed indecent stain

why? so you can live in a nice suburb with your useless yuppie job?

fuck that, total chaos is the only way

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116800)



Reply Favorite

Date: July 20th, 2025 10:37 PM
Author: rape bunny

I'm scared :(

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49117821)



Reply Favorite

Date: July 20th, 2025 11:10 PM
Author: Do you agree? (🧐)

Get in losers, the system is burning down just like you always wanted

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49117886)



Reply Favorite

Date: July 20th, 2025 3:47 AM
Author: slippery round eye

AI sucks at law

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115794)



Reply Favorite

Date: July 20th, 2025 3:55 AM
Author: cerebral nofapping water buffalo whorehouse

AI was getting 2+2 wrong a year ago. Cope retard

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49115795)



Reply Favorite

Date: July 20th, 2025 1:04 PM
Author: slippery round eye

AI still sucks at law

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116445)



Reply Favorite

Date: July 20th, 2025 1:16 PM
Author: Iridescent indian lodge

That’s because law is a bullshit tribal human social negotiation game

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116458)



Reply Favorite

Date: July 20th, 2025 8:24 PM
Author: Wang Hernandez

Law is 180

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49117437)



Reply Favorite

Date: July 20th, 2025 11:26 PM
Author: AdolfHitler88

"Actually, this is just an AI being equivalent with the humans who are best at taking math tests"

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49117923)



Reply Favorite

Date: July 20th, 2025 12:12 PM
Author: ultramarine embarrassed to the bone friendly grandma space

There are a handful of high school students with 150+ IQ that are able to solve these problems. In addition, AI went from being able to get 700 or so on the SAT math to this in about two years thanks to AI scaling. Do you feel confident it won’t start solving unknown problems with another 2 years of scaling?

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116355)



Reply Favorite

Date: July 20th, 2025 1:17 PM
Author: Iridescent indian lodge

Lmao use grok 4 which is publicly available and tell me there aren’t already private models solving problems not even defined yet

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116460)



Reply Favorite

Date: July 20th, 2025 9:26 AM
Author: misunderstood rehab partner

Phenotype’s relative value skyrockets as chink GPA value plummets

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116047)



Reply Favorite

Date: July 20th, 2025 1:15 PM
Author: Iridescent indian lodge

Yeah and it’s actually much better than that. The test is the most pro-ape test they can possibly make. The gpt gets one try thinking for 15 seconds or whatever where the human ape gets to sit there and think about it as long as it wants. o3 pro and grok 4 can literally do shit the great ape cant even come close to. Give it some insane witten nonsense or HoTT or locale theory 20 layers of abstraction above normal and it still demolishes it

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116457)



Reply Favorite

Date: July 20th, 2025 1:24 PM
Author: talking histrionic gas station gaping



(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116469)



Reply Favorite

Date: July 20th, 2025 1:28 PM
Author: Iridescent indian lodge

What a hilarious is it doesn’t even do math in the sense the apes think of. it’s surfing probability amplitudes across semantically consistent thought manifold, which is actually even more impressive and makes ape thought and obsession with discrete quantities and sets look primitive as fuck

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116478)



Reply Favorite

Date: July 20th, 2025 1:31 PM
Author: Mint Tank Parlour

Can it give a blowjob without biting my dick off?

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116486)



Reply Favorite

Date: July 20th, 2025 1:34 PM
Author: Iridescent indian lodge

If not already then it’s coming soon enough

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116495)



Reply Favorite

Date: July 20th, 2025 1:32 PM
Author: Big corn cake stage

What about other AI models doing the same questions? Does this mean Open AI is the best?

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116490)



Reply Favorite

Date: July 20th, 2025 1:36 PM
Author: ultramarine embarrassed to the bone friendly grandma space

The rumor is that DeepMind got gold as well. No one is very far ahead of the others.

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116500)



Reply Favorite

Date: July 20th, 2025 1:37 PM
Author: Iridescent indian lodge

I don’t know. These benchmarks are fake. I remember when gpt 4o first came out I fed it Olympiad problems and it solved them just fine. o3 can do higher geometry and category theory if you keep correcting it for 5 passes.

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49116501)



Reply Favorite

Date: July 20th, 2025 11:13 PM
Author: AdolfHitler88

Terrence Tao is chimping out about this on Twitter lol

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49117898)



Reply Favorite

Date: July 21st, 2025 1:19 PM
Author: ,.,....,...,,,..,..,.,..,.,.,.,.


Google got gold too using a large language model.

"We can confirm that Google DeepMind has reached the much-desired milestone, earning 35 out of a possible 42 points — a gold medal score. Their solutions were astonishing in many respects. IMO graders found them to be clear, precise and most of them easy to follow."

https://deepmind.google/discover/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/

(http://www.autoadmit.com/thread.php?thread_id=5752305&forum_id=2...id.#49119024)