\
  The most prestigious law school admissions discussion board in the world.
BackRefresh Options Favorite

7/11/25 AI thread

scholarship wants me to do these daily again https://x.co...
Contagious Sexy Pistol
  07/11/25
Curious how this performs on law stuff. I’ve been main...
ocher slippery public bath
  07/11/25
https://x.com/MechanizeWork/status/1943726855015841805 ht...
Contagious Sexy Pistol
  07/11/25
...
ocher slippery public bath
  07/11/25
i think i am going to stick to Opus, o3 and 2.5 pro. the dif...
Floppy Puce Electric Furnace Gay Wizard
  07/11/25
they're definitely all using our data for training that's...
Contagious Sexy Pistol
  07/11/25
...
Onyx main people
  07/11/25
Google does if you use aistudio but i buy API credits. Same ...
Floppy Puce Electric Furnace Gay Wizard
  07/11/25
I trained a local llm on XO and it called me a fag
Brilliant orchestra pit
  07/11/25
...
Razzle toaster
  07/11/25
Needs tweaking until it calls you a nigger
Onyx main people
  07/11/25
"grok 4, program grok 5 and don't make any mistakes&quo...
Rusted slimy home
  07/11/25
this was literally what the "AI 2027" Doomsday Rep...
Contagious Sexy Pistol
  07/11/25
...
Rusted slimy home
  07/11/25
https://x.com/keyonV/status/1943730495264584079 https://a...
Contagious Sexy Pistol
  07/11/25
https://archive.ph/cYJS5 good WSJ normie article about th...
Contagious Sexy Pistol
  07/11/25
...
flushed macaca feces
  07/11/25
...
ocher slippery public bath
  07/11/25
...
ebony mexican
  07/11/25
this is why i think the reinforcement learning approach of t...
Floppy Puce Electric Furnace Gay Wizard
  07/11/25
yeah, i've become convinced that LLMs just fundamentally can...
Contagious Sexy Pistol
  07/11/25
right. i view RL as the way to get strong superhuman agents ...
Floppy Puce Electric Furnace Gay Wizard
  07/11/25
god i love AI. i can have it explain this paper to me and th...
Contagious Sexy Pistol
  07/11/25
https://x.com/karpathy/status/1944435412489171119
Contagious Sexy Pistol
  07/13/25
https://itcanthink.substack.com/p/what-are-robot-world-model...
Contagious Sexy Pistol
  07/11/25
This shit is way overrated for my legal practice
Shaky blood rage kitty
  07/11/25
https://x.com/NeonWhiteRabbit/status/1943541842655478255 ...
Contagious Sexy Pistol
  07/11/25
...
cream ratface tank
  07/11/25
https://www.youtube.com/watch?v=YkzlPcsSGlA
motley coral plaza reading party
  07/11/25
OpenAI got cucked again. They tried to acquire Windsurf but ...
ebony mexican
  07/12/25
Google will acquire everything in the end
Hateful Heaven Filthpig
  07/12/25
Microsoft should put them out of business. The decision to p...
Charismatic Fuchsia Lodge Pozpig
  07/12/25
Yeah they somehow tricked the greedy jeets at Microsoft up t...
Contagious Sexy Pistol
  07/12/25
I think so too, and Zuckerberg just poached a lot more peopl...
ebony mexican
  07/12/25
the failure to transition to a for-profit likely hurt them a...
Floppy Puce Electric Furnace Gay Wizard
  07/12/25
yes sir, it's all moving pretty fast rn
Beady-eyed mauve jap
  07/12/25
https://x.com/Kimi_Moonshot/status/1943687594560332025 Ch...
Contagious Sexy Pistol
  07/12/25
https://x.com/nrehiew_/status/1943694228804063614
Contagious Sexy Pistol
  07/12/25
Would you actually use a Chinese LLM for work?
Hateful Heaven Filthpig
  07/12/25
The local qwen models are quite good. Doubt i would want to ...
Floppy Puce Electric Furnace Gay Wizard
  07/12/25
Yeah they are really good. I use qwen to mess around locally...
Contagious Sexy Pistol
  07/12/25
If I installed on a PC is anything better than Qwen? How do ...
Hateful Heaven Filthpig
  07/12/25
I think qwen is the best right now but there's a new Chinese...
Contagious Sexy Pistol
  07/12/25


Poast new message in this thread



Reply Favorite

Date: July 11th, 2025 1:36 PM
Author: Contagious Sexy Pistol

scholarship wants me to do these daily again

https://x.com/elder_plinius/status/1943171871400194231

system prompt for new grok 4. it has the same line about being allowed to say politically incorrect things, which supports my suspicion that the will stancil-raping version of grok the other day had more tweaks than just this

it appears to now be the strongest AI currently available. they spent a huge amount of compute resources on post-pretraining RL compared to all the other models. this is probably why it's performing so well on reasoning and problem-solving benchmark tests, because RL training helps a lot with this

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093385)



Reply Favorite

Date: July 11th, 2025 2:53 PM
Author: ocher slippery public bath

Curious how this performs on law stuff. I’ve been maining gemini lately and have used grok the least

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093630)



Reply Favorite

Date: July 11th, 2025 2:50 PM
Author: Contagious Sexy Pistol

https://x.com/MechanizeWork/status/1943726855015841805

https://www.mechanize.work/blog/sweatshop-data-is-over/

move over, teachers getting paid 100k to try to "teach" nigger kids how to write their name and tie their shoes

hello, RL engineers getting paid 300k to teach AI how to write its name and tie its shoes

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093621)



Reply Favorite

Date: July 11th, 2025 2:50 PM
Author: ocher slippery public bath



(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093622)



Reply Favorite

Date: July 11th, 2025 2:54 PM
Author: Floppy Puce Electric Furnace Gay Wizard

i think i am going to stick to Opus, o3 and 2.5 pro. the differences look pretty marginal and i have zero faith xAI won't use my data for training.

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093634)



Reply Favorite

Date: July 11th, 2025 2:57 PM
Author: Contagious Sexy Pistol

they're definitely all using our data for training

that's just Part Of The Deal

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093643)



Reply Favorite

Date: July 11th, 2025 2:59 PM
Author: Onyx main people



(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093650)



Reply Favorite

Date: July 11th, 2025 3:01 PM
Author: Floppy Puce Electric Furnace Gay Wizard

Google does if you use aistudio but i buy API credits. Same with Anthropic. i highly doubt they are lying about not training on API users if you opt out. i have less faith in OpenAI but i'm also not too concerned.

an Elon company? no way am i trusting them unless the model is way better.

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093657)



Reply Favorite

Date: July 11th, 2025 3:02 PM
Author: Brilliant orchestra pit

I trained a local llm on XO and it called me a fag

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093659)



Reply Favorite

Date: July 11th, 2025 3:03 PM
Author: Razzle toaster



(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093670)



Reply Favorite

Date: July 11th, 2025 3:12 PM
Author: Onyx main people

Needs tweaking until it calls you a nigger

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093697)



Reply Favorite

Date: July 11th, 2025 3:10 PM
Author: Rusted slimy home

"grok 4, program grok 5 and don't make any mistakes"

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093695)



Reply Favorite

Date: July 11th, 2025 3:14 PM
Author: Contagious Sexy Pistol

this was literally what the "AI 2027" Doomsday Report was btw

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093701)



Reply Favorite

Date: July 11th, 2025 4:40 PM
Author: Rusted slimy home



(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49093973)



Reply Favorite

Date: July 11th, 2025 5:44 PM
Author: Contagious Sexy Pistol

https://x.com/keyonV/status/1943730495264584079

https://arxiv.org/pdf/2507.06952

paper that demonstrates that while LLMs excel at predictive tasks that fall within their training data, they can't generalize that predictive ability into a complete and accurate world model to make correct predictions on tasks that weren't within their training

lecun is right imo. statistical inferences do not lead to the ability to make generalized inferences

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094177)



Reply Favorite

Date: July 11th, 2025 5:49 PM
Author: Contagious Sexy Pistol

https://archive.ph/cYJS5

good WSJ normie article about this

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094212)



Reply Favorite

Date: July 11th, 2025 6:07 PM
Author: flushed macaca feces



(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094310)



Reply Favorite

Date: July 11th, 2025 6:06 PM
Author: ocher slippery public bath



(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094306)



Reply Favorite

Date: July 11th, 2025 6:28 PM
Author: ebony mexican



(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094378)



Reply Favorite

Date: July 11th, 2025 6:48 PM
Author: Floppy Puce Electric Furnace Gay Wizard

this is why i think the reinforcement learning approach of trying to paper over these problems with more data is wrongheaded. they are getting insufficient generalization from 30 trillion token datasets so they think they just need to use RL and chain of thought to make 300 trillion token datasets or whatever and their problems will be solved. the architectures and training methods should be fixed first. there's a lot of research showing inadequate generalization on toy tasks even with lots of compute.

https://arxiv.org/abs/2207.02098

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094428)



Reply Favorite

Date: July 11th, 2025 7:05 PM
Author: Contagious Sexy Pistol

yeah, i've become convinced that LLMs just fundamentally cannot out-of-distribution generalize. and i don't see how RL could ever solve it. all you're doing is training it on additional specific tasks that it....still can't generalize out of

i think companies are doing RL now as more of a marketing gimmick than anything else. people are saying that the latest grok 4 is only performing so well on benchmarks because its RL training had overlap with the benchmark tests. this is what i mean when i say that they're just gaming the benchmark tests

i think the next few years might end up being the big base models trained with RL into a bunch of different specialized sub-models that are used in larger multi-agent architectures in order to be more useful in practice in different specific contexts

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094487)



Reply Favorite

Date: July 11th, 2025 7:18 PM
Author: Floppy Puce Electric Furnace Gay Wizard

right. i view RL as the way to get strong superhuman agents once you have the right base learning model. they should be able to use the existing data to train robust human level agents, but that's apparently not happening because the learning algorithm is inadequate.

that paper i linked seems to imply there are already models that consistently generalize better than transformers, so it's not clear we see companies flailing about for years using RL gimmicks with transformers. memory augmented transformers could be a near-term replacement for current models. if models don't improve substantially over the next year, there will be a strong motivation to try novel approaches like this.

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094525)



Reply Favorite

Date: July 11th, 2025 7:53 PM
Author: Contagious Sexy Pistol

god i love AI. i can have it explain this paper to me and then immediately answer all my follow up questions to understand architectures that could actually potentially generalize. lol

this would take me like...i dunno, 2 days to do before AI. not even counting all the time and effort that AI has saved me to get to the point of being able to understand this in the first place

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094574)



Reply Favorite

Date: July 13th, 2025 1:24 PM
Author: Contagious Sexy Pistol

https://x.com/karpathy/status/1944435412489171119

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49097842)



Reply Favorite

Date: July 11th, 2025 7:34 PM
Author: Contagious Sexy Pistol

https://itcanthink.substack.com/p/what-are-robot-world-models

really good short article explaining how people think that we'll be able to use generative AI video to train robots to have functional world models

the notion of this actually working seems crazy to me but these guys are all very smart so it must be somewhat viable or someone would be calling it out as BS. it would cut robot training time and costs IMMENSELY and make them a lot more commercially viable and speed up robotics development by years

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094551)



Reply Favorite

Date: July 11th, 2025 6:19 PM
Author: Shaky blood rage kitty

This shit is way overrated for my legal practice

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094362)



Reply Favorite

Date: July 11th, 2025 8:32 PM
Author: Contagious Sexy Pistol

https://x.com/NeonWhiteRabbit/status/1943541842655478255

looks like we found the real reason why grok 4 is so surprisingly good

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094660)



Reply Favorite

Date: July 11th, 2025 8:36 PM
Author: cream ratface tank



(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094671)



Reply Favorite

Date: July 11th, 2025 8:49 PM
Author: motley coral plaza reading party

https://www.youtube.com/watch?v=YkzlPcsSGlA

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49094712)



Reply Favorite

Date: July 12th, 2025 1:33 AM
Author: ebony mexican

OpenAI got cucked again. They tried to acquire Windsurf but didn’t want Microsoft to have access to its IP. Now the acquisition is off and Google has hired their CEO.

https://x.com/ns123abc/status/1943806065524507007?s=46&t=YKr-jZOYUHE15Tew69wt4w

The Microsoft partnership was key to getting them off the ground, but now it’s an adversarial relationship.

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095182)



Reply Favorite

Date: July 12th, 2025 1:52 AM
Author: Hateful Heaven Filthpig

Google will acquire everything in the end

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095203)



Reply Favorite

Date: July 12th, 2025 3:08 AM
Author: Charismatic Fuchsia Lodge Pozpig

Microsoft should put them out of business. The decision to partner with them rather than hiring away their engineers and scaling themselves was strange.

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095224)



Reply Favorite

Date: July 12th, 2025 8:51 AM
Author: Contagious Sexy Pistol

Yeah they somehow tricked the greedy jeets at Microsoft up to this point but it seems like their luck has run out

I haven't believed in OpenAI since a lot of their top guys left a year ago. It's a really bad sign for an org when a bunch of your top level producers leave

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095388)



Reply Favorite

Date: July 12th, 2025 9:19 AM
Author: ebony mexican

I think so too, and Zuckerberg just poached a lot more people. Besides brand recognition from chatGPT, I don’t think they have any other advantages over the competition.

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095411)



Reply Favorite

Date: July 12th, 2025 1:33 PM
Author: Floppy Puce Electric Furnace Gay Wizard

the failure to transition to a for-profit likely hurt them a lot. also increasingly clear there is no secret sauce and that training LLMs is not rocket science.

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095861)



Reply Favorite

Date: July 12th, 2025 9:01 AM
Author: Beady-eyed mauve jap

yes sir, it's all moving pretty fast rn

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095396)



Reply Favorite

Date: July 12th, 2025 9:05 AM
Author: Contagious Sexy Pistol

https://x.com/Kimi_Moonshot/status/1943687594560332025

China just released a new 32 billion parameter model that is apparently the 2nd best model in the world behind o3 and it's all open source(!)

If China could whip up the infrastructure and hardware for enough inference compute to meet demand, I think they could snap up a bunch of the market share for AI because they could undercut the US labs by so much

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095399)



Reply Favorite

Date: July 12th, 2025 9:22 AM
Author: Contagious Sexy Pistol

https://x.com/nrehiew_/status/1943694228804063614

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095415)



Reply Favorite

Date: July 12th, 2025 1:37 PM
Author: Hateful Heaven Filthpig

Would you actually use a Chinese LLM for work?

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095865)



Reply Favorite

Date: July 12th, 2025 1:39 PM
Author: Floppy Puce Electric Furnace Gay Wizard

The local qwen models are quite good. Doubt i would want to send data to a chinese LLM though

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095870)



Reply Favorite

Date: July 12th, 2025 1:49 PM
Author: Contagious Sexy Pistol

Yeah they are really good. I use qwen to mess around locally and it does really well

I don't trust the American LLM companies with my private data that actually matters either and I only work locally with it when I need to

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095893)



Reply Favorite

Date: July 12th, 2025 1:50 PM
Author: Hateful Heaven Filthpig

If I installed on a PC is anything better than Qwen? How do you train it on daily news updates and shit if it's local?

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095898)



Reply Favorite

Date: July 12th, 2025 1:59 PM
Author: Contagious Sexy Pistol

I think qwen is the best right now but there's a new Chinese one that I posted in this thread about that is apparently even better

There are ways to scrape web search info into your local model and incorporate it into your local LLM prompts but I haven't set it up yet because it hasn't really come up. It looks pretty easy to do though

(http://www.autoadmit.com/thread.php?thread_id=5749144&forum_id=2...id.#49095912)