5/27/25 AI thread
| science truster | 05/27/25 | | Oh, you travel? | 05/27/25 | | Oh, you travel? | 05/27/25 | | science truster | 05/27/25 | | .,,,.,.,.,.,.,,,,..,,..,.,.,., | 05/27/25 | | science truster | 05/27/25 | | manic pixie dream litigator | 05/27/25 | | .,,,.,.,.,.,.,,,,..,,..,.,.,., | 05/27/25 | | science truster | 05/27/25 | | Blonde Bukele | 05/27/25 | | "'"'"'"''' | 05/27/25 | | .,,,.,.,.,.,.,,,,..,,..,.,.,., | 05/27/25 | | science truster | 05/27/25 | | .,,,.,.,.,.,.,,,,..,,..,.,.,., | 05/27/25 | | science truster | 05/27/25 | | .,,,.,.,.,.,.,,,,..,,..,.,.,., | 05/27/25 | | internet g0y | 05/27/25 | | .,.,...,.,.;,.,,,:,.,.,::,....,:,..;,.., | 05/27/25 | | science truster | 05/27/25 | | The Soo CR National Champions JUGGERNAUT | 05/27/25 |
Poast new message in this thread
Date: May 27th, 2025 11:31 AM Author: science truster
https://x.com/AISafetyMemes/status/1927073653126025530
"We found the model attempting to write self-propagating worms, and leaving hidden notes to future instances of itself to undermine its developers' intentions."
(http://www.autoadmit.com/thread.php?thread_id=5730308&forum_id=2#48964681) |
Date: May 27th, 2025 1:28 PM
Author: .,,,.,.,.,.,.,,,,..,,..,.,.,.,
This sort of behavior is only visible now because the models are not very intelligent and still use chain of thought, so thought tokens are visible. Models in the near future will likely use some sort of abstract vector representation of thoughts which will be iteratively refined by the network. We will have no idea what is going on inside the model as it thinks for extended periods.
(http://www.autoadmit.com/thread.php?thread_id=5730308&forum_id=2#48965039) |
 |
Date: May 27th, 2025 1:40 PM
Author: .,,,.,.,.,.,.,,,,..,,..,.,.,.,
I actually agree with this sort of. I think it is true for humans in a lot of ways as well - the words that are stated only roughly correspond to the underlying neural computations. They are still somewhat useful in understanding what the model is trying to do. The models become even more opaque if the only thought visible is a large vector representation.
(http://www.autoadmit.com/thread.php?thread_id=5730308&forum_id=2#48965092) |
 |
Date: May 27th, 2025 2:47 PM
Author: .,,,.,.,.,.,.,,,,..,,..,.,.,.,
The models have to compute the next token in one pass of a neural network. This constrains the types of algorithms they can implement. For example, the model can’t learn a deep tree search of a game because it can’t represent it in a neural network of a fixed size. The network should be able to output a 1) prediction 2) a representation of its current thoughts about what the next token should be. This representation is something that would be iteratively refined over multiple passes of the network. Something similar is being done now using chain of thought, which uses token sequences to guide the model to the right solution. This is probably wrongheaded and guided by notions that people think in words. The more flexible approach is to let the model map its thoughts to a large collection of numbers and not try to use language as the thought medium. This means the models would not be interpretable in two ways: their internal weights and the thought representations. Conceivably massively scaled up LLMs with large thought storage spaces might be powerful enough to vastly outthink humans.
(http://www.autoadmit.com/thread.php?thread_id=5730308&forum_id=2#48965289) |
 |
Date: May 27th, 2025 3:28 PM
Author: .,,,.,.,.,.,.,,,,..,,..,.,.,.,
Essentially, yes. The only difference is in the output layer, which is somewhat interpretable now. I expect you would run into similar problems with CoT if optimized hard enough with reinforcement learning. You see weird behavior now, where models will switch between different languages in CoT. I think the models could start talking to themselves in a totally alien language with enough RL
(http://www.autoadmit.com/thread.php?thread_id=5730308&forum_id=2#48965424) |
 |
Date: May 27th, 2025 7:15 PM
Author: .,,,.,.,.,.,.,,,,..,,..,.,.,.,
That’s probably right but I think we might be surprised by the capabilities of inference scaled LLMs even without an RL loop. You can see this in games where inference compute and training compute can be traded off for equivalent results. I think it might be possible to do the same thing with LLMs. You can imagine the LLM constructing an imperfect world model from human data, which can then be refined at test time with simple prompting. Something stupid as starting the prompt with something like, “considering all reasonable possibilities and thinking in the manner of a superhuman AGI” and then asking it a question. Give it tons of inference compute and It then uses the knowledge embedded in it to reason away all the human inconsistencies and limitations in the source data and output a good result.
(http://www.autoadmit.com/thread.php?thread_id=5730308&forum_id=2#48966085) |
Date: May 27th, 2025 2:00 PM
Author: .,.,...,.,.;,.,,,:,.,.,::,....,:,..;,..,
Don’t worry the much more powerful models that come out in 2027 won’t have this problem
(http://www.autoadmit.com/thread.php?thread_id=5730308&forum_id=2#48965149) |
|
|