Apr 9

Popper, Deutsch, and Taleb walk into a context window. 73% of tokens don't walk out. Full results and a design principle that survived all of them.

2 Comments

Luis Conde

Apr 11

Love this experiment. I’d be interested in running something like auto-researcher on this, to fine-tune and test more scenarios (prompts) and models.

Thanks! Yes, given the costs or running agents in loops - I'd love to see what auto-researcher finds - this maybe also a way to have per-model tuning and self healing on model updates. I looked a bit into quality scoring using Opus 4.6 as judge - ultraphilosopher looks strongest there too. But real improvement would be making a longer context + real code tests (and blending the results with code output). I love less chatter and faster responses from the models so use it by default everywhere now ;-)

Reply

Share

Krystian Kolondra

Cavemen vs Philosophers