<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Transformers on vnykmshr</title><link>https://blog.vnykmshr.com/writing/tags/transformers/</link><description>Recent content in Transformers on vnykmshr</description><generator>Hugo</generator><language>en</language><lastBuildDate>Wed, 18 Feb 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.vnykmshr.com/writing/tags/transformers/index.xml" rel="self" type="application/rss+xml"/><item><title>Repeat yourself</title><link>https://blog.vnykmshr.com/writing/repeat-yourself/</link><pubDate>Wed, 18 Feb 2026 00:00:00 +0000</pubDate><guid>https://blog.vnykmshr.com/writing/repeat-yourself/</guid><description>&lt;p&gt;If you repeat your prompt, the model gives you a better answer. Not a smarter model, not a bigger context window, not chain of thought &amp;ndash; you say the same thing twice and it works better. &lt;a href="https://arxiv.org/abs/2512.14982"&gt;Google researchers tested this&lt;/a&gt; across Gemini, GPT, Claude, DeepSeek &amp;ndash; 47 wins out of 70 benchmarks, zero losses.&lt;/p&gt;
&lt;p&gt;In a transformer, token 1 can&amp;rsquo;t see token 50. Causal masking &amp;ndash; each token only attends to what came before it. The first words of your prompt are always processed with the least context. They&amp;rsquo;re flying blind. When you repeat the prompt, the second copy&amp;rsquo;s early tokens can attend to the entire first copy. You&amp;rsquo;re giving the beginning of your question the context it never had.&lt;/p&gt;</description></item></channel></rss>