<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Gemini on vnykmshr</title><link>https://blog.vnykmshr.com/writing/tags/gemini/</link><description>Recent content in Gemini on vnykmshr</description><generator>Hugo</generator><language>en</language><lastBuildDate>Thu, 10 Jul 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.vnykmshr.com/writing/tags/gemini/index.xml" rel="self" type="application/rss+xml"/><item><title>Replacing OCR with Gemini</title><link>https://blog.vnykmshr.com/writing/gemini-ocr/</link><pubDate>Thu, 10 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.vnykmshr.com/writing/gemini-ocr/</guid><description>&lt;p&gt;The previous post covered an &lt;a href="https://blog.vnykmshr.com/writing/fixing-ocr-addresses/"&gt;address sanitizer&lt;/a&gt; that fixes mangled OCR output using multi-strategy matching. It works, but it&amp;rsquo;s treating a symptom. A smarter OCR step would make most of it unnecessary.&lt;/p&gt;
&lt;p&gt;Traditional OCR extracts characters, then downstream code figures out what they mean. A separate pipeline handles structure, validation, error correction. The address sanitizer is part of that pipeline. It exists because the OCR engine doesn&amp;rsquo;t understand what it&amp;rsquo;s reading.&lt;/p&gt;</description></item></channel></rss>