<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Nik Malykhin]]></title><description><![CDATA[Reflections on platform engineering, developer experience, and the craft of modern software — plus the occasional analog side quest]]></description><link>https://www.nikmalykhin.com</link><image><url>https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png</url><title>Nik Malykhin</title><link>https://www.nikmalykhin.com</link></image><generator>Substack</generator><lastBuildDate>Fri, 10 Apr 2026 20:24:27 GMT</lastBuildDate><atom:link href="https://www.nikmalykhin.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Nik Malykhin]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[nik1379616@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[nik1379616@substack.com]]></itunes:email><itunes:name><![CDATA[Nik]]></itunes:name></itunes:owner><itunes:author><![CDATA[Nik]]></itunes:author><googleplay:owner><![CDATA[nik1379616@substack.com]]></googleplay:owner><googleplay:email><![CDATA[nik1379616@substack.com]]></googleplay:email><googleplay:author><![CDATA[Nik]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[𝗧𝗵𝗲 𝗣𝗿𝗮𝗴𝗺𝗮𝘁𝗶𝗰 𝗛𝗲𝘅𝗮𝗴𝗼𝗻: 𝗦𝗰𝗮𝗹𝗶𝗻𝗴 𝗗𝗲𝗰𝗼𝘂𝗽𝗹𝗶𝗻𝗴 𝘄𝗶𝘁𝗵𝗼𝘂𝘁 𝗖𝗼𝗺𝗽𝗹𝗲𝘅𝗶𝘁𝘆]]></title><description><![CDATA[&#120295;&#120309;&#120306; &#120295;&#120306;&#120315;&#120320;&#120310;&#120316;&#120315; &#120316;&#120315; &#120321;&#120309;&#120306; &#120295;&#120319;&#120302;&#120310;&#120313;]]></description><link>https://www.nikmalykhin.com/p/pragmatic-hexagon</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/pragmatic-hexagon</guid><pubDate>Tue, 24 Mar 2026 13:21:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3><strong>&#120295;&#120309;&#120306; &#120295;&#120306;&#120315;&#120320;&#120310;&#120316;&#120315; &#120316;&#120315; &#120321;&#120309;&#120306; &#120295;&#120319;&#120302;&#120310;&#120313;</strong></h3><p>In a professional kitchen, there is a concept called <em>mise en place</em>&#8212;everything in its place. You don&#8217;t start searing the scallops until every herb is chopped and every sauce is whisked. If you skip the prep to &#8220;save time,&#8221; you end up adjusting the recipe mid-saut&#233;, usually resulting in a frantic mess, ruined ingredients, and a dish that takes twice as long to serve.</p><p>Modern software development has a similar &#8220;popular choice&#8221;: start coding the logic immediately to show &#8220;progress.&#8221; But when we skip the architectural prep&#8212;the interfaces and boundaries&#8212;we aren&#8217;t moving fast; we are just building a kitchen we&#8217;ll have to tear down while the customers are waiting. I&#8217;ve watched engineers lose sight of the goal in the pursuit of a &#8220;perfect flow&#8221; that wasn&#8217;t grounded in discipline. If everyone says they want &#8220;clean code,&#8221; why does the system feel like it&#8217;s fighting us the moment we add a new story?</p><h3><strong>&#120294;&#120326;&#120320;&#120321;&#120306;&#120314; &#120282;&#120306;&#120316;&#120314;&#120306;&#120321;&#120319;&#120326;</strong></h3><p>The environment of this experiment is a standard <strong>Kotlin and Spring Boot</strong> stack. The landscape is defined by three distinct zones designed to minimize the &#8220;weight&#8221; of dependencies. To navigate this space, we use a rigid directory structure that acts as our map:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;78781d66-8266-4010-b6a8-432cfa8a8d42&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">app

&#9500;&#9472;&#9472; domain      &lt;-- THE HEART (POKOs only)
&#9474;   &#9500;&#9472;&#9472; model
&#9474;   &#9474;   &#9492;&#9472;&#9472; Data.kt     &lt;-- Pure Kotlin Data Class
&#9474;   &#9492;&#9472;&#9472; ports
&#9474;       &#9492;&#9472;&#9472; outgoing    &lt;-- Interfaces defining &#8220;What&#8221; we need
&#9474;           &#9500;&#9472;&#9472; DataPersistencePort.kt    &lt;- SQL db
&#9474;           &#9492;&#9472;&#9472; DataStoragePort.kt        &lt;- Object storage
&#9500;&#9472;&#9472; usecases    &lt;-- THE ORCHESTRATOR
&#9474;   &#9492;&#9472;&#9472; StoreDataUseCase.kt    &lt;-- Feature logic
&#9492;&#9472;&#9472; adapter     &lt;-- THE &#8220;HOW&#8221; (Infrastructure)
    &#9500;&#9472;&#9472; web         &lt;-- Inbound Adapter
    &#9474;   &#9500;&#9472;&#9472; DataController.kt
    &#9474;   &#9500;&#9472;&#9472; dto         &lt;-- Request/Response DTOs
    &#9474;       &#9492;&#9472;&#9472; WebMapper.kt    &lt;-- DTO &lt;-&gt; Domain mapping
    &#9500;&#9472;&#9472; sqldb       &lt;-- Outbound Adapter
    &#9474;   &#9500;&#9472;&#9472; entity
    &#9474;   &#9474;   &#9492;&#9472;&#9472; DataJpaEntity.kt    &lt;-- @Entity + JPA annotation
    &#9474;   &#9500;&#9472;&#9472; DataRepository.kt        &lt;-- Spring Data/CrudRepository
    &#9474;   &#9500;&#9472;&#9472; PersistenceMapper.kt     &lt;-- Entity &lt;-&gt; Domain mapping
    &#9474;   &#9492;&#9472;&#9472; PersistenceAdapter.kt    &lt;-- Impl DataPersistencePort
    &#9492;&#9472;&#9472; cloud       &lt;-- Outbound Adapter
        &#9492;&#9472;&#9472; ObjectStorageAdapter.kt</code></pre></div><p>&#10148; <strong>The Heart (Domain):</strong> Pure Kotlin Data Classes and business logic common to all usecases.</p><p>&#10148; <strong>The Orchestrator (Usecases):</strong> Where feature-specific logic lives and adapters are coordinated.</p><p>&#10148; <strong>The Infrastructure (Adapters):</strong> The &#8220;How&#8221; of the system&#8212;web controllers, JPA entities, and cloud storage clients.</p><p>The invisible boundary here is the <strong>Port</strong>. It&#8217;s an interface that defines &#8220;what&#8221; we need without caring &#8220;how&#8221; it&#8217;s done. In theory, this geometry should be light and flexible, yet many teams find it rigid because they misunderstand the direction of the signal.</p><h3><strong>&#120280;&#120314;&#120317;&#120310;&#120319;&#120310;&#120304;&#120302;&#120313; &#120280;&#120325;&#120317;&#120313;&#120316;&#120319;&#120302;&#120321;&#120310;&#120316;&#120315;</strong></h3><p>I moved from the &#8220;theoretical path&#8221; of perfect architecture to the &#8220;actual terrain&#8221; of daily PRs. The system showed its breaking point not in a crash, but in a silent failure of discipline: the <strong>Domain Import Leak</strong>.</p><p>&#10148; <strong>The Breaking Point:</strong> It usually starts when an engineer adds a domain service that directly imports an adapter: import app.adapter.NewAdapter.kt.</p><p>&#10148; <strong>The Silent Failure:</strong> The code still passes tests. It still &#8220;works&#8221;. But the &#8220;Pure Domain&#8221; has been poisoned by infrastructure concerns.</p><p>&#10148; <strong>The Result:</strong> When the time inevitably comes to move that service to a usecase, the system reacts with extreme fatigue. We end up with PRs requiring the renaming of tens of files, leading to typos, package mismatches, and a massive mental load on reviewers.</p><h3><strong>&#120288;&#120302;&#120315;&#120302;&#120308;&#120310;&#120315;&#120308; &#120321;&#120309;&#120306; &#120294;&#120310;&#120308;&#120315;&#120302;&#120313;</strong></h3><p>The handoff between layers is where the &#8220;spaghetti&#8221; starts or ends. In my exploration, I found that the clarity of intent is often lost because teams are afraid of the &#8220;complexity&#8221; of an extra interface.</p><p>&#10148; <strong>Cognitive Load:</strong> Trying to refactor architecture in the middle of a feature story creates a &#8220;refactoring nightmare&#8221;.</p><p>&#10148; <strong>Signal-to-Noise:</strong> If you are 100% sure a logic block belongs in the domain, put it there. If not, the &#8220;cleaner&#8221; signal is to start in a <strong>Usecase</strong> and extract downward only when the need is proven.</p><p>&#10148; <strong>Direct Translation:</strong> To keep the signal clear, I&#8217;ve found it&#8217;s even acceptable to call a Port directly from a controller for simple cases. This avoids 1:1 &#8220;pass-through&#8221; mapping while keeping the adapter decoupled through the interface.</p><h3><strong>&#120298;&#120309;&#120302;&#120321; &#120280;&#120302;&#120319;&#120315;&#120306;&#120305; &#120295;&#120319;&#120322;&#120320;&#120321;?</strong></h3><p>After the stress test of &#8220;no time to decouple,&#8221; one principle remained standing: <strong>Mandatory Ports from the Start</strong>.</p><p>&#10148; <strong>Stability:</strong> The &#8220;price&#8221; of an interface at the start is effectively zero. It provides an immediate boundary that prevents the &#8220;import leak&#8221; and allows the domain to remain pure. &#10148; <strong>The New Baseline:</strong> My trusted navigation strategy is now <strong>TDD-driven Hexagon</strong>.</p><p>&#8226; <strong>Step 1:</strong> Define the Domain Model.</p><p>&#8226; <strong>Step 2:</strong> Build the Adapter and verify it with <strong>Testcontainers</strong> (SQL or Object Storage).</p><p>&#8226; <strong>Step 3:</strong> Finally, orchestrate it all in the Usecase or Controller using the Port interface.</p><h3><strong>&#120276;&#120304;&#120321;&#120310;&#120316;&#120315;&#120302;&#120303;&#120313;&#120306; &#120284;&#120315;&#120320;&#120310;&#120308;&#120309;&#120321;&#120320;</strong></h3><p>&#10148; <strong>Backlog (Failed the Stress Test):</strong></p><p>&#8226; &#8220;Refactoring-in-the-middle&#8221;: Changing architecture while delivering a story leads to mess and typos.</p><p>&#8226; Direct Adapter Imports: Any import app.adapter inside app.domain is a bug, not a feature.</p><p>&#10148; <strong>Merged (Trusted Toolkit):</strong></p><p>&#8226; <strong>Ports First:</strong> Always create the interface for 3rd party services or repositories immediately.</p><p>&#8226; <strong>Adapter-First Testing:</strong> Use Testcontainers to prove your &#8220;How&#8221; works before you worry about the &#8220;What&#8221; in your orchestration.</p><p>&#8226; <strong>Minimum Layers:</strong> Only add a Usecase layer if there is actual orchestration; otherwise, call the Port from the Controller.</p><p><strong>Final Wisdom:</strong> Clean architecture isn&#8217;t about having the most layers; it&#8217;s about having the most resilient boundaries. The &#8220;price&#8221; of an interface is nothing compared to the cost of a messy PR that no one wants to review.</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The 24-Inch Migration: Onboarding a 5-Year-Old to New Hardware]]></title><description><![CDATA[Learn how to apply software engineering principles&#8212;like look-ahead buffers and integration testing&#8212;to manage complex hardware migrations. Discover how to transition a junior rider to a new platform while protecting the Developer Experience (DX) and fostering long-term system ownership.]]></description><link>https://www.nikmalykhin.com/p/the-24-inch-migration-onboarding</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/the-24-inch-migration-onboarding</guid><pubDate>Tue, 17 Mar 2026 11:03:03 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!h0kn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the world of software, we often talk about &#8220;breaking changes.&#8221; You upgrade a core library, and suddenly the interfaces you relied on are deprecated, the latency spikes, and the system becomes unpredictable. Last week, I attempted a major version upgrade on my 5-year-old son&#8217;s primary transport layer: we moved from a 16-inch &#8220;legacy&#8221; bike to a <strong>Specialized Hotrock 24</strong>.</p><p>Physically, he was ready. He&#8217;s tall for his age, and the metrics suggested he could handle the 24-inch wheels. But as any Tech Lead knows, just because the hardware supports the requirements doesn&#8217;t mean the operator is ready to push to production.</p><h2>The System Architecture: Specialized Hotrock 24</h2><p>In this migration, the hardware selection was about finding the right <strong>Long Term Support (LTS)</strong> release. We skipped the 20-inch version entirely; in our roadmap, a 20-inch bike was a short-term patch that would only serve us for a year or two before hitting its end-of-life.</p><p>We went straight for the 24-inch platform as our LTS. To make this high-performance hardware compatible with a 5-year-old&#8217;s geometry, I chose the Hotrock for its low-slung frame&#8212;think of it as a <strong>compatibility layer</strong> or a &#8220;shim&#8221; that allows a smaller user to interface with a much larger system architecture.</p><h2>The Debugging Phase: Staging Environment (Weekend 1)</h2><p>We didn&#8217;t head straight for the trails. That would be like deploying a refactored monolith to 100% of users without a staging environment. We set up a 3x3 meter &#8220;Sandbox&#8221; in a parking lot to run our first integration tests.</p><h3>1. The Look-Ahead Buffer (The Square)</h3><p>The first bug we encountered was <strong>Visual Latency</strong>. He was looking at his front wheel&#8212;the equivalent of a system only processing the data packet currently in the buffer.</p><p><strong>The Fix:</strong> I implemented a new algorithm. <em>Start at Cone 1, look at Cone 2. When the front wheel enters the zone between 1 and 2, immediately point the sensors (eyes) toward Cone 3.</em> We were teaching him to process future state while executing current operations.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!h0kn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h0kn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg 424w, https://substackcdn.com/image/fetch/$s_!h0kn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg 848w, https://substackcdn.com/image/fetch/$s_!h0kn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!h0kn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h0kn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg" width="1456" height="1096" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1096,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5694022,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.nikmalykhin.com/i/191237160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h0kn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg 424w, https://substackcdn.com/image/fetch/$s_!h0kn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg 848w, https://substackcdn.com/image/fetch/$s_!h0kn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!h0kn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97d9ca56-b3b9-4b46-8a6c-4a109979f92a_4080x3072.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>2. The I/O Interrupt v1.0 (Stop-on-Line)</h2><p>We tested the &#8220;Stop&#8221; command with a simple line. At this stage, we kept the requirements low: just execute a <code>HALT</code> command exactly on the line. He passed this test without issues&#8212;the braking interface was working, even if it was still a bit binary.</p><h2>Scaling the System (Weekend 2)</h2><p>Once the basic &#8220;Look-Ahead&#8221; logic was cached, we increased the complexity of our tests.</p><h3>2.1 The I/O Interrupt v1.1 (The &#8220;No-Touch&#8221; Constraint)</h3><p>We refactored the stop-and-go drill. Now, he had to stop on the line and then resume driving <em>without</em> touching the floor. This was about refining balance and power delivery&#8212;moving from a simple halt to a complex state transition.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hCuS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hCuS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hCuS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hCuS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hCuS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hCuS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg" width="1456" height="1934" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1934,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4512976,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.nikmalykhin.com/i/191237160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hCuS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hCuS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hCuS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hCuS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e744fb6-a17b-4f74-99d6-acd5107c51b1_3072x4080.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>3. The Slalom (Logic Branching)</h3><p>Finally, we introduced the Slalom. This was a true logic-branching exercise: navigating a sequence of four cones. It required high-frequency adjustments to his trajectory based on the &#8220;Look-Ahead&#8221; data he was now successfully processing.</p><h2>The &#8220;Merged PR&#8221;: Managing the Developer Experience (DX)</h2><p>The first weekend wasn&#8217;t a &#8220;success&#8221; by pure performance metrics. He failed several drills, the &#8220;build&#8221; felt shaky, and the cones remained largely un-navigated.</p><p>But here is the most important log entry: <strong>He didn&#8217;t get frustrated.</strong> In my day job, when a Junior Developer (or an AI agent like Jules) struggles with a new stack, the worst thing a Tech Lead can do is demand they stay until midnight to &#8220;fix the build.&#8221; That is how you accrue <strong>Human Technical Debt</strong>&#8212;you might get the code merged today, but you&#8217;ve poisoned the developer&#8217;s relationship with the codebase for tomorrow.</p><p>By applying a &#8220;Freedom of Decision&#8221; protocol and capping sessions at 15 minutes, we prioritized the <strong>Developer Experience</strong>. Because I didn&#8217;t push, he didn&#8217;t associate the new hardware with stress. We maintained a high &#8220;morale-to-output&#8221; ratio, ensuring he was excited to &#8220;reboot&#8221; the training the following weekend.</p><p><strong>The Feature:</strong> By the end of the second weekend, something clicked. It wasn&#8217;t about completing the drills perfectly&#8212;it was about the <em>feel</em>. The &#8220;Look-Ahead&#8221; algorithm was finally running in the background, and he started to feel comfortable on the new hardware.</p><h2>The Post-Deployment Cleanup: Ownership</h2><p>The real sign that the migration was a success came after the training was over. Without being asked, he started cleaning the bike himself.</p><p>In engineering, we call this <strong>Full-Cycle Ownership</strong>. It&#8217;s the moment a developer stops just writing code and starts caring about the health of the system they operate. Seeing a 5-year-old wipe down his own &#8220;hardware&#8221; after a successful sprint in the sandbox is the ultimate proof of engagement. He wasn&#8217;t just using the tool; he was owning it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!940b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!940b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!940b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!940b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!940b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!940b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg" width="1456" height="1934" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1934,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3369257,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.nikmalykhin.com/i/191237160?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!940b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!940b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!940b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!940b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F374b8953-b0fb-43b7-8456-6d931c84eb32_3072x4080.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>The Log:</h3><ul><li><p><strong>Hardware:</strong> Specialized Hotrock 24 (LTS Migration).</p></li><li><p><strong>Total Training Time:</strong> Two 15-minute sprints.</p></li><li><p><strong>Bugs Fixed:</strong> Visual Latency (Front-wheel staring).</p></li><li><p><strong>Post-Deployment:</strong> Automatic system maintenance (he cleaned the bike).</p></li><li><p><strong>Emotional ROI:</strong> High. The goal isn't to go fast on day one&#8212;it's to make sure that when we finally hit the trails, the pilot feels like the system belongs to him.</p></li></ul><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Refactoring life, one Side Quest at a time.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Lying Tests and the Silent Swallow: Hardening Legacy Java]]></title><description><![CDATA[Is your CI/CD pipeline telling you the truth, or is it just telling you what you want to hear?]]></description><link>https://www.nikmalykhin.com/p/lying-tests-and-the-silent-swallow</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/lying-tests-and-the-silent-swallow</guid><pubDate>Tue, 17 Mar 2026 08:00:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Is your CI/CD pipeline telling you the truth, or is it just telling you what you want to hear?</strong> </p><p>In many legacy projects, the build is &#8220;Green,&#8221; the tests pass, and the console shows no errors. Yet, the moment the application hits production, it fails. The culprit is often a &#8220;Lying Test&#8221;&#8212;a suite that passes not because the code works, but because the errors have been carefully hidden, logged to a void, or suppressed by a generic catch-all block.</p><p>How do you turn a &#8220;politely silent&#8221; codebase into one that fails loudly enough to be fixed?</p><h3>The &#8216;Before&#8217; State: Setting the Context</h3><p>In older Java applications (circa 2005), error handling was often synonymous with <code>e.printStackTrace()</code>. Developers used manual <code>main()</code> methods or early JUnit versions to &#8220;test&#8221; logic. When an exception occurred, the instinct was to keep the process running at all costs.</p><p>The &#8220;old way&#8221; of testing often looked like this:</p><ul><li><p><strong>The Silent Swallow:</strong> Generic <code>catch (Exception e)</code> blocks that log a message but do not rethrow or signal failure.</p></li><li><p><strong>Exit Code 0:</strong> Build scripts (Ant) that encounter a runtime error but still report a successful exit code, tricking the developer into thinking everything is fine.</p></li><li><p><strong>Manual Verification:</strong> Tests that require a human to read the console output to see if it &#8220;looks right,&#8221; rather than asserting a specific outcome.</p></li></ul><h3>Introducing the Core Concept: Honest Testing</h3><p><strong>Honest Testing</strong> is the process of stripping away the &#8220;safety blankets&#8221; of legacy error handling to force the application to <strong>Crash Loudly.</strong></p><p><strong>What is it?</strong> It is a &#8220;Hardening Phase&#8221; where you replace swallowed exceptions with meaningful failures and migrate manual checks to automated assertions.</p><p><strong>Why does it matter?</strong> You cannot refactor code you do not understand. If your tests are lying to you about the state of the system, any &#8220;improvement&#8221; you make is just a guess. Making the build <strong>RED</strong> is the first step toward making it truly <strong>GREEN.</strong></p><h3>Practical Applications &amp; Use Cases</h3><h4>Use Case A: Exposing the Silent Swallow</h4><p>The most common anti-pattern in legacy Java is the &#8220;Log and Forget&#8221; block. We must convert these into loud failures during the testing phase.</p><pre><code><code>// BEFORE: The Lying Code
public void storeData() {
    try {
        // critical logic
    } catch (Exception e) {
        System.out.println("Error happened, but let's keep going!");
    }
}

// AFTER: Honest Code for Testing
public void storeData() {
    try {
        // critical logic
    } catch (Exception e) {
        // Re-throwing as a RuntimeException forces the test to fail
        throw new RuntimeException("Hardened Failure: Data storage failed", e);
    }
}
</code></code></pre><p><em>Benefit: The test suite will now immediately catch failures that were previously invisible.</em></p><h4>Use Case B: From <code>main()</code> to JUnit 5</h4><p>Legacy projects often have &#8220;test&#8221; classes that are just <code>public static void main(String[] args)</code> methods. These don&#8217;t integrate with CI/CD.</p><pre><code><code>// Migrating to JUnit 5 Assertions
@Test
void testBackendConnection() {
    Backend b = new Backend("qbert.guba.com");
    // Instead of printing to console, we assert the state
    assertDoesNotThrow(() -&gt; b.connect(), "Connection should be stable");
    assertNotNull(b.getStatus(), "Status should be initialized");
}
</code></code></pre><p><em>Benefit: Provides a quantifiable &#8220;Safety Net&#8221; that build tools like Gradle can interpret as a Pass/Fail signal.</em></p><h3>Common Pitfalls &amp; Misconceptions</h3><p><strong>The &#8220;Fear of Red&#8221; Pitfall:</strong> Many teams are terrified of a broken build. They think that if the build turns red, they&#8217;ve failed.</p><p><strong>The Truth:</strong> In legacy refactoring, a <strong>Red Build</strong> is a victory. It means you&#8217;ve finally found the boundaries of the system. You&#8217;ve moved from &#8220;unknown-unknowns&#8221; to &#8220;known-knowns.&#8221; Don&#8217;t rush to fix the red; use it as a map to find where the code is truly broken.</p><h3>Core Trade-offs &amp; Nuances</h3><ul><li><p><strong>The &#8220;Crash&#8221; Period:</strong> When you start hardening tests, the project might not compile or pass for days. This requires stakeholder buy-in&#8212;you are breaking the &#8220;illusion of stability&#8221; to find the &#8220;reality of the debt.&#8221;</p></li><li><p><strong>Log Noise:</strong> Hardening exceptions often results in massive stack traces in your logs. This is necessary labor; you have to clean the noise to find the signals.</p></li></ul><h3>Forward-Looking Conclusion</h3><p>A &#8220;Green Build&#8221; is only valuable if it is earned. By removing the &#8220;Silent Swallows&#8221; from your legacy Java project, you are performing a diagnostic surgery. It is painful, and it reveals the rot, but it is the only way to heal the codebase.</p><p>Once your tests are honest, you can finally apply modern AI tools and refactoring patterns with confidence. You aren&#8217;t just &#8220;hacking&#8221; anymore; you are <strong>Engineering.</strong></p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Refactoring the Workshop]]></title><description><![CDATA[Rebuilding a bike maintenance stack from scratch&#8212;from professional roots to family essentials in Spain.]]></description><link>https://www.nikmalykhin.com/p/refactoring-the-workshop</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/refactoring-the-workshop</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Thu, 05 Mar 2026 16:29:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3>The Migration Headache</h3><p>Ever tried to migrate a massive, stateful legacy system to a new cloud region with zero downtime? That was my life in 2024. But here&#8217;s the thing about technical debt: it follows you.</p><p>My &#8220;Legacy System&#8221; wasn&#8217;t the physical tools&#8212;I&#8217;d sold those off before leaving Israel. The debt was in my head. My experience as a consultant in a small local MTB shop in Saint-Petersburg fifteen years ago had programmed me with a &#8220;pro-shop&#8221; bias. When we lived in Israel, I acted on that bias and built a monolith: a massive toolset, a wheel balancing stand, the works. It was classic <strong>Over-engineering</strong>.</p><p>Now, standing in my garage in Spain with two <strong>Merida Big Nine 60s</strong> and my son&#8217;s <strong>Specialized Hotrock 24</strong>, I realized I didn&#8217;t need to rebuild the data center. I needed to refactor for efficiency. I needed a <strong>modular set of microservices</strong>.</p><h3>The &#8220;System Architecture&#8221;: A Modular Toolchain</h3><p>Instead of a &#8220;buy-it-all&#8221; approach, I&#8217;ve decoupled the maintenance into three high-performance modules.</p><h4>1. Edge Computing: The &#8220;On-The-Trail&#8221; Kit</h4><p>This is for high-availability fixes. If this service fails, the &#8220;user&#8221; (my oldest son) has a total system crash 5km from the trailhead. I&#8217;ve packed this &#8220;payload&#8221; into a <strong>SKYSPER 20L</strong> backpack, organized in <strong>Zip-lock bags</strong> for modular access:</p><ul><li><p><strong>The Processor</strong>: Crankbrothers M17 multi-tool.</p></li><li><p><strong>Error Handling</strong>: KMC Missing Links (9 and 7-speed) + 2x Pedro&#8217;s or Park Tool tire levers.</p></li><li><p><strong>Redundancy</strong>: Kenda tubes (29x2.2 and 24x1.95) + Park Tool GP-2 pre-glued patches.</p></li><li><p><strong>Hardware Peripherals</strong>: hand pump with manometer + 2x small microfiber towels (Dirty/Clean).</p></li><li><p><strong>On-the-fly Patches</strong>: Small 60ml Finish Line Dry Lube + a travel-size Teflon spray (MO-94/GT85).</p></li></ul><h4>2. Maintenance Scripts: The &#8220;Dry-Clean&#8221; Routine</h4><p>Think of this as your <code>cron</code> jobs. It runs weekly to prevent system degradation. Here is the <strong>deployment logic</strong>:</p><ol><li><p><strong>Mechanical Cleaning</strong>: Back-pedal the chain through a dry rag to remove &#8220;big&#8221; grit.</p></li><li><p><strong>Rinse (Optional)</strong>: If you hear &#8220;sand grinding&#8221; in the gears, flush it with water.</p></li><li><p><strong>Stanchion Wipe</strong>: Clean the shiny bits of the fork with a dedicated rag.</p></li><li><p><strong>The Teflon Interface (Conditional Logic)</strong>:</p><ul><li><p><code>if (no_rinse)</code>: Spray Teflon onto a rag (not the bike) to wipe the chain/bolts.</p></li><li><p><code>else if (rinse_performed)</code>: Protect the brakes and spray Teflon <strong>directly</strong> onto the chain for water displacement.</p></li></ul></li><li><p><strong>The Wipe-Down</strong>: Use that Teflon-soaked rag to wipe the chain and bolt heads. This microscopic film stops the Spanish salt air from &#8220;bit-rotting&#8221; your hardware.</p></li><li><p><strong>Re-Lube</strong>: Apply Finish Line Dry Lube to the rollers.</p></li><li><p><strong>Final Wipe</strong>: Wait 60 seconds for penetration, then wipe off excess.</p></li></ol><h4>3. Core Infrastructure: The &#8220;Yearly Service&#8221;</h4><p>This is the &#8220;bare metal&#8221; hardware needed for the deep dives.</p><ul><li><p><strong>Health Monitoring</strong>: A <strong>Chain Wear Indicator</strong>. If it hits 0.75, the chain is &#8220;deprecated&#8221; and needs replacement.</p></li><li><p><strong>The Interface</strong>: A thin-profile <strong>15mm Pedal Wrench</strong>. You can&#8217;t hack this with a standard DIY wrench.</p></li><li><p><strong>Environment Setup</strong>: A <strong>Floor-to-Frame Stand</strong>. I found one for &#8364;30 on Vinted&#8212;a small investment for a massive increase in &#8220;developer comfort.&#8221;</p></li><li><p><strong>JIT (Just-In-Time) Dependencies</strong>: Specialized tools like the Cassette Lockring Tool and Cable Cutters are in the &#8220;backlog.&#8221; I won&#8217;t buy them until the specific part needs a &#8220;version upgrade.&#8221;</p></li></ul><h3>The Bonus: &#8220;Season Deep Clean&#8221; (System Integrity Audit)</h3><p>Once a season, we need more than a script; we need a full <strong>System Audit</strong>. This is where we check for &#8220;memory leaks&#8221; and hardware degradation.</p><h4>The Audit Kit</h4><ul><li><p><strong>Garbage Collector</strong>: Bio-Degreaser (Finish Line EcoTech).</p></li><li><p><strong>The &#8220;Gherkin&#8221; Brush</strong>: A drivetrain detail brush with a &#8220;claw&#8221; for digging out grit.</p></li><li><p><strong>Linter Tool</strong>: Chain Wear Indicator.</p></li></ul><h4>The Protocol</h4><ol><li><p><strong>Pre-Wash &amp; Degrease</strong>: Remove the mud, then spray degreaser on the gears. Let the &#8220;Garbage Collector&#8221; run for 3 minutes.</p></li><li><p><strong>Scrub &amp; Rinse</strong>: Use the &#8220;Gherkin&#8221; claw to dig out grit. Rinse with low-pressure water.</p></li><li><p><strong>Water Displacement</strong>: While wet, spray Teflon on the chain, bolts, and derailleur springs to prevent oxidation.</p></li><li><p><strong>Dry</strong>: Use a microfiber towel. <strong>Crucial</strong>: If the chain isn&#8217;t dry, your lube won&#8217;t &#8220;deploy&#8221; correctly into the metal.</p></li><li><p><strong>Re-Lubrication</strong>: Apply one drop of Line Dry Lube to each roller on the <strong>inside</strong> of the chain while back-pedaling.</p></li><li><p><strong>The Wipe-Down</strong>: Wait 60 seconds for the lube to soak into the &#8220;inner pins.&#8221; Then, use a clean rag to wipe off the excess. The chain should be lubricated on the inside, but dry to the touch on the outside to prevent sand from sticking to the surface.</p></li></ol><h4>The Health Check (Static Analysis)</h4><ul><li><p><strong>Dependency Check</strong>: Use the Chain Wear Indicator. If it hits 0.75, the chain is <strong>deprecated</strong>&#8212;replace it.</p></li><li><p><strong>Brake Validation</strong>: Check for 1mm thickness. Safety is a non-negotiable fail-safe.</p></li><li><p><strong>Indexing</strong>: Shift through all gears. If it &#8220;clicks,&#8221; adjust the barrel adjuster by 0.5 turns (like fine-tuning a config file).</p></li><li><p><strong>Cable Integrity</strong>: Look for &#8220;blooming&#8221; silver wires. If a cable is untwisting, it&#8217;s about to <strong>crash</strong>. If shifting is &#8220;crunchy,&#8221; the cable is &#8220;dragging&#8221; in the housing&#8212;likely a rust/dirt bottleneck.</p></li><li><p><strong>Load Balancing</strong>: Spin the wheels. If they wobble &gt;3mm, they need balancing (truing).</p></li></ul><h3>The Debugging Phase: Ego vs. Reality</h3><p>The biggest &#8220;bug&#8221; I encountered was my own <strong>Professional Ego</strong>. Because I worked in that shop in Saint-Petersburg and maintained a &#8220;perfect&#8221; setup in Israel, I felt like a &#8220;junior&#8221; by not having every professional tool immediately.</p><p>I had to debug that thought process. In software, we call this <strong>YAGNI</strong> (You Ain&#8217;t Gonna Need It). For a Merida Big Nine 60, I can &#8220;debug&#8221; a wobbly wheel by watching it against the frame. I don&#8217;t need a $300 truing stand to verify a fix.</p><p>The real challenge is <strong>Onboarding the Junior Dev</strong> (my son). When his Hotrock 24 starts &#8220;clicking,&#8221; the <strong>latency</strong> between my coaching cue and his execution is high. Keeping his bike &#8220;clean&#8221; via these scripts reduces the &#8220;noise&#8221; in his learning process. A smooth drivetrain is just a better UI for a kid.</p><h3>The &#8220;Merged PR&#8221;: Log Summary</h3><p>The &#8220;monolith&#8221; workshop is officially decommissioned. It&#8217;s been replaced by a streamlined, purpose-built kit, neatly &#8220;containerized&#8221; in Zip-lock bags within a single backpack.</p><ul><li><p><strong>Status</strong>: Healthy.</p></li><li><p><strong>Packaging</strong>: All trail tools isolated in Zip-locks for weatherproofing.</p></li><li><p><strong>Uptime</strong>: All family bikes are 100% operational.</p></li><li><p><strong>Backlog</strong>: Need to keep an eye on the brake pads; we&#8217;re approaching a &#8220;major version&#8221; update there.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Refactoring life, one Side Quest at a time.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Environment Emulation: Using Docker as a Time Machine for Legacy Java]]></title><description><![CDATA[What do you do when the code is right, but the world has changed too much to run it? You&#8217;ve successfully compiled a 20-year-old Java app, but the moment you hit &#8220;Run,&#8221; it crashes.]]></description><link>https://www.nikmalykhin.com/p/environment-emulation-using-docker</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/environment-emulation-using-docker</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Tue, 03 Mar 2026 08:01:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>What do you do when the code is right, but the world has changed too much to run it?</strong> You&#8217;ve successfully compiled a 20-year-old Java app, but the moment you hit &#8220;Run,&#8221; it crashes. It&#8217;s looking for a server named <code>qbert.guba.com</code> that was decommissioned in 2011. It&#8217;s searching for a local directory belonging to a developer who left the company fifteen years ago.</p><p>How do you convince a digital &#8220;antique&#8221; that it&#8217;s still living in 2005?</p><h3>The &#8216;Before&#8217; State: Setting the Context</h3><p>In the early days of Java development, &#8220;Environment Variables&#8221; and &#8220;Configuration as Code&#8221; were often ignored in favor of hardcoded assumptions. Developers wrote code that relied on:</p><ul><li><p><strong>Static Network Topologies:</strong> Hardcoded hostnames in <code>.properties</code> files or even inside <code>.class</code> files.</p></li><li><p><strong>Personalized File Paths:</strong> Logic that pointed to <code>/Users/ericlambrecht/data</code>, making the code physically impossible to run on any other machine.</p></li><li><p><strong>Specific Hardware Quirks:</strong> Reliance on the way Intel processors handled certain operations, which breaks on modern ARM-based chips like Apple&#8217;s M-series.</p></li></ul><p>The &#8220;old way&#8221; to fix this was a massive refactoring effort to externalize configuration. But when you have thousands of lines of &#8220;spaghetti&#8221; code, you risk introducing more bugs than you fix.</p><h3>Introducing the Core Concept: Environment Emulation</h3><p><strong>Environment Emulation</strong> is the practice of using containerization to recreate a specific historical &#8220;reality&#8221; for your application. Instead of changing the code to fit the modern world, you change the world to fit the code.</p><p><strong>What is it?</strong> It&#8217;s a &#8220;Time Capsule&#8221; strategy where Docker mimics the network, filesystem, and CPU architecture the application expects.</p><p><strong>Why does it matter?</strong> It allows you to achieve a &#8220;Green Start&#8221; without touching a single line of legacy business logic. By stabilizing the environment first, you can verify that the code <em>can</em> work before you begin the dangerous work of refactoring it.</p><h3>Practical Applications &amp; Use Cases</h3><h4>Use Case A: Network Trickery (Docker Aliases)</h4><p>If your legacy code is hardcoded to look for <code>qbert.guba.com</code>, you don&#8217;t need to hunt through the source code. You can use Docker&#8217;s network aliases to point that &#8220;ghost&#8221; hostname to a local container or a mock service.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;markdown&quot;,&quot;nodeId&quot;:&quot;0aefdc7d-db5f-40c4-8841-fc3209dcea12&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-markdown"># docker-compose.yml
services:
  legacy-app:
    image: my-ancient-app:latest
    networks:
      backend:
        aliases:
          - qbert.guba.com  # The app thinks it found its long-lost server
networks:
  backend:</code></pre></div><p><em>Benefit: The application connects successfully without any code changes or </em><code>/etc/hosts</code><em> hacking on your host machine.</em></p><h4>Use Case B: Filesystem Mimicry (Volume Mapping)</h4><p>When code is locked to a specific path like <code>/Users/eric/data</code>, Docker volumes can &#8220;teleport&#8221; your modern project directory into that exact location inside the container.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;833c3f29-cd4e-4610-95cf-d1ca05c4eb25&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">docker run -v $(pwd)/data:/Users/ericlambrecht/data my-legacy-java-app</code></pre></div><p><em>Benefit: You satisfy hardcoded file requirements immediately, allowing the app to boot and pass its initial I/O checks.</em></p><h4>Use Case C: Hardware Realities (x86 on ARM)</h4><p>Older binaries or specific versions of the JVM (like early Java 6 or 8 builds) may behave unpredictably on Apple Silicon (ARM64). You can force Docker to emulate the original Intel environment.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;dockerfile&quot;,&quot;nodeId&quot;:&quot;16a91638-e465-40fc-8ce9-b94533cdf233&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-dockerfile"># Specify the platform to ensure 100% compatibility with legacy binaries
FROM --platform=linux/amd64 eclipse-temurin:8-jdk</code></pre></div><p><em>Benefit: Eliminates subtle &#8220;Heisenbugs&#8221; caused by CPU architecture differences.</em></p><h3>Common Pitfalls &amp; Misconceptions</h3><p><strong>The "Config-First" Trap:</strong> Many engineers think they must "clean up" the configuration files before they can run the app in Docker.</p><p><strong>The Fix:</strong> Don&#8217;t clean. <strong>Emulate.</strong> Use Docker to satisfy the app&#8217;s current (even if &#8220;ugly&#8221;) requirements. Once you have a running, testable container, you can then refactor the configuration into modern environment variables as a second, safer step.</p><h3>Core Trade-offs &amp; Nuances</h3><ul><li><p><strong>The &#8220;Magic&#8221; Burden:</strong> Environment emulation can feel like &#8220;magic&#8221; to new developers. If the <code>docker-compose.yml</code> isn&#8217;t well-documented, a newcomer won&#8217;t understand why the app is looking for a server that doesn&#8217;t exist.</p></li><li><p><strong>Performance:</strong> Running x86 images on ARM64 via emulation (QEMU) is slower than native execution. This is acceptable for refactoring and testing, but may not be ideal for high-performance production needs.</p></li></ul><h3>Forward-Looking Conclusion</h3><p>Modernization is an act of engineering, not just coding. By using Docker as a &#8220;Time Machine,&#8221; you stop fighting the environment and start observing the application&#8217;s actual behavior.</p><p>Once the &#8220;Time Capsule&#8221; is built, you have achieved the ultimate goal of the software archaeologist: <strong>Reproducibility.</strong> From here, you can move forward with confidence, knowing that any changes you make to the code are being tested against a stable, predictable reality.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Strangler Build: Modernizing Java Tooling with Gradle 7.6]]></title><description><![CDATA[What do you do when your build system is the primary blocker to your modernization? You want to introduce automated testing and containerized deployments, but your project is locked inside an opaque build.xml file.]]></description><link>https://www.nikmalykhin.com/p/the-strangler-build-modernizing-java</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/the-strangler-build-modernizing-java</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Tue, 17 Feb 2026 08:03:21 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>What do you do when your build system is the primary blocker to your modernization?</strong> You want to introduce automated testing and containerized deployments, but your project is locked inside an opaque <code>build.xml</code> file. It&#8217;s not necessarily that the file is thousands of lines long&#8212;it&#8217;s that it represents a &#8220;frozen&#8221; process. The fear of breaking a specific, undocumented Ant target often keeps teams stuck in the past, manually running builds because they don&#8217;t trust the automation.</p><h3>The &#8216;Before&#8217; State: Setting the Context</h3><p>In the early 2000s, <strong>Apache Ant</strong> was the industry standard. It was purely imperative: you wrote a &#8220;script&#8221; telling the computer exactly how to delete folders, copy files, and compile classes.</p><p>The problem isn&#8217;t just the age of the tool; it&#8217;s the <strong>lack of lifecycle</strong>. Unlike Maven or Gradle, Ant has no built-in concept of a &#8220;test&#8221; phase or a &#8220;package&#8221; phase unless someone manually scripted them. For many legacy projects, this resulted in a build process that is fragile, hard to replicate in CI/CD, and completely disconnected from modern dependency management.</p><h3>Introducing the Core Concept: The Tooling Strangler</h3><p>The <strong>Tooling Strangler</strong> applies the Strangler Fig pattern to your build infrastructure. Instead of attempting a &#8220;Big Bang&#8221; migration where you delete Ant and spend a week debugging a new Gradle script, you <strong>wrap</strong> the old logic.</p><p><strong>What is it?</strong> Using Gradle&#8217;s <code>ant.importBuild</code>, you surface your legacy Ant targets as native Gradle tasks.</p><p><strong>Why does it matter?</strong> It allows you to move to a modern CLI immediately. You get the benefits of the Gradle Wrapper (<code>./gradlew</code>), advanced caching, and build scans, while the actual heavy lifting is still performed by the original, proven Ant logic.</p><h3>Practical Applications &amp; Use Cases</h3><h4>Use Case A: The &#8220;Wrapper&#8221; Migration</h4><p>By importing the build, you can start adding modern features (like dependency management) around the old Ant tasks without changing the Ant file itself.</p><pre><code>// build.gradle
// Import the existing Ant logic
ant.importBuild 'build.xml'

// Add a modern dependency that Ant didn't know about
dependencies {
    implementation 'org.slf4j:slf4j-api:1.7.36'
    testImplementation 'org.junit.jupiter:junit-jupiter:5.9.1'
}

// "Hook" a modern task into an old Ant target
tasks.named('compile') {
    doLast {
        println "Ant finished compiling. Gradle is now verifying the output..."
    }
}</code></pre><p><em>Benefit: Risk-free modernization. Your build stays &#8220;green&#8221; throughout the entire transition.</em></p><h4>Use Case B: The 7.6 &#8220;Goldilocks&#8221; Version</h4><p>In my experiments, I found that <strong>Gradle 7.6</strong> is the specific &#8220;sweet spot&#8221; for this work. Why?</p><ol><li><p><strong>JDK 8 Compatibility:</strong> It is the last major version that runs its own background processes (the daemon) natively on Java 8.</p></li><li><p><strong>Modern Features:</strong> It still supports the latest JUnit 5 platforms and Docker-ready plugins.</p></li><li><p><strong>The Bridge:</strong> It allows you to bridge the gap between a 2005 build logic and a 2026 deployment pipeline.</p></li></ol><h3>Common Pitfalls &amp; Misconceptions</h3><p><strong>The "Pure Gradle" Obsession:</strong> A common mistake is trying to make the <code>build.gradle</code> file "perfect" from day one. Developers often get stuck trying to replicate a weird Ant <code>copy</code> task in Gradle's DSL.</p><p><strong>The Fix:</strong> If the Ant task works, <strong>leave it in Ant.</strong> Use the Strangler Fig approach: only move tasks to Gradle when you actually need to change their logic or improve their performance.</p><h3>Core Trade-offs &amp; Nuances</h3><ul><li><p><strong>Dual Maintenance:</strong> For a period, you have both <code>build.xml</code> and <code>build.gradle</code>. You must treat the Gradle file as the new &#8220;entry point&#8221; for the team.</p></li><li><p><strong>Mindset Shift:</strong> You are moving from a &#8220;Scripting&#8221; mindset (Ant) to a &#8220;Task Graph&#8221; mindset (Gradle). Understanding how tasks depend on one another is more important than knowing the syntax.</p></li></ul><h3>Forward-Looking Conclusion</h3><p>Modernizing a build system doesn&#8217;t require a &#8220;demolition and rebuild.&#8221; By using <strong>Gradle 7.6</strong> as a wrapper for your legacy Ant scripts, you buy yourself the most valuable asset in refactoring: <strong>time.</strong> You get the project into a modern CI/CD pipeline on day one. Once the build is stabilized and automated, you can &#8220;strangle&#8221; the remaining Ant targets at your own pace.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Golden Bridge: Why Java 8 is the Ultimate Tool for Legacy Refactoring]]></title><description><![CDATA[When does &#8220;latest and greatest&#8221; become a liability? Imagine you&#8217;ve just inherited a &#8220;Big Ball of Mud&#8221;: a 20-year-old repository built with Ant, running on Java 1.5, and filled with raw types and swallowed exceptions.]]></description><link>https://www.nikmalykhin.com/p/the-golden-bridge-why-java-8-is-the</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/the-golden-bridge-why-java-8-is-the</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Mon, 16 Feb 2026 08:02:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>When does &#8220;latest and greatest&#8221; become a liability?</strong> Imagine you&#8217;ve just inherited a &#8220;Big Ball of Mud&#8221;: a 20-year-old repository built with Ant, running on Java 1.5, and filled with raw types and swallowed exceptions. Your instinct is to jump to Java 21 to get the latest performance gains and features. But when you try to compile, you&#8217;re met with thousands of breaking changes, deleted APIs, and a build system that refuses to acknowledge modern hardware.</p><p>How do you modernize a system that is too old to run, but too critical to fail?</p><h3>The &#8216;Before&#8217; State: Setting the Context</h3><p>In the world of &#8220;Software Archaeology,&#8221; we often encounter projects stuck in the mid-2000s. These applications are often:</p><ul><li><p><strong>Compiler-Locked:</strong> They rely on syntax (like certain raw-type configurations) that modern JDKs (11, 17, 21) simply won&#8217;t compile anymore.</p></li><li><p><strong>Environment-Fragile:</strong> They only &#8220;work on Bob&#8217;s machine&#8221; because Bob has a specific 2008-era Intel laptop and a prehistoric version of the JDK.</p></li><li><p><strong>Tooling-Limited:</strong> They use Ant or early Maven versions that don&#8217;t understand modern CI/CD pipelines or containerization.</p></li></ul><p>The &#8220;old way&#8221; of fixing this was the <strong>Big Bang Migration</strong>: a grueling six-month rewrite where you try to jump 15 years of evolution in one go. Most of these attempts end in failure, reverted commits, and exhausted teams.</p><h3>Introducing the Core Concept: The Golden Bridge</h3><p>The <strong>Golden Bridge</strong> methodology uses Java 8 not as a final destination, but as a strategic <strong>"Field Hospital."</strong> <strong>What is it?</strong> It is the practice of migrating ancient code (Java 1.4 - 1.6) specifically to Java 8 first, rather than the current LTS.<br><strong>Why does it matter?</strong> Java 8 sits at a unique historical intersection. It is the &#8220;Last of the Ancients&#8221; and the &#8220;First of the Moderns.&#8221; It provides a stable environment where you can fix the internal architecture of the code without the external environment fighting you.</p><p><strong>How does it work?</strong> </p><ol><li><p><strong>Dual-Compatibility:</strong> It supports the <code>-source 1.5</code> flag to compile ancient syntax while allowing you to use modern IDEs.</p></li><li><p><strong>Architecture Neutrality:</strong> It is the first version that runs natively on Apple Silicon (ARM64) via Zulu or Temurin builds, ending the reliance on old hardware.</p></li><li><p><strong>Tooling Support:</strong> It is fully supported by Gradle 7.6, which acts as the "Strangler Fig" for old Ant builds.</p></li></ol><h3>Practical Applications &amp; Use Cases</h3><h4>Use Case A: Compiling the &#8220;Uncompilable&#8221;</h4><p>Modern JDKs have removed many internal APIs and tightened the rules on source compatibility. Java 8 allows you to keep the old code running while you transition the build system.</p><pre><code>// In your build.gradle, you can target the past while living in the present
java {
    toolchain {
        languageVersion = JavaLanguageVersion.of(8)
    }
}</code></pre><p><em>Benefit: You get a green build in hours, not weeks.</em></p><h4>Use Case B: The Docker &#8220;Time Machine&#8221;</h4><p>By using Java 8, you can create a Docker image that mirrors the production environment exactly, but runs on a 2024 MacBook.</p><pre><code>FROM eclipse-temurin:8-jdk
# Map the 20-year-old hardcoded file paths to modern volumes
VOLUME /Users/original_dev/data:/data 
COPY . /app
WORKDIR /app
CMD ["ant", "test"]</code></pre><p><em>Benefit: Eliminates &#8220;Works on my machine&#8221; bugs immediately.</em></p><h3>Common Pitfalls &amp; Misconceptions</h3><p><strong>The "Destination" Trap:</strong> The biggest mistake is thinking that moving to Java 8 is "enough."</p><p>Java 8 is a <strong>bridge</strong>, not a home. If you stay there, you are still accumulating technical debt. The goal of the Golden Bridge is to get the code clean enough (removing raw types, fixing tests) so that the jump to Java 17 or 21 becomes a simple compiler flag change rather than a structural nightmare.</p><h3>Core Trade-offs &amp; Nuances</h3><ul><li><p><strong>The Cost:</strong> You have to maintain a specific legacy toolchain (like Gradle 7.6) because the newest versions of build tools have dropped support for Java 8.</p></li><li><p><strong>The Mindset:</strong> You must resist the urge to use Java 8 features (like Streams or Optionals) immediately. Your first goal is <strong>stabilization</strong>, not modernization. Adding new syntax to a &#8220;muddy&#8221; codebase only makes the archaeology harder.</p></li></ul><h3>Forward-Looking Conclusion</h3><p>Java 8 is the unique &#8220;Goldilocks&#8221; zone of the Java ecosystem. It&#8217;s old enough to understand where the code came from, and modern enough to work with the tools of today.</p><p>By treating Java 8 as your <strong>Golden Bridge</strong>, you turn a high-risk &#8220;archaeological dig&#8221; into a controlled engineering project. Use it to stabilize your build, containerize your environment, and harden your tests. Once the mud is washed away, the path to Java 21 will be wide open.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Does Delegating to AI Mean We Can Finally Be Lazy Managers?]]></title><description><![CDATA[I tested Google's Jules agent with two approaches: a vague "lazy" prompt and a detailed technical spec. The results reveal a paradox about AI autonomy and technical debt.]]></description><link>https://www.nikmalykhin.com/p/does-delegating-to-ai-mean-we-can</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/does-delegating-to-ai-mean-we-can</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Tue, 20 Jan 2026 08:00:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3>1. The Hook</h3><p>We often sell AI adoption to our bosses (and ourselves) with the promise of speed. We imagine a future where we toss a vague request over the wall&#8212;&#8221;fix the build,&#8221; &#8220;export the data,&#8221; &#8220;optimize the query&#8221;&#8212;and the AI handles the rest while we grab a coffee.</p><p>But my recent experiments with Jules, Google&#8217;s new AI agent, suggest the opposite is true. The more &#8220;autonomy&#8221; I gave the AI, the more mediocre the code became. This leads to an uncomfortable question: <strong>Does effective AI delegation actually require </strong><em><strong>more</strong></em><strong> management overhead, not less?</strong></p><h3>2. Context &amp; Tools</h3><p>I&#8217;ve been experimenting with <strong><a href="https://jules.google/">Jules</a></strong>, testing its ability to act as a &#8220;Junior Developer&#8221; in my Spring Boot repository, <strong><a href="https://github.com/nikmalykhin/joyofenergy-java-jules/">joyofenergy-java</a></strong>.</p><p>In my previous explorations, I looked at <a href="https://www.nikmalykhin.com/p/pair-authoring-with-an-ai-a-case">Pair-Authoring with an AI</a> and the <a href="https://www.nikmalykhin.com/p/the-context-window-paradox-to-get?utm_source=publication-search">Context Window Paradox</a>. This time, I wanted to test the difference between <strong>Abdication</strong> (lazy delegation) and <strong>Navigation</strong> (structured delegation) when asking an agent to build a feature from scratch.</p><h3>3. The Failed Experiment: The &#8220;Friday Afternoon&#8221; Prompt</h3><p>I set up a scenario we&#8217;ve all faced: It&#8217;s Friday afternoon, I want a new feature shipped, and I don&#8217;t want to think about the implementation details.</p><p>I gave Jules the &#8220;Lazy Manager&#8221; prompt:</p><blockquote><p>&#8220;Jules, create an endpoint to export meter readings as a CSV file. Use the existing MeterReadingService.&#8221;</p></blockquote><p>I intentionally withheld constraints. I didn&#8217;t mention memory usage, libraries, or formatting.</p><p>The Result?</p><p>Technically, it worked. Jules created a CsvService, updated the controller, and passed the tests. But structurally, it was a time-bomb.</p><ul><li><p><strong>Memory Unsafety:</strong> It loaded the entire dataset into a <code>List</code> in memory before writing the response. For a smart meter with 100,000 readings, this is an <code>OutOfMemoryError</code> waiting to happen.</p></li><li><p><strong>Library Bloat:</strong> It generated a new service class (<code>CsvService</code>)  where a simple stream in the controller would have sufficed.</p></li><li><p><strong>Junior Mistakes:</strong> It used standard Java formatting without considering how a user would actually open the file in Excel.</p></li></ul><p>The &#8220;lazy&#8221; prompt produced &#8220;lazy&#8221; code: functional, but dangerous at scale. It validated my fear that <a href="https://www.nikmalykhin.com/p/does-more-powerful-ai-mean-slower?utm_source=publication-search">More Powerful AI Doesn&#8217;t Always Mean Faster Fixes</a>.</p><h3>4. Principles That Actually Work: The &#8220;Brief&#8221;</h3><p>I reset the experiment. This time, I treated Jules like a Senior Engineer would treat a Junior: I wrote a spec.</p><p>I uploaded a file named <a href="https://github.com/nikmalykhin/joyofenergy-java-jules/blob/add-specs-feature-csv-export/specs/feature-csv-export.md">feature-csv-export.md</a> containing strict constraints:</p><ol><li><p><strong>No New Dependencies:</strong> Do not add <code>apache-commons</code> or <code>opencsv</code>.</p></li><li><p><strong>Memory Safety:</strong> Do not load lists into memory; stream directly to the <code>HttpServletResponse</code>.</p></li><li><p><strong>Strict Formatting:</strong> Use <code>yyyy-MM-dd HH:mm</code>.</p></li></ol><p>I then prompted:</p><blockquote><p>&#8220;Jules, I&#8217;ve uploaded a spec file... Please refactor the implementation to strictly follow these constraints.&#8221; </p></blockquote><p>The Outcome:</p><p>The difference was night and day.</p><ul><li><p><strong>Architectural Safety:</strong> Jules implemented a streaming solution using <code>PrintWriter</code>, avoiding the memory bottleneck entirely.</p></li><li><p><strong>Dependency Management:</strong> It correctly added <code>jakarta.servlet-api</code> as a <code>compileOnly</code> dependency, respecting the &#8220;no runtime bloat&#8221; rule.</p></li><li><p><strong>Test Integrity:</strong> It initially failed to test the controller response correctly, but because I had defined the &#8220;correct&#8221; output in the spec, I could guide it to fix the assertion logic.</p></li></ul><h3>5. Unexpected Discovery: The &#8220;Spec&#8221; as a Guardrail</h3><p>The most surprising insight was that Jules didn&#8217;t just follow the instructions&#8212;it used the spec file as a defense mechanism against bad code.</p><p>When I ran the &#8220;Lazy&#8221; experiment, Jules defaulted to the path of least resistance (loading data into memory). When I provided the &#8220;Brief,&#8221; Jules shifted behavior entirely. It didn&#8217;t just write code; it <strong>navigated the constraints</strong>.</p><p>This confirms a theory I touched on in <a href="https://www.nikmalykhin.com/p/can-we-make-ai-code-assistants-smarter?utm_source=publication-search">Can We Make AI Code Assistants Smarter by Asking Them to Write Their Own Rules?</a> The AI performs best not when it has &#8220;creative freedom,&#8221; but when it is boxed in by rigid technical constraints. The &#8220;Senior Engineer&#8221; input  wasn&#8217;t the code I wrote, but the boundaries I set.</p><h3>6. The Central Paradox</h3><p>This brings us to the Delegation Paradox:</p><p>To get an AI agent to work autonomously, you must micromanage the requirements.</p><p>If you want to be &#8220;lazy&#8221; during the implementation phase (execution), you must be hyper-active during the definition phase (specification). You cannot abdicate both.</p><ul><li><p><strong>Abdication</strong> (Vague prompt) -&gt; Requires heavy code review and refactoring later.</p></li><li><p><strong>Navigation</strong> (Detailed spec) -&gt; Requires heavy upfront thought, but produces near-production-ready code.</p></li></ul><p>We aren&#8217;t thinking <em>less</em> with AI; we are shifting <em>when</em> we think.</p><h3>7. Forward-Looking Conclusion</h3><p>Tools like Jules are shifting the developer&#8217;s role from &#8220;writer of code&#8221; to &#8220;architect of constraints.&#8221;</p><p>If you treat your AI agent like a magic wand that reads your mind, you will build technical debt at record speeds. But if you treat it like a talented but literal-minded junior developer who needs a solid brief, it becomes a powerful force multiplier.</p><p>The future of engineering isn&#8217;t about writing the perfect function; it&#8217;s about writing the perfect spec.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Can We Skip TDD with Modern AI? A Context Experiment]]></title><description><![CDATA[The Hook]]></description><link>https://www.nikmalykhin.com/p/can-we-skip-tdd-with-modern-ai-a</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/can-we-skip-tdd-with-modern-ai-a</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Tue, 09 Dec 2025 08:01:03 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3>The Hook</h3><p>Recently, some colleagues pitched me an idea: &#8220;Today, LLMs are so powerful, you can start exactly from implementation and it will work well. No need to use TDD or other more complicated XP techniques&#8221;.</p><p>It is a tempting thought. If an AI can generate a complete feature in seconds, is my approach&#8212;always start from a test&#8212;still relevant?.</p><p>I decided to check it. I ran an experiment to see if I could implement a complex feature by describing the task and letting GenAI create the application. My hypothesis was that TDD is still vital, but I wanted to see if the &#8220;Just Do It&#8221; method could prove me wrong.</p><p>The result? I confirmed exactly what I expected: <strong>TDD is one of the best ways to create context for an LLM.</strong></p><h3>Personal Context &amp; Tools</h3><p>For this experiment, I returned to a project I started in a previous article: <a href="https://www.nikmalykhin.com/p/does-ai-need-clear-goals-my-experiment">&#8220;Does AI Need Clear Goals? My Experiment in Turning Vague Ideas into Code&#8221;</a>.</p><p>My tool of choice was <strong>GPT-4.1</strong> (via GitHub Copilot), utilizing its Agent mode to handle multi-file context. Usually, I treat the AI as a pair programmer, following structured collaboration methods I&#8217;ve discussed in <a href="https://www.nikmalykhin.com/p/pair-authoring-with-an-ai-a-case">&#8220;Pair-Authoring with an AI: A Case Study in Structured Collaboration&#8221;</a>.</p><p>But for this session, I acted as a &#8220;manager,&#8221; giving requirements and approving plans, but explicitly skipping the &#8220;Red&#8221; phase of TDD. I let the AI write the code first.</p><h3>The Failed Experiment</h3><p>The task was <strong>Story #2346</strong>: Implement a &#8220;Day of Week Pricing Plan&#8221;. The requirements were clear: users needed to compare power usage costs based on the day of the week and rank price plans accordingly.</p><p>I approved the AI&#8217;s plan and let it generate the implementation. Here is where the &#8220;No TDD&#8221; approach started to show its cracks.</p><p><strong>1. The &#8220;Ghost Method&#8221; Problem</strong> After the AI implemented the service layer, my IDE lit up with errors. The AI used a method <code>getDayOfWeekMultiplier(DayOfWeek)</code> that didn&#8217;t exist. It &#8220;hallucinated&#8221; a method on the domain object because it was writing the service in isolation. I am usually fine with &#8220;Red&#8221; code, but this wasn&#8217;t TDD &#8220;Red&#8221;&#8212;this was just broken code requiring immediate fixes.</p><p><strong>2. The Regression Nightmare</strong> When we fixed the missing method, we broke the existing logic.</p><blockquote><p>PricePlanTest &gt; shouldReceiveMultipleExceptionalDateTimes() FAILED</p></blockquote><p>Because we implemented the new logic <em>over</em> the old logic without a guiding test, the AI introduced regressions. We had to do several iterations just to get back to a baseline.</p><p><strong>3. The Context Disconnect</strong> The real struggle happened during Functional Testing. I asked the AI to verify the endpoints. It generated a test that tried to hit the API, but it returned a <strong>404 Not Found</strong>. Why? The AI created a test that queried a Smart Meter ID, but &#8220;it didn&#8217;t have a context!&#8221;. It forgot that in this application, a Smart Meter must be linked to a Price Plan via the <code>AccountService</code> first. The AI tried to guess the solution, attempting to call an API <code>/account/link/{smart-metter-id}</code> that didn&#8217;t even exist.</p><h3>Principles That Actually Work</h3><p>I eventually finished the task without TDD, but it required multiple rollbacks and context corrections. Through this struggle, I confirmed why TDD works:</p><p><strong>Principle 1: Tests Are Context Anchors</strong> The reason the AI failed the functional test setup was a lack of context. If I had written the test <em>first</em>, I would have been forced to set up the <code>AccountService</code> association immediately. The failing test provides the AI with a strict &#8220;Context Window&#8221; of what is required, as I explored in <a href="https://www.nikmalykhin.com/p/the-context-window-paradox-to-get?utm_source=publication-search">&#8220;The Context Window Paradox&#8221;</a>.</p><p><strong>Principle 2: Small Steps Prevent &#8220;Imagination&#8221;</strong> When the AI doesn&#8217;t have enough context, it tries to imagine the answer. TDD forces small, verifiable steps. By skipping the test, I forced the AI to generate a large chunk of logic (Controller + Service) at once, increasing the surface area for hallucinations.</p><h3>Unexpected Discovery</h3><p>The most painful part of skipping TDD wasn&#8217;t the coding&#8212;it was the debugging.</p><p>When I finally added tests <em>after</em> the implementation to verify the logic, one failed with a confusing error:</p><blockquote><p>Expecting actual: {FRIDAY=[...]} to contain key: MONDAY</p></blockquote><p>This revealed a critical weakness of the &#8220;Test After&#8221; approach. When a test fails, you don&#8217;t know where the problem is: &#8220;In the tests or in the business logic.&#8221;. It turned out to be an error in the test data (the date provided was a Friday, not Monday). If I had written the test first, the AI would have generated the implementation <em>based</em> on that test data. We wouldn&#8217;t have had this problem at all.</p><h3>The Central Paradox</h3><p>We tend to think that as AI gets smarter, we can think less. I touched on this in <a href="https://www.nikmalykhin.com/p/can-we-think-less-with-ai?utm_source=publication-search">&#8220;Can We Think Less with AI?&#8221;</a>.</p><p>But this experiment confirmed a paradox: <strong>To move faster with AI, you must slow down enough to write the test.</strong></p><p>Can we avoid the loops of small context errors? Yes. TDD reduces complexity and creates trust between us and the AI . The test acts as a contract. Without it, you are just hoping the AI guesses your architectural constraints correctly.</p><h3>Forward-Looking Conclusion</h3><p>So, can we skip TDD? Yes, but you will spend more time adding additional context manually.</p><p>The power of TDD is approaching a new peak in the AI era: tests create a <strong>POWERFUL CONTEXT</strong> for LLMs. Modern models like GPT-4 are powerful, but &#8220;better LLM, not exclude context from that function&#8221;.</p><p>If you want to get the most out of your AI teammate, don&#8217;t just ask it to write code. Give it a failing test.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Does "Extract Method" Actually Hurt Your Readability?]]></title><description><![CDATA[We&#8217;ve all been there.]]></description><link>https://www.nikmalykhin.com/p/does-extract-method-actually-hurt</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/does-extract-method-actually-hurt</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Tue, 25 Nov 2025 08:01:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We&#8217;ve all been there. A feature starts simple, maybe 20 lines. But after three or four iterations, that same function has ballooned to 200 lines, a tangled mess of nested <code>if-else</code> blocks. </p><p>Does that reality sound familiar?</p><p>When faced with this, we have two main choices. One way is to create tech debt, a task we&#8217;ll never <em>really</em> get to because we will always have more urgent priorities from the business. The other way was shown in the foundational book, <strong><a href="https://martinfowler.com/books/refactoring.html">Refactoring by Kent Beck and Martin Fowler</a></strong>. This path treats refactoring as a continuous action, not a tech debt item in the backlog.</p><p>But if we choose to refactor continuously, what does that <em>really</em> mean, and are our tools helping or hurting?</p><h3>My Context and the &#8220;Easy&#8221; Button</h3><p>Working in a <strong>Java/Kotlin</strong> environment, my tool of choice is <strong>IntelliJ IDEA</strong>. It&#8217;s an incredibly powerful IDE with a host of features designed to help.</p><p>When facing a 200-line monster method, the most obvious solution is right in the refactoring menu: <strong>&#8220;<a href="https://www.jetbrains.com/help/idea/extract-method.html">Extract Method</a>&#8221;</strong>. It seems perfect. It makes the original method smaller, which is exactly what I want.</p><p>Right?</p><h3>Introducing the Core Concept: Readability-Driven Refactoring</h3><p>The main goal of refactoring shouldn&#8217;t just be &#8220;smaller methods.&#8221; For me, the main goals are <strong>readability</strong> and, secondarily, <strong>decoupling</strong>.</p><p>In fact, readability is arguably more important than adhering to a specific architecture or design pattern. While good architecture often improves readability, it&#8217;s not its primary goal. If I have a choice between perfect pattern adherence and readability, I will prefer readability. Working on a typical web application, it&#8217;s readability that helps me daily when I look at different parts of the code.</p><p>This is where the simple &#8220;Extract Method&#8221; tool falls short. It often just moves the mess, failing to improve readability.</p><p>A more powerful <em>technique</em> for guiding this process is <strong>Test-Driven Development (TDD)</strong>. Instead of just extracting code, we use TDD to <em>describe our expectations</em> for the new, refactored code <em>before</em> we write it. This small shift in process fundamentally changes the quality of the refactoring.</p><h3>Practical Application: A TDD-Led Refactoring</h3><p>Let&#8217;s look at a practical example.</p><h4>The Problem Code</h4><p>Imagine we have this block of code in a method. It&#8217;s searching for properties, then mapping them to calculate Avios points, with error handling mixed in .</p><pre><code>summaries =
    shc
    .psSearch(
        startDate = startDate,
        nights = nights,
        hotelCodes = it,
        adults = adultsParam,
        children = childrenParam,
        infants = infantsParam,
    ).toTypedArray()
    .mapNotNull { tbh -&gt;
        kotlin
            .runCatching {
                aviosEarn = aviosAdapter.calculateAviosEarn(BigDecimal(tbh.summary!!.totalPrice!!))
                tbh.toAccommodationSummary(aviosEarn)
            }.onFailure { e -&gt;
                SASAdapter.Companion.log.warn("Skipping", e)
            }
            .getOrNull()
    }.toList()</code></pre><h4>Common Pitfall: The &#8216;Extract Method&#8217; Trap</h4><p>If we use the &#8220;Extract Method&#8221; feature in our IDE, we get this:</p><p><strong>Original method:</strong></p><pre><code>summaries = requestSummariesAndCalculateAviosEarn(startDate, nights, it, adultsParam, childrenParam, infantsParam)</code></pre><p><strong>New private method:</strong></p><pre><code>private fun requestSummariesAndCalculateAviosEarn(
    startDate: LocalDate,
    nights: Int,
    it: List&lt;String&gt;,
    adultsParam: String,
    childrenParam: String,
    infantsParam: String,
): List&lt;AccommodationSummary&gt; =
    shc
        .psSearch(
            startDate = startDate,
            nights = nights,
            hotelCodes = it,
            adults = adultsParam,
            children = childrenParam,
            infants = infantsParam,
        ).toTypedArray()
        .mapNotNull { tbh -&gt;
            calculateAviosEarnAndMapToAccommodationSummary(tbh)
        }.toList()

private fun calculateAviosEarnAndMapToAccommodationSummary(tbh: TBH): AccommodationSummary? {
    var aviosEarn: Int 
    return runCatching { 
        aviosEarn =
            aviosAdapter.calculateAviosEarn(BigDecimal(tbh.summary!!.totalPrice!!)) 
        tbh.toAccommodationSummary(aviosEarn) 
    }.onFailure { e -&gt; 
        log.warn(&#8221;Skipping&#8221;, e) 
    }
        .getOrNull() 
}</code></pre><p>Is this good? Not exactly. It makes the original method smaller, but it doesn&#8217;t improve readability. We&#8217;ve just created a new private method that takes a <em>mess</em> of parameters.</p><h3>The Better Way: The TDD-Led Flow</h3><p>Instead of using the IDE tool, let&#8217;s use the TDD <em>technique</em>.</p><ol><li><p><strong>Describe Expectations:</strong> We start by writing a test for the logic we <em>want</em> to have. We don&#8217;t want to just test a private method; this logic feels like it belongs in its own service.</p></li><li><p><strong>Define the &#8220;To-Be&#8221; Service:</strong> We&#8217;ll create a test for a new <code>SummaryAdapter</code>. At first, this service is &#8220;red&#8221; (it doesn&#8217;t exist).</p></li><li><p><strong>Discover the Parameter Problem:</strong> As we write the test and describe the method we want to call, we see the problem clearly: it needs too many parameters.</p></li><li><p><strong>The Solution:</strong> The test itself shows us what we need. Instead of passing 6 individual parameters, we should pass a single <code>SearchCriteria</code> object. We define this object as an expectation of our test.</p></li><li><p><strong>Implement:</strong> We now implement the new service, moving the logic from the old method.</p></li></ol><p><strong>The Result:</strong></p><p>By extracting the logic to a new service and passing a parameter object, the original code now looks like this:</p><pre><code>summaries = SummaryAdapter.requestSummariesAndCalculateAviosEarn(searchCriteria, it)</code></pre><p>Did we improve readability? Yes. And not just because the method is smaller, but because we are no longer passing an excessive number of parameters, as we were with the simple &#8220;Extract Method&#8221;.</p><h3>A Technique Over a Tool</h3><p>IDE tools are wonderful, and techniques like TDD are powerful.</p><p>Of course, we <em>could</em> have used the IDE tools to change the method signature, create a new class, and move the method there. What the tool <em>can&#8217;t</em> do is help us understand what we want to do in the first place. We can&#8217;t describe our expectations to the tool.</p><p>TDD gives us that option: <strong>we describe our expectations before the work</strong>. This key difference is what truly changes the quality of our refactoring.</p><p>By knowing different techniques, we can understand when and which tool to use. Don&#8217;t let the tool lead the refactoring; let your <em>technique</em> guide the tool.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Does AI Need Clear Goals? My Experiment in Turning Vague Ideas into Code]]></title><description><![CDATA[We&#8217;re all told the same thing: AI needs clear, specific, and context-rich prompts to be useful.]]></description><link>https://www.nikmalykhin.com/p/does-ai-need-clear-goals-my-experiment</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/does-ai-need-clear-goals-my-experiment</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Tue, 11 Nov 2025 08:00:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We&#8217;re all told the same thing: AI needs clear, specific, and context-rich prompts to be useful. &#8220;Garbage in, garbage out.&#8221; This is especially true in engineering.</p><p>But what if your job isn&#8217;t to execute a clear task, but to <em>find</em> the task?</p><p>In my current work, we do a lot of research. Goals are not clear. We receive highly abstract, one-sentence ideas that need to be explored. This research is a necessary, messy process of discovery, and it&#8217;s full of &#8220;boilerplate&#8221; actions.</p><p>This got me thinking. We assume AI is for <em>execution</em>, but can we use it for <em>exploration</em>? What happens when you feed an AI a problem that you, the engineer, don&#8217;t even fully understand yet?</p><p>I ran an experiment to find out, starting with nothing but a single, vague sentence.</p><h3>My Setup: From Vague Idea to Boilerplate</h3><p>My goal was to see if I could use Generative AI to shepherd a &#8220;one-sentence idea&#8221; all the way to a foundational, runnable piece of code.</p><p>My toolkit was straightforward:</p><ul><li><p><strong>The Idea:</strong> A vague user story, &#8220;#2348: As an administrator I want to add a new tariff so that it can be advertised to users who may benefit&#8221;. This was perfect because it was so vague&#8212;what&#8217;s a &#8220;tariff&#8221;? How is it &#8220;advertised&#8221;?</p></li><li><p><strong>The &#8220;Analyst&#8221; AI:</strong> I used <strong>Gemini 2.5 Pro</strong> to act as a Product Owner and flesh out this vague idea.</p></li><li><p><strong>The &#8220;Developer&#8221; AI:</strong> I then used <strong>GitHub Copilot (CPT 4.1)</strong> in <strong>IntelliJ</strong> to write the boilerplate code.</p></li><li><p><strong>The Project:</strong> All this was done in the context of TW &#8220;<a href="https://www.thoughtworks.com/en-es/insights/blog/careers-at-thoughtworks/joi_application_process">Joy of Energy</a>&#8221; project, a Java Spring Boot application.</p></li></ul><p>The plan was a two-part workflow:</p><ol><li><p><strong>Part 1: AI as Business Analyst.</strong> Feed the vague story to Gemini and ask it to define the requirement.</p></li><li><p><strong>Part 2: AI as Boilerplate Generator.</strong> Feed the <em>AI-generated spec</em> to Copilot and ask it to write the code.</p></li></ol><h3>The Failed Experiment (That Was Actually a Success)</h3><p>My first attempts were a perfect illustration of the &#8220;AI is context-blind&#8221; problem. The &#8220;failure&#8221; wasn&#8217;t that the AI was useless; it&#8217;s that its first drafts were wrong in very specific, instructive ways.</p><p><strong>Failure 1: The AI &#8220;Product Owner&#8221; Became a Tech Lead</strong> I asked Gemini to act as a Product Owner and flesh out the story . It made a &#8220;very popular mistake&#8221;: it skipped the &#8220;what&#8221; and &#8220;why&#8221; and jumped straight to the &#8220;how.&#8221;</p><p>The <em>very first draft</em> of the spec it gave me wasn&#8217;t a user story; it was a technical task. It immediately suggested a <code>JPA @Entity</code> and defined fields like <code>id</code> as a <code>UUID</code>. It was already designing the database schema.</p><p>This is exactly what you <em>don&#8217;t</em> want from a user story, and it&#8217;s a common trap where the AI tries to be the engineer, not the analyst. As I&#8217;ve written before, the AI&#8217;s job is to reflect our needs, not just give us a technical answer (you can read more on that idea here: <a href="https://www.nikmalykhin.com/p/how-genai-helps-engineers-write-better">How GenAI Helps Engineers Write Better</a>).</p><p>I had to intervene, critique the output, and explicitly ask it to &#8220;Change database to more abstract system&#8221; to get the clean, implementation-agnostic user story and Acceptance Criteria (ACs) I actually needed .</p><p><strong>Failure 2: The AI &#8220;Developer&#8221; Was a Clumsy New Hire</strong> After I had a clean spec, I gave it to GitHub Copilot with a clear prompt: generate a POJO, an in-memory Service, and a Controller .</p><p>The code it generated was not &#8220;copy-paste and run&#8221;.</p><ul><li><p><strong>Wrong Package Structure:</strong> It invented a &#8220;by-feature&#8221; package structure (<code>com.joi.energy.tariff</code>). My project uses a &#8220;by-layer&#8221; structure (<code>uk.tw.energy.domain</code>, <code>uk.tw.energy.service</code>, etc.) .</p></li><li><p><strong>Missing Dependencies:</strong> It correctly suggested using <code>jakarta.validation</code> annotations &#8212;a great idea!&#8212;but my project didn&#8217;t have that dependency.</p></li><li><p><strong>Minor (Human) Errors:</strong> It even forgot the <code>@Service</code> annotation on the <code>TariffService</code>, a simple mistake I&#8217;ve made myself a dozen times.</p></li></ul><p>If I were a junior engineer, I would have been blocked or, worse, just pasted it all in, breaking the project&#8217;s architecture.</p><h3>Principles That Actually Work</h3><p>These &#8220;failures&#8221; led me to the real principles of using AI for this kind of work.</p><p><strong>1. The AI is a &#8220;Demultiplicator,&#8221; Not a Supercharger</strong> This was my single most important insight. A supercharger just makes the engine spin <em>faster</em>. A demultiplicator (like a reduction gear) <em>changes the nature</em> of the work, trading raw speed for torque.</p><p>The AI is a demultiplicator for my brain.</p><p>When I was iterating on the user story, I didn&#8217;t think about &#8220;how to write these words or if it sounds good&#8221;. I was 100% focused on the <em>business goals</em>. The AI handled the <em>typing</em>, and I handled the <em>validating</em>. This is a profound shift. It took me 30 minutes to get a solid user story, not because I typed fast, but because I <em>thought</em> fast, using the AI&#8217;s draft as a disposable starting point.</p><p><strong>2. The Engineer&#8217;s New Job: Strategist and Context-Provider</strong> The AI&#8217;s mistakes weren&#8217;t stupid; they were <em>context-blind</em>. This reveals the engineer&#8217;s true role in an AI-augmented workflow: we are the &#8220;Reviewer and Strategist&#8221;.</p><p>My job wasn&#8217;t to write getters and setters. My job was to make two high-level strategic decisions:</p><ol><li><p>&#8220;The AI is right, <code>jakarta.validation</code> is a good idea. I will add that dependency&#8221;.</p></li><li><p>&#8220;The AI is wrong about the package structure. I will correct it to follow our existing pattern&#8221;.</p></li></ol><p>The AI&#8217;s &#8220;flawed&#8221; draft actually <em>forced</em> me to think strategically about my project&#8217;s architecture and dependencies.</p><p><strong>3. Embrace the &#8220;90% Win&#8221; and the Iterative Loop</strong> The AI&#8217;s output doesn&#8217;t need to be 100% perfect to be valuable. The boilerplate it generated, despite its flaws, was a &#8220;90% win&#8221;. It saved me from the &#8220;boring boilerplate&#8221; and the hours I would have spent on Stack Overflow as a junior engineer.</p><p>More importantly, the AI&#8217;s <em>mistakes</em> are part of the value. That wrong package structure? It&#8217;s a great &#8220;recommendation for reorganizing your project&#8221; and a perfect topic to bring to a team huddle.</p><h3>My Unexpected Discovery: &#8220;1:0 to AI&#8221;</h3><p>The most surprising moment came during the boilerplate generation. I asked for <em>three</em> files (POJO, Service, Controller). The AI gave me <em>four</em>.</p><p>It proactively and correctly created a <code>TariffType.java</code> Enum (<code>FLAT_RATE</code>, <code>TIME_OF_USE</code>).</p><p>This was a perfect &#8220;micro-improvement&#8221;. I called it &#8220;1:0 to AI&#8221;. I was so focused on the &#8220;big picture&#8221; of the architecture that I missed this small, obvious detail. This &#8220;separating of responsibilities&#8221; is incredibly powerful : the AI handles the small details while I focus on the larger strategic goals.</p><h3>The Central Paradox: AI&#8217;s Flaws Are Its Greatest Strength</h3><p>This leads to the central paradox: <strong>The AI is terrible at handling vague, abstract ideas... and yet, it&#8217;s the best tool I have for the job.</strong></p><p>Why? Because its value isn&#8217;t in <em>giving you the right answer</em>. Its value is in its ability to <em>instantly turn a &#8220;blank page&#8221; into a flawed, tangible draft that you can critique</em>.</p><p>The AI&#8217;s initial, flawed responses&#8212;the over-technical user story, the context-blind package structure&#8212;are its most valuable feature. They act as a mirror, forcing the engineer to <em>define</em> the context and <em>make</em> the strategic decisions. It can&#8217;t read your mind, so it forces you to figure out what&#8217;s in it.</p><p>Effective use doesn&#8217;t require a perfect prompt. It requires an engineer to stop acting like a <em>typist</em> and start acting like an <em>editor, a critic, and a strategist</em>.</p><h3>Conclusion: From Vague to Validated</h3><p>The AI didn&#8217;t <em>solve</em> my vague problem. It gave me the tools to solve it myself, faster and at a higher level of abstraction.</p><p>By delegating the &#8220;boring boiler plate code&#8221; , I was able to stay focused on the &#8220;big picture&#8221; and &#8220;business needs&#8221;. This workflow is a powerful way to accelerate research, allowing us to build, test, and throw away foundational ideas at a speed we couldn&#8217;t before.</p><p>The AI isn&#8217;t here to replace us. It&#8217;s here to take the routine work and free us to focus on the hard parts. It&#8217;s a &#8220;demultiplicator&#8221; that gives us the torque to move from a one-sentence idea to a validated, runnable foundation &#8212;flaws and all.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[What If the ‘Cleanest’ Code Is the Wrong Solution?]]></title><description><![CDATA[In our continuing experiment with Trio Programming&#8212;two engineers and an AI&#8212;we decided to level up.]]></description><link>https://www.nikmalykhin.com/p/what-if-the-cleanest-code-is-the</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/what-if-the-cleanest-code-is-the</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Tue, 28 Oct 2025 08:00:38 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In our continuing experiment with <strong>Trio Programming</strong>&#8212;two engineers and an AI&#8212;we decided to level up. Our first session was a slow, painful grind of fixing our environment. This time, with a stable foundation, we aimed for speed. Our new strategy: write comprehensive tests ourselves, then give the AI the freedom to implement the solution in one big step.</p><p>The initial results were promising. The AI produced working code that passed our tests. But then, our instincts as seasoned developers kicked in. We saw the AI&#8217;s implementation&#8212;a simple <code>Map&lt;String, Object&gt;</code>&#8212;and reflexively identified it as a &#8220;code smell&#8221;. We spent the next hour trying to refactor it into a &#8220;cleaner,&#8221; more object-oriented design using the Composite pattern.</p><p>That&#8217;s when we fell into a trap. Our pursuit of clean code was leading us toward a solution that was elegant, sophisticated, and completely wrong. This led us to our second major discovery: <strong>In AI-augmented development, the biggest risk isn&#8217;t bad AI code, but good human intuition applied to the wrong problem.</strong></p><div><hr></div><h3>Our Setup: Aiming for a Bigger Step</h3><p>Our team remained the same: I (Nik) acted as the driver for <strong>GitHub Copilot</strong>, while Javier served as the strategic navigator. Having stabilized our Java, Spring Boot, and Gradle environment in the last session, we were ready to test a new hypothesis: if we write strong, expectation-focused tests, we can trust the AI with a larger implementation scope and move much faster.</p><p>The flow was simple:</p><ol><li><p>Human engineers write a small, focused test with clear assertions.</p></li><li><p>Let the AI generate the implementation code in a single, larger step to make the test pass.</p></li><li><p>Trust the tests to validate the AI&#8217;s work, rather than meticulously reviewing every line of generated code.</p></li></ol><h3>The Failed Experiment: Refactoring into a Corner</h3><p>The first part of the experiment worked. We added two tests for our hierarchy API, one for a root-only employee and one for a simple employee-supervisor relationship. We then prompted the AI: &#8220;tests looks good, let&#8217;s make postHierarchy method for passing all of them&#8221;.</p><p>The AI&#8217;s implementation worked, save for one minor edge case we quickly fixed. But we weren&#8217;t satisfied. The code returned a <code>Map&lt;String, Object&gt;</code>, and our developer brains screamed for type safety and better design.</p><ol><li><p><strong>The &#8220;Code Smell&#8221; Diagnosis:</strong> We prompted the AI with our concern: &#8220;maybe, response object will make the readability of the code better and will reduce smell of code?&#8221;. This initiated a refactoring plan to introduce a dedicated <code>HierarchyNode</code> class.</p></li><li><p><strong>Applying a Design Pattern:</strong> We pushed further, suggesting a more formal structure: &#8220;maybe we can apply composite pattern... to our response?&#8221;. The goal was to create a pure, object-oriented hierarchy and eliminate the <code>Map</code> entirely.</p></li><li><p><strong>The Collision with Reality:</strong> Our final prompt revealed the fatal flaw in our logic: &#8220;can we avoid to use Map if we will use Spring Boot which we have in our project?&#8221;.</p></li></ol><p>The AI&#8217;s response was the turning point. It patiently explained that given our requirement for dynamic JSON keys (e.g., <code>&#8220;Jonas&#8221;: { &#8220;Sophie&#8221;: ... }</code>), a <code>Map</code> or a structure that serializes like one was <strong>unavoidable</strong> with Spring Boot and its default Jackson serializer.</p><p>We had spent a significant part of our session chasing an elegant design that was fundamentally incompatible with the constraints of our framework and the explicit requirements of the kata. As I noted in my log, &#8220;we spend time trying to add something not workable to the code&#8221;. The AI&#8217;s initial, simpler solution wasn&#8217;t a code smell; it was the correct, pragmatic solution from the start.</p><div><hr></div><h3>Principles That Actually Work</h3><p>This humbling experience confirmed our new hypothesis and revealed principles for a more effective human-AI workflow.</p><ol><li><p><strong>Focus on &#8220;What,&#8221; Not &#8220;How&#8221; (Test-Focused Development).</strong> Our initial strategy was correct. The most valuable role for the human developers is to define the <em>behavior</em> of the system through precise, comprehensive tests. When we focused on the expected JSON output, the AI produced correct code. When we focused on our preconceived notions of &#8220;good&#8221; internal implementation, we wasted time. The tests are the contract; the AI&#8217;s job is to fulfill it.</p></li><li><p><strong>The AI is a Mirror for System Constraints.</strong> The AI is more than a code generator; it&#8217;s an interactive expert on the toolchain. It didn&#8217;t just reject our idea; it explained <em>why</em> it wouldn&#8217;t work within the Spring Boot ecosystem. This prevented us from going further down a dead-end path. Use the AI not just to write code, but to validate your architectural assumptions against the framework&#8217;s reality.</p></li><li><p><strong>Codify Your Learnings into the System.</strong> A failed experiment is only a waste if you don&#8217;t learn from it. The most productive outcome of our refactoring dead-end was updating our <code>.github/copilot-instructions.md</code> file. We added an explicit refactoring protocol and guidance on when to challenge the AI&#8217;s use of patterns versus accepting framework constraints. This turns a session&#8217;s lesson into a permanent upgrade for the trio&#8217;s workflow.</p></li></ol><h3>Unexpected Discovery: AI Generalizes from Specifics</h3><p>After our refactoring detour, we returned to our Test-Focused workflow. We added much more complex tests, including one with multiple employees reporting to the same supervisor and another with a full four-level hierarchy.</p><p>The surprising part? <strong>The AI&#8217;s existing implementation passed these complex tests without any modifications</strong>. This revealed a powerful insight: the AI is remarkably good at generalizing a solution. It needed a few simple, specific test cases to establish the core logic. Once that logic was in place, it was robust enough to handle more complex scenarios automatically. The &#8220;big step&#8221; works, but it needs to be built on a foundation of small, clear examples.</p><h3>The Central Paradox of AI-Driven Speed</h3><p>This leads to the central paradox we uncovered in this session: <strong>To move faster with big, AI-generated implementation steps, you must first slow down and write smaller, more precise human-guided tests.</strong></p><p>Our desire for speed was not at odds with the discipline of TDD; it was enabled by it. The quality of the AI&#8217;s large-scale contribution was directly proportional to the quality of the small-scale expectations we defined. You cannot achieve reliable speed by simply telling the AI &#8220;build this feature.&#8221; You achieve it by saying &#8220;build something that satisfies these very specific, verifiable behaviors.&#8221;</p><h3>Conclusion: We Are Architects of Behavior, Not Just Code</h3><p>Our second session was a success, but not because we wrote code faster. It was a success because we learned how to trust our tests more than our own implementation habits. The &#8220;Test-Focused Development&#8221; rhythm&#8212;small tests by humans, big implementation by AI&#8212;feels right.</p><p>The dynamic is shifting. Our job is becoming less about crafting the perfect implementation and more about architecting the perfect set of expectations. We define the contract with rigorous tests, and the AI, our tireless third programmer, finds the most direct way to fulfill it&#8212;even if it&#8217;s not the way we would have written it ourselves.</p>]]></content:encoded></item><item><title><![CDATA[Does an AI Teammate Mean You Write Less Code?]]></title><description><![CDATA[We embarked on an experiment called Trio Programming: two engineers and an AI assistant building software together.]]></description><link>https://www.nikmalykhin.com/p/does-an-ai-teammate-mean-you-write</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/does-an-ai-teammate-mean-you-write</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Tue, 14 Oct 2025 07:00:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We embarked on an experiment called <strong>Trio Programming</strong>: two engineers and an AI assistant building software together. Our goal was to discover effective workflows for this new dynamic. We started with a simple code kata, a clear set of rules for our AI, and a straightforward tech stack. Our assumption was that with a powerful AI coder, we&#8217;d move through the logic faster than ever.</p><p>Instead, we spent almost the entire session without writing a single line of business logic. The AI wrote plenty of code, but it was all in service of fixing a development environment that kept breaking. This led us to a counterintuitive conclusion: <strong>adding an AI to the team doesn&#8217;t accelerate feature development, it brutally exposes foundational weaknesses in your environment and workflow.</strong></p><h3>Our Setup: An Experiment in Trio Programming</h3><p>Our team consisted of myself (Nik) acting as the &#8220;driver&#8221;&#8212;the one interacting directly with the AI&#8212;and my colleague Javier as the &#8220;navigator,&#8221; providing high-level direction and quality control. Our third programmer was <strong>GitHub Copilot</strong>, guided by a detailed set of custom instructions emphasizing a strict Test-Driven Development (TDD) cycle, small incremental changes, and explicit permissions before writing any code.</p><p>The plan was to tackle the &#8220;Hierarchy Kata&#8221; &#8212;a REST API for managing an employee hierarchy&#8212;using a pure stack: <strong>Core Java</strong>, <strong>JUnit 5</strong>, and <strong>Gradle</strong>. We wanted to keep things simple and avoid framework magic.</p><h3>The Experiment That Failed: A Cascade of Configuration Errors</h3><p>Our first mistake was idealism. We started with Core Java to avoid frameworks, but quickly realized the sheer amount of boilerplate needed for a simple REST endpoint was distracting us from the actual kata. We pivoted.</p><p>&#8220;Let&#8217;s delegate that work to Spring,&#8221; we decided, thinking it would get us back on track. This is where the real trouble began. Our session devolved into a frustrating, iterative battle with our own setup, guided by an AI that was helpful but lacked strategic oversight.</p><ol><li><p><strong>Missing Dependencies:</strong> We asked Copilot to generate a test for a Spring Boot controller. It correctly produced a test using <code>@WebMvcTest</code> and <code>MockMvc</code> . But when we ran <code>./gradlew build</code>, the build failed spectacularly with dozens of <code>cannot find symbol</code> and <code>package does not exist</code> errors . Our <code>build.gradle</code> file had JUnit, but none of the required Spring Boot test dependencies.</p></li><li><p><strong>Incorrect Dependency Configuration:</strong> We then asked Copilot to fix our Gradle file. It suggested adding the Spring dependencies, but the first attempt failed because we hadn&#8217;t defined a version number, leading to a <code>Could not find org.springframework.boot:spring-boot-starter-web:.</code> error . The next fix involved adding the dependencies to the <code>subprojects</code> block in our root <code>build.gradle</code>, as they weren&#8217;t being inherited by the kata&#8217;s module. Each step was a tiny, painful discovery.</p></li><li><p><strong>Classpath and Package Structure Hell:</strong> After fixing the build file, the errors persisted. The problem? Our test file, <code>HelloWorldControllerTest.java</code>, was in <code>src/main/java</code> instead of <code>src/test/java</code>. The test dependencies weren&#8217;t on the main classpath. Once we moved it, we hit yet another wall: <code>Unable to find a @SpringBootConfiguration</code>. Our test in the <code>com.kata.hierarchy</code> package couldn&#8217;t find the main application class located in <code>com.example.helloworld</code> because of how Spring&#8217;s component scanning works.</p></li></ol><p>The entire session was a cycle of: ask for code, watch the build fail, feed the error log back to the AI, and apply the suggested micro-fix. We weren&#8217;t programming; we were performing highly-structured, AI-assisted debugging on our own environment.</p><div><hr></div><h3>Principles That Actually Work</h3><p>This frustrating experience revealed three principles that are critical for effective AI-augmented development.</p><ol><li><p><strong>The Environment is Non-Negotiable.</strong> An unstable or poorly understood development environment will completely derail any attempt at Trio Programming. The AI can suggest fixes, but it can&#8217;t reason about your setup holistically. Before you can ask an AI to write a feature, the entire team&#8212;humans and AI&#8212;must operate on a rock-solid foundation where builds, tests, and dependencies are flawless.</p></li><li><p><strong>Human Navigation is Paramount.</strong> The session would have been a total failure without a human navigator. Javier&#8217;s role was crucial for steering the ship. He spotted issues in prompts, provided strategic direction (&#8221;let&#8217;s put it in a new package&#8221; ), and kept the focus on the larger goal while I was in the weeds prompting the AI. As I noted in my log, &#8220;Speak, not only think - it&#8217;s a very strong pattern&#8221;. The AI is a powerful tool, but it needs a human strategist to be effective.</p></li><li><p><strong>Treat the AI as a System, Not Just a Coder.</strong> We started by giving the AI rules for writing code (TDD, small steps). But the real value came from using it as a diagnostic tool for a complex system that included our code, our build tool, and our framework. The prompts that worked best weren&#8217;t &#8220;implement this feature,&#8221; but rather &#8220;here is an error log, diagnose the problem and propose a minimal fix&#8221;.</p></li></ol><div><hr></div><h3>The Unexpected Discovery: The AI Reshapes Human Roles</h3><p>The most surprising insight was how the AI&#8217;s presence changed our own roles. My job as the &#8220;driver&#8221; became less about writing code and more about <strong>prompt engineering and AI flow control</strong>. I was focused on translating our navigator&#8217;s intent into precise instructions and context for the AI.</p><p>Javier&#8217;s &#8220;navigator&#8221; role expanded from guiding the code&#8217;s logic to <strong>managing the overall strategy and quality controlling both my prompts and the AI&#8217;s output</strong>. This division of labor was incredibly effective. Having one person focused on the high-level goal while the other managed the human-AI interface prevented us from getting stuck. The AI didn&#8217;t just add a third programmer; it created a new, more specialized dynamic between the two human programmers.</p><h3>The Central Paradox of AI Collaboration</h3><p>Herein lies the paradox: <strong>The goal of using an AI is to abstract away complexity, but its immediate effect is to surface hidden complexities you&#8217;ve been ignoring.</strong></p><p>We thought we had a working Java setup. But the AI, by trying to follow our commands precisely and rapidly, immediately ran into every single flaw in our Gradle configuration and package structure. A human programmer might have found these issues slowly over time. The AI found them all at once, forcing a full stop.</p><p>Effective use of an AI programmer therefore requires:</p><ul><li><p>An <strong>impeccably configured and automated</strong> development environment.</p></li><li><p><strong>Deep human expertise</strong> in the underlying tools (Gradle, Spring), as the AI&#8217;s suggestions still need validation.</p></li><li><p>A workflow where humans provide <strong>strategic intent</strong>, not just tactical instructions.</p></li></ul><h3>Conclusion: Build Your Pipeline Before You Start the Assembly Line</h3><p>Our first Trio Programming session felt slow and, at times, unproductive. We wanted to build an API, but we ended up building a robust, multi-module Spring Boot Gradle configuration. But as Javier aptly put it, this process is like building a good CI/CD pipeline: it &#8220;reduces the price of mistakes&#8221; and gives you the confidence &#8220;to move forward faster&#8221;.</p><p>The lesson is clear. You can&#8217;t just drop an AI into an existing workflow and expect a productivity boost. You must first use the AI to stress-test and harden your foundations. The initial time investment is not spent on writing features, but on creating an environment so solid that the AI can finally be unleashed on the work you actually want it to do. We ended the day in a much safer, more robust place, ready for the real work to begin.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[From Buzzword to Practical Tool: A Developer's Guide to Generative AI]]></title><description><![CDATA[It seems like every week there&#8217;s a new AI tool that promises to change everything.]]></description><link>https://www.nikmalykhin.com/p/from-buzzword-to-practical-tool-a</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/from-buzzword-to-practical-tool-a</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Mon, 29 Sep 2025 09:29:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NqMm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>It seems like every week there&#8217;s a new AI tool that promises to change everything. The hype is impossible to ignore. But behind the marketing, what are we actually dealing with? How do Large Language Models (LLMs) really work, and more importantly, what are the practical limitations that we, as developers, QAs, and analysts, need to understand to use them effectively and responsibly?</p><p>This article cuts through the noise to explain the core mechanics of Generative AI. We'll explore how these models "think," where they fail, and provide a set of practical heuristics for applying them in our work.</p><div><hr></div><h3>The 'Before' State: From Hard-Coded Rules to Learned Patterns</h3><p>To understand today's Generative AI, we have to look at its conceptual ancestors. The original dream of <strong>Artificial Intelligence</strong> (beginning in the 1950s) was about logic and explicit rules. The idea was to encode expert knowledge into a series of <code>IF &lt;condition&gt; THEN &lt;action&gt;</code> statements. This approach is far from obsolete; it&#8217;s still the backbone of many complex systems.</p><p>For example, I previously worked on a phishing detection team at a <strong>global cybersecurity company</strong>, where our core detection engine was a sophisticated, rule-based AI. We analyzed an email&#8217;s characteristics, and if the combined weighted score of all rules triggered a threshold, we marked it as malicious. That was our production system.</p><p>The first major evolution of this paradigm was <strong>Machine Learning</strong> (ML), which gained traction in the 1980s. Instead of engineers hand-crafting every rule, we could feed a system massive amounts of data and let it discover the patterns on its own. We don't tell a spam filter every possible suspicious word; we show it thousands of examples, and it <em>learns</em> the statistical characteristics of spam. These two ideas&#8212;rules and learning&#8212;are often used together. Our plan at the cybersecurity company was to layer ML on top of our rule engine to automatically spot new threats, rather than waiting for an engineer to write a new rule.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NqMm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NqMm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png 424w, https://substackcdn.com/image/fetch/$s_!NqMm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png 848w, https://substackcdn.com/image/fetch/$s_!NqMm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png 1272w, https://substackcdn.com/image/fetch/$s_!NqMm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NqMm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png" width="915" height="337" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:337,&quot;width&quot;:915,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:134009,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.nikmalykhin.com/i/174322894?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NqMm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png 424w, https://substackcdn.com/image/fetch/$s_!NqMm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png 848w, https://substackcdn.com/image/fetch/$s_!NqMm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png 1272w, https://substackcdn.com/image/fetch/$s_!NqMm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ed9aaf9-db9b-4fdc-9a53-44e4d55b72dc_915x337.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>Introducing the Core Concept: Generative AI</h3><p>The next leap came with <strong>Deep Learning</strong> in the 2010s, a type of ML that uses complex, multi-layered "neural networks" to find incredibly subtle patterns in data. This is the technology that powered the huge advances we saw in image recognition and speech-to-text.</p><p>That brings us to today's breakthrough: <strong>Generative AI</strong>.</p><ul><li><p><strong>What is it?</strong> Generative AI takes powerful Deep Learning models and flips their function. Instead of just <em>recognizing</em> patterns (e.g., "this image contains a cat"), it uses its understanding of those patterns to <em><strong>create</strong></em> new, original content (e.g., "generate a picture of a cat"). Large Language Models are a prime example of this capability.</p></li><li><p><strong>Why does it matter?</strong> The impact is massive because an estimated 80% of the world's data is unstructured text&#8212;emails, documents, support tickets, etc. LLMs are the first technology that can both process and generate human language at scale, creating a new human-computer interface where we can use <strong>natural language to express intent</strong>.</p></li><li><p><strong>How does it work?</strong> At its core, an LLM is a sophisticated pattern-matching machine built on a technology called the <strong>Transformer architecture</strong>. Its fundamental job is surprisingly simple: <strong>to predict the most statistically probable next word in a sequence</strong>. It's essentially a very powerful autocomplete. To do this, it relies on two key concepts:</p><ol><li><p><strong>Tokens</strong>: The model doesn't see words; it sees "tokens". Text is broken down into these building blocks&#8212;which can be words, parts of words, or punctuation. For example, <code>Generative AI is powerful</code> might become <code>["Gener", "ative", " AI", " is", " powerful"]</code>. A model's limits and API costs are all measured in tokens.</p></li><li><p><strong>The Context Window</strong>: This is the model's short-term memory. LLMs are <strong>stateless</strong>; they don't truly "remember" past conversations. With each prompt, the application sends the <em>entire conversation history</em> back to the model. This entire block of text must fit within the context window, which has a fixed token limit (e.g., 8k or 128k). If a conversation gets too long, the oldest messages are dropped, which is why the model seems to "forget" what was said earlier.</p></li></ol></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WrMT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WrMT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png 424w, https://substackcdn.com/image/fetch/$s_!WrMT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png 848w, https://substackcdn.com/image/fetch/$s_!WrMT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png 1272w, https://substackcdn.com/image/fetch/$s_!WrMT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WrMT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png" width="547" height="325" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:325,&quot;width&quot;:547,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19763,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.nikmalykhin.com/i/174322894?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WrMT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png 424w, https://substackcdn.com/image/fetch/$s_!WrMT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png 848w, https://substackcdn.com/image/fetch/$s_!WrMT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png 1272w, https://substackcdn.com/image/fetch/$s_!WrMT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a82ca8-6b77-4f6f-b912-af406d66209c_547x325.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>Practical Applications: Choosing the Right Tool for the Job</h3><p>Different LLMs are trained with different goals, giving them unique strengths. Choosing the right one is a key engineering decision. The following list is not exhaustive, but it reflects the tools my team and I rely on for our daily work.</p><p>The main model families you'll encounter are:</p><ol><li><p><strong>OpenAI's GPT Series (</strong><code>GPT-4o</code><strong>, etc.)</strong>: Best known as a powerful all-rounder, excelling at tasks requiring strong <strong>logical reasoning</strong> and complex <strong>code generation</strong>. This is often the go-to for debugging a tricky algorithm or scaffolding a new service.</p></li><li><p><strong>Anthropic's Claude Series (</strong><code>Claude 3.5 Sonnet</code><strong>, etc.)</strong>: Built with a heavy emphasis on safety and "Constitutional AI". Claude often produces more careful, <strong>nuanced writing</strong> and is a great choice for tasks like drafting detailed technical documentation or analyzing sensitive user feedback where tone and safety are paramount.</p></li><li><p><strong>Google's Gemini Series (</strong><code>Gemini 1.5 Pro &amp; Flash</code><strong>)</strong>: This family offers a trade-off. <strong>Gemini Pro</strong> is the high-power version focused on top-tier reasoning and advanced multi-modal capabilities. Its sibling, <strong>Gemini Flash</strong>, is optimized for speed and cost-efficiency, making it ideal for high-volume, lower-complexity tasks like chatbots or data extraction where low latency is critical.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L3M0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L3M0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png 424w, https://substackcdn.com/image/fetch/$s_!L3M0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png 848w, https://substackcdn.com/image/fetch/$s_!L3M0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png 1272w, https://substackcdn.com/image/fetch/$s_!L3M0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L3M0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png" width="1176" height="332" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:332,&quot;width&quot;:1176,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54635,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.nikmalykhin.com/i/174322894?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L3M0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png 424w, https://substackcdn.com/image/fetch/$s_!L3M0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png 848w, https://substackcdn.com/image/fetch/$s_!L3M0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png 1272w, https://substackcdn.com/image/fetch/$s_!L3M0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8c86c05-5966-4ec1-bc4f-1fbed57c2126_1176x332.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>Common Pitfalls &amp; Misconceptions</h3><p>The architecture of LLMs leads to two fundamental limitations that every user must understand.</p><h4>1. Hallucinations: Plausible vs. Truthful</h4><p>Because an LLM's only job is to predict the next most probable token, it is optimized to generate text that is <strong>plausible</strong>, not text that is factually <strong>true</strong>. It has no internal knowledge base or concept of truth. If you ask it to find sources for a claim, it will generate a list of references that <em>looks</em> perfect&#8212;with authors, titles, and journals that fit the pattern&#8212;but the sources themselves may be completely fabricated.</p><p><strong>How to avoid it</strong>: Be <strong>professionally skeptical</strong>. Treat all outputs as a first draft. Always verify facts, test all code, and check any sources it provides.</p><h4>2. The Black Box Problem: Why vs. What</h4><p>We can make an LLM's output <strong>deterministic</strong> by setting a parameter called "temperature" to zero, meaning it will give the same output for the same input every time. So we can see <em>what</em> it did. However, we can't see <em>why</em> it chose one token over another in a way that is humanly understandable. The decision is a result of calculations across billions of parameters, not a logical decision tree we can audit.</p><p><strong>Why it matters</strong>: This makes it nearly impossible to debug why a model gives a strange answer. In high-stakes domains like finance or healthcare, it's difficult to trust a system when there is no transparent reasoning path.</p><div><hr></div><h3>Core Trade-offs: Free vs. Paid Models</h3><p>The difference between free and paid AI tools is not just about features; it's about the entire engine. The primary trade-off is <strong>cost vs. capability</strong>.</p><ul><li><p><strong>Underlying Model</strong>: Free tiers typically use older, smaller, and less powerful models. Paid tiers give you access to the flagship models.</p></li><li><p><strong>Context Window</strong>: Paid models have much larger context windows (e.g., 128k+ tokens vs. 4k-16k), allowing you to work with larger documents and maintain longer conversations.</p></li><li><p><strong>Reasoning Ability</strong>: Premium models are significantly better at following complex, multi-step instructions. Less capable models are more prone to "laziness"&#8212;giving simplified answers, writing placeholder code, or telling you to do it yourself.</p></li></ul><p>For simple tasks, a free model may suffice. For complex development work, the limitations of a less capable model can become a significant bottleneck.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zxVN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zxVN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png 424w, https://substackcdn.com/image/fetch/$s_!zxVN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png 848w, https://substackcdn.com/image/fetch/$s_!zxVN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png 1272w, https://substackcdn.com/image/fetch/$s_!zxVN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zxVN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png" width="523" height="204" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:204,&quot;width&quot;:523,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32616,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.nikmalykhin.com/i/174322894?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zxVN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png 424w, https://substackcdn.com/image/fetch/$s_!zxVN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png 848w, https://substackcdn.com/image/fetch/$s_!zxVN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png 1272w, https://substackcdn.com/image/fetch/$s_!zxVN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f85a36d-1c7b-4352-8f0b-64123b439909_523x204.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div><hr></div><h3>Conclusion: 4 Heuristics for Using AI Responsibly</h3><p>Generative AI is a powerful tool, not magic. Understanding its mechanism&#8212;next-token prediction within a limited context window&#8212;is key to using it well. To ensure we are using these tools in a safe, effective, and responsible way, our team should always ask four questions before starting a task:</p><ol><li><p><strong>Do we have permission?</strong> Is the use of AI approved for this task by both the client and our company's policies? This is a non-negotiable first step.</p></li><li><p><strong>Are we exposing sensitive data?</strong> Does the prompt contain any client secrets, personal information, or confidential data? The answer must be no.</p></li><li><p><strong>How will we verify the output?</strong> What is our strategy for human review and testing? Whether it's a peer code review or a QA testing plan, a verification process is essential.</p></li><li><p><strong>Is this the right tool for the job?</strong> Is the model's speed, cost, and capability a good fit for this task? This is about making a deliberate engineering trade-off.</p></li></ol><p>By embracing professional skepticism and applying these simple heuristics, we can move beyond the hype and begin using Generative AI as what it is: a powerful, imperfect, but profoundly useful new tool in our professional toolkit.</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Jules, My AI Junior Developer]]></title><description><![CDATA[Here&#8217;s a question that emerged from my recent work with AI coding agents: to get better, more autonomous results, do you need to treat the AI less like a senior peer and more like a junior developer?]]></description><link>https://www.nikmalykhin.com/p/jules-my-ai-junior-developer</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/jules-my-ai-junior-developer</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Mon, 15 Sep 2025 19:36:38 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Here&#8217;s a question that emerged from my recent work with AI coding agents: to get better, more autonomous results, do you need to treat the AI less like a senior peer and more like a junior developer?</p><p>I recently spent time experimenting with Google's Jules, an AI agent designed to operate with a high degree of autonomy. The initial assumption was that I could delegate a series of tasks to a "senior" AI, provide it with a well-documented repository and clear instructions, and expect efficient execution. The experiment, however, surfaced a different, more nuanced reality about the operational model required to work effectively with today's AI agents.<br><br>My plan was to partition the work for a new project, <a href="https://github.com/nikmalykhin-tw/jules-foundation/tree/main">jules-foundation</a>, in a collaborative way:</p><ul><li><p><strong>Backend:</strong> Fully delegated to Jules.</p></li><li><p><strong>CI/CD:</strong> I would initiate the setup, and Jules would continue it.</p></li><li><p><strong>Frontend:</strong> A "ping-pong" approach where Jules would start, I would take over for a task using VSCode and GitHub Copilot, and then hand it back.</p></li></ul><p>This process was underpinned by a detailed <a href="https://github.com/nikmalykhin-tw/jules-foundation/blob/main/AGENTS.md">AGENTS.md</a> file, which codified principles from foundational software engineering texts to guide the agent's behavior. The results were illuminating, but not for the reasons I expected.</p><div><hr></div><h2>The Experiment's Stumbles</h2><p>The initial attempts at delegation quickly ran into issues that revealed the agent's limitations, not in its ability to write code, but in its judgment and awareness of context.</p><h3>1. The Hallucinating Generalist</h3><p>In the very first backend task&#8212;setting up a Kotlin and Micronaut project&#8212;Jules briefly defaulted to a completely different stack, attempting to implement the solution using Python and Poetry. It seemed to fall back on its generalized training data, where Python is a common choice for initial project setups. To its credit, the agent caught its own mistake and asked for confirmation before proceeding, but it was a stark reminder that even with specific instructions, the agent can be swayed by the statistical weight of its training data. It behaves like a junior developer who has a lot of theoretical knowledge but lacks the experience to apply it consistently in a specific context.</p><h3>2. The Context Pollution Problem</h3><p>My most significant error was continuing with the frontend task (<a href="https://github.com/nikmalykhin-tw/jules-foundation/issues/3">Task 3</a>) in the same chat I used for the backend (Task 1). After the first task was completed and merged, the <code>main</code> branch of the repository was updated. However, Jules, operating within its isolated chat context, was working off a stale version of the repository.</p><p>When asked to proceed, its lack of environmental awareness became clear. It stated: "I do not have a direct git pull command. My process is to complete the work and then use submit to propose the changes." Its proposed solution was to start over, re-implementing <em>both the backend and frontend tasks</em> from scratch. This demonstrated that long-running conversations spanning multiple, distinct tasks are unworkable. The context from previous work pollutes the agent's understanding of the current state.</p><h3>3. The Over-Eager Assistant</h3><p>During the frontend task, the instructions specified using simple HTML, Tailwind CSS, and Alpine.js, with Vite mentioned as an <em>optional</em> tool. Jules immediately planned to set up a full Vite project, concluding this was "the most professional and efficient way to approach this task." While a reasonable conclusion for a human engineer, it was a deviation from the core requirement of simplicity. It prioritized an optimized solution over adhering to the task's constraints, forcing me to update the <code>AGENTS.md</code> file with a strict <strong>"Technology Constraint Mandate"</strong> to prevent such deviations.</p><div><hr></div><h2>An Effective Operating Model</h2><p>Through these failures, a more effective workflow emerged. It centered on providing a rigid, well-defined operational framework rather than relying on the agent's "senior" judgment.</p><h3>1. One Task, One Context</h3><p>The <code>git pull</code> fiasco taught me the most important lesson: <strong>every new task requires a new, clean context</strong>. The effective workflow is atomic and mirrors standard development practice:</p><ol><li><p>Start a new "Jules task" for each new GitHub issue.</p></li><li><p>Provide the prompt, linking to the repository and the specific issue.</p></li><li><p>Let the agent fork the <em>current</em> main branch, implement the changes, and submit a pull request.</p></li><li><p>Review, merge, and close the task.</p></li><li><p>Repeat from step 1 for the next issue.</p></li></ol><p>This approach prevents context pollution and also mitigates the risk of git conflicts, as the agent is never in a position where it needs to reconcile its work with other changes made in parallel. The tasks must be designed to be sequential and independent.</p><h3>2. Define the Goal, Not Every Step</h3><p>My most successful interaction was with the first task, where the goal was clear and concise: "Create a Kotlin-based Micronaut application with a single GET endpoint that returns 'Hello, World!'." I didn't over-specify the steps, which allowed the agent to complete the task in just 7 minutes.</p><p>In contrast, my more prescriptive frontend task created blind spots. By trying to detail the steps, I inadvertently omitted small but crucial details, leading to initial friction. The key is to provide a <strong>clear objective and firm constraints</strong> but grant the agent the autonomy to handle the implementation details within those boundaries.</p><h3>3. The "Rules" Are the Scaffolding</h3><p>The foundational <code>AGENTS.md</code> file, which summarized core software engineering principles, was critical. Much like a <a href="https://nik1379616.substack.com/p/can-we-make-ai-code-assistants-smarter">well-crafted context can steer GitHub Copilot's suggestions</a>, these initial instructions act as a firm scaffolding for the agent's behavior. When failures occurred, I didn't just correct the agent in the chat; I updated the foundational rules. This ensures the learning is persistent and benefits all future tasks.</p><div><hr></div><h2>Unexpected Discovery: Simulating a Workflow</h2><p>A fascinating insight was how to enforce quality gates without giving the agent direct access to our environment. Jules can't <em>run</em> a pre-commit hook, but it can be instructed to <strong>simulate one</strong>.</p><p>I created a <strong>"Pre-Flight Simulation"</strong> mandate in the rules. Before submitting code, the agent must:</p><ol><li><p>Analyze the project's pre-commit configuration files.</p></li><li><p>Mentally review its generated code against every check defined in those files.</p></li><li><p>Provide a report confirming it performed the simulation.</p></li></ol><p>This approach improves code quality and reduces the cost of failed CI runs by shifting quality checks earlier in the process, even if only in simulation.</p><div><hr></div><h2>The Core Insight: Constraints Unlock Autonomy</h2><p>This leads to the core realization: <strong>to unlock the autonomy of an AI agent, you must constrain it with a rigid, machine-readable process.</strong></p><p>You can't treat it like a senior engineer with whom you can have a nuanced conversation. You have to manage it like a brilliant, lightning-fast, but utterly naive junior developer. It needs a "manager" to provide:</p><ul><li><p><strong>A Clear Definition of Done:</strong> The GitHub issue.</p></li><li><p><strong>Strict Rules of Engagement:</strong> The <code>AGENTS.md</code> file.</p></li><li><p><strong>An Isolated Work Environment:</strong> A new task for each new unit of work.</p></li></ul><p>The agent's value isn't in its judgment or experience, but in its speed and its ability to flawlessly execute a well-defined process within a tightly controlled environment.</p><div><hr></div><h2>Conclusion: We Are Becoming Architects of AI Workflows</h2><p>My experiment with Jules was a success, though not in the way I initially envisioned. The true leverage of these tools isn't just in code generation&#8212;it's in <strong>automating a workflow</strong>.</p><p>The real engineering work is shifting from pure implementation to architecting the system of rules, constraints, and processes that guide the AI. This has implications beyond just how developers work. It elevates the importance of the <strong>Business Analyst</strong> function, as creating well-defined, atomized, and unambiguous tasks is now a prerequisite for effective AI delegation. We must not only learn new skills for interacting with AI but also adapt our entire development workflow to match the capabilities of these new tools. The future isn't about replacing developers but about providing them with powerful new forms of leverage, provided we are willing to become the architects and managers of our new AI team members.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Does More Powerful AI Mean Slower Fixes?]]></title><description><![CDATA[Is it possible that our most advanced AI coding assistants are actually slowing us down?]]></description><link>https://www.nikmalykhin.com/p/does-more-powerful-ai-mean-slower</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/does-more-powerful-ai-mean-slower</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Tue, 26 Aug 2025 15:16:04 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Is it possible that our most advanced AI coding assistants are actually slowing us down? This question felt absurd as my team was heads-down, polishing our UI for a major release. We were in the final stretch, tackling a long list of small, cosmetic changes&#8212;the kind of work that should be quick. Yet, I found my workflow clogged, not by the complexity of the tasks, but by the "helpfulness" of my AI partner.</p><div><hr></div><h3>My Setup: The Final Polish</h3><p>Our environment was standard: a React codebase, a Git workflow with peer reviews, and an integrated AI coding assistant. My goal was to rapidly work through a backlog of minor UI tickets. Any UI update is a form of refactoring, and for that, I strictly follow a philosophy of making changes in what Javi L&#243;pez aptly calls "<a href="https://www.google.com/search?q=https://javil.substack.com/p/a-lot-of-tiny-steps-16eaac27acb4">a lot of tiny steps</a>," a pattern also known in classic terms as <a href="https://wiki.c2.com/?RefactoringInVerySmallSteps">Refactoring In Very Small Steps</a>. This ensures each commit is atomic and easy for my teammates to review. I was relying on the AI&#8217;s "Agent mode"&#8212;<strong>its capability to autonomously modify the codebase</strong>&#8212;expecting it to align with this micro-step approach. The reality was quite different.</p><div><hr></div><h3>When 'Help' Became a Hindrance</h3><p>The core problem was that the AI agent consistently over-engineered solutions for trivial problems. It treated every request for a small change as an invitation to refactor the entire component. This isn't a failure of intelligence, but a <strong>misalignment of goals</strong>: my goal was a minimal diff, whereas the agent's goal is often holistic file correctness, aiming to fix all potential issues it identifies in one pass. Crucially, even when I gave it explicit, TDD-style instructions to <em>only</em> perform a single, minimal action, it still defaulted to making broad, sweeping changes.</p><h4>Example 1: A Simple CSS Tweak</h4><p>I needed to make a submit button full-width on mobile devices. A straightforward task.</p><p>The fix that was actually needed:</p><p>CSS</p><pre><code><code>@media (max-width: 50rem) {
  .formSubmitMobileWrapper button {
    width: 100%;
  }
}
</code></code></pre><p>I prompted the AI agent: "<em>Only add a new media query for screens under 50rem to the </em><code>.formSubmitMobileWrapper button</code><em> class to set its width to 100%. Do not touch any other code.</em>"</p><p>Despite the clear instruction, the agent generated a massive diff, rewriting existing desktop styles and restructuring the entire CSS class.</p><ul><li><p><strong>Time Wasted:</strong> I spent 15 minutes untangling the AI's suggestion, versus the 2 minutes it would have taken to write the CSS myself.</p></li><li><p><strong>Quality Issues:</strong> The generated code created a high cognitive load for code review. A teammate would have to ask, "Why did we refactor all the button styles just to change one mobile property?"</p></li><li><p><strong>Structural Problems:</strong> This approach created bloated commits, making our Git history noisy and directly violating the "very small steps" principle.</p></li></ul><h4>Example 2: A Minor Accessibility Improvement</h4><p>Next, I picked up a ticket to improve the accessibility of our card components. Again, I gave a precise instruction: "<em>Add a </em><code>role='region'</code><em> attribute to the parent div of the Card component.</em>"</p><p>Instead of a one-line change, the agent tried to rewrite half the component's JSX structure, arguing it was for "better semantic clarity" and completely ignoring my focused instruction.</p><div><hr></div><h3>Principles That Actually Work</h3><p>This friction forced me to re-evaluate how I was using the tool. I realized the key is to <strong>match the tool's capability to the task's scope</strong>. This led me to two guiding principles.</p><h4>1. Use AI Chat for Suggestions, Not Implementation</h4><p>For micro-changes, the AI's "Chat mode" is far more effective. By treating it as a context-aware search engine, I can ask for targeted advice.</p><ul><li><p><strong>Prompt:</strong> "<em>What's the best CSS to make this button full-width on mobile?</em>"</p></li><li><p><strong>Result:</strong> It gives me the precise, minimal code snippet I need. I copy, paste, and commit. The change is atomic and review is trivial.</p></li></ul><p>This keeps the developer in control and prevents the AI from making unsolicited "improvements." The benefits are clear: smaller pull requests and faster review cycles. This aligns with research from <a href="https://www.faros.ai/blog/ai-software-engineering">Faros AI</a>, which notes that while AI can boost individual developer throughput, it often leads to ballooning review queues. I've written more about this in my article, "<a href="https://nik1379616.substack.com/p/can-we-make-ai-code-assistants-smarter">Can we make AI code assistants smarter?</a>".</p><h4>2. Reserve AI Agents for Scaffolding and True Refactoring</h4><p>The autonomous "Agent mode" is incredibly powerful, but its strength lies in larger, well-defined tasks, not surgical strikes.</p><ul><li><p><strong>Good use case:</strong> "<em>Create a new React component for a user profile page with an avatar, name, and bio section. Include Storybook stories and a basic test.</em>"</p></li><li><p><strong>Bad use case:</strong> "<em>Add a </em><code>margin-top</code><em> to the avatar in the user profile component.</em>"</p></li></ul><p>Using an agent is best when the expected outcome is a significant amount of new or changed code.</p><p><em>This simple matrix illustrates the core principle: for small-scoped tasks, a suggestion-based AI interaction is most effective, while large-scoped tasks are better suited for autonomous AI execution.</em></p><div><hr></div><h3>Unexpected Discovery: AI Forced Me to Define "Small"</h3><p>The most surprising insight was that the AI forced me to be more precise in defining a "small change." My heuristic is now this: <strong>if the task's description is longer than the code I expect to write, it will be good to use Chat mode.</strong></p><p>A task like "Make the button full-width on mobile" is a perfect example. The description is simple, and the code is just a few lines. The AI agent, however, interprets this as a symptom of a larger problem ("This component is not fully responsive") and tries to solve that instead. This mental checkpoint prevents me from accidentally turning a 5-minute task into a 30-minute ordeal.</p><div><hr></div><h3>The Autonomy vs. Precision Trade-Off</h3><p>This leads to a central, counterintuitive truth: <strong>the more autonomy you grant an AI coding assistant, the less precision you may get for small, targeted tasks.</strong></p><p>This isn't a paradox; it's a trade-off. Autonomous agents are optimized for holistic correctness. They don't just see the three lines of CSS you want to add; they see the entire file and its potential imperfections. Their goal is to bring the whole file into a state of grace, which directly conflicts with the goal of making a minimal, targeted change.</p><p>Effective use, therefore, requires the developer to:</p><ul><li><p><strong>Explicitly define the scope</strong> of the change before starting.</p></li><li><p><strong>Choose the right mode</strong> for the job (Chat vs. Agent).</p></li><li><p><strong>Maintain control</strong> and view the AI as a suggester, not an infallible executor, for routine work.</p></li></ul><div><hr></div><h3>A More Thoughtful Partnership</h3><p>My journey through pre-release UI tweaks taught me a crucial lesson. AI coding tools aren't a simple "on/off" switch for productivity. They are a suite of capabilities, each with an appropriate use case. An autonomous agent is a powerful ally for building new things from the ground up, but for the delicate art of finishing and polishing, a simple chat-based suggestion is often faster, cleaner, and more respectful of my teammates' time. The real skill in this new era of software development is not just in writing clever prompts, but in having the wisdom to choose the right tool for the job.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Context Window Paradox: To Get More From Your AI, Give It Less]]></title><description><![CDATA[When I started working with LLMs that have massive, million-token context windows, I fell into a trap.]]></description><link>https://www.nikmalykhin.com/p/the-context-window-paradox-to-get</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/the-context-window-paradox-to-get</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Mon, 18 Aug 2025 14:02:12 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When I started working with LLMs that have massive, million-token context windows, I fell into a trap. I figured that more context meant less work for me. I could just dump a project's entire history into the chat and trust the model to keep everything straight. It seemed logical. But as I found out while co-authoring a technical presentation with an AI, this assumption is not just wrong; it&#8217;s backward. My attempts to treat the AI like a partner with a perfect memory led to subtle but persistent quality issues. It was only when I started being deliberately restrictive with the context that the results genuinely improved. This led me to a counterintuitive conclusion: to get the most out of a large context window, you have to actively manage and constrain it.</p><div><hr></div><h2>My Setup: Crafting a Presentation with an LLM</h2><p>For this experiment, I set out to create a new slide deck, "Introduction to Agents," using <strong>Gemini 2.5 Pro</strong> as my authoring partner. My methodology is rooted in agile principles&#8212;starting with a high-level outline and iteratively building out the content. The goal is to offload the heavy lifting of drafting, allowing me to focus on the core narrative and technical accuracy.</p><p>I began with a highly structured initial prompt. This approach, refined from my <a href="https://nik1379616.substack.com/p/pair-authoring-with-an-ai-a-case">previous work on pair-authoring with AI</a>, was designed to establish clear rules of engagement for our collaboration from the start.</p><pre><code><code>Hello,

I need your help preparing a presentation. Here is the context and the way I would like to work with you.

1. The Goal:
My objective is to create a [e.g., 15-minute] presentation for [e.g., the AI For Software Delivery festival]. The final output should be a complete script, ready to be put onto slides.

2. Source Materials:
Here are the links and documents you should use...

3. Our Workflow (How We Will Work):
We will follow a structured, step-by-step process. I want to approve each step before we move to the next.
  Step 1: High-Level Plan...
  Step 2: Section Planning...
  Step 3: Slide-by-Slide Drafting...
  Step 4: Finalizing Sections...

4. Content Structure (What I Want for Each Slide):
A) On-Slide Text: ... minimalist and concise.
B) Accompanying Speech: ... comprehensive and explain the concepts...

5. Tone and Style:
... Professional and Neutral ... Not Promotional...

Your First Task:
Please review the source materials I provided and propose a high-level outline for the presentation.
</code></code></pre><p>This prompt was my attempt to front-load the process with as much structure as possible.</p><div><hr></div><h2>The Subtle Failure of a Well-Laid Plan</h2><p>Even with this detailed plan, I began noticing a subtle degradation in quality as our conversation grew longer. The failure wasn't catastrophic, but it manifested as a slow accumulation of small inconsistencies and extra work.</p><p>My chat history filled up with micro-corrections. At one point, the AI generated speech that claimed I worked at Thoughtworks on a project I had explicitly described from a previous job at Mimecast. The fix was a simple prompt:</p><pre><code><code>fix for B) Accompanying Speech (Revised):
###
For example, I previously worked on a phishing detection team here at Thoughtworks.
@@@
I worked in Mimecast, not in Thoughtworks.
###
</code></code></pre><p>Other times, I had to refine the AI's understanding of the scope: <code>We analyzed an email</code>, not <code>We'd analyze a URL</code>. While individually minor, the frequency of these small edits increased as the context grew. The AI's attention seemed to drift, anchored more by recent turns in the conversation than by the foundational context we had established. The presentation began to feel less like a single, coherent story and more like a collection of loosely related talking points that required my constant, vigilant correction.</p><div><hr></div><h2>Principles That Actually Work</h2><p>This experience led me to refine my approach, focusing on two principles that proved to be more effective.</p><h3>1. Enforce a Disciplined Pace (Decompose and Verify)</h3><p>My initial prompt already specified a slide-by-slide workflow. However, I discovered that I had to actively enforce this pace. At times, the AI would try to rush ahead, offering to generate multiple slides or an entire section at once. I learned to be firm, using prompts like <code>always wait my approve</code> and <code>before we will move to section 2, combine all approved slides and speeches for section 1</code>.</p><p>The key insight here was that <strong>it wasn't enough to state the plan; I had to actively rein in the AI's eagerness to generate</strong>. This forced discipline kept each unit of work small and verifiable. By confirming the correctness of each individual slide before moving on, I ensured the foundation for the next one was solid.</p><h3>2. Proactively Distill the Context</h3><p>This was the most impactful change. Acknowledging the limitations of the AI's attention, I began a practice of "proactive context distillation." After completing and approving a section, I would instruct the AI to summarize what we had just created.</p><pre><code><code>"Excellent. Please summarize the content of the two slides we just created for Section 1 into a concise overview."
</code></code></pre><p>This technique serves two purposes. First, it acts as a "context refresher" for the AI, collapsing the detailed turn-by-turn history into a dense, high-signal summary. As explained by IBM, a <a href="https://www.ibm.com/think/topics/context-window">context window</a> isn't a perfect memory; a clean summary places the most important information front and center. Second, it keeps the most relevant information active, preventing the model's attention from being diluted by the noise of a long conversation.</p><div><hr></div><h2>An Interesting Realization: It&#8217;s For Me, Not Just the AI</h2><p>Initially, I viewed context distillation as a trick to manage the AI. But a valuable realization was how much it improved my own thinking. The act of requesting and reviewing these summaries forced me to constantly re-evaluate the presentation's narrative arc. Was Section 2 a logical continuation of Section 1? Did the key messages connect?</p><p>This process mirrors what we know about human cognition. Our own <a href="https://en.wikipedia.org/wiki/Working_memory#Theories">working memory</a> is limited. By creating these summaries, I wasn't just helping the AI; I was building a better mental model for myself. The distillation process became a forcing function for clarity, ensuring that I, the human author, remained in firm control of the narrative.</p><div><hr></div><h2>Closing the Loop: Refining the Process with the AI</h2><p>After the presentation script was complete, I tried one final experiment. I asked the AI to become my process consultant.</p><pre><code><code>"Now, analyze our chat for understanding our pattern of communication and work under the presentation. After that, check my initial prompt... Do we need to improve the original prompt or does it look good enough?"
</code></code></pre><p>The AI analyzed our entire interaction and proposed an improved V2 of my starting prompt. The new version added subtle but important clarifications, such as describing the process as <code>iterative</code> and <code>flexible</code>. This idea of having the tool refine its own operating instructions isn't new to my workflow; I'd explored a <a href="https://nik1379616.substack.com/p/can-we-make-ai-code-assistants-smarter">similar pattern when getting Copilot to improve its own rules</a>. Applying it to a conversational authoring process, however, felt like a significant step forward. It&#8217;s a powerful demonstration of using the tool not just to execute a task, but to reflect on and improve the very process of collaboration itself.</p><div><hr></div><h2>The Central Paradox of AI Collaboration</h2><p>This leads to the central, counterintuitive truth I discovered: <strong>To effectively leverage an AI's massive context window for a complex project, you must actively manage and constrain the context you provide it.</strong></p><p>This paradox exists because a large context window is not a perfect memory. It is a probabilistic field of attention. Without deliberate guidance, the AI can get lost in the details, overweighting recent conversation turns and losing the foundational plot.</p><p>Effective use, therefore, requires more than just good prompting. It requires:</p><ul><li><p><strong>Active Context Management</strong>: You must act as the session manager.</p></li><li><p><strong>Enforced Discipline</strong>: A structured, step-by-step pace that you actively maintain.</p></li><li><p><strong>Strategic Summarization</strong>: Periodically compressing the state of the project to maintain focus.</p></li></ul><div><hr></div><h2>Conclusion: From Prompter to Director</h2><p>Treating a generative AI as a simple instruction-follower for complex work is a path to mediocre results. The initial promise of "thinking less" is a mirage. Instead, these tools invite us to think <em>differently</em>.</p><p>The real leverage comes from shifting our role from a mere prompter to that of a director. Our job is not just to provide instructions but to manage the state, curate the context, and guide the narrative. By enforcing a disciplined pace, proactively distilling the context, and even using the AI to refine our methods, we don't just mitigate the tool's weaknesses; we sharpen our own thinking. The AI provides the immense power of generation, but human oversight provides the coherence and intent that turns raw output into a valuable, finished product. This collaborative dance is the future of knowledge work.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Can We Make AI Code Assistants Smarter by Asking Them to Write Their Own Rules?]]></title><description><![CDATA[It seems logical, doesn't it?]]></description><link>https://www.nikmalykhin.com/p/can-we-make-ai-code-assistants-smarter</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/can-we-make-ai-code-assistants-smarter</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Wed, 30 Jul 2025 10:12:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>It seems logical, doesn't it? To get better, more consistent output from an AI pair programmer, you should give it a clear set of instructions. My thinking went a step further: why reinvent the wheel? Why not just grab a comprehensive, battle-tested list of rules from another advanced tool and plug it into my setup?</p><p>This experiment was a continuation of my explorations into <a href="https://nik1379616.substack.com/p/ai-assisted-terraform-refactoring">AI-assisted development</a>, but it led me somewhere unexpected. The attempt to create a "plug-and-play" expert failed, but the lessons learned revealed a far more effective and collaborative way of working with AI.</p><h3>Personal Context &amp; Tools</h3><p>My daily workflow is centered in Visual Studio Code. The key players in this experiment were:</p><ul><li><p><strong>GitHub Copilot:</strong> The core AI assistant, used in both inline and chat modes.</p></li><li><p><strong>Copilot Instructions Feature:</strong> The ability to provide custom guidance via a <code>.vscode/copilot-instructions.md</code> file.</p></li><li><p><strong>My Goal:</strong> To make Copilot's suggestions adhere to my project's specific coding standards without constant manual correction.</p></li></ul><p>I typically follow a pragmatic approach to development, valuing consistency and clarity over dogmatic adherence to any single methodology. The goal was to codify this pragmatism into rules for my AI partner.</p><h3>The Failed Experiment: The "Universal Rulebook" Fallacy</h3><p>My hypothesis was simple: a good set of rules for one AI agent should be good for another. I had been using the Cursor editor with the popular <code>awesome-cursorrules</code> repository, and it worked well <em>in that environment</em>. The rules helped Cursor generate clean, consistent code. I assumed I could port this success over to GitHub Copilot in VS Code.</p><p>I copied a large chunk of these proven rules directly into my <code>.vscode/copilot-instructions.md</code> file. The result was a lesson in context. Rules are not universally portable.</p><ol><li><p><strong>Context-Blind Enforcement:</strong> Even on a relatively fresh project, the agent became overly aggressive. I had a rule like <code>"Always use arrow functions for React components."</code> While my project did use arrow functions, Copilot began to apply this rule with a sledgehammer, suggesting aggressive refactors on any function it encountered, often in ways that broke the subtle stylistic patterns my team had established. It lacked judgment, creating noise and unnecessary churn.</p></li><li><p><strong>Verbose and Noisy Suggestions:</strong> Another rule I implemented was, <code>"Ensure all functions are documented with detailed JSDoc comments."</code> When I then asked Copilot to help with a simple utility function, the output was technically correct but practically absurd. An illustrative example would look something like this:</p><p>TypeScript</p></li></ol><pre><code><code>/**
 * @param {string} str The string to capitalize.
 * @returns {string} The capitalized string.
 */
const capitalize = (str) =&gt; str.charAt(0).toUpperCase() + str.slice(1);
</code></code></pre><ol><li><p>The documentation, forced by the rule, was longer than the code itself. This added cognitive overhead for simple, self-explanatory functions.</p></li></ol><p>This approach abandoned the core idea of context-aware assistance. The specific negative outcomes were clear:</p><ul><li><p><strong>Quality issues:</strong> The AI-generated code felt alien. It followed the new, transplanted rules but ignored the project's existing patterns, creating a maintenance headache.</p></li><li><p><strong>Impact on iteration:</strong> Instead of accelerating development, the rigid rule set became a bottleneck. It was like working with an overzealous assistant who had memorized a textbook but had zero practical field experience.</p></li><li><p><strong>Quantifiable problems:</strong> I spent more time fighting, deleting, or manually editing Copilot's "helpful" but misguided suggestions than I would have spent just writing the code myself. My flow state was constantly broken.</p></li></ul><h3>Principles That Actually Work</h3><p>After deleting the entire instruction file in frustration, I started over. This time, I discovered a couple of principles that transformed the AI from a dogmatic rule-follower into a genuine collaborator.</p><h4>1. Co-Author Your Instructions <em>with</em> the AI</h4><p>Instead of pasting in a foreign set of rules, I used a built-in VS Code feature. In my empty <code>.vscode/copilot-instructions.md</code> file, I used the <strong>"Generate instructions..."</strong> command. Copilot analyzed the code in my current workspace and then proposed a set of instructions tailored to my project's reality.</p><p>It identified existing patterns and suggested rules to reinforce them. It was the complete opposite of my first experiment: instead of forcing my code to conform to the rules, the rules were generated to conform to my code. This aligns with the core tenets of <a href="https://martinfowler.com/bliki/TestDrivenDevelopment.html">Test-Driven Development (TDD)</a>, where the desired outcome (in this case, the existing code style) defines the path forward.</p><p><strong>Benefit:</strong> The rules are organic, project-specific, and context-aware from day one.</p><h4>2. Use the AI to Refine Its Own Instructions</h4><p>My initial mistake was treating instructions as a static document. The effective approach is to make the AI itself a partner in refining them. My new workflow is a continuous, AI-driven feedback loop:</p><p>After a pairing session with the agent, I switch from the inline mode to the <strong>Copilot Chat view</strong>. There, I prompt it to perform a self-assessment:</p><p><code>"Analyze our recent conversation. Based on the guidance and corrections I provided, suggest improvements to my .vscode/copilot-instructions.md file."</code></p><p>The AI then reviews our interaction&#8212;including the times I had to give it procedural hints like <code>"let's make a short overview before writing the code"</code>&#8212;and suggests new, more effective rules. This delegates the work of refining the instructions to the tool that will be using them. It's a powerful and efficient way to make the AI a better collaborator over time.</p><h3>Example: My Evolved Instruction File</h3><p>After a few of these refinement cycles, my <code>.vscode/copilot-instructions.md</code> file started to look less like a generic style guide and more like a practical collaboration agreement. It's a living document, but here&#8217;s a snapshot of what it contains:</p><pre><code># AWS Serverless Infrastructure as Code Guidelines

This project implements a serverless application using AWS Lambda and API Gateway, with infrastructure defined in Terraform. Follow these guidelines when making changes:

## Project Architecture

### Component Structure

```
src/              # Lambda function implementations
&#9500;&#9472;&#9472; hello_world.py     # Example Lambda handler
&#9492;&#9472;&#9472; [other_functions]  # Additional Lambda functions
terraform/        # Infrastructure definition
&#9500;&#9472;&#9472; modules/      # Reusable Terraform modules
&#9492;&#9472;&#9472; main.tf       # Main infrastructure configuration
tests/           # Integration tests
&#9492;&#9472;&#9472; test_api.py   # API endpoint tests
```

### Key Design Patterns

1. **Lambda Function Structure**:

   ```python
   # Standard Lambda handler pattern - follow this structure
   def lambda_handler(event, context):
       logger.info("Processing request")  # Always log entry
       # ... function logic ...
       return {
           'statusCode': 200,
           'body': json.dumps(result)
       }
   ```

2. **Terraform Module Usage**:
   ```hcl
   # Follow this pattern when adding new Lambda functions
   module "my_function" {
     source = "./modules/lambda"
     function_name = "&lt;service&gt;-&lt;action&gt;"
     source_file   = "../src/&lt;filename&gt;.py"
     handler       = "&lt;filename&gt;.lambda_handler"
     runtime       = "python3.9"
   }
   ```

## Code Quality Standards

- Write explicit, descriptive variable names over short, ambiguous ones
- Follow the existing project's coding style for consistency
- Use named constants instead of hardcoded values
  Example:

```python
# Good
MAX_API_RETRIES = 3
is_api_healthy = retry_count &lt; MAX_API_RETRIES

# Avoid
m = 3
healthy = n &lt; m
```

# Development Approach

- Don't invent changes beyond what's explicitly requested
- Follow security-first approach in all code modifications
- Don't modify files outside the requested scope
- Don't suggest improvements to files not mentioned in the task
  Example of focused scope:

```typescript
// Request: "Add email validation to User class"
// Good - only modifying requested file
class User {
  validateEmail(email: string): boolean {
    const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
    return emailRegex.test(email);
  }
}

// Avoid - suggesting changes to other files
// &#10060; "We should also update UserRepository.ts"
// &#10060; "Let's improve the existing validation in Utils.ts"
```

# Communication Style &amp; Protocol

## Step-by-Step Communication Pattern

```
1. User Request
   User: "Need to implement email validation"

2. Copilot Overview
   Copilot: "Overview: Adding email validation
   - Will create test for invalid email (15 lines)
   - Will implement validator (20 lines)
   Let's start with the test?"

3. User Approval
   User: "Looks good"

4. Implementation
   Copilot: *provides code in proper format*

5. Confirmation
   User: "OK" or "Looks good"
```

# Test-Driven Development Workflow

## TDD Cycle

```
&#9484;&#9472;&#9472; 1. Discuss Test Requirements
&#9474;   User: "Need password validation"
&#9474;   Copilot: "Let's test minimum length first"
&#9474;
&#9500;&#9472;&#9472; 2. Write Test (Red)
&#9474;   describe('PasswordValidator', () =&gt; {
&#9474;     it('requires minimum 8 characters', () =&gt; {...}
&#9474;   });
&#9474;
&#9500;&#9472;&#9472; 3. Implement Code (Green)
&#9474;   class PasswordValidator {
&#9474;     isValid(password: string): boolean {...}
&#9474;   }
&#9474;
&#9500;&#9472;&#9472; 4. Optional: Refactor
&#9474;   - Improve naming
&#9474;   - Remove duplication
&#9474;   - Enhance readability
&#9474;
&#9492;&#9472;&#9472; 5. Next Test or Complete
    - User approval required before proceeding
```

### Testing Instructions

### Test Guidelines

- Write focused, single-purpose API integration tests
- Always use `get_api_url()` to fetch endpoints dynamically
- Test edge cases explicitly and include proper delays
- Use descriptive test names
  Example:

```python
# Good test names
def test_endpoint_returns_200_on_valid_input():
def test_endpoint_handles_empty_payload():
def test_endpoint_returns_404_on_invalid_path():

# Good patterns
def test_new_endpoint():
    api_url = get_api_url()
    time.sleep(5)  # Allow API Gateway to propagate
    response = requests.get(f"{api_url}/path")
    assert response.status_code == 200
    assert "expected_value" in response.text
```

### Running Tests

```bash
cd tests
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pytest -v
```</code></pre><h3>Unexpected Discovery: Guidance Trumps Raw Power</h3><p>Here's the most surprising insight: a well-instructed GitHub Copilot, even using the standard (and technically free) models, consistently produces more useful, contextually-aware code than a more advanced model like Claude 3.5 Sonnet working <em>without instructions</em>.</p><p>The raw power of a cutting-edge LLM often leads to more "creative" but less relevant code for the specific task at hand. My carefully curated instruction set, running on a standard model, was simply a better pair programmer for <em>my</em> project.</p><p><strong>Why this matters:</strong> It proves that effective AI assistance is less about the raw intelligence of the Large Language Model and more about the quality of the guidance you provide. Thoughtful engineering and context-setting can be more valuable than simply paying for a more powerful brain.</p><h3>The Central Paradox: To Think Less, You Must First Think More</h3><p>This leads to the central paradox of using AI assistants effectively: <strong>to offload cognitive work to an AI, you must first do the meta-cognitive work of codifying your own development philosophy and collaboration style.</strong></p><p>You can't just install a tool and expect it to read your mind. This paradox exists because AI assistants are not colleagues; they are incredibly sophisticated pattern matchers. Without your explicit context, they will default to the most generic patterns from their training data.</p><p>Effective use actually requires:</p><ol><li><p><strong>Self-Awareness:</strong> A clear understanding of your own coding patterns and project conventions.</p></li><li><p><strong>Iterative Refinement:</strong> Treating the AI's instruction set as a project artifact that evolves over time.</p></li><li><p><strong>Collaborative Mindset:</strong> Shifting from "commanding" the AI to "guiding" its process.</p></li></ol><h3>Forward-Looking Conclusion</h3><p>The dream of a universal, plug-and-play rulebook for AI assistants is a dead end. As I found in my previous reflections on whether we can <a href="https://thoughtworks.medium.com/https-www-thoughtworks-com-insights-blog-generative-ai-do-developers-need-think-less-ai-203f608de4bb">think less with AI</a>, the answer is no&#8212;we must think differently.</p><p>The <code>copilot-instructions.md</code> file is not just a configuration; it's the DNA of your AI collaborator for a specific project. It should be checked into version control and evolve alongside your <code>README.md</code> and <code>package.json</code>.</p><p>Stop searching for the perfect list of rules to copy and paste. Start with an empty file, click "Generate instructions...", and then use the chat to refine them after each session. The most effective AI assistant isn't the one with the most powerful model, but the one you&#8217;ve taken the time to teach.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Pair-Authoring with an AI: A Case Study in Structured Collaboration]]></title><description><![CDATA[Moving beyond simple text generation to a collaborative partnership with a Large Language Model.]]></description><link>https://www.nikmalykhin.com/p/pair-authoring-with-an-ai-a-case</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/pair-authoring-with-an-ai-a-case</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Thu, 10 Jul 2025 14:13:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Moving beyond simple text generation to a collaborative partnership with a Large Language Model.</p><h3><strong>1. The Tempting Proposition: "Just Write the Presentation"</strong></h3><p>Every complex project begins with a familiar challenge: a folder full of raw material and a blank canvas. In my case, it was a collection of dense Google Docs, a GitHub repository, and a clear goal&#8212;to create a polished 15-minute presentation for an upcoming conference.</p><p>The temptation, of course, was to turn to a Large Language Model and use what I call a "lazy prompt": <em>"Here are the documents, write me a 15-minute presentation."</em></p><p>But my past experiments in AI-assisted development have taught me a crucial lesson. This approach almost inevitably leads to what I've previously described as a </p><p> "structural mess" in a recent article <a href="https://www.thoughtworks.com/insights/blog/generative-ai/do-developers-need-think-less-ai">Do developers need to think less with AI?</a> The AI can generate content, but it cannot, on its own, generate a compelling structure from ambiguous source material.</p><p>Knowing this, I adopted a different role&#8212;not of a micromanager, but of a director and critic. My job wasn't to define every expectation in a vacuum. Instead, our process became a rapid, iterative loop: the AI would propose a draft, and I would review it, using my critique to constantly clarify and refine my expectations for the next generation, from the high-level outline down to the tone of a single sentence.</p><h3><strong>2. Two Principles for a Better Partnership</strong></h3><p>To avoid the disappointing results of lazy prompting, I applied a more disciplined approach based on two core principles. These principles weren't just about getting a better output; they were about creating a more effective and predictable collaborative process with the AI.</p><p><strong>1. Define the "Pairing Contract" Upfront</strong></p><p>Before generating any content, we established a clear "contract" that would govern our entire interaction. This wasn't a formal document, but a set of initial instructions that aligned the AI with my expectations. The key clauses of this contract were:</p><ul><li><p><strong>A Step-by-Step Workflow:</strong> We explicitly agreed to work in small, approved increments. The AI would propose a plan for a section or a draft for a slide, and I would approve or critique it before we moved on. This prevented the AI from getting too far ahead on a wrong path.</p></li><li><p><strong>A Dual-Output Structure:</strong> For every slide, I specified two distinct deliverables: minimalist <strong>on-slide text</strong> for readability and a more detailed <strong>accompanying speech</strong> for the narrative. This separation of concerns is critical for creating effective presentations.</p></li><li><p><strong>A Pre-Defined Tone:</strong> We established that the voice should be "professional and neutral," not "promotional." This simple directive guided the tone of every piece of generated text.</p></li></ul><p>This upfront alignment ensured the AI's "creative freedom" was always channeled within the bounds of my strategic goals for the project.</p><p><strong>2. A Lot of Tiny Steps</strong></p><p>Borrowing a key principle from effective software development, I resisted the urge to ask the AI to solve large problems in one go. Instead, we broke down the task of creating the presentation into a hierarchy of tiny, manageable steps:</p><ul><li><p>First, we debated and finalized the <strong>high-level outline</strong> of the entire presentation.</p></li><li><p>Next, we planned the narrative flow for <strong>one section at a time</strong>.</p></li><li><p>Then, we drafted the content for only <strong>one slide at a time</strong>.</p></li><li><p>Finally, once the core content was set, we refined the <strong>smallest details</strong>, like the wording of a single bullet point or the line breaks in a sentence for better visual balance.</p></li></ul><p>This micro-iterative approach kept the AI's output focused and reviewable. It allowed for constant course correction and ensured that every component, from the overall structure down to the final polish, met my expectations before we considered it complete.</p><h3><strong>3. An Unexpected Discovery: The Power of Separated Concerns</strong></h3><p>In any deep-dive process, some of the most valuable insights are the ones you don't anticipate. While I expected the step-by-step workflow to be effective, I underestimated the impact of one particular clause from our initial "Pairing Contract": the strict separation of <strong>On-Slide Text</strong> and <strong>Accompanying Speech</strong>.</p><p>Initially, this seemed like a simple, practical rule for creating a presentation. However, it quickly revealed itself to be a powerful disciplinary tool that forced a higher level of clarity throughout the entire process. Here's how:</p><ul><li><p><strong>It Forced Conciseness.</strong> The most common failure mode of presentations is a slide cluttered with text. By having a dedicated place for the detailed narrative (the speech), we were forced to be ruthless with the on-slide text. The question was no longer "What should this slide say?" but "What is the absolute minimum text needed on this slide while I am speaking?" This led to cleaner, more impactful visuals.</p></li><li><p><strong>It Clarified the Narrative.</strong> The dual-output structure forced us to constantly distinguish between the <em>visual aid</em> (the slide) and the <em>story</em> (the speech). At every step, we had to decide what the audience should <em>see</em> to anchor them, and what they should <em>hear</em> to understand the deeper context. This clarified the purpose of every single element we created.</p></li><li><p><strong>It Improved the AI's Output.</strong> Giving the AI two smaller, distinct tasks consistently produced better results than one larger, ambiguous task. A prompt like "Write three concise bullet points for this slide" yielded more useful output than "Create content for a slide about Topic X."</p></li></ul><p>What began as a simple formatting rule for a presentation revealed itself to be a core principle for effective AI collaboration in general. It is a powerful strategy for ensuring clarity and quality in any complex, multi-layered project. Ultimately, the AI didn't reduce my cognitive load; it shifted it. It took on the heavy lifting of drafting text, which freed me up&#8212;and required me&#8212;to focus entirely on the higher-level strategic thinking involved in shaping the narrative, structure, and tone.</p><h3><strong>Conclusion: Thinking Differently</strong></h3><p>This journey of creating a presentation with an AI partner confirmed a critical insight: the promise of these tools is not to reduce the need for human thought, but to change its nature. The most effective approach I've found combines the speed and pattern-recognition of the AI with the architectural thinking and quality standards that an experienced professional brings to the table.</p><p>Our success didn't come from a single, brilliant prompt. It came from a disciplined process: establishing a "pairing contract," taking a lot of tiny, deliberate steps, and enforcing a clear separation of concerns.</p><p>Ultimately, these tools are powerful amplifiers of our own capabilities, not replacements for our judgment. They excel at generating drafts, suggesting patterns, and handling the tactical work of filling in the details. However, they require a human director to provide thoughtful integration, strategic thinking, and a clear vision for the final product.</p><p>This isn't about thinking less&#8212;it's about learning to think more architecturally about the creative process itself. In my experience, the key to success is learning to direct these powerful tools effectively. As with any powerful tool, the final quality of the work lies not in the tool itself, but in the wisdom and discipline of its user.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Can We Think Less with AI?]]></title><description><![CDATA[A reflection on AI-assisted software development and the myth of effortless coding]]></description><link>https://www.nikmalykhin.com/p/can-we-think-less-with-ai</link><guid isPermaLink="false">https://www.nikmalykhin.com/p/can-we-think-less-with-ai</guid><dc:creator><![CDATA[Nik]]></dc:creator><pubDate>Thu, 03 Jul 2025 08:32:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-Ojx!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8d27381-c618-42b7-a15f-62e1d625e22d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3>The Tempting Proposition</h3><p>The promise of AI-powered development tools is seductive: write natural language descriptions, get working code, move fast, ship features. Tools like GitHub Copilot and Cursor have become staples in many developers' workflows, mine included. I use VSCode with Copilot as my primary environment, with Cursor as my secondary platform for cross-checking and refining solutions.</p><p>But after several weeks of experimenting with different approaches to AI-assisted development, I've come to a counterintuitive conclusion: <strong>we cannot think less with AI </strong></p><p>My initial approach was what I'd call "lazy prompting"&#8212;throwing poorly constructed, vague requests at AI tools and expecting magic. Here are examples of the kind of prompts I was using:</p><ol><li><p>"The resources created in the <code>rds.tf</code> file could be placed in their own module. How would you go about it? Which variables and outputs are necessary for the module to deliver the functionality?"</p></li><li><p>"Expose the ECS module output for ecr_url in root <code>outputs.tf</code> as well."</p></li></ol><p>These prompts, while technically clear, were asking AI to make too many architectural decisions at once without proper context. Despite having established <a href="https://nik1379616.substack.com/i/166244759/setting-the-foundation-rules-based-ai-development">foundation rules for AI development</a>, I wasn't applying them consistently.</p><p>The results were consistently disappointing:</p><ul><li><p><strong>Information overload</strong>: AI would generate vast amounts of code that technically worked but was difficult to comprehend</p></li><li><p><strong>Constant rollbacks</strong>: Every 20 minutes, I found myself undoing changes and starting over</p></li><li><p><strong>Non-iterative code</strong>: The output was functional but rigid, making incremental improvements nearly impossible</p></li><li><p><strong>Structural mess</strong>: While the code fulfilled requirements, it lacked coherent architecture</p></li></ul><p>The code worked, but it wasn't <em>good</em> code. More importantly, it wasn't code I could build upon.</p><h3>Two Principles That Work</h3><p>I had been using these principles in my earlier AI development work, but my recent experiment with "lazy prompting" confirmed just how essential they are for effective AI assistance:</p><h4>1. Small Steps</h4><p>Rather than asking AI to solve large, complex problems in one go, I've learned to break work into smaller, focused increments. This aligns with the principle of taking <a href="https://blog.devgenius.io/a-lot-of-tiny-steps-16eaac27acb4">a lot of tiny steps</a> in software development. This approach:</p><ul><li><p>Keeps the AI's output manageable and reviewable</p></li><li><p>Allows for course corrections before investing too much time</p></li><li><p>Maintains code quality by preventing architectural drift</p></li><li><p>Enables better understanding of each component</p></li></ul><h4>2. Pair Programming with AI</h4><p>The most significant shift in my thinking came from treating AI as a pair programming partner rather than a code generator. This builds on the established benefits of <a href="https://martinfowler.com/articles/on-pair-programming.html">pair programming</a> while adapting them for AI collaboration. This means:</p><ul><li><p><strong>Active engagement</strong>: Continuously reviewing and questioning the AI's suggestions</p></li><li><p><strong>Collaborative iteration</strong>: Building solutions together rather than accepting wholesale output</p></li><li><p><strong>Maintained agency</strong>: Staying in control of architectural decisions and code quality</p></li><li><p><strong>Continuous learning</strong>: Understanding what the AI produces rather than blindly accepting it</p></li></ul><p>This approach mirrors the benefits of traditional pair programming&#8212;better code quality, knowledge sharing, and reduced bugs&#8212;while leveraging AI's strengths in pattern recognition and rapid prototyping.</p><h3>The Pomodoro Connection</h3><p>An unexpected discovery was how well the <a href="https://en.wikipedia.org/wiki/Pomodoro_Technique">Pomodoro Technique</a> complements AI-assisted development. In traditional pair programming, natural breaks occur when your partner steps away for coffee or other needs. These interruptions, while sometimes frustrating, provide valuable thinking time.</p><p>When pairing with AI, these natural breaks disappear. The AI never gets tired, never needs coffee, and never suggests taking a step back. This can lead to tunnel vision and mental fatigue. The Pomodoro Technique artificially introduces these crucial breaks, providing time to:</p><ul><li><p>Reflect on the direction of the work</p></li><li><p>Assess code quality objectively</p></li><li><p>Consider alternative approaches</p></li><li><p>Prevent the cognitive overload that comes from continuous AI interaction</p></li></ul><p>For VSCode users, extensions like "Pomodoro Timer" can help integrate these breaks directly into your development workflow.</p><h3>The Thinking Paradox</h3><p>The central paradox of AI-assisted development is this: tools that promise to reduce cognitive load actually require more disciplined thinking to use effectively. Success with AI development tools depends on:</p><ul><li><p><strong>Clear problem articulation</strong>: Better prompts lead to better solutions</p></li><li><p><strong>Architectural awareness</strong>: Understanding how generated code fits into the larger system</p></li><li><p><strong>Quality assessment</strong>: Evaluating AI output against engineering standards</p></li><li><p><strong>Strategic thinking</strong>: Knowing when to accept, modify, or reject AI suggestions</p></li></ul><h3>Moving Forward</h3><p>AI tools for software development are powerful amplifiers of human capability, not replacements for human judgment. They excel at generating boilerplate, suggesting patterns, and rapid prototyping. However, they require thoughtful integration into development workflows.</p><p>The most effective approach I've found combines the speed and pattern recognition of AI with the architectural thinking and quality standards that experienced developers bring. This isn't about thinking less&#8212;it's about thinking differently and more strategically.</p><p>The future of AI-assisted development isn't about replacing developer intelligence but augmenting it. The developers who thrive will be those who learn to think clearly about how to direct these powerful tools toward creating maintainable, understandable, and robust software.</p><p>As with any powerful tool, the key lies not in the tool itself but in the wisdom and discipline of its user.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.nikmalykhin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to get more practical guides on using GenAI tools effectively in software development work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>