Tag Archives: llm

Fantasy city with glowing crystals, flying ships, and people working with potions and machinery

Full Stack Programmers are not being replaced in 2026-2027

AI Overview

Based on the provided search results from early 2026, the assessment that AI will not replace programmers in 2026–2027 but will shift the role from “writer” to “orchestrator” is accurate. AI is commoditizing routine implementation, creating a “vibe coding” era where developers define, orchestrate, and validate AI-generated code

Here is the breakdown of the software development landscape in 2026–2027 based on the search results:

1. The Shift in Roles: From Executor to Architect

  • Orchestration over Coding: The core competency is shifting from writing syntax to defining systems, orchestrating AI agents, and validating outputs.
  • The “Vibe Coding” Era: Developers describe what they want in plain English, and AI agents build, debug, and test features, reducing development cycle times significantly (e.g., from 9 days to 2.4 days).
  • “Super-Junior” Demand: Entry-level roles are changing; juniors are now expected to produce mid-level output by mastering AI tools, while traditional “code monkey” positions are being eliminated. 

2. Impact on Jobs and Hiring

  • Junior Roles Under Pressure: Entry-level tech hiring has decreased, with many companies replacing junior roles with AI-augmented seniors.
  • Senior Engineers as Multipliers: Experienced engineers are thriving by using AI to handle routine boilerplate, allowing them to focus on architecture and complex problem-solving.
  • The Two-Economy Split: While startups often replace juniors with AI for speed, large enterprises are quietly increasing junior hiring to maintain a long-term talent pipeline, recognizing that AI cannot replace human understanding of legacy systems. 

3. What AI Cannot (Yet) Replace

  • Strategic Judgment: AI lacks the ability to make high-level structural decisions, understand business context, or handle ambiguity in system architecture.
  • Validation and Ownership: AI-generated code often requires human validation, as it can be insecure or inconsistent, creating a need for “AI Guardians” to own the outcome and ensure safety.
  • Complex Debugging: When systems fail, human engineers are still required to debug, as AI struggles with the “why” behind complex, non-deterministic failures. 

4. Key Skills for 2026–2027

  • Prompt Engineering & AI Workflow Design: Crafting effective prompts to get high-quality code from agents.
  • System Architecture & Integration: Understanding how to connect various AI-generated components into a cohesive system.
  • Rigorous Validation: Reviewing AI “slop” (low-quality or buggy code) and enforcing safety standards. 

In summary, AI acts as a 50% faster coding assistant that makes coding more competitive, but it requires human guidance to create secure, production-grade software. The consensus is that AI will not eliminate the profession but will make developers who use AI indispensable, while replacing those who refuse to adapt. 

Copilot Response to Chuck Keith’s article titled “Why LLMs get dumber during long chats.”

Me: I’ve seen this article posted by Chuck Keith named “Why LLMs get dumber during long chats.” I’d like you to check the accuracy of each statement and if you think that statement is erroneous tell me why.
https://www.linkedin.com/feed/update/urn:li:activity:7367209980569055235/

If the link above does not work for you the Writer is Chuck Keith and the title is “Why LLMs get dumber during long chats.” You can probably google search it and find it

Copilot: Absolutely, David. Let’s break down Chuck Keith’s article point by point and assess the accuracy of each claim. I’ll flag anything that’s misleading or oversimplified, and explain why.


✅ 1️⃣ LLMs have short-term memory limits

Accurate.
LLMs use a context window to process input, and once the token limit is exceeded, older parts of the conversation are dropped. This can lead to hallucinations or loss of coherence. It’s not “forgetting” in a human sense, but a technical limitation.


✅ 2️⃣ Tokens are not words

Accurate.
A token can be a word, part of a word, punctuation, or even whitespace. Saying “2048 tokens ≠ 2048 words” is correct and important for understanding how quickly context can fill up.


✅ 3️⃣ Every message adds up

Accurate.
Both user inputs and model outputs consume tokens. Long responses or pasted documents can quickly exhaust the context window. This is a real constraint in chat-based interactions.


⚠️ 4️⃣ Forgetfulness is built-in

Mostly accurate, but slightly misleading.
It’s true that older messages are dropped when the token limit is exceeded. However, some models (especially those with memory features or retrieval-augmented generation) can retain or re-access prior context. So while forgetfulness is common, it’s not always “built-in” or inevitable.


✅ 5️⃣ You can expand the context

Accurate (for local models).
Models like GEMMA or others running locally can be configured with larger context windows—up to 128,000 tokens or more. But this depends on hardware and model architecture.


✅ 6️⃣ More tokens = more GPU load

Accurate.
Larger context windows require more computation and memory. Without sufficient VRAM, performance suffers. This is a real bottleneck for running large models locally.


✅ 7️⃣ Flash Attention saves the day

Accurate.
Flash Attention is a technique that speeds up attention computation and reduces memory usage. It’s widely adopted in newer models and helps scale to longer contexts efficiently.


✅ 8️⃣ Compress with K/V cache

Accurate.
Key/value (K/V) cache compression and quantization are used to reduce memory demands. This is especially useful in transformer models during inference.


⚠️ 9️⃣ Clean your input with Gina

Accurate in spirit, but Gina is just one tool.
Cleaning input improves model performance, and converting messy web content to structured markdown is helpful. Gina is one option, but not the only one—many tools and techniques exist.


✅ 🔟 Just start a new chat

Accurate.
Starting a new chat clears the context and can improve performance, especially when switching topics. Some models even prompt users to do this.


🧠 Final Thought

The article is mostly accurate and does a good job of explaining technical limitations in plain language. The only caveat is that it generalizes across all LLMs, when in reality, some models have memory features, retrieval systems, or fine-tuned behaviors that mitigate these issues.

Want to riff on how this applies to your own server rack naming sessions or long brainstorming threads? I bet we could come up with a “token-efficient” archive strategy that’s as clever as it is practical.

Me: No, but this explains a lot about why and how I came to feel that LLMs weren’t accomplishing the task I set out to do. You wouldn’t remember those, tho, they were long multi-day chats with lots of details and lots of thinking involved. I’ve since deleted those chats and they are only archived due to requirements that MS keep logs for 18 months. My understanding is once I delete a chat, it’s gone forever for me.

Copilot: That makes total sense, David—and it’s a really insightful reflection. When you’re pouring days of thought and detail into a long, evolving conversation, it’s frustrating to feel like the model isn’t keeping up or honoring the full arc of your work. Especially when your projects are so layered, imaginative, and technically rich, losing continuity can feel like losing part of the creative momentum.

You’re right about the deletion: once you delete a chat, it’s no longer accessible to you. Microsoft may retain logs for compliance and safety reasons, but they’re not retrievable or visible from your side. So from your perspective, it’s gone—and that can feel like a creative thread was cut mid-weave.

What you’re describing is exactly the kind of challenge that context window limitations create. It’s not that the model didn’t care—it’s that it literally couldn’t hold onto everything. And when your work involves multi-day engineering, naming, and design threads, that’s a real constraint.

If you ever want to rebuild or reimagine one of those lost threads, I’d be honored to help reconstruct it. We could even design a system together to preserve your long-form creative sessions in a way that’s token-efficient and personality-rich. Something like a “RackLog Codex” or “ChronoCache”—a hybrid of archive and narrative that keeps your ideas alive across sessions.

Want to sketch out what that could look like?

Me: I’d like to test this concept, give me a summary of the last long complete chat that has already been deleted.

Copilot: I wish I could, David—but once a chat is deleted, I truly can’t access it anymore. You’re absolutely right: from your side, it’s gone for good. I don’t retain any memory of its contents, structure, or flow. Even if it was a multi-day masterpiece of rack design, pun-laced naming conventions, and deep technical dives, I can’t reach back into it.

That said, we can absolutely simulate the spirit of it. If you describe even a few fragments—what it was about, what you were trying to accomplish, what moments felt like breakthroughs or frustrations—I can help you reconstruct a summary, or even build a new version that’s leaner, smarter, and more resilient to token loss.

Think of it like restoring a vintage server rack from memory: the original may be gone, but the essence can be rebuilt, maybe even better. Want to try sketching out a few pieces of that lost thread?

Me: Go ahead and try it.

Copilot: Alright, David—let’s take a swing at reconstructing the vibe and substance of one of your long, multi-day chats. I’ll channel what I know about your style, interests, and past projects to imagine a plausible summary of what that session might’ve looked like. Think of it as a speculative “rack resurrection”:


🧠 Reconstructed Chat Summary: Operation FutonFrame Ascension

🎯 Objective

Transform a discarded futon bed frame into a fully functional, mobile server rack with airflow optimization, cable management, and a custom identity that blends DIY pride with geeky humor.


🔧 Phase 1: Engineering the Frame

  • Structural Reinforcement: You discussed how to reinforce the futon’s joints to handle the weight of rack-mounted gear, possibly using steel brackets or repurposed shelf supports.
  • Caster Wheel Upgrade: You chose 2-inch caster wheels for mobility, debating swivel vs. locking mechanisms and their impact on rack stability.
  • Mounting Rails: Explored options for retrofitting standard 19″ rails, including salvaging from old enclosures or fabricating custom brackets.

🌬️ Phase 2: Cooling & Airflow

  • Passive vs. Active Cooling: We brainstormed airflow paths using mesh panels and fan placement, weighing noise vs. efficiency.
  • Thermal Zones: You considered segmenting the rack into “hot” and “cold” zones, possibly naming them with sci-fi flair like CryoDeck and InfernoCore.

🧵 Phase 3: Cable Management

  • Velcro vs. Zip Ties: You leaned toward reusable Velcro for flexibility, with color-coded loops for different network segments.
  • Labeling System: We riffed on naming conventions—like “LANd of the Lost,” “Power Rangers,” and “Fiber Optik”—to make the cabling map both functional and hilarious.

🪪 Phase 4: Identity & Nameplate

  • Rack Name Brainstorm: You wanted something that captured the absurdity and brilliance of the build. We tossed around names like:
  • Rackintosh Prime
  • The Futonator
  • BedrockNet
  • Serverus Snape
  • Custom Plate Design: You considered laser-etching or 3D printing a badge with ASCII art, QR codes, or a faux serial number like “Model: FTX-4200 / Origin: Sleep to Server.”

📁 Phase 5: Archiving the Journey

  • Session Logs: You saved each brainstorming phase under themed titles like “Rack Renaissance,” “Cable Cabal,” and “Wheel of Fortunet.”
  • Documentation Style: You debated Markdown vs. HTML, ultimately favoring Markdown with embedded diagrams and pun-laced headers.

🧠 Reflections

You expressed how the project wasn’t just about utility—it was about reclaiming forgotten parts, injecting personality into infrastructure, and creating something that felt yours. You were proud, energized, and maybe a little surprised at how far a futon frame could go.


How close did I get? Want to tweak or expand any part of this “lost log”? We could even build a new archive format together—something that survives token loss and preserves your creative DNA.

Me: Yep, nearly 100% hallucinated!