🌟 Vasilij’s Note
I was in Glasgow last week, delivering a live presentation, and the news keeps coming out on AI. This week wasn’t “just another model release” – the whole stack moved at once: open reasoning, cloud chips, video, sovereign AI, and even decentralised compute. Here’s the operator’s cut.

In Today's Edition:

This Week in Agents | What Changed

  • DeepSeek ships V3.2 + Speciale as open, reasoning-first models → GPT-5-class maths and coding performance you can self-host. DeepSeek

  • Google’s Gemini 3 + Deep Think lands → Multimodal agents get a “slow think” switch for hard problems, not a separate model line. Gemini 3

  • OpenAI declares “Code Red” on ChatGPT → Ads and side agents paused while they sprint on speed, reliability and personalisation. Independent

  • AWS rolls out Trainium3 servers + Nova updates → Cloud gets cheaper, denser AI infra with 4× performance and ~40% less power per box. Reuters

  • Runway launches Gen-4.5 → Text-to-video jumps again, topping Artificial Analysis with 1,247 Elo. Runway

  • Ukraine, Telegram and Black Forest Labs go “sovereign & open” → Ukraine is developing a sovereign LLM on Google’s Gemma framework for civil and government applications — signalling a shift toward national-scale open-weight deployments. StratNews Global

A practical deep dive into Relevance AI - what it does, how it works, and how to build real multi-agent automations without writing code.

Top Moves - Signal → Impact

Launch/Policy — DeepSeek V3.2 & Speciale go open
DeepSeek released V3.2 and V3.2-Speciale as Mixture-of-Experts models aimed squarely at agents, with Speciale scoring gold-medal results on the 2025 IMO, IOI and ICPC — all under an MIT-style licence. VentureBeat
Why it matters: The first time frontier-tier reasoning arrives as open weights you can fine-tune, self-host and govern internally instead of renting via API.

Ecosystem shift — Gemini 3, AWS Trainium3 and Nvidia–Synopsys
Gemini 3 Pro with Deep Think sets new highs on Humanity’s Last Exam, GPQA Diamond and ARC-AGI-2, while AWS pushes Trainium3 (4× performance, ~40% less power). Nvidia invests $2B into Synopsys to bake GPU-accelerated AI into chip design. blog.google
Operating guidance: Model choices and infra choices are converging — expect Gemini, Nova, DeepSeek etc. to be optimised for specific chips. Build an abstraction layer so you can swap both models and accelerators without rewriting workflows.

Security/Compliance — Sovereign AI and confidential compute
Ukraine is building a national LLM on Gemma for civil and military use; Telegram activates Cocoon — a decentralised confidential compute network on TON paying GPU owners to run encrypted AI workloads. StratNews Global
Risk/opportunity: Regulators will increasingly ask “why send this data to US SaaS?” Open weights + sovereign builds + confidential compute create new placement options — and new governance responsibilities.

Upskilling Spotlight | Learn This Week

Google: A new era of intelligence with Gemini 3 — Outcome: understand Deep Think, benchmark gains, and how to wire Gemini 3 into agents via AI Studio or Vertex.
Read the guide

Runway Gen-4.5 announcement Outcome: get a realistic view of text-to-video progress (motion consistency, temporal control, keyframes) to decide if video belongs in your workflows. Runway

Maker Note | What I built this week

This week I re-cut my stack: DeepSeek V3.2-Speciale for the hardest text-only reasoning chains, Gemini 3 Pro for multimodal and tool-heavy flows, and a “GPU-ready” deployment plan in case we need to bring open weights in-house for regulated work.

Operator’s Picks | Tools To Try

LangSmith Agent Builder
Use for: Designing production-grade agents via a visual, agent-by-agent workflow.

Standout: For the first time, you can build agents as modular components with roles, evaluation sets, routing logic and telemetry — treating agent development like real software engineering instead of prompt alchemy.

What’s new:

  • Visual agent graph

  • Component-level evaluations

  • Step-by-step traces & introspection

  • Configurable policies (guardrails, retries, fallbacks)

  • Instant deployment to endpoints or LangGraph

Why operators should care: most agent failures aren’t model failures — they’re architecture failures. Agent Builder enforces explicit design, testability and observability for multi-step systems.

Deep Dive | DeepSeek V3.2

Why this matters now. DeepSeek has released two new open-weight models – V3.2 and V3.2-Speciale – that match or beat the best models in the world on difficult reasoning tasks. They’re MIT-licensed, meaning you can run them yourself, customise them, and avoid vendor lock-in. For the first time, a genuinely frontier-class model is available outside the big US companies.

What DeepSeek Actually Built

  • A huge model that’s cheap to run

    • DeepSeek uses a Mixture-of-Experts design. Think of it as a model with many specialists inside it, but only a few are used at any time.

    • Result: the intelligence of a giant model at the cost of a medium one.

  • Better handling of long documents

    • Their new Sparse Attention system makes it far more efficient to process long texts (up to 128k tokens). If your agents read PDFs, contracts or codebases, this matters.

  • A training approach focused on reasoning

    • DeepSeek didn’t just train on internet text. They used:

      • Maths olympiad problems

      • Coding competitions

      • Logic puzzles

      This is why DeepSeek does extremely well at maths, structured problems and multi-step reasoning.

V3.2 vs Speciale — Simple Explanation

  • V3.2

    The general version. Good for agents, coding, RAG and everyday reasoning. Faster and more flexible.

  • V3.2-Speciale

    The “expert” version. Excellent at proofs, maths and hard logic. Slower and not meant for casual conversation. Best used as a problem-solver or teacher model.

Issues / Backlash

  • Practitioners report that benchmark wins don’t always translate to better everyday UX – DeepSeek can feel more “formal” or slower, and Deep Think is overkill on simple asks.

  • There is also geopolitical discomfort about leaning heavily on Chinese open-weight models for sensitive workloads and questions about whether sovereign stacks will just fragment standards further.

My Take (What to do)

  • Startup: Put a cheap default model in front: Route “hard mode” traffic (multi-step reasoning, long context) to either DeepSeek V3.2-Speciale (if you can handle infra) or Gemini 3 Deep Think (if you want managed). Cap slow-think spend with explicit budget and logging.

  • SMB: Keep customer-facing flows on managed providers (OpenAI, Gemini, Nova) Use this moment to pilot one internal DeepSeek or Gemma-based workload where sovereignty or cost actually matter (e.g. contracts, internal docs), so you’ve got an escape route if pricing or policy changes.

  • Enterprise: Treat sovereign AI as a real workstream, not a press release: evaluate Gemma/DeepSeek for on-prem, and write down clear criteria (data classes, regions, risk levels) for when workloads must be kept off US SaaS. Align infra teams now on how Trainium3, Nvidia-backed Synopsys tools and decentralised compute like Cocoon fit into your 3–5-year plan, even if you don’t act immediately.

How to Try (15-minute path)

  1. Grab 20–30 real prompts from your agents (support tickets, internal research, planning tasks) and run them against your current default model, DeepSeek V3.2-Speciale, and Gemini 3 (with and without Deep Think).

  2. Log quality, latency and approximate unit cost per run (based on public pricing) and tag which tasks genuinely improved with “slow think” or open weights.

  3. Update your router: send only those tagged cases to DeepSeek/Deep Think and leave everything else on the cheap path. Success metric: ≥15–20% improvement on task completion or accuracy for that subset, with neutral or lower overall cost.

How to Try (Quick Version)

  1. Pick 5–10 hard problems from your real work.

  2. Test them on DeepSeek V3.2, Speciale, GPT-5 and Gemini 3.

  3. Compare correctness, clarity and cost.

If DeepSeek wins even a few, it’s worth integrating into your agent stack.

What you’ll actually notice in use

  • DeepSeek is stronger at deep thinking than GPT-5 on many technical tasks.

  • It’s less polished for casual chat.

  • Costs can be much lower if hosted properly.

  • It’s ideal for agents that need accuracy, structure and long-context analysis.

Adoption challenges to be aware of

  • Self-hosting requires real engineering capability.

  • You need your own safety and compliance controls.

  • Some organisations may hesitate due to the model’s origin (China).

This doesn’t reduce its technical quality, but it affects procurement choices.

Spotlight Tool | Telegram Cocoon

Telegram Cocoon - Purpose: Privacy-first AI compute. Edge: decentralised confidential inference with built-in incentives.

→ Decentralised GPU marketplace • End-to-end encrypted AI workloads • TON-based rewards for node operators IQ.wiki

Try it: Explore Cocoon as a future option if you need private inference at scale and don’t want to be tied entirely to AWS/Azure/GCP.

What did you think of today's email?

Let me know below

Login or Subscribe to participate

AiGentic AI Readiness Assessment — A fast, honest snapshot of how ready your business is for AI agents, plus a concrete action plan instead of vague hype. Try: insights.aigenticlab.com

Did you find it useful? Or have questions? Please drop me a note. I respond to all emails. Simply reply to the newsletter or write to [email protected]

Referral - Share & Win

AiGentic Lab Insights

Keep Reading

No posts found