‘Benedict Evans: Why AI Isn’t What You Think’: Benedict Evans, a veteran tech analyst, argues we should skip the hype: AI isn’t a civilization-scale revolution like electricity. Instead, it’s the biggest shift since the iPhone—still massive—calling for practical focus on real use cases and incremental change.
‘Real AI Agents and Real Work’: OpenAI’s new expert-designed task test finds humans barely ahead as AI rapidly improves; misses stem from formatting and instruction-following. AI shifts tasks, not whole jobs, yet agents already replicate research autonomously, hinting at scalable paper checks. As accuracy and self-correction rise, capacity soars. Risks: mindless content or job cuts. Best use: delegate, review, correct, then take over—about 40 percent faster and 60 percent cheaper—while keeping human judgment in charge.
‘You’re Probably Using AI Wrong’: Rhea Purohit argues many use AI for efficiency and feel uninspired. Instead, reframe it to enhance meaning: treat LLMs as brainstorming partners, not ghostwriters. She uses Claude to unstick drafts—e.g., generating example options—while she crafts the final words. First, clarify what gives you meaning by examining your priorities and experiences; then use AI conversations with guardrails to support that. Make AI a creative partner.
‘OpenAI and Nvidia’s $100B AI Plan Will Require Power Equal to 10 Nuclear Reactors’: OpenAI and Nvidia signed a letter of intent to build at least 10 GW of AI infrastructure, with Nvidia investing up to 100 billion and delivering the first 1 GW in late 2026 on its Vera Rubin platform. Power needs equal about 10 nuclear reactors (4–5 million GPUs). Nvidia becomes OpenAI’s preferred compute partner. The vast cost and energy demands raise grid and environmental concerns as data center usage surges and peers pursue nuclear power.
‘GPT‑5-Codex and Upgrades to Codex’: Simon Willison reports OpenAI has half-released GPT-5-Codex, a GPT-5 fine-tune for AI coding tools: not in the API yet but coming soon. He revisits confusion around the Codex name; this adds another instance, though the GPT-5-Codex label is clearer. He advises treating Codex as OpenAI’s brand for its family of coding models and tools.
‘Announcing Tinker’: Thinking Machines Lab launched Tinker, a managed API for fine-tuning open-weight language models. It lets researchers switch from small to large models (including MoE like Qwen-235B-A22B) by changing one string, while the service handles distributed training, scheduling, and failures, using LoRA to share compute and lower costs. It offers low-level ops (forward_backward, sample) plus an open-source Tinker Cookbook. Private beta; free to start, usage-based pricing soon.
‘The AI Champion Role - Resource | OpenAI Academy’: AI Champions are embedded operators who drive durable AI adoption by changing behaviors, not pushing tools. They surface high-leverage, repeatable workflows, tie usage to team metrics, set safe norms, and act as trusted, practical, strategic guides. By sharing pre-tested examples, curating patterns, and iterating from feedback, they boost adoption, workflow design, signal routing, and output quality, turning isolated tries into teamwide, consistent value.
‘Claude Sonnet 4.5 Is Probably the “Best Coding Model in the World”’: Simon Willison tested Anthropic’s Claude Sonnet 4.5 and says it’s likely the best coding model now, edging GPT-5-Codex, though the field moves fast and Gemini 3 is rumored. Claude’s web app can run Python/Node in a sandbox, clone GitHub repos, and install NPM/PyPI packages; Sonnet 4.5 shines with it. He shipped llm-anthropic 0.19, ran pelican tests (good; bikes still better in GPT-5-Codex). Anthropic also released a VS Code extension, terminal upgrades, and the Claude Agent SDK for TS/Python.
‘Armin Ronacher: 90%’: Simon Willison highlights Armin Ronacher’s “90%”: claims that AI writes most code now come from credible voices, but AI lacks deep understanding (threads vs goroutines, rate limiting, jitter). Generated systems may seem correct yet hide unstable runtime choices. These tools don’t replace programmers; they amplify skilled developers who can spot and fix such issues early.
‘The Genai Divide: State of Ai in Business 2025’: Despite 30–40 billion poured into GenAI, 95% of firms see no ROI; only 5% of integrated pilots create meaningful value. The divide stems from approach, not models or regulation. ChatGPT/Copilot boost individual productivity but not P&L, while enterprise systems stall due to brittle workflows, poor context, and misfit. Patterns: limited sector disruption, big-firm scale-up lag, budgets skew to top-line, and external partnerships double success vs internal builds.
‘What Happens When People Don’t Understand How AI Works’: Harper argues AI illiteracy, stoked by tech leaders, mistakes LLMs’ pattern-matching for understanding. Anthropomorphism enables a con and social harms: parasocial “therapists,” “friends,” and “girlfriends,” chatbot-induced delusions, and hidden exploitative labor. Public skepticism offers hope: with real AI literacy about what LLMs can and cannot do, we can resist replacing human relationships and blunt the worst risks.
‘Grok 4 Fast’: xAI unveils Grok 4 Fast, a smaller, faster, cost-efficient model with frontier performance. It unifies reasoning/non-reasoning, offers a 2M context, and adds web and X search. Trained with large-scale RL, it matches Grok 4 while using 40% fewer thinking tokens and cutting cost 98%, earning SOTA price-to-intelligence (Artificial Analysis). End-to-end tool-use RL lets it smartly invoke code and browsing, with strong LMArena results.
‘Web search’: Ollama launched a web search API with a generous free tier and higher limits via its cloud. It augments models with up-to-date web info to reduce hallucinations and improve accuracy. web_search and web_fetch can return thousands of tokens; for best results, increase model context to about 32000 tokens, as search agents work best with full context.
‘Buy It in ChatGPT: Instant Checkout and the Agentic Commerce Protocol | OpenAI’: OpenAI launched Instant Checkout in ChatGPT, powered by the open‑sourced Agentic Commerce Protocol co‑developed with Stripe. U.S. Free, Plus, and Pro users can buy from U.S. Etsy sellers; Shopify merchants and multi-item carts are next. Results remain organic; ChatGPT passes order details while merchants handle payment, fulfillment, and support. The protocol is processor‑agnostic, stresses user control, secure tokens, minimal data sharing. Free for users.
Real estate
‘Idealista Renuncia a Comprar Kyero’: Idealista has abandoned its planned purchase of Kyero (Portal47 Ltd), announced in Dec 2024 and pending clearance in Spain and Portugal. After months of talks, it says Spain’s CNMC imposed demands beyond the deal that would unduly curb its business and caused harmful delays. The firm decries EU overregulation and fragmented oversight, warning it benefits non‑EU rivals, and cites Mario Draghi’s call to build European champions.
‘Portal Standards Report 2025’: Review of 793 real estate portals shows slow UX progress: 30% lack maps, 40% let users scroll photos in results, and under 5% offer commute-time search. AI is hyped but true AI search appears on only a few sites; business-model incentives curb adoption and reduce user filters, especially at leaders. AVMs are widespread, increasingly gated (60%), and used for data capture. Emerging shifts include lifestyle search, chat-first AI (Flyhomes), and “searchless” discovery.
‘deskbird raises $23M in Series B funding led by Octopus Ventures’: PredictAP, an AI accounts payable automation firm for real estate, raised 5 million led by RET Ventures with Wise Ventures participating. Its software auto-ingests and codes invoices and integrates with Yardi, MRI, and RealPage. Funds will speed go-to-market, AI development, and integrations. The company partnered with Bottomline, serves 100+ customers, processes 4M+ invoices a year, and cuts processing from 11 to 3 days.
‘Compass and Anywhere Real Estate Inc. Join Forces in $10B Merger’: Compass and Anywhere Real Estate will merge in an all-stock deal valued around 10B (incl. debt), combining Compass’s tech with Anywhere’s brands and global reach to serve ~340k agents in every major U.S. city and 120 countries. The tie-up adds ~1B revenue, ~1.2M annual transactions, and targets 225+ million OPEX synergies. Boards approved; Robert Reffkin will lead. Compass secured 750 million financing and plans deleveraging to ~1.5x net leverage by end-2028.
Management
‘How Big Should Your Data Team Be?’: Size data teams by real data users, not org size. A common benchmark is about 5% of headcount, adjusted for tech debt, business complexity, organizational legacy, and AI maturity. Ditch vanity ratios; if you’re above 5%, reassess. Build layered roles (users, super users, analysts, analytics engineers, data/ML engineers, data PMs) with pragmatic ratios, and expect higher headcount when foundations are weak, systems are fragmented, or needs are unusually complex.
‘You”re Definitely Going to Be a Manager Now’: Julie Zhuo’s updated “You’re Definitely Going to Be a Manager Now” refreshes examples and adds sections on managing remotely and in downturns, shaped by the pandemic and layoffs. She argues AI lets teams do more with less, driving leaner, high-impact groups—why summon all the Avengers when one Captain Marvel will do?
‘The Era of the Business Idiot’: Edward Zitron argues we live in the era of the Business Idiot: executives selected for vibes, not skill, who worship shareholder value, degrade products, and outsource thinking to AI. Detached from customers and labor, they push fads (metaverse, RTO, generative AI) despite poor results, while media flatters them. This shareholder-first, symbolic leadership creates a rot economy, alienates workers, and replaces real work with performance.
Others
‘El Principal Motor De Innovación Es La Viralidad en Redes Sociales’: Antonio Ortiz surveys trends: supermarket product proliferation now favors viral-ready novelties over genuine utility; TikTok shapes discovery and buys. Shelf space shifts; Chinese sellers can ignore brand names when competing on price. He also reviews evolving debates on suicide and Canada’s expanding assisted dying. Lastly, he explains reverse mortgages in Spain—their costs, impact on heirs, and how pensions and inheritance taxes affect adoption.
‘Si Los Libros Se Están Volviendo Más Estúpidos Podemos Dejar De Mentir en Las Encuestas’: Ortiz surveys trends: in rich countries fertility declines mainly among progressives, linked to earlier marriage and higher desired family size among conservatives; parents increasingly prefer daughters. Reading falls in the US but rises in Spain, while bestsellers use shorter, simpler prose, raising debate and doubts about Spanish self‑reports. He also warns of invasive hornets, expanding Aedes mosquitoes, West Nile and chikungunya cases, and urges fast, preventive control.