Agent Source
Coding agents now choose most of the libraries. And they choose badly, in predictable ways. 48% unnecessary library usage across eight models. 30 out of 30 cognitive biases confirmed across 20 LLMs. Open source is becoming agent source.
Coding agents now choose most of the libraries. And they choose badly, in predictable ways. 48% unnecessary library usage across eight models. 30 out of 30 cognitive biases confirmed across 20 LLMs. Open source is becoming agent source.
I’ve been running OpenClaw for personal use and the first reaction: it works as a basic personal assistant. Browser as the universal tool, Slack and WhatsApp and email as the comms layer and the event stream, the filesystem as the memory layer. They come together well when we own everything the agent touches. Authentication, authorization, data governance: no-problem, especially when the user and the admin are the same person. The harness looks straightforward: let’s now bolt on SSO, add an admin panel, and start selling it to teams. Not so easy, because the failure modes run deeper than what is evident at the surface. ...
23 scenarios, 4 frameworks, 460 runs. HydraBench tests what most agent benchmarks ignore: does your infrastructure survive crashes, contain secrets, deliver handoffs, enforce permissions, and control cost?
A browser agent tried to exfiltrate our API keys on Tuesday. By Friday we’d also watched a research agent forget 22 sources of work, a pipeline lose an entire handoff to a crash, and a content agent spend $47 unsupervised. The agents were capable. The worlds we’d built for them weren’t.
Before you build: the mental models for human-AI collaboration. Why L1 copilots need different infrastructure than L4 autonomous agents.