The Last Hardware Problem

PUBLISHED OCTOBER 16, 2025

READ TIME

22 MIN READ

4,203 WORDS

Introduction

The First Steps

In 1969, we put humans on the Moon using computers less powerful than a parking meter. Today, we have machines that can explain quantum mechanics in iambic pentameter, but we still can’t build a robot that can reliably fold our laundry.

This is a well-known paradox of the AI field, and it’s also a blind spot for the West.

The AI revolution has achieved miracles in the digital realm: language models that can write and reason, superhuman vision systems, and game-playing agents that discover new, superior strategies after millennia of human play. But every breathless headline, trillion-dollar valuation, and x-risk manifesto rests on an assumption so fundamental we’ve forgotten to question it: that intelligence can exist without a body. That cognition can be divorced from interaction. That you can understand the world without touching it.

And you’ll notice a curious asymmetry in our progress. We’ve been teaching machines to think, while others teach them to work. China is perhaps the most enthusiastic (and credible) proponent of this alternate path. It is not chasing AGI through chatbots or the cloud, but with mechanical hands learning to grasp; robotic eyes learning to see not objects, not pixels; and steel bodies learning the only lesson that matters for embodied AI: how to manipulate physical reality.

America isn’t a complete stranger to these challenges. We have the brightest researchers, the most brilliant labs, and well-capitalized startups working on embodied AI.

But, by and large, we are solving for elegance in controlled conditions.

China’s path is not necessarily better, yet, but it’s definitely bigger and faster. They install one of every two industrial robots, ship humanoids at laughably low price points, control supermajority shares of key upstream materials (and regularly threaten to turn off the tap), and have filed 4x more robotics patents in the last five years than us.

Pure research pushes boundaries, yes, but in technology, scale has a way of becoming quality. Every deployed robot helps conquer the torso and tail of edge cases you’ll face through the messy complexity of real-world deployment.

This is a story about two different bets on machine intelligence: perfection in the lab versus proficiency in the world. Both paths could lead us somewhere important, but only one will win the robotics race.

And we need to ask ourselves: are we on the winning path?

Section 001

The Dexterity Dilemma

dex·ter·i·ty /dekˈsterədē/ noun

readiness and grace in physical activity; especially: skill and ease in using the hands
mental skill or quickness; ADROITNESS

See also: the trillion-dollar problem we haven’t been able to solve; what every toddler masters but no machine can replicate.

There’s two videos that Rodney Brooks, legendary MIT roboticists and founder of iRobot, likes to show those who think we’re close to solving robotic manipulation.

In the first, a woman picks up a match and lights it. Takes about seven seconds. Normal stuff. In the second, researchers at Umeå University anesthetize just her fingertips and ask her to try again. She still has perfect vision. Full muscle control. Complete awareness of where her hands are in space. She’s only lost sensation in her fingertips. The seven-second task takes nearly half a minute. She fumbles for the match, drops it, can’t orient properly between her fingers.

Without touch, just touch, the simple becomes impossible.

Therein lies the secret to the grand challenge ahead of us. We’re building robots that are fundamentally blind to the most important sense for manipulation. Worse, we’re trying to teach them to see what must be felt.

The hard problems are easy, and the easy problems are hard.

Hans Moravec discovered something in 1988 that has humbled us ever since. Today’s AI can beat Gasparov at chess. It can solve math proofs that would stump a Nobel laureate. It can generate images that fool art critics.

But ask it to pick up a grape with grace? To tie a shoelace?? To fold a fitted sheet???

This is Moravec’s Paradox, that it’s “comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility.”

Said more bluntly: Evolution spent 500 million years perfecting sensorimotor intelligence, while chess is only 3,000 years old.

Abstract reasoning is really just a recent software patch running on ancient hardware. Your cerebellum alone, which handles movement and balance, contains half your brain’s neurons despite being only a tenth of its volume. We dedicate more neural real estate to controlling our thumbs than our entire backs.

Most humans can catch a ball but can’t calculate its trajectory. We are all physics savants who can’t do physics.

What makes human dexterity so extraordinary?

Start with the hardware. Our hands are engineering marvels, with 27 degrees of freedom (DoF) and ~17,000 mechanoreceptors in the hairless skin alone. Each fingertip has ~1,000 specialized sensors: not simple on/off switches, but 15 different types of neurons firing in concert, detecting pressure, vibration, stretch, slip, temperature. They fire in milliseconds, adjusting your grip before your conscious mind even registers a problem.

The most advanced robotic hands, if they’re very, very, very expensive, might have 94 tactile sensors. Total. Most have far fewer.

We’re seeing research prototypes of “sensory skins” and fingertip sensors – and I’m excited by those developments – but integrating these sensors with control algorithms is extraordinarily hard. A high-resolution tactile sensor generates a firehose of data that the robot must interpret in real time. It’s an immense perception and compute problem to even approach the feedback bandwidth human hands possess. – Dan Goldin

And…raw sensing is only half the story.

Elasticity is where the magic happens.

Human muscles and tendons form a sophisticated spring system, storing and releasing energy with every movement. When you jump and land, your tendons absorb the impact through elastic stretch, then release it gradually. This decouples joint movement from muscle lengthening, reducing peak forces and creating the smooth, controlled motions we never think about.

Robotic actuators are rigid electric motors that know two states: on or off. They can be strong and precise, but they’re unforgiving. A minor miscalculation strips gears or crushes whatever they’re holding. Recent attempts at “series elastic actuators” add springs to absorb shock, but achieving the combined strength and delicacy of biological muscle remains a fantasy.

These advancements give robots some “give,” inching closer to the finesse of biological muscle. But achieving the combined strength and delicacy of a human limb remains a long way off.

I learned this firsthand at NASA. The Mars Pathfinder’s ‘advanced’ robotic arm had exactly one degree of freedom: a simple spectrometer mount. We didn’t dare attempt anything more complex because we knew it wouldn’t survive the journey, let alone function reliably once there.

-Dan Goldin

Robotically reproducing even a tiny fraction of the human hand’s capability, then, is one of the grand challenges of our times.

Why can’t our robots cross the uncanny valley of touch?

Dexterity, in the end, is the keystone capability that would transform robots from specialized tools into general partners in human environments. We won’t cross this frontier without fundamentally new approaches in both learning (the robot’s “brain” to handle the unexpected) and hardware design (the “body” to act with nuance). This leads directly to the next major barrier: the data needed to teach a robot to be dexterous.

Section 002

Data

If intelligence is fueled by data, robots are running on fumes.

GPT-4 devoured hundreds of billions of words scraped freely from the internet — 13T tokens, representing just 1.8 hours of all human speech worldwide. Yet this ‘microscopic’ slice was more than sufficient to train and deploy one of our most capable AI systems.

And the scale of AI inference today is staggering. Google’s Gemini alone now processes over 1.3 quadrillion tokens per month (an annual run rate of ~16 quadrillion tokens). To put this in perspective: if GPT-4’s training represented 1.8 hours of human speech, these models now process the equivalent of entire years of global human conversation every single month.

Meanwhile, robots starve. There’s no “Internet of physical interactions” for them to learn from. Every robotic experience must be generated through real-world trials or high-fidelity simulations, which are slow, expensive, and typically proprietary. We have to generate this data painstakingly, one interaction at a time, often via teleoperation, where humans guide robots through tasks, recording each movement as training data.

Eight hours of robot teleoperation produces exactly eight hours of training data. No parallelization. No shortcuts. While LLMs feast on centuries of text in minutes, teaching a robot to fold laundry requires a human operator to physically guide it through the motion, reset the laundry, and repeat…hundreds of times.

The cost asymmetry is tough. Tesla pays $25-48/hour for motion-capture teleoperation data. Industry estimates suggest robots need 500,000 to millions of hours of diverse interaction data. Extrapolating, that’s $18M just for labor costs. Meanwhile, LLM training data costs converge towards $0 (compute, of course, is another story entirely).

Heroic Attempts That Aren’t Enough

Open X-Embodiment represents perhaps the most ambitious attempt to mitigate the data deficit challenge: 21 research labs pooling data from 22 different robot types into a single dataset, producing 1M+ real-world robotic trajectories covering hundreds of skills.

It’s the largest open robotics dataset ever assembled. It took dozens of institutions years to gather. GPT-4 processes that much text while you’re reading this sentence.

Berkeley and Google’s DROID project tells a similar story: 76,000 human-teleoperated trajectories required 350+ hours of human operators and sophisticated crowdsourcing. The cost per useful data point makes even the most expensive human labeling look cheap. Sadly, we are not just data-poor…we’re data-destitute.

America’s Cleverness vs. China’s Brute Force

The United States excels at algorithmic efficiency: squeezing insights from minimal data, using simulation to fill gaps, and developing one-shot learning techniques. It’s intellectually elegant but insufficient.

China, by comparison, chooses massive deployment. Last year, they installed 280,000 new industrial robots (or, 51% of the global total) versus America’s 34,000.

China’s robot density in manufacturing has surged to 470 robots per 10,000 workers, making them third globally. America sits at tenth place with 295 per 10,000.

Learning by Doing, and Vicious Cycles

Each one of those quarter-million Chinese robots deployed last year is generating real-world data. Every shift, every failure, and every success is captured at industrial scale. Even uncurated, this ocean of experience teaches what no simulation can.

Meanwhile, American robots remain stuck in pilot projects or lab demos. This is a cruel catch-22, a vicious cycle that goes something like this:

No deployment → No data → No improvement → No deployment.

We’re in demo loops, waiting for the perfect robots. Competitors ship imperfect units by the hundreds of thousands. Guess who learns faster?

Compute and capital may not fix this alone. Breaking out of the cycle is partly a data problem, but it’s also a simulation and validation problem…which brings us to the next section.

Section 003

Simulation ⇄ Reality

Simulation promises to help solve our data crisis, and it does help in one specific way: multiplying real-world demonstrations into synthetic training data.

Nvidia, for example, recently turned 60 actual human demonstrations of a task into 20,000 synthetic ones using simulation and AI (for a 300× multiplier). It’s a powerful concept: use a small amount of real data to seed vast amounts of permutations in silicon. Techniques such as Real2Sim2Real show we can augment the trickle of real data with a flood of fake data.

But ultimately, all those virtual trials have to be validated and fine-tuned on real hardware.

Digital Twins’ Dirty Secret

Today, we design airliners via computer models. The Air Force has invested heavily in digital fighter jet simulators. Automakers tout “virtual factories,” while biotech companies talk longingly about digital labs. So, why can’t we just simulate our way to robotic dexterity?

Because simulations can’t capture the gritty, contact-rich physics that dexterity requires. The world has many physical phenomena we simply cannot model with sufficient fidelity, or that would require impractical computing levels to comprehensively simulate.

Subtle frictional effects, such as small material deformations, wear-and-tear over time, and sensor noise characteristics are often oversimplified or ignored in simulation.

A recent MIT review was blunt on this: simulating realistic contact physics remains an open problem. Not challenging, nor computationally intensive, but open. As in: we don’t know how to do it.

Even distinguishing between soft rubber and hard plastic in a robot’s grip defeats our best physics engines. A human instantly feels the difference, but our simulations can’t model it properly.

I’ve seen million-dollar aerospace simulations be wrong about something as “simple” as a gasket’s friction coefficient, leading to nasty surprises in real hardware tests.

-Dan Goldin

Then, there’s also the curse of dimensionality.

Reality is fractal in its complexity. Every surface has micro-textures. Every material deforms uniquely. Temperature changes everything. Humidity changes everything else. We can’t simulate all of that, so we have to approximate. But enough approximations can compound into fantasy.

It’s like training a boxer on a heavy bag, then throwing them into a street fight. The fundamentals might transfer but the chaos will eat them alive. Roboticists call this the sim-to-real gap. An algorithm that aces every virtual test can crumble when reality introduces one unmodeled variable: 5% more friction in a joint, a slightly sticky surface, or a cable that wasn’t in the CAD file.

This is why serious robotics tends to include real-world fine-tuning after simulation training: you’re using actual experience to correct simulation’s distortions.

What Simulation Can’t Simulate

You can’t simulate 10,000 hours of mechanical wear. You can’t predict when a sensor will drift out of calibration. The only way to know if a robotic component will survive years of use is to test it for years of use (or find clever accelerated life-test regimes).

In aerospace and automotive engineering, we still put hardware on test rigs and run it to failure. There’s no substitute for metal fatigue, thermal cycling, and the million tiny degradations that separate pilots from products.

Simulation is seductive because it’s fast, safe, and cheap. Reality is comparatively slow, dangerous, and expensive. To crack robotics, we must embrace both: the speed of virtual iteration and the truth of real-world testing.

Section 004

Supply Chain + Scaling

America’s history is full of brilliant inventions that end up being scaled elsewhere. We already can and do design the world’s most sophisticated robots. But can we build it at scale?

If you want to assemble a cutting-edge humanoid in America, your shopping list will look something like this:

rare-earth magnets (90% from China)
harmonic drives (Japan)
precision bearings (Germany/Japan)
high-resolution encoders (Asia)
battery cells (East Asia)
power electronics (mostly imported)

Recall that China installed 280,000 industrial robots last year to our 34,000. A decade ago, Chinese robot makers were negligible. Now they supply nearly half their domestic market. Meanwhile, we import most of our robots from Japan, Germany, and increasingly China. This dependency could become as dangerous as our reliance on foreign hydrocarbons in decades past.

Alas, we are missing entire industries and processes.

The net effect is that an American company trying to build a humanoid robot finds itself with a shopping list heavily reliant on imports. And if those imports are slow, expensive, or politically sensitive, you’ve got a problem.

Manufacturing processes are similarly instructive.

Building advanced robots requires precision casting, micron-tolerance assembly, exotic alloys, and clean-room semiconductor processes. The U.S. maintains world-class capabilities in aerospace and semiconductors, but we’ve let other critical manufacturing know-how atrophy and slip overseas, especially in the electromechanical components, small electric motors, and precision gears that robots demand.

The Vertical Integration Trap

These gaps force American robotics startups into extreme vertical integration. Founders frequently discover that almost every critical component either didn’t exist off-the-shelf or came from immature supply chains. They end up developing custom actuators and manufacturing many parts in-house, not from ego but from necessity. “If there had been a vendor, we would have bought it,” they often will tell us. This builds expertise but demands enormous capital. Meanwhile, Chinese companies source everything domestically from suppliers built around consumer electronics and EVs.

The $5,900 Reality Check

Earlier this year, Shenzhen-based Unitree unveiled the R1, a small humanoid robot that retails for $5,900. That’s no typo. For under $6K, you can get a humanoid platform that’s an order of magnitude cheaper than anything you can buy from a Western developer.

How is this so? It should be a familiar story at this point: Chinese OEMs leverage existing ecosystems, repurposing motors from electric scooters and drones, tapping inexpensive sensor supplies, and assembling with the efficiency that comes from building tens of thousands of units.

Chinese firms provide off-the-shelf what U.S. teams must custom-build. They accept thinner margins, often with strategic subsidies, prioritizing market share and rapid iteration over profits.

So, the <$6,000 R1 may not be high-margin, and in all likelihood loss-making, but it puts units in the field, generates data, forces competitors to react, and collapses adoption timelines. This should be a wake-up call: without a robust supply chain and aggressive scaling, Western efforts will face cost curves they can’t easily bend.

The Scaling Imperative

Mass production transforms economics. This is the dynamic Tesla captured with batteries: scaling production dramatically cut the cost per kWh. For robotics, an expensive $10,000 LiDAR sensor might drop to $1,000 at mass-production volumes.

DJI proved this with drones: they reached high-volume production, achieved unbeatable price-performance levels, and captured as much as 90% share in the consumer drone market. And never looked back.

Image: Boston Dynamics

Boston Dynamics’ humanoids and quadrupeds, for all their jaw-dropping capabilities, remain extremely expensive because they’re made in very low quantities with extensive hand assembly. Scale those same designs (or improved ones) to tens of thousands of units, and costs plummet, markets unlock, and new applications become viable.

Scaling doesn’t just mean cranking out widgets in the factory. You also learn to develop the necessary support ecosystem: maintenance networks, training programs, spare parts availability, continuous improvement loops, and so on. A deployed robot will inevitably sometimes fail, and each failure teaches lessons. (Consumer robo-vacuums are instructive here: newer models are dramatically more capable, robust, and performant, informed by a decade-plus of experience and well over 100M units shipped).

Specialists vs. Generalists

At NASA, I (Dan) championed the Faster, Better, Cheaper (FBC) management philosophy, pushing my teams to build many smaller spacecraft rather than a few ultra-expensive, exquisite ones. Not every mission succeeded, but overall, we learned faster and achieved more per dollar.

American robotics could use a dose of this FBC thinking. Instead of pouring everything into a single $1M humanoid that aims for full human parity…and never leaves the lab…we might field fleets of $50,000 robots, each tackling narrower tasks, earning revenue, and learning from the field.

Plenty of U.S. companies are smartly pursuing this approach: deploying “simpler” robots (such as sidewalk delivery bots, shelf-scanning retail robots, or warehouse tote carts) in significant numbers. These specialty bots are limited in scope, but generate focused data and prove out pieces of the puzzle. This incremental scaling strategy might lack the sci-fi splash of an omnipotent humanoid unveiling, but it builds real industries and markets.

The Platform Moment

When only a few dozen robots exist in R&D labs, everyone builds bespoke systems. When you’re deploying 10,000+ robots, you need standards: common operating systems, modular components, plug-and-play integration. The PC industry exploded once IBM architecture, Windows OS, and USB peripherals emerged.

Robotics hasn’t had its PC moment yet, but it could. And with a healthy open-source culture and vibrant developer communities, America can and should lead this. But we can only do so if we commit to scaling the industry rather than keeping everything proprietary.

We are big believers in platforms and network effects: from the ISS’s standardized module interfaces to smartphones spawning the app economy. A similar platform play in robotics could unlock innovation and de-duplicate efforts.

Wrap

The Price of Leadership

We don’t need prettier reels of dancing humanoids or carefully choreographed demos. We need fleets that fail, log, learn, and improve.

The above barriers (dexterity, data, sim, supply chain) are known, defined, and theoretically breakable.

China understands something that we’ve spent decades forgetting and surrendering. Their embodied AI advantage, then, is policy + investment + national appetite to use robots wherever possible. The country is proving what dense supply chains and brute-force deployment can achieve: sub-$6K humanoids, armies of industrial robots, and a flywheel of experience that compounds by the day.

America has everything needed to lead: the deepest capital markets, unmatched talent, and a world-leading founder/engineer ecosystem. But we need to escape the demo trap, which requires a return to our manufacturing roots and a willingness to play the long game that hardware demands.

A six-step modest proposal to break free of our current robotics deadlock:

001 // CREATE A DATA COMMONS. Build a National Robotics Data Commons where companies pool basic physics interactions (how materials deform, surfaces interact, forces propagate) while keeping algorithms proprietary. Every American robotics company is spending millions to rediscover the same, redundant things. The Human Genome Project cost $3B and generated $1T in economic value by making genomic data public. Apply the same model: fund it through public-private partnerships and run it like the agricultural cooperatives that achieved 10x yield improvements through shared knowledge.

002 // DESIGN FOR IMPERFECT DEXTERITY. Instead of losing another decade chasing perfect human hands, redesign tasks for robot’s capability. Amazon already robot-optimizes packages and warehouses. Expand this thinking to standardized industrial connectors, robot-readable components, and environments that meet specialized machines halfway. The PC revolution didn’t happen when computers learned to read handwriting; it happened when we learned to type.

003 // MUSTER THE CLUSTER. Pick Detroit, or Pittsburgh, or another place. Pour everything there until critical mass forms. American robotics needs its Shenzhen, where suppliers, assembly plants, and testing facilities cluster within miles of each other. When your actuator fails, the vendor is 20 minutes (and not 20 days) away. When you need custom machining, the shop is down the street. This proximity turns months-long iteration cycles into days. Silicon Valley worked because talent and knowledge concentrated in one place. And hardware needs this even more than software, because tribal knowledge doesn’t transfer over Zoom. We need to commit serious capital to one region, not political sprinkles everywhere. The winning city needs a strong robotics university, existing industrial base, and cheap infrastructure for startups.

004 // DEPLOY AT 80% READINESS. Our obsession with perfection is paralyzing. We polish prototypes to 95% while Chinese robots that barely work learn from millions of operational hours. Instead, we should deploy at 80% ready and let reality teach what simulation can’t. The last 20% of performance may take 80% of the effort. Ship the adequate, and iterate to excellence.

005 // CREATE DEMAND TO CREATE SUPPLY. Anchor customers (in military, industry, or logistics) could commit to buying 100,000 robots over ten years for actual deployment. There’s precedent: guaranteed aircraft purchases created aerospace, space contracts built commercial launch, and government networking needs spawned the internet. When institutions guarantee demand, industry builds supply chains. When suppliers know they’ll sell thousands, they invest in capacity. When capacity exists, costs plummet.

006 // SECURE THE CRITICAL STUFF FIRST. We need not waste our time and precious attention on the pipe dream of full supply chain independence. Rather, we should focus on the most critical bottlenecks: high-torque rare-earth motors, precision reducers, advanced tactile sensors, edge AI processors, power management systems. Accept paying 3x more initially. This is the tax you must initially pay to control your robotic destiny. Everything else can remain globally sourced (ideally, friendshored) while domestic capacity scales. Same logic as the Strategic Petroleum Reserve: secure what’s critical, source what’s commodity.

The lesson of history is clear: technological revolutions are won by those who build, not just those who invent. We invented the transistor and largely lost electronics. Created the solar panel and gave that industry away. Pioneered the internet and surrendered hardware manufacturing.

With robotics, we’re a few innings into the same ballgame. We don’t need to accept our current trajectory, but every day we debate, 767 more robots go to work in China. Robots will not build themselves (yet). It’s time to move past acting as the world’s R&D department, and start building the future we’ve designed ourselves.

If America wants to lead in the age of intelligent machines, we must invest not only in algorithms but in actuators; not only in AI models but in the mines, foundries, and fabs that feed them. That is the price of leadership.