Teaching Machines to Fold: How Invisible Workers Are Building the Robot Future
From Lagos to Los Angeles, thousands of gig workers are recording themselves doing chores for $5-$20 an hour—training the AI systems that will eventually replace them
The Simulation Problem: Why Robots Need to See Humans First
For decades, roboticists have faced a frustrating paradox: robots perform beautifully in digital simulations but stumble awkwardly in the real world. This gap between the virtual and the physical—known as the sim-to-real problem—has been robotics’ most stubborn unsolved challenge, holding back the entire field from practical breakthroughs.
The reason is deceptively simple: digital simulations cannot capture the complexity of reality. A computer model can predict how a rigid metal cube will move, but it struggles with the subtle physics of folding fabric, the unpredictable resistance of a human body, or the thousands of micro-corrections a hand makes while grasping an object. These infinitesimal details, invisible to the eye, are what separate a successful robot from one that fails repeatedly at basic tasks.

Then came a surprising discovery. Nvidia’s team found that adding over 20,000 hours of first-person human video footage to their training data improved robot task success rates by approximately 50 percent. The robots weren’t just memorizing human movements—they were learning something fundamental about how the physical world actually works. By watching humans manipulate objects from a first-person perspective, the robots gained intuition about forces, textures, and dynamics that no simulation could provide.
This breakthrough has shifted how the robotics industry thinks about artificial intelligence. Human movement data is now recognized as the critical missing ingredient in physical AI development. Rather than robots learning exclusively from digital worlds, they need to learn from human bodies operating in real environments. It’s a humbling realization: before machines can replace us at physical tasks, they must first watch us and learn from our expertise.
The Global Workforce: Who’s Training Robots and What They’re Earning
Across the globe, ordinary people are performing extraordinary work—and most don’t realize it. Over 100,000 hours of footage has been collected from workers in more than 50 countries, all engaged in the same seemingly mundane task: recording themselves doing household chores. Folding laundry, loading dishwashers, organizing pantries—these everyday activities are becoming the raw material fueling the artificial intelligence revolution.
The setup is surprisingly simple. Workers strap iPhones to their heads and attach wrist sensors to capture natural, unrehearsed movement as they go about their daily routines. This body-as-training-data approach has created an entirely new gig economy segment, one where physical labor itself becomes digital currency. The footage feeds machine learning algorithms, teaching robots how humans actually move through space and interact with objects—knowledge that’s impossible to program directly.

But there’s a troubling reality lurking beneath this innovation: massive payment disparities. A worker in the United States might earn $15 to $20 per hour for this training work, while counterparts in Vietnam and India receive $5 to $8 hourly. In Nigeria and other Global South nations, compensation drops even further. It’s a stark reminder that the AI economy, despite its global scope, perpetuates the same economic inequalities that have long characterized technology development.
Most problematic is what many workers don’t understand: they’re training the very machines designed to replace their jobs. The irony is particularly sharp for gig workers—those already operating in precarious employment conditions. While they film themselves performing tasks, companies and countries like China are planning 60 or more dedicated robot training centers, and firms like Micro1 are rapidly hiring thousands across the Global South.
This represents a crucial inflection point in human-machine relations. The robots learning today will reshape tomorrow’s workforce. Yet the people teaching them—often from economically vulnerable countries—may be the last to benefit from, or even understand, the transformation they’re helping to create.
The Work Itself: Inside the Economy of Recording Your Labor
The setup is deceptively simple. Workers strap a smartphone to their chest or head using an elastic band, open a task list on their screen, and begin their shift. For roughly two hours of work, they can earn between $40 and $80 per completed submission. The tasks themselves are mundane—folding laundry, washing dishes, loading dishwashers, sweeping floors, cooking meals, organizing closets. The kind of work most people do at home without a second thought.

But there’s a critical paradox embedded in this economy: workers must perform these ordinary activities as naturally as possible while being filmed. The more conscious someone becomes of the camera, the worse the footage becomes. Robots learn from authentic human movement, which means the most valuable performances are those where workers essentially forget they’re being recorded. It’s a strange psychological tightrope—hyper-awareness paradoxically requires forgetting awareness entirely.
The barrier to entry is almost nonexistent. An iPhone, an elastic strap, and internet access are all that’s required to begin. No special certification, no previous experience, no formal qualifications. This accessibility has made it attractive to workers globally, from Lagos to Los Angeles, who can monetize movements they’d perform anyway.
Yet quality varies dramatically by worker. Performance depends less on technical skill and more on one’s ability to relax in front of a lens—to let muscle memory take over and suppress the self-consciousness that naturally emerges under surveillance. In this economy, your body’s authenticity becomes your product, and your comfort with being watched determines your earning potential.
DoorDash and the Repurposing of Gig Infrastructure
In March 2025, DoorDash launched its Tasks platform with a deceptively simple premise: earn money by recording yourself doing household chores. What makes this unremarkable-sounding feature significant is what happens to that footage afterward. Video recordings from 8 million US couriers performing everyday tasks like folding laundry, washing dishes, and organizing pantries now flow directly into robotics training pipelines serving Tesla, Google, and Figure AI.
This represents a fascinating—and troubling—case of infrastructure repurposing. DoorDash built its courier network to replace restaurant workers by automating food delivery logistics. Now that same network of gig workers is being leveraged to train the machines that will ultimately replace delivery workers themselves. The workers creating the training data that automates their own jobs are compensated only for their time, not for the downstream value of the footage or the employment it displaces.

Consider the economic asymmetry. A courier records thirty minutes of folding clothes and earns a modest hourly wage. That same footage, combined with millions of hours from other workers, trains a robot that eventually eliminates delivery jobs entirely. The workers bear the temporal cost while technology companies capture the exponential value created by machine learning at scale.
This model reveals how gig infrastructure—built on the premise of human flexibility and low wages—becomes perfectly suited for feeding artificial intelligence systems. Workers aren’t just performing jobs; they’re unknowingly documenting the movements and techniques needed to replace themselves. The platform economy, designed to minimize labor costs, has found an even more efficient application: turning human workers into research assistants for their own obsolescence.
The Extraction Problem: Colonialism Through Data
There is a modern form of extraction happening quietly across the Global South, one that mirrors colonial patterns of old. Workers in Nigeria, Kenya, India, and the Philippines are training artificial intelligence systems that will eventually be sold back to their own markets at premium prices—often displacing the very workers who built them.
Consider a concrete example: a Nigerian worker earns roughly $5 per hour labeling images, annotating videos, and demonstrating physical tasks to train robots. The company paying them sells access to this trained model for thousands of dollars to businesses worldwide. Within months, that same robot—built on the labor of Nigerian workers—begins replacing Nigerian jobs in warehouses, delivery services, and manufacturing. The worker who trained it now competes against it, at a disadvantage.
Tech companies frame this arrangement as democratization and bridging the digital divide. They celebrate bringing economic opportunity to developing nations. But the reality tells a different story. Value flows one direction: upward, toward wealthy corporations and shareholders in the Global North. Workers in the Global South provide the raw material—their labor, their time, their movements—while bearing the displaced job risk.
The asymmetry is stark. Companies profit from footage and data; workers absorb the economic shock of automation. A software engineer in San Francisco might earn $200,000 yearly building AI systems. A data labeler in Lagos earns a fraction of that for providing the essential human intelligence that makes those systems possible.
This arrangement is not inevitable. It is not the natural outcome of technological progress. It is a choice—a deliberate decision about how value gets distributed between multinational corporations and the workers whose labor creates it. Other models are possible: equitable revenue sharing, mandatory transition assistance, or technology transfer that gives Global South nations ownership stakes in the systems built from their workers’ contributions. Until we demand these alternatives, we are simply watching colonialism evolve for the digital age.
What Comes Next: The Future of Physical AI Labor
The physical AI market stands at an inflection point. Industry analysts project exponential growth through 2026 and beyond, with robotics and autonomous systems poised to transform how we approach everything from warehouse logistics to last-mile delivery. Yet this explosive growth trajectory faces a critical constraint: training data.
Currently, physical AI systems require vast amounts of real-world video footage and movement data to learn complex tasks. Unlike language models that can be trained on existing internet text, robots learning to fold laundry or navigate unpredictable environments need humans to demonstrate these skills repeatedly. This demand for training footage has created an unexpected economic opportunity for gig workers—people willing to be recorded performing mundane tasks that will teach machines to eventually replace them.
As the physical AI boom intensifies, demand for these human trainers will likely skyrocket. But this raises uncomfortable questions about sustainability. What happens when labor supply catches up with demand? If thousands of workers compete to provide training data, wages for this work could plummet, even as the robots they help build command premium prices.
These economic pressures have sparked broader policy debates. Who owns the training footage generated by workers’ bodies and movements? Should people be compensated beyond standard gig rates for contributing to AI systems that will displace them? How do we ensure gig workers aren’t simply funding their own obsolescence?
Meanwhile, a parallel track is emerging. Some researchers are exploring synthetic data generation and reduced-data training methods, attempting to minimize human labor requirements altogether. Whether this technological solution will succeed remains uncertain—but the race is on, and the stakes for workers couldn’t be higher.
Stay ahead of the curve! Subscribe for more insights on the latest breakthroughs and innovations.


