Say Hello to Stanley

Stanford's souped-up Volkswagen blasted through the Mojave Desert, blew away the competition, and won Darpa's $2 million Grand Challenge. Buckle up, human - the driverless car of the future is gaining on you.

Sebastian Thrun is sitting in the passenger seat of a 2004 Volkswagen Touareg that's trying to kill him.

The car hurtles down a rutted dirt road at 35 miles per hour somewhere in the Mojave Desert, bucking and swerving, kicking up a cloud of dust. Thrun, the youngest person ever to head Stanford's famed artificial intelligence laboratory, clings to an armrest. Mike Montemerlo, a speed-coding computer programmer and postdoc, is wedged in the backseat amid a tangle of wires and cables.

No one is driving. Or more precisely, the Touareg is trying to drive itself. But despite 635 pounds of gear - roof-mounted radar, laser range finders, video cameras, a seven-processor shock-mounted computer - the car is doing a lousy job. Thrun tightens his grip on the armrest. He's built plenty of robots, but he's never entrusted his life to one of his creations. He's scared, confused, and above all furious that his algorithms are failing.

Suddenly the steering wheel spins itself hard to the left and the car speeds toward a ditch. David Stavens, a programmer who is stationed in the driver's seat in case of emergency, grabs the wheel and fights the pull of the robotic autopilot, which is insisting on a plunge into the gulley. Stavens slams his foot down on the computer-controlled brake. Thrun hits the big red button on the console that disables the vehicle's navigation computers. The SUV skids to a halt. "Hey, that was exciting," Thrun says, trying to sound upbeat.

It wasn't supposed to be this way. In 2003, the Defense Advanced Research Projects Agency offered $1 million to anyone who could build a self-driving vehicle capable of navigating 300 miles of desert. Dubbed the Grand Challenge, the robot-vehicle race was hyped for months. It was going to be as important as the 1997 Kasparov-Deep Blue chess match. But on race day in March 2004, the cars performed like frightened animals. One veered off the road to avoid a shadow. The largest vehicle - a 15-ton truck - mistook small bushes for enormous boulders and slowly backed away. The favorite was a CMU team that, fueled by multimillion-dollar military grants, had been working on unmanned vehicles for two decades. Its car went 7.4 miles, hit a berm, and caught fire. Not a single car finished.

Back at Stanford, Thrun logged on to check the progress of the race and couldn't believe what he was seeing. It was a humiliation for the entire field of robotics - a field Thrun was now at the center of. Only a year before, he'd been named head of Stanford's AI program. In the quiet halls of the university's Gates Computer Science Building, the suntanned 36-year-old German was a whirlwind of excitement, ideas, and brightly colored shirts. He was determined to show what intelligent machines could contribute to society. And though he had never considered building a self-driving car before, the sorry results of the first Grand Challenge inspired him to give it a try.

He assembled a first-rate team of researchers, attracted the attention of Volkswagen's Palo Alto R&D team, and charged ahead. But here in the desert, he's facing the reality that the Touareg - dubbed Stanley, a nod to Stanford - is totally inadequate. With only three months to go before the second Grand Challenge, he realizes that some basic problems remain unsolved.

Thrun gets out to kick the dirt on the side of the road and think. While the car idles, he squints at the uneven terrain ahead. This was his chance to lead the way toward his vision of the new vehicular order. But for now, all he sees is mountains, sagebrush, and sky.

It started with a black-and-white videogame in 1979. Thrun, then 12, was spending most of his free time at a local pub in Hannover, Germany. The place had one of the first coin-operated videogames in town, and 20 pfennig bought him three lives driving at high speed through a stark landscape of oil slicks and oncoming cars. It was thrilling - and much too expensive. For weeks, Thrun scrutinized the graphics and then decided that he could re-create the game on his Northstar Horizon, a primitive home computer that his father, a chemical engineer, had bought for him. He shut himself in his room and devoted his young life to coding the Northstar. It ran at 4 MHz and had only 16 Kbytes of RAM, but somehow he coaxed a driving game out of the machine.

Though he didn't study or do much homework over the next seven years, Thrun ended up graduating near the top of his high school class. He wasn't sure what was next. He figured he'd think about it during his mandatory two-year stint in the German army. But on Juneé15, 1986 - the last day to apply for university admissions - military authorities told him he wouldn't be needed that year. Two hours later, he arrived at the centralized admission headquarters in Dortmund with only 20 minutes to file his application. The woman behind the counter asked him what he wanted to study - in Germany, students declare a major before arriving on campus. He looked down the list of options: law, medicine, engineering, and computer science. Though he didn't know much about computer science, he had fond memories of programming his Northstar. "Why not?" he thought, and decided his future by checking the box next to computer science.

Within five years, he was a rising star in the field. After posting perfect scores on his final undergraduate exams, he went on to graduate school at the University of Bonn, where he wrote a paper showing for the first time how a robotic cart, in motion, could balance a pole. It revealed an instinct for creating robots that taught themselves. He went on to code a bot that mapped obstacles in a nursing home and then alerted its elderly user to dangers. He programmed robots that slithered into abandoned mines and came back hours later with detailed maps of the interior. Roboticists in the US began to take note.Carnegie Mellon offered the 31-year-old a faculty position and then gave him an endowed chair. But he still hadn't found a research area to focus all his energy and skills on.

While Thrun was settling in at CMU, the hot topic in robotics was self-driving cars. The field was led by Ernst Dickmanns, a professor of aerospace technology at the University of the Bundeswehr. He liked to point out that planes had been flying themselves since the 1970s. The public was clearly willing to accept being flown by autopilot, but nobody had tried the same on the ground. Dickmanns decided to do something about that.

With help from the German military and Daimler-Benz, he spent seven years retrofitting a boxy Mercedes van, equipping it with video cameras and a bunch of early Intel processors. On a Daimler-Benz test track in December 1986, the driverless van accelerated to 20 miles per hour and, using data supplied by the videocams, successfully stayed on a curving road. Though generally forgotten, this was the Kitty Hawk moment of autonomous driving.

It sparked a 10-year international dash to develop self-driving cars that could navigate city streets and freeways. In the US, engineers at Carnegie Mellon led the charge with funding from the Army. On both sides of the Atlantic, the approach involved a data-intensive classification approach, a so-called rule-based system. The researchers assembled a list of easily identifiable objects (solid white lines, dotted white lines, trees, boulders) and told the car what to do when it encountered them. Before long, though, two main problems emerged. First, processing power was anemic, so the vehicle's computer quickly became overwhelmed when confronted with too much data (a boulder beside a tree, for instance). The car would slow to a crawl while trying to apply all the rules. Second, the team couldn't code for every combination of conditions. The real world of streets, intersections, alleys, and highways was too complex.

In 1991, a CMU computer science PhD student named Dean Pomerleau had a critical insight. The best way to teach cars to drive, he suspected, was to have them learn from the experts: humans. He got behind the wheel of CMU's sensor-covered, self-driving Humvee, flipped on all the computers, and ran a program that tracked his reactions as he sped down a freeway in Pittsburgh. In minutes, the computers had developed algorithms that codified Pomerleau's driving decisions. He then let the Humvee take over. It calmly maneuvered itself on Pittsburgh's interstates at 55 miles per hour.

Everything worked perfectly until Pomerleau got to a bridge. The Humvee swerved dangerously, and he was forced to grab the wheel. It took him weeks of analyzing the data to figure out what had gone wrong: When he was "teaching" the car to drive, he had been on roads with grass alongside them. The computer had determined that this was among the most important factors in staying on the road: Keep the grass at a certain distance and all will be well. When the grass suddenly disappeared, the computer panicked.

It was a fundamental problem. In the mid-'90s, microchips weren't fast enough to process all the potential options, especially not at 55 miles per hour. In 1996, Dickmanns proclaimed that real-world autonomous driving could "only be realized with the increase in computer performance … With Moore's law still valid, this means a time period of more than one decade." He was right, and everyone knew it. Research funding dried up, programs shut down, and autonomous driving receded back to the future.

Eight years later, when Darpa held its first Grand Challenge, processors had in fact become 25 times faster, outpacing Moore's law. Highly accurate GPS instruments had also become widely available. Laser sensors were more reliable and less expensive. Most of the conditions Dickmanns had said were necessary had been met or exceeded. More than 100 contestants signed up, including a resurgent CMU squad. Darpa officials couldn't hide their excitement. The breakthrough moment in autonomous driving was, they thought, at hand. In truth, some of the field's biggest challenges had yet to be overcome.

Once Thrun decided to take a crack at the second Grand Challenge, he found himself consumed by the project. It was as though he were 12 again, shut up in his room, coding driving games. But this time a Northstar home computer wasn't going to cut it. He needed serious hardware and a sturdy vehicle.

That's when he got a call from Cedric Dupont, a scientist at Volkswagen's Electronics Research Laboratory, just a few miles from the Stanford campus. The Volkswagen researchers wanted in on the Grand Challenge. They'd heard that Thrun was planning to enter the event, and they offered him three Touaregs - one to race, another as a backup, and a third for spare parts. The VW lab would outfit them with steering, acceleration, and braking control systems custom-built to link to Thrun's computers. Thrun had his vehicle, and Volkswagen executives had a chance to be part of automotive history.

It was history, however, that Red Whittaker planned on writing himself. Whittaker, the imposing, bald, bombastic chief of CMU's eponymously named Red Team, had been working on self-driving vehicles since the '80s. Whittaker's approach to problem solving was to use as much technological and automotive firepower as possible. Until now, the firepower hadn't been enough. This time, he would make sure that it was.

First, he entered two vehicles in the race: a 1986 Humvee and a 1999 Hummer. Both were chosen for their ruggedness. Whittaker also stabilized the sensors on the trucks with gyroscopes to ensure more reliable data. Then he sent three men in a laser-studded, ground-scanning truck into the desert for 28 days. Their mission: create a digital map of the race area's topography. The team logged 2,000 miles and built a detailed model of the desolate sagebrush expanses of the Mojave.

That was only the beginning. The Red Team purchased high-resolution satellite imagery of the desert and, when Darpa revealed the course on race day, Whittaker had 12 analysts in a tent beside the start line scrutinize the terrain. The analysts identified boulders, fence posts, and ditches so that the two vehicles would not have to wonder whether a fence was a fence. Humans would have already coded it into the map.

The CMU team also used Pomerleau's approach. They drove their Humvees through as many different types of desert terrain as they could find in an attempt to teach the vehicles how to handle varied environments. Both SUVs boasted seven Intel M processors and 40 Gbytes of flash memory - enough to store a world road atlas. CMU had a budget of $3 million. Given enough time, manpower, and access to the course, the CMU team could prepare their vehicles for any environment and drive safely through it.

It didn't cut it. Despite that 28-day, 2,000-mile sojourn in the desert, CMU's premapping operation overlapped with only 2 percent of the actual race course. The vehicles had to rely on their desert training sessions. But even those didn't fully deliver. A robot might, for example, learn what a tumbleweed looks like at 10 am, but with the movement of the sun and changing shadows, it might mistake that same tumbleweed for a boulder later in the day.

Thrun faced these same problems. Small bumps would rattle the Touareg's sensors, causing the onboard computer to swerve away from an imagined boulder. It couldn't distinguish between sensor error, new terrain, its own shadow, and the actual state of the road. The robot just wasn't smart enough.

And then, as Thrun sat on the side of that rutted dirt road, an idea came to him. Maybe the problem was a lot simpler than everyone had been making it out to be. To date, cars had not critically assessed the data their sensors gathered. Researchers had instead devoted themselves to improving the quality of that data, either by stabilizing cameras, lasers, and radar with gyroscopes or by improving the software that interpreted the sensor data. Thrun realized that if cars were going to get smarter, they needed to appreciate how incomplete and ambiguous perception can be. They needed the algorithmic equivalent of self-awareness.

Together with Montemerlo, his lead programmer, Thrun set about recoding Stanley's brain. They asked the computer to assess each pixel of data generated by the sensors and then assign it an accuracy value based on how a human drove the car through the desert. Rather than logging the identifying characteristics of the terrain, the computer was told to observe how its interpretation of the road either conformed to or varied from the way a human drove. The robot began to discard information it had previously accepted - it realized, for instance, that the bouncing of its sensors was just turbulence and did not indicate the sudden appearance of a boulder. It started to ignore shadows and accelerated along roads it had once perceived as being crisscrossed with ditches. Stanley began to drive like a human.

Thrun decided to take the car's newfound understanding of the world a step further. Stanley was equipped with two main types of sensors: laser range finders and video cameras. The lasers were good at sensing ground within 30 meters of the car, but beyond that the data quality deteriorated. The video camera was good at looking farther away but was less accurate in the foreground. Maybe, Thrun thought, the laser's findings could inform how the computer interpreted the faraway video. If the laser identified drivable road, it could ask the video to search for similar patterns ahead. In other words, the computer could teach itself.

It worked. Stanley's vision extended far down the road now, allowing it to steer confidently at speeds of up to 45 miles per hour on dirt roads in the desert. And because of its ability to question its own data, the accuracy of Stanley's perception improved by four orders of magnitude. Before the recoding, Stanley incorrectly identified objects 12 percent of the time. After the recoding, the error rate dropped to 1 in 50,000.

It's half past 6 in the morning on October 8, 2005, outside of Primm, Nevada. Twenty-three vehicles are here for the second Grand Challenge. Festooned with corporate logos, lasers, radars, GPS transponders, and video cameras, they're parked on the edge of the gray-brown desert and ready to roll. The early morning light clashes with the garish glow of the nearby Buffalo Bill's Resort and Casino.

Red Whittaker is beaming. His 12 terrain analysts have completed their two-hour premapping of the route, and the data has been uploaded to the two CMU vehicles via a USB flash drive. The stakes are high this year: Darpa has doubled the prize money to $2 million, and Whittaker is ready to win it and erase the memory of the 2004 debacle. Last night, he pointed out to the press that Thrun had been a junior faculty member in Whittaker's robotics lab at CMU. "My DNA is all over this race," he boasted. Thrun won't be baited by Whittaker's grandstanding. He focuses on trying to calm his own frayed nerves.

The race begins quietly: One by one, the vehicles drive off into the hills. A few hours later, the critical moment is captured in grainy footage. CMU's H1 is in the middle of a dusty white desert expanse. The camera slowly approaches - the image is pixelated and overexposed. It's the view from Stanley's rooftop camera. For the past 100 miles, the Touareg has been tailgating the H1, and now it pulls close. Its lasers scan the exterior of its competitor, revealing a ghostly green outline of side panels and a giant, sensor-stabilizing gyroscope. And then the VW rotates its steering wheel and passes.

Darpa has imposed speed limits of 5 to 25 miles per hour, depending on conditions. Stanley wants to go faster. Its lasers are constantly teaching its video cameras how to identify drivable terrain, and it knows that it could accelerate more. For the rest of the race, Stanley pushes up against the speed limits as it navigates through open desert and curving mountain roads. After six hours of driving, it exits the final mountain pass ahead of every other team. When Stanley crosses the finish line, Thrun catches his first sight of an undiscovered country, a place where robots do all the driving.

The 128-mile race is a success. Four other vehicles, including both of CMU's entries, complete the course behind Stanley. The message is clear: Autonomous vehicles have arrived, and Stanley is their prophet. "This is a watershed moment - much more so than Deep Blue versus Kasparov," says Justin Rattner, Intel's R&D director. "Deep Blue was just processing power. It didn't think. Stanley thinks. We've moved away from rule-based thinking in artificial intelligence. The new paradigm is based on probabilities. It's based on statistical analysis of patterns. It is a better reflection of how our minds work."

The breakthrough comes just as carmakers are embracing a host of self-driving technologies, many of them barely recognizable as robotic. Take, for example, a new feature known as adaptive cruise control, which allows the driver to select the distance to be maintained between the vehicle and the car in front of it. On the Toyota Sienna minivan, this is simply another button on the steering wheel. What that button represents, however, is a laser that surveys the distance to the vehicle ahead of it. The minivan's computer interprets the data and then controls the acceleration and braking to keep the distance constant. The computer has, in essence, taken over part of the driving.

But even as vehicles are being produced with sensors that perceive the world, they have, until now, lacked the intelligence to comprehensively interpret what they see. Thanks to Thrun, that problem is being solved. Computers are nearly ready to take the wheel. But are humans ready to let them?

Jay Gowdy doesn't think so. A highly regarded roboticist, he has worked for nearly two decades to build self-driving cars, first with CMU and, more recently, with SAIC, a Fortune 500 defense contractor. He notes that in the US, about 43,000 people die in traffic accidents every year. Robot-driven cars would radically reduce the number of fatalities, he says, but there would still be accidents, and those deaths would be attributable to computer error. "The perception is that in the majority of accidents today, those who die are drunk, lazy, or stupid and bring it on themselves," Gowdy says. "If computers take over the driving, any deaths are likely to be perceived as the loss of people who did nothing wrong."

The resulting liability issues are a major hurdle. If a robotically driven car gets in an accident, who is to blame? If a software bug causes a car to swerve off the road, should the programmer be sued, or the manufacturer? Or is the accident victim at fault for accepting the driving decisions of the onboard computer? Would Ford or GM be to blame for selling a "faulty" product, even if, in the larger view, that product reduced traffic deaths by tens of thousands?

This morass of liability questions would need to be addressed before robot cars could be practical. And even then, Americans would have to be willing to give up control of the steering wheel.

Which is not something they're likely to do, even if it means saving 40,000 lives a year. So the challenge for carmakers will be to develop interfaces that make people feel like they're in control even when the car is really doing most of the thinking. In other words, that small adaptive cruise control button in Toyota's minivan is a Trojan horse.

"OK, we're two of two, two of two, and one of one, no U-turn, speed advisory 25, large divider, POI gas station on left."

Michael Loconte and Bill Wong are creeping through a quiet suburb just north of San Jose, California. They are driving a white Ford Taurus with a 6-inch antenna on the roof. Loconte wears a headset and mumbles coded descriptions of theésurroundings into the microphone - "two of two" means that he's in the right lane on a street with two lanes, and "POI" means point of interest. Wong scribbles with a digital pen, making landmark and street address notations on a scrolling map. "People think we're with the CIA," Loconte says. "I know it kind of looks like that."

But they aren't spies. They're field analysts working for the GPS mapping company Navteq, and they're laying the foundation for the future of driving. On this Friday afternoon, they're doing a huge commercial extension of CMU's ditch-and-fence mapping operation. Navteq has 500 such analysts driving US neighborhoods, mapping them foot by foot. Though Thrun has proven that extensive mapping isn't needed to get from A to B, maps are critical when it comes to communicating with robotic vehicles. As automotive engineers build cars with increasing autonomy, the human interface with the vehicle will migrate from the steering wheel to the map. Instead of turning a wheel, drivers will make decisions by touching destinations on an interactive display.

"We want to move up the food chain," says Bob Denaro, Navteq's VP of business development. The company sees itself moving beyond the help-me-I'm-lost gizmo business and into the center of the new driving experience. That's not to say that the steering wheel will disappear; it will just be gradually de-emphasized. We will continue to sit in the driver's seat and have the option of intervening if we choose. As Denaro notes: "A person's role in the car is changing. People will become more planners than drivers."

And why not - since the car is going to be a better driver than a human anyway. With the addition of map information, a car will know the angle of a turn that's still 300 feet away. Navteq is in the process of collecting slope information, road width, and speed limits - all things that bathe the vehicle in more data than a human could ever handle.

Denaro believes that the key to making people comfortable with the shift from driver to planner will be the same thing that made pilots comfortable accepting autopilot in the cockpit: situational awareness. If a robot simply says it wants to go left instead of right, we feel uncomfortable. But if a map showed a traffic jam to the right and the machine listed reasons for rerouting, then we would have no problem pressing the Accept Route Change icon. We feel like we are still in control.

"Autopilot in the cockpit greatly extended the pilots' skills," Denaro says. Automation in driving will do the same thing.

Sebastian Thrun is standing in front of about a hundred of his colleagues and teammates at a winery overlooking Silicon Valley. He has a glass of champagne in one hand and a microphone in the other, and everyone is in a festive mood. Darpa just gave Stanford a $2 million check for winning the desert race, and Thrun is going to use a portion of the money to endow the Stanley fellowship for graduate students in computer science.

"Some people refer to us as the Wright brothers," he says, holding up his champagne. "But I prefer to think of us as Charles Lindbergh, because he was better-looking."

Everyone laughs and toasts to that. "A year ago, people said this couldn't be done," Thrun continues. "Now everything is possible." There is more applause, and then the AI experts, programmers, and engineers take small, conservative sips of the champagne. The drive home is curvy and dark. If only the party were happening in Thrun's future - then the champagne could flow unimpeded and the cars would take everyone safely home.

How Stanley Sees the Road

The SUV's hard drives boot up, its censors come to life, and it's ready to roll. Here's how Stanley works. - J.D.

1. GPS antenna
The rooftop GPS antenna receives data that has actually traveled twice into space - once to receive an initial position that is accurate up to a meter, and a second time to make corrections. The final reading is accurate up to 1 centimeter.

2. Laser Range Finder
So-called lidar scans the terrain 30 meters ahead and to either side of the grill five times a second. The data is used to build a map of the road.

3. Video camera
The video camera scans the road beyond the lidar's range and pipes the data back to the computer. If the lasers have identified drivable ground, software looks for the same characteristics in the video data, extending Stanley's vision to 80 meters and permitting safe acceleration.

4. Odometry
To contend signals blocked by, say, a tunnel or mountain, a photo sensor in the wheel well monitors a pattern imprinted on Stanley's wheels. The data is used to determine how far Stanley has moved since the blackout. The onboard computer can then track the vehicle's position based on its last known GPS location.

Taking the Wheel

Seven ways today's cars are already robots. - Brian Lam

1. Road Condition Reporting
When a car using BMW's hazard system slips on ice, its sensors activate traction control. Meantime, wireless technology alerts other cars in the area to the hazard.

2. Adaptive Cruise Control
Luxury cars made by Audi, BMW, Infiniti, and others now use radar-guided cruise control to keep pace with the car ahead.

3. Omnidirectional Collision System
GM has built an inexpensive collision detection system that allows GPS-equipped cars to identify each other and communicate wirelessly.

4. Lane-Departure Prevention
Nissan has a prototype that uses cameras and software to detect white lines and reflective markers. If the system determines the vehicle is drifting, it will steer the car back into the proper lane.

5. Auto Parallel Park
Toyota has a technology that uses a camera to identify a curbside parking space and turns the wheel automatically to reverse you into the spot.

6. Blind-Spot Sensors
GM's GPS-based collision detectors can warn you when another car enters your blind spot.

7. Corner Speed
An experimental Honda navigation computer anticipates upcoming turns and, if necessary, slows the vehicle to match predetermined safe speeds.

Contributing editor Joshua Davis (jd@joshuadavis.net) is the author of The Underdog. He wrote about DVD bootlegging in issue 13.10.
credit Ian White
Stanley: The Stanford Racing Teamés autonomous vehicle is a modified Volkswagen Touareg that can scan any terrain and pick out a drivable course to a preset destination. Cup holders optional.

credit Joe Pugliese
Team Stanley: From left, Sven Strohband, Sebastian Thrun, David Stavens, Hendrik Dahlkamp, Mike Montemerlo.

credit Jesse Jensen


credit Jameson Simpson

Feature:

Say Hello to Stanley

Plus:

How Stanley Sees the Road

Taking the Wheel