Mars Curiosity self portrait

The tech behind NASA's Martian chronicles

UPDATE: This article has been updated to correct a reference to Amazon's regions and Availability Zones.

Mars. The red planet. Similar to Earth in many ways, it may be the only planet within the solar system capable of sustaining life. Sitting 154 million miles away from Earth (on the day Curiosity landed), countless books, movies and TV shows have starred the planet, and little green Martians have become a wacky staple of American culture. Suffice it to say, humans have often dreamt of their celestial neighbor and longed to explore its surface. But it doesn’t give up its secrets easily.

On Aug. 5, 2012, NASA dared to pierce its mysterious veil. The space agency had sent probes to Mars before, but never with an instrument as advanced as the minivan-sized Curiosity rover, and never using such a complicated delivery system. A unique sky crane transported the precious cargo down onto the red, rocky ground.

The world was watching, literally, and the manager of data services for tactical operations at NASA’s Jet Propulsion Laboratory, Khawaja Shams, and his team stood ready to broadcast one of the agency’s greatest successes — or perhaps most dismal failures — to millions of computers, tablets and smart phones around the globe.


To people quickly becoming accustomed to real-time streaming on the Web, seeing the images sent back from Curiosity might have seemed routine. But that belies the behind-the-scenes complexity of receiving signals over such a great distance via a low-bandwidth space network and making them instantly available to scientists and the public around the world — and dealing with heavy and fluctuating demand on bandwidth. Along with showing the world what Curiosity was finding, NASA was demonstrating the power of highly scalable, on-demand cloud computing.

For the operations team, the big events of Aug. 5 started with "seven minutes of terror"   when Curiosity entered the Martian atmosphere. There was nothing NASA could do but hope that everything went according to the well-researched plan. Because it takes an average of 14 minutes to send a command or receive data back from Mars, if anything went wrong with the landing, NASA wouldn’t know until it was far too late.

But Shams and his team had been already working all day. NASA was broadcasting videos about the landings on NASA TV, featuring educators, scientists and celebrities talking about Mars and the pending landing. These grew more popular as the actual landing time approached and as people around the world visited the Curiosity and Mars Exploration Web pages to see how things were going.

Shams admits he was a little nervous. His heart was pounding despite feeling confident in his equipment. NASA had enlisted a whole suite of programs and servers from Amazon Web Services to ensure that everyone around the world could tune in without any danger of crashing the network. Capacity could be added or removed on the fly and, as it turned out, the system was even able to fix an emergency bandwidth overflow happening at a website that wasn’t even part of the original plan.

But before all that happened, Shams checked his control panel and saw that all of the AWS programs NASA was using were running in the green. Amazon CloudWatch was a go. CloudFormation was green. Simple Workflow Service was ready to go when called upon. Simple Storage Service (S3) and Elastic Compute Cloud (EC2) were already being used. Even Route 53 was ready to provide Domain Name System management.

“The Amazon Cloud solution can scale to handle over a terabyte per second but only requires us to provision and pay for exactly how much capacity we need,” Shams said. In addition to being a pay-as-you-need system, it was highly redundant, able to suffer the unlikely loss of over a dozen data centers without ever disrupting the flow of video, data and information to its worldwide audience.

The closer Curiosity got to Mars and its much-publicized rendezvous with the planet, the more people tuned in to monitor its status. As it entered the Marian atmosphere, Shams noticed that bandwidth usage had grown to over 40 gigabits/sec, according to the Amazon CloudWatch program. CloudWatch monitors usage within the Amazon Cloud and can send out warnings and messages when certain thresholds are met. Shams said he was impressed with CloudWatch because it can send out e-mails and text messages, making sure that everyone is alerted to any signs of trouble. It instantly springs to work in troublesome situations too, such as when any of the servers that make up the cloud begin to have I/O errors.

However, CloudWatch never actually sent out any warnings during the Mars landing, because the CloudFormation stack was designed to seamlessly add more bandwidth and servers when needed. In fact, Shams was able to click to add a new 25 gigabit/sec stack to the JPL cloud, and register those new machines to the project on the fly with the Route 53 program.

Back on Mars, Curiosity was hanging just a few meters above the surface of the planet, suspended in air by its rocket-powered sky crane. It would soon be dropped a few feet to the soil below, whereupon the crane would fly off and land a safe distance away, a move designed to prevent dust from kicking up around, and possibility damaging, the rover. Bandwidth usage worldwide was topping 70 gigabits/sec, when an unexpected crisis happened.

“The main JPL website, still running under traditional infrastructure, was crumbling under the load of millions of excited users,” Shams said. “We built the Mars site for the mission, but people were coming in through the main JPL site.”

Acting quickly, Shams directed the IT staff to change the DNS entries for the JPL main site to dump all of its traffic into the cloud. This process had to be done manually because it wasn’t part of the system, so Route 53 couldn’t be used. Even so, staff members at JPL were able to make the changes within five minutes. This could have caused a perfect storm of bandwidth problems for the cloud, with millions of users getting directed into the Amazon system all at once, joining millions of others who were already there and those who were tuning in late to see if the ship made it to Mars.


Featured

  • business meeting (Monkey Business Images/Shutterstock.com)

    Civic tech volunteers help states with legacy systems

    As COVID-19 exposed vulnerabilities in state and local government IT systems, the newly formed U.S. Digital Response stepped in to help. Its successes offer insight into existing barriers and the future of the civic tech movement.

  • data analytics (Shutterstock.com)

    More visible data helps drive DOD decision-making

    CDOs in the Defense Department are opening up their data to take advantage of artificial intelligence and machine learning tools that help surface insights and improve decision-making.

Stay Connected