Microsoft’s undersea datacenter helps the hunt for a COVID-19 vaccine

An experimental Microsoft datacenter submerged beneath the sea in Scotland’s Orkney Islands is processing workloads for a global, distributed computing project to understand the viral proteins that cause COVID-19 and design therapeutics to stop them.

Distributed computing projects harness otherwise idle computer processing power to perform specific tasks for big science research. Ongoing projects include efforts to understand climate change, map cancer markers and fight infectious disease. The trend started in the late 1990s when tens of thousands of people downloaded the SETI@home screensaver to hunt for extraterrestrial radio signals.

The Folding@home distributed computing project launched in October 2000 to simulate protein dynamics. Proteins are molecular machines that perform many functions essential to life, ranging from providing a sense of taste and smell to muscle contraction and hair growth. How proteins – chains of amino acids – fold into structures determines their function. The Folding@home simulations can lead to breakthroughs such as identifying sites on a viral protein that a therapeutic drug could bind to.

“Folding@home was one of the first distributed computing groups to start working on COVID-related problems and immediately came out with a bunch of workloads that were geared toward finding antibodies and figuring out ways they could create immunizations,” said Spencer Fowers, a principal member of technical staff for Microsoft’s special projects research group.

Fowers is the technical lead for Project Natick, a years-long research effort to investigate the feasibility of manufacturing and operating environmentally sustainable, prepacked datacenter units that can be ordered to size, rapidly deployed and left to operate lights out on the seafloor for years. The project’s Northern Isles datacenter, which is about the size of a shipping container, has been humming away 117 feet under the sea in Scotland since June 2018.

A major goal for the Northern Isles deployment is to study how well the system’s plumbing maintains operational temperatures while the 864 standard datacenter servers are running all the time. To keep the servers humming even when they are not processing workloads for Microsoft, Fowers runs distributed computing jobs on them.

In 2018, Fowers set up the Northern Isle’s servers to run jobs from the World Community Grid, an IBM sponsored distributed computing effort with projects tackling big science problems such as how to better forecast rainfall in sub-Saharan Africa, understand how bacteria in human bodies may help cause disease, and finding ways to detect and treat cancer.

When Folding@home announced the COVID-19 related research effort, Fowers jumped on the opportunity and deployed the software across the servers on the Northern Isles.

In addition, Fowers worked with colleagues at Microsoft to enable Microsoft employees currently working from home to deploy the project on their office computers, and worked with the Folding@home community to improve the ability to install the software remotely.

These efforts are in addition to the contributions Microsoft is making to Folding@home via the AI for Health initiative, which recently granted Azure computing resources to help Folding@home run simulations of COVID-19 proteins. That effort has already revealed sites on the virus that potential drugs could bind to, Greg Bowman, the director of Folding@home and an associate professor at Washington University in St. Louis, noted in a video prepared for Microsoft’s virtual Build developers conference.

Distributed computing projects are designed to consume spare processing power of personal computers, which makes Project Natick uniquely positioned to make outsized contributions, noted Fowers.

“We just got into the top 1% of contributors in the world,” said Fowers in reference to the contribution from the Northern Isles. That’s largely because the servers are “100% of the time dedicated to this project. They are constantly working on workloads and it allows us to do a big contribution.”

Unlike commercial Microsoft datacenters that run the full Azure infrastructure, including artificial intelligence frameworks tailored to meet specific needs, Project Natick is a research datacenter and its servers are generic, similar to several thousand high-end personal computers.

“This COVID-19 pandemic is an example of why the distributed computing platform is still relevant today,” said Fowers, explaining that “it makes it quick for adoption, and it gives people the opportunity to feel like they are contributing.”