Since its early inception in the mainframe era until around a decade ago, scientific computing held the reputation of a cloistered art — an arcane, guarded practice that only a select crowd in the highest reaches of government and academia understood. Its machines were monolithic, its software dense and overtly mathematical, and its management frameworks unwieldy.
The evolution of massive mainframes to more comprehensible clusters and now into remote resources accessible to all — with the hardware complexity abstracted away — is one of the great technology tales of our times. It’s the story of the democratization of science, commercial research, and the data-driven tasks of daily life. It’s the dawn of the era of the “scientific entrepreneur” who, with the right skills, data, and access to an internet connection, is limited only by imagination.
It’s difficult not to be swept away by hyperbole when thinking about what this access means. The small genomics researcher with a reference sample and a big idea is freed from the burdens of owning and operating a cluster. These same benefits apply across research arenas — from neuroscience, energy, materials science, biology, and beyond. Science can now be delivered as a service, removing the complexity and cluster weight from research and letting domain specialists focus on what matters most.
The first time the phrase “science as a service” circulated was back in 2005, just before the Amazon cloud hit the mainstream. “Science as a service” was discussed in the context of grid computing, the precursor to the idea of shared, remote resources that developed great traction, particularly in academia. This topic evolved with each year, gathering a host of new tools, platforms, scientific gateways, and collaborative development frameworks.
The originator of the “science as a service” concept was one of the early pioneers of grid computing, Argonne National Laboratory’s Dr. Ian Foster. He is still delivering on the promise of distributed, remote computational tools as a key driver for the next generation of discovery, but is including the public cloud as a key enabling mechanism. Access to on-demand infrastructure can “reduce the competitive disadvantage that smaller labs may experience relative to large, well-funded labs by making tools formerly available only in large labs and specialist researchers accessible to all,” contends Foster. Access, in other words, is the great leveler.
“In a sense, the growing number of science cloud services are analogous to building blocks. As we assemble bigger and better building blocks — blocks that are well-managed and maintained, and reliable and scalable — scientific entrepreneurs can more easily build applications that solve specific scientific problems without having to create and support monolithic software stacks themselves.”
Chances are, especially for those of you who are already using Amazon Web Services, this story is not unfamiliar. Perhaps you are one of the scientific entrepreneurs who brought a new question or approach to an old problem into the cloud for resolution. You built your own tools or leveraged one of the many open source frameworks primed for AWS, you were able to define your own hardware, select your own ideal instance type configuration and Intel® Xeon® processor, and you were able to run your application your way.
And although it might seem ordinary and routine these days to be able to do this — to be the scientific entrepreneur — think about these developments from time to time. The ability to spin up a cluster on demand with the exact processor, storage, memory, and other requirements and shut it down when you’re finished. Just like that.
As the first of its kind and the most widely used service of its nature, Amazon Web Services has made history — and will continue doing so as the next generation of developers, researchers, and data explorers find their way here. The stories are limitless, the promise and potential, boundless.