Blog

Subscribe to Email Updates

Data Management Trends: Machine Learning, Artificial Intelligence, and High Performance Computing in the Life Sciences Industry

by Catherine Chiang – April 10, 2018

Explosive data growth in the life sciences industry is nothing new; what’s truly exciting is how this data can be used!

Research and healthcare organizations have been generating huge amounts of data due to developments in scientific equipment, but for years lacked the tools to use their data to its fullest potential. Now, thanks to technological advancements, including machine learning, artificial intelligence, and high performance computing, scientists are harnessing the power of all that data.

ML, AI, and HPC are Revolutionizing the Life Sciences

Enormous amounts of data enable organizations to perform deeper analytics, build more accurate machine learning algorithms, and develop better artificial intelligence models.

One area of the life sciences where using more data has improved outcomes is healthcare. A Fortune article describes how patients can use personalized health data to treat chronic diseases, hospitals use AI to send crisis victims to facilities best prepared to treat them, and pharmaceutical companies use vast stores of genetic data to validate new drugs.

The huge scale of data and the computational power required for ML/AI has pushed life sciences organizations to utilize HPC.

Aaron Gardner, Director of Technology at BioTeam, says, “People are processing more samples, doing more studies, more genomes, more data sets that are larger and larger. All this is creating scale pushing for larger HPC resources in core compute, storage, and networking even though analytics is the key application.”

Data Management for ML/HPC Workflows

We have customers utilizing Igneous Hybrid Storage Cloud to manage petabytes of scientific data, such as tumor imaging data and brain scan data, that must be processed or computed. Automation of data protection, data movement, and easier analysis yields faster time to market and quicker results for our users.

Managing the massive scale of data, administering machine learning processes, and having the IT skill set to manage an HPC environment are all common challenges for life sciences organizations. Furthermore, the cost of IT resources for this scale and new workflows for many create a need to align consumption of resources to project or pipeline.

Gardner says, “I think for consumers of HPC, the skills required to kind of get going and do your science are decreasing, but, would say that to properly provide HPC on premise or in the cloud, the skills and the knowledge required are increasing.”

In the case of machine learning workflows, Igneous Hybrid Storage Cloud acts as a strong, API-driven archive tier that not only ingests enormous amounts of data, but also enables the data to be used. This means that Igneous Hybrid Storage Cloud can plug into customer applications that utilize the data, or easily send data to public cloud for computing.

Importantly, Igneous Hybrid Storage Cloud is delivered as-a-Service. Our remotely managed infrastructure and software reduces management overhead for our customers, so they can keep their IT departments lean and focus on research.

In addition, having a secondary storage solution capable of highly efficient data movement, like Igneous Hybrid Storage Cloud, enables organizations to take advantage of pay-as-you-go cloud economics for HPC in public cloud.

If you would like to learn more about how Igneous Hybrid Storage Cloud can support your machine learning and high performance computing workflows, contact us!

Contact us

Related Content

How Much is Tape Really Costing Your Business?

April 17, 2018

Tape remains a popular medium of data storage due to its advantages in cost and ease of use. Often as little as a penny per gig per month to cover just the hardware, tape seems to be a great, affordable option for backup. Unlike other backup solutions, scaling tape is simply a matter of adding physical inventory, and tape doesn’t require network bandwidth or expertise to move offsite, or power when storing long-term archives.

read more

Archive 101: What is It, Why is It So Important, and How Do You Archive Effectively?

April 3, 2018

As data grows, archiving data has become more important than ever for a robust data management strategy. Yet, effective archive remains elusive for many organizations. Even defining what “archive” means can be difficult because archive commonly refers to backup archives or e-mail archives, not unstructured data management.

read more

Tomorrow’s Cars Need Modern Data Management Today

March 14, 2018

The Internet of Things is transforming industries across the board, and the automotive industry is no exception. From connected cars to autonomous vehicles, big changes in automotive technology are creating big data.

read more

Comments