Computing Biology: How Bioinformatics is Making a Difference at Coriell

06/2024

The more we learn about our DNA and the role it plays in our health, the more sheer information this work entails. There are billions of base pairs in the human genetic code and identifying meaningful variations in that code is akin to finding a needle in a million haystacks. 

Fortunately, like for so many previously impossible tasks, advances in computer technology have made it possible now for machines and algorithms to analyze great troves of data to, among other things, compare the efficacy of drugs in the trial stage, identify new biomarkers that allow for earlier detection of cancer, or investigate alterations in histone proteins that may affect gene expression. 

Coriell’s bioinformatics scientists are an essential part of our team. They play a critical role in our cancer research, help make the samples in our care more accessible, and collaborate with scientists all over the world. 

Gennaro Calendo is the Associate Director of Bioinformatics at Coriell and answered a few questions about his team and their work.

Gennaro Calendo

I am sure many are unfamiliar with the field. When friends or family ask what bioinformatics is, how do you explain it?

Usually I start with a question, 'How much do you know about DNA?' And then gauge how quickly their eyes glaze over and adjust my answer. 

I answer with something along the lines of, “Bioinformatics is using techniques developed in computer science, mathematics, and statistics to answer questions about biological data. That could be anything from solving how proteins fold to how does a particular drug influence the expression of genes in breast cancer.” 

This 10 step guide is a good laugh, but really isn’t too far from how it often goes.

Can you give an overview of some of those techniques? How does computer science/mathematics/statistics come into play at Coriell?

In bioinformatics most techniques that you use in practice were actually developed for different purposes in the fields of math, or biology, or computer science but then clever biologists recognized that they could use those tools and techniques to solve their own problems. 

For example, the Burrows Wheeler transform was an algorithm invented for data compression, but this algorithm has proven to also be useful for the purpose of genomic alignments (aligning sequencing reads to where they originated from in the genome).  

At Coriell, we do a lot of analyses that involve these tools and then we check to see if the results we get are reliable. This is where statistical techniques come into play. Similarly, many of the statistical techniques we use were developed in other fields and then adapted to biology. 

The bioinformatics team here works closely with our scientists studying cancer. What roles does bioinformatics play in that work?

Bioinformatics plays a crucial role in the study of cancer biology. The cost of sequencing a genome has sharply declined since the human genome was first sequenced and as a result, everyone uses next generation sequencing in their experiments. Whether that means sequencing cancer cells treated with a particular epigenetic drug, or examining DNA methylation changes with aging, or detecting mutations in tumors, it’s all done with sequencing. 

However, sequencing produces a massive amount of data that is far too large to load into Excel and analyze with any desktop tool. This is where bioinformatics comes into play. We can take those mounds of data and (hopefully) extract meaningful results.

The problems in biology are scaling very large and it takes a new set of skills to handle these big problems and that is where bioinformaticians come in. 

And the technology is now sufficiently advanced enough to help?

Having very powerful computers now definitely plays a role since now the data is becoming larger and larger. 

Again, it’s the mathematicians/statisticians/and computer science folks that we heavily rely on to develop new algorithms for these huge datasets, but bioinformaticians can build on this work and adapt it for our problems. It’s an always evolving process that benefits all sides.

Aside from research, Coriell is a biobank with collections that have enabled genetic research for generations. How does the bioinformatics team assist that side of Coriell? 

I have helped a bit with the biobanking side of Coriell and it’s been exciting. We have a small working group led by Laura Scheinfeldt, PhD, Coriell’s Director of Repository Science, that consists of myself and a few others here at Coriell. 

Over the last several years, we implemented things like gene expression search, SNP search, and star allele search - all tools designed to help non-computational scientists find the samples they are looking for based on genetic characteristics. Without these tools, it can be tough to parse this data, so we’ve helped make it easier for people to find the samples they’re looking for.

Your team often collaborates with scientists outside of Coriell too?

Yes, we offer analysis services broadly, and also collaborate with scientists and clinicians involved with the Camden Cancer Research Center, the three-way partnership to study cancer we launched with Cooper University Health Care and Cooper Medical School of Rowan University.

We’ve done many collaborations with researchers at other organizations, including the University of Pennsylvania, Hackensack Meridian Health, and Van Andel Research Institute through the Epigenetic SPORE to list just a few. 

This is such a new and rapidly changing field. What first drew you to it?

Computer programming was my introduction to bioinformatics. In my first research job out of college, there were many tedious tasks that needed to be done manually and I thought there must be an easier way. I taught myself to code and automated a lot of these tasks. That, to me, was very rewarding and I realized that what I really wanted to do was combine this new skill with my love of biology which led me to bioinformatics.

As far as skills are concerned, a bioinformatics scientist need to know a little about programming, a little about statistics, a little about mathematics (mostly linear algebra), a little bit about computer science and a decent amount about biology. The field is growing and changing and it’s good to know at least a little about many things.


Other News