Single-cell RNA-seq technologies have opened up a completely new era for transcriptomic studies. For the first time ever, scientists can look at individual transcriptomic profiles of millions of cells, and better understand how each cell functions in a tissue. Yet science is confronting bigger challenges analyzing these massive amounts of… Read more »

“I’m clueless” You say, looking at an ocean of unlabeled data, waving in front of you. It is true that the lack of labels can sometimes freak us out, leaving us wondering how to group the data together. But luckily, k-means clustering algorithm is here to rescue, one of the… Read more »

Today we finished the first version of Hera-T, a new single-cell RNA-seq quantification algorithm. We developed Hera-T by improving challenging alignment errors that Cell Ranger has. As a result, Hera-T is more accurate than Cell Ranger. Hera-T is more than 10 times faster than Cell Ranger, while consuming just a small amount of… Read more »

Heatmaps. We see them all over the scientific journals — the shading matrices that convey meaningful stories. Wanna interpret gene expressions? Draw a heatmap. Gotta examine the correlation among variables? Get it done with a heatmap. It seems that the shading has breathed such life into those cheerless matrices of… Read more »

Principal component analysis (PCA) has been gaining popularity as a tool to bring out strong patterns from complex biological datasets. We have answered the question “What is a PCA?” in this jargon-free blog post — check it out for a simple explanation of how PCA works. In a nutshell, PCA capture… Read more »

As we are entering the era of Big Data, everyone and their moms seem to be talking about PCA. All the papers you read mention PCA (with lots of jargon, of course). Half of the seminars you’ve been to this month touch on PCA. Your boss/collaborators suggest trying PCA on your data. “What… Read more »

We showed a quick and easy way to compare box plots in previous post. Let’s dig deeper into what information you can use to compare two box plots. Overlapping boxes and medians It gets tricky when the boxes overlap and their median lines are inside the overlap range. As always, math comes to… Read more »

Box plots, a.k.a. box-and-whiskers plots, are an excellent way to visualize differences among groups. They manage to carry a lot of statistical details — medians, ranges, outliers — without looking intimidating. But box plots are not always intuitive to read. How do you compare two box plots? The key information… Read more »

1. Violin graph is like box plot, but better Box-and-whisker plots are great. They show medians, ranges and variabilities effectively. They allow comparing groups of different sizes. They are super simple to create and read, so naturally, they are all over the place. But box plots can be misleading. They are not… Read more »

Box plots, a.k.a box-and-whisker plots, are an excellent way to compare groups. They are intuitive: viewers can see samples’ medians, distribution, and variabilities with a quick glance. They allow comparing samples of different sizes; for example, a set of 50 data points with a set of 30. They don’t even… Read more »