Like many people, I rely on the reviews of others to find the best products on any website. This is especially true on Sephora’s website. I don’t want to take a chance on an 85 dollar face cream unless I’m pretty sure it’s going to be great (and perhaps bring about world peace). However, when I search for products on their site and sort by rating, I’m usually disappointed. The highest ranked product will often have a five star rating, but only a single review. Meanwhile, a product that has been reviewed hundreds of times, but has an average customer rating of 4.8 stars, is ranked much lower down the list. Shouldn’t the number of reviews have some bearing on how a product gets ranked? How do you compare the average rating of products with differing numbers of reviews?

This is where Bayesian estimation comes in. With Bayesian estimation, we can utilize both the average customer rating and the number of reviews to come up with a better ranking system. To do so, we need to select two parameters, a prior and a confidence score we’ll call *p* and *C*. All ratings will be adjusted towards the prior, and the strength of that adjustment is determined by the confidence score. The formula is:

$Bayesian\; estimate\text{}=\frac{C\text{}\times \text{}p\text{}+\text{}\mathrm{total\; number\; of\; stars}}{C\text{}+\text{}\mathrm{number\; of\; reviews}}$

So comparing a product with 1 review of 5 stars to a product with 100 reviews and an average rating of 4.8, let’s say we choose C=10 and p=3. The results would look like this:

$\frac{10\text{}\times \text{}3\text{}+\text{}5\text{}\times \text{}1}{10\text{}+\text{}1}\text{}=\text{}3.2$

$\frac{10\text{}\times \text{}3\text{}+\text{}4.8\text{}\times \text{}100}{10\text{}+\text{}100}\text{}=\text{}4.6$

The product with 100 reviews is now ranked higher than the product with only one review. Now let’s think about what’s going on in this equation to figure out how to pick values for *p* and *C*.

Essentially, the equation is adding 10 reviews for each product with an average rating of 3. This means that both adjusted ratings move closer to 3, but the first product moves a lot closer, since it has one review. So this means that the adjustment to a product rating with many reviews will be much smaller than the adjustment to one with fewer reviews.

To demonstrate the effect Bayesian estimation can have on ranking, I scraped data on the ratings and number of reviews for all of the eye creams listed on Sephora’s website. This totaled 116 products, some with over 1300 reviews. Next, I computed the Bayesian estimate using 3 as my prior and 10 as my confidence score. Another option to consider for the prior might be the average rating of all reviews across the site. It would take a lot more scraping to gather that information, so I settled on the middle score (users cannot review a product without rating it at least one star, so 3 is the midpoint).

Let’s look at the products with the 15 highest ratings, according to Sephora. If you instead sort by Bayesian estimate, the rankings change drastically. The product that was ranked first by Sephora is now ranked 15th.

After comparing the rankings when sorted by Bayesian estimate, I did notice something interesting. My new rankings are quite similar to those on Sephora’s site when sorted by ‘bestselling’ products. Perhaps Sephora is using some sort of weighted ranking, or perhaps bestselling products simply receive more reviews, which are then favored in the Bayesian estimation equation.

You can find all the code for this post **here** and read about Bayesian estimation in more detail **here**. Below are all 116 eye products available from Sephora ranked by Bayesian estimate.