Understanding the Effectivity of Ensembles in DL

Dissecting ensembles, one at a time.

The report explores the ideas presented in Deep Ensembles: A Loss Landscape Perspective by Stanislav Fort, Huiyi Hu, and Balaji Lakshminarayanan.

In the paper, the authors investigate the question - why do deep ensembles work better than single deep neural networks?

In their investigation, the authors figure out:

  • Different snapshots of the same model (i.e., model trained after 1, 10, 100 epochs) exhibit functional similarity. Hence, their ensemble is less likely to explore the different modes of local minima in the optimization space.

  • Different solutions of the same model (i.e., trained with different random initializations each time) exhibit functional dissimilarity. Hence, their ensemble is more likely to explore the different modes of local minima in the optimization space.

Inspired by their findings, in this report, we present several different insights that are useful for understanding the dynamics of deep neural networks in general.

​😼 Check out the GitHub repo here.

This was co-written and implemented with Sayak Paul.

Last updated