dropsofai.com
Boosting your Sequence Generation Performance with ‘Beam Search + Language model’ decoding - Drops of AI
Unlike greedy decoder, Beam Search Decoder doesn’t just consider the most probable token at each prediction, it considers top-k tokens having higher probabilities (where k is called the beam-width or beam-size).
Kartik Chaudhary