• kciwsnurb@aussie.zone
    link
    fedilink
    arrow-up
    0
    ·
    7 months ago

    The temperature scale, I think. You divide the logit output by the temperature before feeding it to the softmax function. Larger (resp. smaller) temperature results in a higher (resp. lower) entropy distribution.