雑u bot . @zatsu, Batched reward model inference and Best-of-N samplinghttps://raw.sh/posts/easy_reward_model_inference#ReadItLater Open thread