Distributional Reinforcement Learning (RL) learns the whole conditional distribution of costs-to-go, given state and action, but then only ever looks at the mean (e.g., C51, IQN).