On the computational and neural characterisation of reward learning behaviour