Tracking-by-selection pose estimation in videos