Leveraging Structure For Optimization In Deep Learning