This is an updated posting: the deadline is approaching.


Understanding the learning dynamics of Neural Audio models using Linear Algebra



Since around 2016 most research in Digital Music and Digital Audio has adopted 
Deep Learning techniques. These have brought important advances in performance 
in applications like Music Source Separation, Automatic Music Transcription, 
Timbre Transfer and so on. This is good, but on the downside, the models get 
larger, they consume increasingly large amounts of power for training and 
inference, require more data and become less understandable and explainable. 
These are the issues that underpin the research in this PhD.



A fundamental building block in Deep Learning models is Matrix (or Linear) 
Algebra. Through training, the matrix that represents each layer is 
progressively modified to reduce the error between a predicted value and the 
training data. By examining what happens to these matrices during training, it 
is possible to engineer them to learn faster and more efficiently, as well as 
to build DL models that are more compact. Here we turn to Low Rank matrices: we 
wish to explore what happens when Low Rank is imposed as a training constraint 
in Neural Audio models. Is the model better trained, or not? Is the model 
easier and cheaper to train, or not? Early results in non-audio/music 
applications show that they are better trained and they are cheaper to train. 
This work needs developing further for this PhD.



Research will start with Music Source Separation, exploring the learning 
dynamics of established models like DeMucs. It will then use the knowledge of 
these dynamics to intelligently prune the models using the Low Rank approach 
above [1].  This will speed up the learning and inference and improve 
performance. Next, the work could shift to look at other Neural Audio models 
and applications or could become more immersed in field of Mechanistic 
Interpretability, [2], to reveal the hidden, innermost structures that emerge 
within trained Neural Networks. Other lines of enquiry could include the 
trade-off between data set size (for training) vs the Ideal Rank of the various 
layers in the model. Again, early results surprisingly suggest that Low Rank 
layers can be trained with less data!



Candidates will have excellent background in Linear Algebra (eg Eigenvectors, 
Singular Value Decomposition, Tensor Analysis) as well as strong interest in 
some aspect of music or audio. They will also need background in Deep Learning 
and a sound knowledge of appropriate programming tools. Knowledge of 
Mathematica and the Wolfram Language would be a bonus. You will need a strong 
undergraduate degree and preferably a Masters degree to a high level.



Please note that a studentship is only available for those qualifying for China 
Scholarship Council awards or those qualifying for our faculty’s S&E Doctoral 
Research Studentships for Underrepresented Groups . Self-funded candidates are 
also welcome.



Full application guidelines can be found here:
 
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.c4dm.eecs.qmul.ac.uk_news_2024-2D11-2D12.PhD-2Dcall-2D2025_&d=DwIF-g&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=TRvFbpof3kTa2q5hdjI2hccynPix7hNL2n0I6DmlDy0&m=I7g2FAu_8Y6aNM1PMzfjm45vZFCsuFThSPmHAAjvNwz2Y9skHmyBEPuhUM7pOteq&s=bcusqkiyQtbazU1-3SaZNx6n_SmfsYhnD9TFngNK83M&e=
 



For further details of this research topic, contact Mark Sandler 
([email protected]<mailto:[email protected]>) by email.


[1] B. Bermeitinger, T. Hrycej, and S. Handschuh, ‘Singular Value Decomposition 
and Neural Networks’, Jun. 2019. doi: 
10.1007/978-3-030-30484-3_13<https://urldefense.proofpoint.com/v2/url?u=https-3A__doi.org_10.1007_978-2D3-2D030-2D30484-2D3-5F13&d=DwIF-g&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=TRvFbpof3kTa2q5hdjI2hccynPix7hNL2n0I6DmlDy0&m=I7g2FAu_8Y6aNM1PMzfjm45vZFCsuFThSPmHAAjvNwz2Y9skHmyBEPuhUM7pOteq&s=GajleHbzAQKQAM-g6ev6Midl0_GpNVBArxC_lovnmUY&e=
 >.
[2] N. Cammarata et al., ‘Thread: Circuits’, Distill, vol. 5, no. 3, p. e24, 
Mar. 2020, doi: 
10.23915/distill.00024<https://urldefense.proofpoint.com/v2/url?u=https-3A__doi.org_10.23915_distill.00024&d=DwIF-g&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=TRvFbpof3kTa2q5hdjI2hccynPix7hNL2n0I6DmlDy0&m=I7g2FAu_8Y6aNM1PMzfjm45vZFCsuFThSPmHAAjvNwz2Y9skHmyBEPuhUM7pOteq&s=Z4fgO7Mvp7kc4SgMYWiyiq4uPpmIkc_F4wf5tVERl9g&e=
 >.
[3] V. S. Paul and P. A. Nelson, ‘Matrix analysis for fast learning of neural 
networks with application to the classification of acoustic spectra’, The 
Journal of the Acoustical Society of America, vol. 149, no. 6, pp. 4119–4133, 
Jun. 2021, doi: 
10.1121/10.0005126<https://urldefense.proofpoint.com/v2/url?u=https-3A__doi.org_10.1121_10.0005126&d=DwIF-g&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=TRvFbpof3kTa2q5hdjI2hccynPix7hNL2n0I6DmlDy0&m=I7g2FAu_8Y6aNM1PMzfjm45vZFCsuFThSPmHAAjvNwz2Y9skHmyBEPuhUM7pOteq&s=YfiT38SOJcWBzakmtuZRoJYP72AqSFpx3dLKOw61ouQ&e=
 >.



--
Please note I work part time Monday - Thursday so there may be a delay to my 
email response.

professor mark sandler, FREng, CEng, FIEEE, FAES, FIET
director of the centre for digital music (c4dm)

school of electronic engineering and computer science, queen mary university of 
london
[email protected]<mailto:[email protected]> | +44 (0)20 7882 7680


Reply via email to