One of the things I wanted to do is improved beat detection. Several music apps have some sort of tempo detection (you tap on the desk and the microphone catches it and figures out the tempo).
While I can certainly use audio analysis to do that, it has its limits. If I wanted to detect a full drum pattern (the user taps on different objects for kick and snare, and the app fills them), something machine-learny sounds much more appropriate for the job.
---
Your poke at the issue "for what? summarising web pages?" is valid though. While I don't have the resources to train those models I mentioned, the resulting weights should be fairly compatible with todays consumer hardware.
I blame the complete and utter lack of imagination of small-to-mid AI labs for the missing variety in that space.
It results in people not being very creative in imagining valid, non-shitty spammy marketing ways of using AI. They exist though.