{250724}
Following the quick algorithm sketch from Italy. Correlation test with 1 second microphone signal and <= 10 minutes 'database'. Eight seconds on the laptop, 60 seconds on the Pi. Which would mean 7 minutes if we scanned all seven pieces. I think two minutes will be fine, and we can reduce MFCC size and FFT step factor, each by two.
Second test: only 14 coefficients and 50% overlap: 24 seconds. Indeed, the 'database' could already be stored as coefficients, so we save more time.