Hope you're well! The instrumentalness is defined as:
'Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0.'
It seems like this value is based on the whole track. Hope this answered your question. Let me know if you still need help on this one!
Happy coding! 🙂
HuboRock Star 16
Help others find this answer and click "Accept as Solution".
If you appreciate my answer, maybe give me a Like.