#### Controlling for errant pitch values

Ian Howell

Hi friends,
I'm running a study where we are auto segmenting two halves of a recorded session by playing a C7 on a piano (much higher than any pitch sung), and then running a separate statistical analysis of pitch on the two halves. We are running into an issue where the To Pitch (cc) function returns errant high pitch measurements from the singing voice. So rather than returning the piano C7 as this high pitch moment, it grabs an incorrect part of the singing.

This is the code we're using:

To Pitch (cc): 0, 75, 15, "no", 0.03, 0.70, 0.01, 0.35, 0.14, 2200

I've tried it with "yes" for the very accurate option as well.

This is the portion of the wave file that Praat returned as containing the highest pitch (it's capturing periodicity from the strong higher formants in the singing voice).

I do not yet understand the nuance of the settings in the To Pitch (cc) function to make it "less" sensitive to this kind of sound. Would someone mind taking a look and making a recommendation (or suggesting readings) that would eliminate these errant measures?

Ian

Boersma Paul

These are not "errant". The algorithm looks for periodicity, not for vocal folds. If you specify that the algorithm can look as high as 2200 Hz for periodicity, it is likely to find reverberating formants.

On 8 Mar 2023, at 20:06, Ian Howell via groups.io <Ian.howell@...> wrote:

Hi friends,
I'm running a study where we are auto segmenting two halves of a recorded session by playing a C7 on a piano (much higher than any pitch sung), and then running a separate statistical analysis of pitch on the two halves. We are running into an issue where the To Pitch (cc) function returns errant high pitch measurements from the singing voice. So rather than returning the piano C7 as this high pitch moment, it grabs an incorrect part of the singing.

This is the code we're using:

To Pitch (cc): 0, 75, 15, "no", 0.03, 0.70, 0.01, 0.35, 0.14, 2200

I've tried it with "yes" for the very accurate option as well.

This is the portion of the wave file that Praat returned as containing the highest pitch (it's capturing periodicity from the strong higher formants in the singing voice).

I do not yet understand the nuance of the settings in the To Pitch (cc) function to make it "less" sensitive to this kind of sound. Would someone mind taking a look and making a recommendation (or suggesting readings) that would eliminate these errant measures?

Ian

<003_AMAB_1ipodedit_part.wav>

_____

Paul Boersma
Professor of Phonetic Sciences
University of Amsterdam
Spuistraat 134, room 632
1012VB Amsterdam, The Netherlands
http://www.fon.hum.uva.nl/paul/

Ian Howell

Hi Paul, can you recommend a workflow that would capture that high piano pitch but not identify higher formants as fundamentals? Unfortunately we don’t have EGG signals, and I can’t reason my way through it.

On Mar 8, 2023, at 3:16 PM, Boersma Paul via groups.io <p.p.g.boersma@...> wrote:

These are not "errant". The algorithm looks for periodicity, not for vocal folds. If you specify that the algorithm can look as high as 2200 Hz for periodicity, it is likely to find reverberating formants.

On 8 Mar 2023, at 20:06, Ian Howell via groups.io <Ian.howell@...> wrote:

Hi friends,
I'm running a study where we are auto segmenting two halves of a recorded session by playing a C7 on a piano (much higher than any pitch sung), and then running a separate statistical analysis of pitch on the two halves. We are running into an issue where the To Pitch (cc) function returns errant high pitch measurements from the singing voice. So rather than returning the piano C7 as this high pitch moment, it grabs an incorrect part of the singing.

This is the code we're using:

To Pitch (cc): 0, 75, 15, "no", 0.03, 0.70, 0.01, 0.35, 0.14, 2200

I've tried it with "yes" for the very accurate option as well.

This is the portion of the wave file that Praat returned as containing the highest pitch (it's capturing periodicity from the strong higher formants in the singing voice).

I do not yet understand the nuance of the settings in the To Pitch (cc) function to make it "less" sensitive to this kind of sound. Would someone mind taking a look and making a recommendation (or suggesting readings) that would eliminate these errant measures?

Ian

<003_AMAB_1ipodedit_part.wav>

_____

Paul Boersma
Professor of Phonetic Sciences
University of Amsterdam
Spuistraat 134, room 632
1012VB Amsterdam, The Netherlands
http://www.fon.hum.uva.nl/paul/

Boersma Paul

dear Ian,

if you have multiple simultaneous signals, you will have little success anyway, because (1) our pitch algorithms can detect only one pitch at a time, and (2) time-based algorithms (such as AC and CC) get confused if the pitches interact. A spectrum-based algorithm (such as SHS, also in Praat) might have more luck finding one of them.

In general, what you would like, i.e. determining vocal-fold movements directly from the sound, does not seem possible. For instance, we had a recording of a small child whose pitch rose at once from 300 to 900 Hz, and this was a sine wave. Should we interpret this as pitch or as a formant? Is it a formant that causes the vocal folds to resonate at the same frequency (900 Hz, i.e. a vocal-fold mode switch), or are the vocal folds at 300 Hz exciting an almost lossless 900 Hz formant exactly in phase? Whoever knows this, is invited to reply. My point is that such cases can have multiple causes, and the acoustics alone can provide no certain solutions (in this example, spectral methods would not help either). The pitch algorithm just finds periodicity; the standard setting for the pitch ceiling is 600 Hz and not higher, in order to maximize the likelihood that an attested periodicity is due to vocal-fold vibration rather than to reverberating formants or background whistles, but a setting of e.g. 300 Hz would be even better at that (at the cost of not capturing a 400 Hz vocal-fold vibration).

Perhaps we can create an AI that does it better, trained on the basis of combinations of sound and EGG.

best wishes,
Paul

On 8 Mar 2023, at 21:43, Ian Howell via groups.io <Ian.howell@...> wrote:

Hi Paul, can you recommend a workflow that would capture that high piano pitch but not identify higher formants as fundamentals? Unfortunately we don’t have EGG signals, and I can’t reason my way through it.

On Mar 8, 2023, at 3:16 PM, Boersma Paul via groups.io <p.p.g.boersma@...> wrote:

These are not "errant". The algorithm looks for periodicity, not for vocal folds. If you specify that the algorithm can look as high as 2200 Hz for periodicity, it is likely to find reverberating formants.

On 8 Mar 2023, at 20:06, Ian Howell via groups.io <Ian.howell@...> wrote:

Hi friends,
I'm running a study where we are auto segmenting two halves of a recorded session by playing a C7 on a piano (much higher than any pitch sung), and then running a separate statistical analysis of pitch on the two halves. We are running into an issue where the To Pitch (cc) function returns errant high pitch measurements from the singing voice. So rather than returning the piano C7 as this high pitch moment, it grabs an incorrect part of the singing.

This is the code we're using:

To Pitch (cc): 0, 75, 15, "no", 0.03, 0.70, 0.01, 0.35, 0.14, 2200

I've tried it with "yes" for the very accurate option as well.

This is the portion of the wave file that Praat returned as containing the highest pitch (it's capturing periodicity from the strong higher formants in the singing voice).

I do not yet understand the nuance of the settings in the To Pitch (cc) function to make it "less" sensitive to this kind of sound. Would someone mind taking a look and making a recommendation (or suggesting readings) that would eliminate these errant measures?

Ian

<003_AMAB_1ipodedit_part.wav>

_____

Paul Boersma
Professor of Phonetic Sciences
University of Amsterdam
Spuistraat 134, room 632
1012VB Amsterdam, The Netherlands
http://www.fon.hum.uva.nl/paul/

_____

Paul Boersma
Professor of Phonetic Sciences
University of Amsterdam
Spuistraat 134, room 632
1012VB Amsterdam, The Netherlands
http://www.fon.hum.uva.nl/paul/

Daniel McCloy

some off-the-cuff ideas:

1. instead of looking for the highest pitch to be your first half / second half cut point, look for a pitch that is very close to 2093 Hz and has some expected duration and some expected amount of (near) silence on either side
2. manually inserting one boundary into a file and saving the textgrid is pretty fast. I don't know how many files you're dealing with here, but the manual method might be faster than writing and debugging the code to automate it.

 1 - 5 of 5