Add test for various length mask functions (code from Samsung, AI Center, Cambridge) by rogiervd · Pull Request #2894 · speechbrain/speechbrain

rogiervd · 2025-05-02T12:35:13Z

This tests five different functions in SpeechBrain that do essentially the same thing.

What does this PR do?

Test functions that turn a list of lengths into a mask:

speechbrain.dataio.dataio.length_to_mask
speechbrain.lobes.models.transformer.Transformer.get_mask_from_lengths
speechbrain.nnet.losses.get_mask
speechbrain.nnet.losses.compute_length_mask
speechbrain.processing.features.make_padding_mask

Are there any other functions in SpeechBrain with the same functionality? Maybe I ought to add tests for them too.

Ultimately, it would be good to merge these functions into one or two, I think; these tests would be a first step towards that.

Commit d14ed49 causes tests to fail, and 6eb6828 should fix them.

I caused one of these failures with a mistaken comment in #2835 (comment).

Before submitting

Did you read the contributor guideline?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Does your code adhere to project-specific code style and conventions?

PR review

Reviewer checklist

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified
Confirm that the changes adhere to compatibility requirements (e.g., Python version, platform)
Review the self-review checklist to ensure the code is ready for review

This tests five different functions in SpeechBrain that do essentially the same thing.

TParcollet · 2025-05-08T10:04:11Z

I think we have discussed this merge with @Adel-Moumen and @pplantinga and we agree, they should be merged. I also think that the ones found here are the most used ones.

TParcollet

Thanks Rogier. I honestly don't have much to say, it looks good to me and I don't think it will impact existing models. I'd like a second opinion from @Adel-Moumen however.

TParcollet · 2025-05-08T19:10:47Z



-def make_padding_mask(x, lengths=None, length_dim=1, eps=1e-8):
+def make_padding_mask(x, lengths=None, length_dim=1, eps=1e-5):


What is the reason for going from 1e-8 to 1e-5?

1e-8 is less than 1 unit in the last place in float32. I chose 1e-5 without much thinking.

>>> torch.tensor(1., dtype=torch.float32) - 1e-8 - 1 tensor(0.) >>> torch.tensor(1., dtype=torch.float32) - 1e-7 - 1 tensor(-1.1921e-07) >>> torch.tensor(1., dtype=torch.float32) - 1e-6 - 1 tensor(-1.0133e-06) >>> torch.tensor(1., dtype=torch.float32) - 1e-5 - 1 tensor(-1.0014e-05)

Should I make it 1e-7 instead you think?

What is the risk here if eps is too large? Could there be an error if a tensor had a length of 1e5?

BTW at a sampling rate of 16000 thats about 6 seconds of audio, so we do actually have cases where tensors are that long. But I doubt that it ever matters if we drop the last sample

pplantinga

Everything LGTM. Thanks for this nice contribution beginning to harmonize the implementations.

pplantinga · 2025-05-19T13:11:48Z



-def make_padding_mask(x, lengths=None, length_dim=1, eps=1e-8):
+def make_padding_mask(x, lengths=None, length_dim=1, eps=1e-5):


What is the risk here if eps is too large? Could there be an error if a tensor had a length of 1e5?

BTW at a sampling rate of 16000 thats about 6 seconds of audio, so we do actually have cases where tensors are that long. But I doubt that it ever matters if we drop the last sample

pplantinga · 2025-05-19T13:16:28Z

    # Convert relative lengths to absolute lengths, then compute boolean mask
    max_len = x.size(length_dim)
-    abs_lengths = (lengths * max_len + eps).unsqueeze(1)
+    abs_lengths = (lengths * max_len - eps).unsqueeze(1)


Good catch here, I assume this is the cause of the failing tests?

rogiervd · 2025-05-19T15:58:58Z

What is the risk here if eps is too large? Could there be an error if a tensor had a length of 1e5?

BTW at a sampling rate of 16000 thats about 6 seconds of audio, so we do actually have cases where tensors are that long. But I doubt that it ever matters if we drop the last sample

The test fails at 4e-7 but succeeds at 5e-7 so I'll make it 1e-6, which gives a better safety margin.

Float32 has 24 bits in the significand. 2**24 == 6e-8 so there could be an argument for doing these fractions at float64.

Adel-Moumen

LGTM! You did a very nice work @rogiervd. I am quite surprised that we actually didn't had careful checks for length/masking functions. Thanks a lot!

rogiervd · 2025-05-20T17:11:38Z

Indeed, good catch! Sorry (I wanted to free you from modifying the docstring by yourself since you did an amazing job already). PS: I am running the pre--commit etc.

Wouldn't have been a problem. I believe the docstring is now both correct and well-formatted.

…ter, Cambridge) (speechbrain#2894) Co-authored-by: Rogier van Dalen <[email protected]> Co-authored-by: Adel Moumen <[email protected]>

Add test for various length mask functions

d14ed49

This tests five different functions in SpeechBrain that do essentially the same thing.

rogiervd changed the title ~~Add test for various length mask functions~~ Add test for various length mask functions (code from Samsung, AI Center, Cambridge) May 2, 2025

rogiervd mentioned this pull request May 2, 2025

Change test for make_padding_mask to expose mistake #2895

Merged

13 tasks

Rogier van Dalen added 2 commits May 2, 2025 13:50

Fix rounding error in make_padding_mask

0e718fb

Fix rounding error in compute_length_mask

6eb6828

TParcollet approved these changes May 8, 2025

View reviewed changes

pplantinga approved these changes May 19, 2025

View reviewed changes

Rogier van Dalen and others added 3 commits May 19, 2025 17:00

Better safety margin for length mask

279deb6

Update test_length_mask.py

c395669

docstrings

291b03e

rogiervd commented May 20, 2025

View reviewed changes

Comment thread tests/unittests/test_length_mask.py Outdated

Adel-Moumen reviewed May 20, 2025

View reviewed changes

Comment thread tests/unittests/test_length_mask.py Outdated

Comment thread tests/unittests/test_length_mask.py

Comment thread tests/unittests/test_length_mask.py Outdated

Rogier van Dalen added 2 commits May 20, 2025 18:07

Remove max_value

2e1365e

Correct docstring

a01b88f

pplantinga merged commit ba9364a into speechbrain:develop May 20, 2025
5 checks passed



		def make_padding_mask(x, lengths=None, length_dim=1, eps=1e-8):
		def make_padding_mask(x, lengths=None, length_dim=1, eps=1e-5):

Conversation

rogiervd commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

PR review

Uh oh!

TParcollet commented May 8, 2025

Uh oh!

TParcollet left a comment

Choose a reason for hiding this comment

Uh oh!

TParcollet May 8, 2025

Choose a reason for hiding this comment

Uh oh!

rogiervd May 9, 2025

Choose a reason for hiding this comment

Uh oh!

pplantinga May 19, 2025

Choose a reason for hiding this comment

Uh oh!

pplantinga left a comment

Choose a reason for hiding this comment

Uh oh!

pplantinga May 19, 2025

Choose a reason for hiding this comment

Uh oh!

pplantinga May 19, 2025

Choose a reason for hiding this comment

Uh oh!

rogiervd commented May 19, 2025

Uh oh!

Uh oh!

Adel-Moumen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rogiervd commented May 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rogiervd commented May 2, 2025 •

edited

Loading