-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistently muffled sound for some multiple speaker sessions #446
Comments
List of possibly impacted sessions:
Immediate recommendation: Avoid these sessions in validation until we re-process them. This includes all sessions from 2018. Sessions from 2017 without the issue are likely |
We need a way to re-process the audio recordings when they have **not** been addressed. Currently we are not doing anything else but replacing them, although adding a note might be nice as well.
We could add a field to every recording to mark when the audio has been reset so that then one could filter to only show the entries that need to be revalidated. |
I have updated the scripts to replace sounds. Locally, when running the sessions for I have not tried the scripts in production yet, as we should first decide on whether we want to have some mark that the recording should be revisited (or maybe resetting the annotations for it). Once we decide and that's implemented, I could re-import a not-yet-validated test session to check how things are working (and to ensure there's usable 2018 content for upcoming validation sessions), but I think it's important to decide first what the process to replace recordings will be. |
@fbanados When I have reviewed the recordings, and then followed this up with Rose, as long as what is being said can be identified (perhaps with some noise or slightly less loudly pronounced) and is judged to be spoken correctly (as judged by Cree speaker, and matching the transcription), then we have judged those audio snippets as good. It's when the recording is clipped at either end, or the speaker doesn't say the entire word, or pronounces sloppily (adding or removing an -h-) or faintly, then that has been marked as bad. We have also marked as bad audio where there is some significant noise resulting from the primary speaker coughing or whispering out loud on top of the secondary speaker. Thus, what has been judged as good, probably would remain judged as so, even if we'd replace the less-optimal current audio with revised improved snippets. What this would have some impact on is that the crappier audio, even if pronounced properly, has rarely been starred as an exemplary pronunciation, which judgment might change with the improved snippets. I probably wouldn't have the speaker revalidate the improved snippets, but that is something that we would take on ourselves, but using the original best snippets as the reference point for what is good. It would probably be good to have some indicator where the recordings that have been judged as bad or good already have been replaced by the improved snippets, as you suggest. I'm not sure we'd want to keep the crappier audio, when there is a better snippet - how to rule them out is another matter (e.g. adding a new button like |
This is a duplicate of #156 |
I have ran the script on production for session |
I've added four extra sessions to work with Rose. See extra field in google spreadsheet. |
Symptom: In some sessions, the recordings for only one speaker are heard clearly. The other ones are noisy, sometimes speakers can be barely heard, sometimes there is either background noise or noises from another speaker's microphone.
Hypothesis: Although the time annotations in ELAN files used for import are correct, they are not always being matched to the appropriate sound file track. This would likely mean that for some speakers we are hearing them through the incorrect micrphone/track and that is muddling the sound. Likely an extraction issue.
Diagnostic: The hypothesis is reflected in code. The import scripts are trying to deal themselves with inconsistencies in naming for the folders and between .eaf and .wav files, as seen in
extract_phrases.py:find_audio_oddities
. Bug is manifest in line 406 (Introduced Jan 2021). If the underscore in theeaf
file happens before the number, the regexp will not filter out anywav
files in the session folder (they are only being filtered for a variation ofTrack
in them, not a track number), and the script always takes the first from the filtered non-empty collection of files, not checking whether there's more than one option. IF the track sequence appears exactly in thewav
filename, this is not an issue as that case is detected earlier, but it becomes a problem if, for example, there is a leading 0 in the number in thewav
filename that is not present in theeaf
name or viceversa.Possible Solution: Simplest solution would be to re-import the sound files for every entry whenever they are different from the one in the file. Because this comparison is relatively expensive (requires to re-compress the wav entry to bytewise-compare the data with the compressed entry stored in the database, for each candidate that previously was just being skipped if the entry was already present in the database).
However, this might invalidate previous validations. We could assume that the new entries should be "strictly better", so if they were already marked as good they should remain as such, but is unclear if the bad ones are to remain bad or be revalidated.
An alternative solution would be to re-import the sound files only for the entries yet-to-be-validated, and generate a new recording for those that have already been validated. I will generate some stats about the scope of the problem and its impact.
Current Impact: TBD. Must write a script to check the exact number of entries impacted. Likely, all entries where the underscore happens before the number, that is, for example,
2017-05-11pm-US-Track_01.eaf
instead of, say,2016-01-13am-Track 3_001.eaf
. This represents 123 out of the 426 folders in/data/maskwacis-recordings
.The text was updated successfully, but these errors were encountered: