Decoding Gender in Cough Sounds: A Transformer-Based Analysis.
Academic Article
Overview
abstract
OBJECTIVE: Various components of speech, such as pitch, volume, and resonance, influence gender perception, but little is known about gender differences in non-speech upper airway sounds such as cough. This gap has implications for gender-affirming voice care, as coughs are harder to modulate. We aimed to explore how cough acoustics differ by gender using a transformer model with self-attention to identify salient cough features for gender classification. METHODS: We analyzed 327 cough recordings (154 male, 173 female) from the Coswara dataset, using a 70/15/15 split for model training, validation, and testing. Preprocessing included resampling, silence removal, normalization, and trimming to uniform length. The HuBERT transformer model was used for its ability to handle unstructured audio. Gender balance was verified through SMD (standardized mean difference) screening across seven variables, all of which showed negligible imbalance. RESULTS: On the held-out test set, the model achieved an accuracy of 84.0% with an F1 score of 0.8462 when classifying gender from cough series, compared to 71.4% accuracy and an F1 score of 0.7308 when using single-cough/first-cough samples. Attention-aligned cough visualization revealed the highest attention on the explosive phases of the cough, suggesting that these segments encapsulate the most salient gender-distinct acoustic cues. CONCLUSION: Cough sounds contain gender-discriminative features detectable by transformer models. Attention to specific cough phases reveals physiologically meaningful segments in cough sounds supporting gender classification. These insights may inform gender-affirming interventions, particularly for non-speech sound production. Future research should explore further socio-demographic factors shaping cough acoustics.