Columbus, OH, USA
A trans voice training app.
A gamified voice training tool designed to help transgender people stay motivated throughout their vocal transition. Track accomplishments and long-term progress with awards and easy-to-digest data visualizations.
Tools: Figma, Google Suite, Adobe Photoshop, Zoom
Lead UX Designer (research, interaction design, visual design). With feedback from my mentors at Thinkful and input from real trans people working on their voices, I orchestrated the entire project, from generative research through usability testing of a high-fidelity prototype.
Constrained by a brief timeline, a non-existent supplementary budget, and the limitations of our prototyping software. Designing an MVP that takes audio input and returns visual analytic feedback requires some sophisticated back-end development that was not within the scope of the project, so the goal of the MVP was to validate that the types of data returned satisfy user needs.
Surveys for people voice training
Interviews with trans people on voice
Interview with a speech professional
Auxiliary scholarly research on voice
Primary and secondary personas
User flows & user stories
Branding & style guide
Usability tests with trans women
For many trans people, changing their voice to better align with their gender is an essential piece of their transition. Depending on an individual's voice goals, finding your best voice can take anywhere from several months to several years. Existing apps that claim to serve as voice training tools fail to comprehensively meet the needs of trans people, leading to minimal long-term engagement, high drop-off rates, and ultimately a less efficient voice training journey. How can an app encourage long-term adoption and promote vocal health while offering appropriate voice measuring tools, progress tracking, and informed goal-setting?
Transgender people need an informed and comprehensive voice training tool that encourages long-term engagement.
While it seems like the world is growing ever more accepting of gender & sexuality minorities, living as a transgender person is still very dangerous, even in the US. Trans people are not protected against discrimination by federal law and in 28 states it is still legal to deny someone housing or employment for their gender identity. Laws in several states, nicknamed “trans panic” laws, can even get a murderer of a trans person acquitted on the basis of "self-defense," backed by an idea that trans people are dangerous deviants. As a whole, the trans population is at extreme disproportionate risk for physical assault, sexual abuse, harassment, mental health issues, and murder relative to their cisgender, or not-transgender, counterparts.
Among trans people, trans women are most at risk for violence and marginalization. One way for some trans people to evade violence and discrimination is to strive toward “passing,” that is, being read as their correct gender by strangers according to societal norms. For example, women, transgender or not, do not need to wear makeup or dresses to still be women, but even today we have expectations of a particular kind of femininity from women. By “passing” as a cisgender woman, trans women are often safer in public settings including bars, restaurants, and their jobs. Voice is a major component to this; we have very specific ideas of how men and women talk, even if most people couldn’t name the components of voice that they gender in other people.
While “passing” has safety benefits to transmasculine people, too, their paths are very different. Many, but not all, trans people use gender-affirming hormone replacement therapy (HRT) to change their bodies, and for a transmasculine person, adding testosterone to their body naturally lowers the pitch of their voice by thickening the vocal folds. For trans women, most of them went through a traditionally masculine puberty where their bodies were pumped with testosterone for several years. The changes testosterone makes to your body are largely permanent, and these women are then stuck with their “masculinized” vocal cords. Feminizing hormones used in HRT like estrogen can’t undo the impact of testosterone on the vocal cords at this point, and women typically have to make conscious, concerted efforts to change the pitch of their voice, with or without the help of a professional. Altering the voice is often seen as a necessity for trans women, not only to feel more personally comfortable with how they sound as a woman, but as a means of self-defense in a transphobic world. Gender-affirming voice training can literally save a woman’s life.
The vocal folds in action
There are a handful of speech modification apps on the market, but each misses the mark. One of the more popular apps is EvaF, which has training exercises, videos, and a multitude of tools for trans women working on voice, but has been chastised for its steep paywalls to access even the most basic features. In addition to all their other risks, trans people in the US are twice as likely as their cisgender counterparts to live in poverty and a pricey subscription service like EvaF is inaccessible to them.
Other apps promote unhealthy voice habits, such as Vocular, an app targeted at cis men to lower their voice at any cost as a means of getting more sexual partners. Another app, InFormant, is a cool desktop tool designed by a trans woman which gives users all kinds of live feedback data on their voices, but offers not even a basic user interface. Users are left to their own devices to interpret complicated audio data. (If you can do this, check out Clo's project on GitHub!)
The most popular voice training app for trans people is the freeware Voice Pitch Analyzer, or VPA. The Android version has some past data storage, but the iOS version is still bare bones. Users read a passage of English literature and a simple graph shows where their non-explicit pitch range landed on a male/female chart. There is no playback or detailed statistics. Despite its popularity, of 24 VPA users I surveyed, 75% rated their satisfaction with it as just a 1 or 2 on a five-point scale.
Surveyed users dissatisfied with Voice Pitch Analyzer
S.W.O.T. Analysis for Voice Pitch Analyzer
Takes audio input
Quantitative voice data
Difficult speech prompts
No storage or playback
No progress tracking
Replace reading prompts
Archive data entries for playback
More data metrics
Using Google Forms, I surveyed 51 people who have or had interest in modifying their voice, with four others screened out for having no interest in voice training, past or present. As I expected, the majority (nearly 57%) had never seen a professional at all to work on their voice, with just 15% having had regular ongoing sessions. Among resources online, 65% of respondents had turned to YouTube for voice modification advice and education. Something else that surprised me was that when asked to rate their understanding of their own voice on a five-point scale, more than 70% self-rated a 3 or higher, indicating that the people working on their voices are putting in the time and effort to do research to understand how to modify their voices even without professional assistance. This would later influence what kinds of educational copy I needed to include in my MVP.
51 total participants
About 2/3 trans women
Have never worked with a speech professional
Have used YouTube as a voice training resource
Rated understanding of voice at 3+ on five-point scale
From my pool of survey respondents, I interviewed four trans people who reported actively working on their voice training goals to get more in-depth information about what their routines are like, what feedback they value, and what resources they find helpful. Between these users, who were a mix of genders with different exposures to professional training, one of the most common asks for an app was just more in-depth feedback on the shape of their voice in general. Even trans people who have modified their voice to successfully “pass” for their gender can get caught up in overvaluing the pitch level of the voice, where other elements of voice such as resonance and inflection patterns also play heavily into how we culturally gender voice.
Barriers these users faced along their voice training journey included difficulty establishing practice routines, staying motivated to work on their voice long-term, and finding the encouragement to learn to love their voice throughout the process.
Well-rounded feedback data
Tips for healthy goal-setting
Audio storage & playback
Clear data visualizations
A way to connect with others
Kind & comprehensive copy
Varying prompt difficulty
Less dependence on the gender binary
Professionally-informed educational content
Formant density as a potential measure of resonance
Examples of Speech Levels
Word level: Naming state capitals.
Sentence level: Answering a simple question in a complete sentence.
Conversational level: Expressing thoughts articulately as you have them.
I myself am a transmasculine person who underwent about a year of speech therapy in 2019. Even working with a speech language pathologist, my SLP had trouble recommending her trans clients one specific app to use outside of our formal sessions. I was lucky to interview her for an expert-level perspective on this design. She has specialized in trans speech therapy in central Ohio for the last six years.
An app built for trans people should be sensitive to its userbase and do what they can to minimize gender dysphoria while using the app, which is most easily done by avoiding using firm binary language or gender standards.
Other apps or online communities are inconsistent in using accurate information, encouraging good vocal habits, and acknowledging the complexity of voice and the many components that make a person's unique sound, such as resonance and inflection patterns. Resonance is hard to quantitatively measure, even for trained professionals. With pitch, the fundamental frequency, being “F0,” F1 and F2 are other frequency formants women who self-study try to use as a measure of resonance, but they actually inform more about the health and stability of your voice rather than a gendered quality of resonance. She suggested that there are patterns in formant density for men versus women, where women have more loosely distributed formants, and maybe a density ratio would be an effective way to get quantitative resonance feedback.
Personas & User Stories
From this research, I created a series of deliverables to inspire the design moving forward, including user stories, common user flows, and a set of personas for different app use cases. With trans women being the most in need of voice training help, our primary persona is Victoria, an out woman working on long-term goals feminizing her voice. Like 70% of my surveyed users, she has confidence in knowing how her voice works and she has had some exposure to professional voice training. Her priorities are to get detailed quantitative feedback on her voice, and then be able to easily store, organize, and playback those audio entries to monitor her progress over time through different metrics.
As someone working on long-term voice goals, Victoria wants to record and save voice clips so she can compare them over time to track her progress.
Wireframing is essential to establishing the information architecture of the app. Here are some of my original sketches for the wireframes. Not all of these early ideas made it into my later prototype, such as the recording state (slide 2) was to include a visual live pitch feedback feature to help direct the user's voice while they record a new entry. This animation just wasn't feasible for the technical restraints on the MVP.
I took an ambitious approach to try to make as many screens as possible to test in my later prototype to validate chosen metrics, navigation, organization, and the educational copy. Based on my interviews, it is important that users have digestible descriptions of what the app measures and why, so even these first digital wireframes include fully-written drafts of the information page.
To help keep our users motivated, I wanted to experiment with gamification. To do this, I included an Awards system alongside goal-setting, where users can collect medals and trophies for meeting their goals for consecutive sessions or maintaining a voice practice routine within the app. Though this MVP-level implementation is very simple, I could see other types of gamification coming into play, such as giving users actual rewards of some sort, like collectable phone wallpapers, shareable graphics for social media, or unlocking audio-based mini games.
I took inspiration from retro 80's arcade games for my colors and typography to add to the game theme of the app. I initially accepted the risk to use 8-bit font styles for my titles and headers, coupled with a contemporary sans serif Roboto for body and caption elements. Ultimately I don't think the 8-bit types test well for accessibility and I'd like to find more legible alternatives where I can for future iterations.
The color scheme was built on a foundation of purples and greys, where I chose purple as a subtle nod to the pink and blue of the trans Pride flag and greys to add dimension and respectability to this largely-scientific app. Ombre throughout signify movement and transition as you work through your voice training goals. The accent green used to represent goal zones is meant to echo the dirty green of early Nintendo and other 8-bit arcade games, but I elevated the color to serve a more contemporary punch. The gold accent is used sparingly to correspond with the awards system.
I prioritized discretion in the logo by focusing on the microphone and audio aspects of the app over the transition part. I wanted a simple and innocuous name so that it wouldn’t arouse suspicion if a closeted trans person was caught with it on their phone. After toying with ideas to try to make the app sound like a game or voice workout, I decided on simply Voice Goals to emphasize the goal-oriented system. Exercising your voice is much like exercising any other muscle in your body, so the type of retro arcade game I had in mind was a sporty game like Track and Field. This retro style of microphone in the logo looks almost human-shaped, so to lean into that sporty idea, the microphone “person” wears a star medal around their “neck.”
After building out my branding materials, I applied the new colors and typefaces to the wireframes and made adjustments to better meet iOS design conventions of spacing and sizing. These three sample screens illustrate the most common screen flow: Starting a new voice entry from the dashboard, recording 30-seconds of speech through interactive prompting, and then instant data visualization feedback and quantitative statistics on a selection of voice metrics.
Video Tour of the Prototype
I tested the prototype with five transfeminine participants sourced from my original survey. These women varied in age, technical experience, voice knowledge, and voice training experience. I began each session with an expectancy test on the main nav bar. Users had little trouble predicting how most of the navigation worked, where “Live” gives real-time audio feedback, “Past” shows data and allows playback for stored entries, and “Progress” shows visual representations of your goal metrics overtime as set on the “Goals” page. The plus-sign button stumped all five, and only one user later tapped it to record a new entry in the full prototype. Another user struggled with its active state, where she believed it to be more clickable when active, which wasn’t an issue on any other page. To help avoid errors in the future, I could add a text label to this button and tone down the brightness of the active state button.
"Goal Zone" box placement made it look like a "completed page" to one user
The primary goal of the usability tests was to see if the app was easy to navigate and understand; would users intuit how to open the pages they were looking for and interpret the data returned to them? I was also looking to validate that the voice metrics I chose to include based on the survey and previous interviews were actually relevant to these users’ voice goals.
The most common issue users came up against was in part caused by the prototyping software. Scrolling a virtual phone on a desktop browser isn't intuitive, and all five users struggled to understand that the entry data pages were scrollable. While the problem may disappear on a mobile prototype, one solution on the design end would be to enlarge the graph, which would both make the graph more readable and push the Goal Zone down a few pixels. If the box were not wholly exposed above the fold, it would be more clear to scroll and users would find the rest of the page.
I was happy to learn that users understood the color connections between the graph highlight and the Goal Zone box. I would like to do further testing to see how else this correlation could be communicated to color blind users, such as a demonstration through an onboarding tutorial.
As far as validating my metric choices, among the users who had experience with other voice training apps or professional intervention, each called out a different measure they were excited to see displayed. From the survey results where so many participants reported being dissatisfied with Voice Pitch Analyzer for not including a resonance measurement, I expected my test users to ask for more data metrics, but for those who mentioned resonance at all, they were just excited to see that formant density was in the works.
Despite the original struggles with finding the scroll, once users figured it out for one page, they quickly adopted it for other pages and could navigate the rest of the app with ease. Users said that they really appreciated the consistency of page layouts and once they overcame these initial barriers to understand the design, they felt confident in how to find what they were looking for.
I also took this opportunity to interview these women on their voice training needs in addition to my earlier interviews and learn about the resources that they’ve found most helpful in their work so far.
The main barriers to these users working on their voices were establishing practice routines and finding credible resources on identifying vocal components to modify or how to set realistic incremental goals. From this, I had some great conversations about the Settings option for push notifications and how that could be expanded, the awards system for reaching and maintaining goals (where awards are explicitly about long-term accomplishments), and getting some general information on the Info page about setting smart steady goals that prioritize sustainable results and your long-term vocal health over fast gains. The awards shown in the prototype are ones like “Spent more than 60% time in goal zone for three consecutive entries,” with the idea being that it would continue to encourage users to increase that percentage until they achieved the appropriate awards to alter their pitch goal zone.
Voice control and other audio technologies are growing rapidly, and there’s opportunity for some exciting integrations for an app like this. With machine learning, we could teach our program to understand the difference between desirable and undesirable qualities for an individual user’s voice to give more in-depth measurements and scoring throughout voice training. For example, identifying if a woman is using her true speaking voice or if she is relying on an artificial falsetto to reach higher pitches.
In line with machine learning, while I didn’t have a favorable review of the voice app Vocular, it offers celebrity pitch comparisons. This idea used well could be really fun for users. It would also be reassuring for trans women to compare their voices to cis women who may also fall on the lower end of the pitch spectrum instead of standardized ideas of female ranges. There are a number of open-source libraries for celebrity voice clips to collect this data; I did the most research into the VoxCeleb dataset, which features over one million celebrity utterances and counting.
For the booming prevalence of virtual assistants, users' Alexa or Google Home could remind you to record regular entries and to record entries for you on-the-go through their own microphones.
Accessibility is very important to me, and there’s a number of inclusivity features I’d like to incorporate with the help of a team. For one, I want my chart data to generate a standardized report for a screen reader to read back the analytics for each voice entry. Typically offering an HTML table takes the place of the visual graph for a screen reader, but I believe it would be more personable and meaningful to have paragraph-level descriptions of graphs as well.
One thing I really love about studying voice is seeing the vast ways that language affects culture. I would like for users to be able to set their geographical region and preferred speaking language such that the goals and metrics take into account any cultural differences in how different groups of people gender elements of voice. Women in the southern US speak differently than women in south Asia speak differently than women in South America, etc., and every language has its own unique sounds and vocabularies that can be gendered in a speaking voice.
I would also like to more formally address my accessibility concerns for my 8-bit typography choices and tests some other color schemes to reach AAA compliance.
Next Steps & Final Thoughts
The main hurdle to moving forward with Voice Goals is partnering with developers so we can begin building and testing the audio input, generating data analysis of the speech, and outputting the graphs automatically. This is the most essential piece before launch, of course. Then, as promised in the prototype, we need to identify useful formant density ratios to give our users a simplified reading for resonance measures. Beyond that, there’s still the matter of building in further accessibility and gamification down the line.
Overall, I am very proud of this project and I am excited to see what we can do with Voice Goals in the future. A product like this is greatly needed and I would be thrilled to talk with any interested developers or grant writers on next steps.