Textio’s team has been building with language models since 2014, and there are several large language models (LLMs) powering Textio today. As new generative AI tools that rely on LLMs become available, it is increasingly difficult to evaluate the differences, benefits, and inherent risks among them.
In this post, we’ll share a few best practices for working with LLMs and generative AI in a safe and intentional way, based on Textio’s expertise in both language technologies and bias mitigation.
What you’ll learn:
First, let’s define “bias,” particularly in the context of workplace communication. Bias doesn’t necessarily mean saying outright harmful or prejudiced words. Rather, we’re more often talking about unconscious bias, which UCSF defines as “social stereotypes about certain groups of people that individuals form outside their own conscious awareness.”
Unconscious bias is present when different groups of people, over time, get paid and promoted differently for exactly the same work. Language, including communication in the workplace, is biased if it leads to biased outcomes.
Bias is deeply embedded in workplace culture. It can be hard to spot when it’s happening, but the impact it has on people’s careers is not. Did you know that women of color make up 19% of entry-level roles but only 5% of C-suite roles? In comparison, white men make up 33% of entry-level roles and 61% of C-suite roles.
Small discrepancies at key moments in people’s careers—when they’re getting hired and when they’re getting promoted, for example—can have an outsized impact over time on career growth and lifetime income.
When teams use Textio in recruiting and performance management, they have better tools to communicate fairly and consistently. As a result, Textio customers are able to hire and promote the best possible talent, regardless of background, reducing bias in that process and developing better products for the diverse customer base they serve.
Computer algorithms are very good at identifying patterns in data, which makes them an especially effective tool for understanding bias.
Textio has been building AI to detect and mitigate bias in HR since 2014. For example, Textio is very good at predicting who will apply to a job based on language in the job post. When companies achieve a Textio Score of 90 or higher on their posts, they get different outcomes: They attract 33% more women and 17% more people of color to their open jobs.
Without the right safeguards in place, however, AI also has the potential to make patterns of bias much worse. There’s the classic example of the application that Amazon built to screen candidates. It identified a relationship between the word “women’s” and no-hire decisions. So, the algorithm started disqualifying resumes from women engineers.
The cultural cues that AI picks up can be much more subtle than the word “women” in women’s resumes. Textio has assembled a diverse team of data scientists, linguists, DEI experts, and HR professionals who deeply understand how language can encode and perpetuate bias.
We also understand it’s not possible to build AI entirely without bias, which means we’re intentional about what outcomes we’re trying to create. Our goal is ensuring everyone has access to opportunity, including marginalized people who often haven’t had access in the past. That’s our bias, and we embrace it in how we build.
Here are the criteria we use to ensure that we’re not amplifying the bias that we’re trying to fix.
Three steps to mitigating bias in generative AI
Large language models learn by looking at what’s been written in the past and guessing, based on those patterns, what word comes next.
Data sources matter a lot! Imagine the difference between a model trained with Reddit comments versus New York Times articles. These sources have very different authors, they use very different words, and they’re addressing different audiences.
Another popular data source is out-of-copyright literature. But this will contain a higher proportion of text authored by educated Western men, and descriptions of women and underrepresented ethnic groups may reflect unfair stereotypes from earlier time periods.
How we approach this at Textio:
Data annotation is what enables models to learn patterns and make predictions. The way you annotate the data (or fail to do so) can encode bias into the model itself. This is really hard, and it requires specialized expertise in linguistics. When you get it wrong, not only are you propagating bias, but also your model won’t produce good results.
The way we approach this at Textio depends on whether it’s a subjective or objective annotation task. A subjective annotation will vary depending on the point of view of the annotator. For example, the data powering Textio’s age graph is subjective annotation from people across different age groups. An objective annotation should be consistent. Think: labeling grammatical structures.
Best practices for subjective annotation tasks:
Best practices for objective annotation tasks:
Always keep records of who annotated your data, what instructions they followed, how you controlled for demographics, quality of data set, and approved future uses for the data set.
Even if you follow all the best practices, bias can hide in any large data set. This is especially true for generative AI, which brings in new data and produces different results over time. Maybe you missed a vector of bias in your original data set or with your annotators.
AI is “smart,” and it will pick up whatever patterns it finds. If you’re using an external provider, “model drift” can even introduce biases over time. At Textio, we carefully design tests to uncover when the algorithm is producing biased results.
How we approach this at Textio:
These are exciting times! New generative AI technologies will transform a range of tools and processes in ways that we can’t imagine yet. However, these new technologies carry risks. If you just use them off the shelf, they can generate harmful content and might actually undermine your goals and lead to worse outcomes.
What’s the good of writing a job post faster if it attracts the wrong candidates and makes the interview cycle longer instead of shorter? What’s the good of rushing through performance feedback if it fails to help people on your team develop and grow?
Luckily, there are well-established best practices for mitigating bias and achieving the outcomes you want—and using generative AI to do it.
If you’re looking to make an impact with generative AI, apply these best practices and keep bias in mind every step of the way. That’s how real progress happens.
At Textio, we believe that AI can be a force for good and help us all build better products and workplaces.