2024/10/02

Protecting Against Hate Speech: How Automotive Brands Can Use LLMs to Safeguard Online Interactions

The world is witnessing a disturbing rise in social aggression, with verbal and physical violence escalating in various spheres of life. While road rage, politician bashing, and medical staff abuse grab headlines, the digital automotive touchpoints is not immune to this trend. However, amidst this chaos, there is a glimmer of hope. Specially trained Large Language Models (LLMs) are emerging as effective filters against hate speech, providing a beacon of protection for the online interactions of automotive brands.

The Alarming Rise of Social Aggression

Verbal violence, in particular, is on the decline, but this trend is offset by the surge in online harassment and hate speech. The statistics are alarming, with many industries, including the automotive sector, struggling to cope with the deluge of abusive requests on their digital platforms. Customer service and car configuration solutions, which rely heavily on LLMs, are not immune to this problem. In fact, insiders confirm that a significant number of disgusting requests are entered into these systems, posing a significant challenge to maintaining a safe and respectful online environment.

 

The Power of LLMs in HAP Detection

Fortunately, specially trained LLMs are rising to the challenge. IBM’s open-source AI model, granite-guardian-hap-125m, is a prime example of this technology. This model can classify English texts according to their content, identifying hateful, offensive, blasphemous, or harmful language with remarkable accuracy.

The Granite model shows excellent overall performance compared to various other models across eight major toxicity benchmarks. For those requiring faster processing speeds, there’s also a lighter version available called granite-guardian-hap-38m, which offers quicker results while maintaining good accuracy levels.

By leveraging this technology, organizations can create robust guardrails for their online interactions, protecting customers and employees from the toxic effects of hate speech.

How HAP Detection Works

So, how do LLMs work their magic in HAP detection? Here’s a simplified explanation:

1. Training Data: LLMs are trained on vast amounts of text data, which includes a wide range of language patterns, including hate speech.

2. Pattern Recognition: The model learns to recognize patterns in language that are indicative of hate speech, such as inflammatory words, phrases, and tone.

3. Classification: When a new text is input into the system, the model classifies it according to its content, flagging hate speech and other forms of abusive language.

4. Filtering: The flagged content is then filtered out, preventing it from being displayed or responded to.

 

The Benefits of LLMs in HAP Detection

The benefits of using LLMs in HAP detection are numerous:

1. Improved Customer Experience: By filtering out hate speech, organizations can create a safer and more respectful online environment for their customers.

2. Enhanced Employee Well-being: Employees are no longer exposed to the emotional toll of handling abusive requests.

3. Reduced Risk: Organizations minimize the risk of reputational damage and potential lawsuits.

4. Increased Efficiency: Automated detection and flagging of hate speech save time and resources.

 

Implementing HAP Detection

Implementing LLMs in HAP detection is easier than you think. Both granite-guardian-hap model are available via Hugging Face and can be run on CPU systems, making it possible to operate serverless in watsonx, AWS or Azure. This means that organizations can quickly and cost efficient integrate this technology into their existing chat services, providing an additional layer of protection against hate speech.

 

Conclusion

As social aggression continues to rise, it’s essential that organizations take proactive steps to protect their online interactions. Specially trained LLMs, like IBM’s granite-guardian-hap-125m, offer a powerful solution for HAP detection, providing a robust defense against hate speech and other forms of abusive language. By leveraging this technology, automotive brands can create a safer, more respectful online environment for their customers and employees, while minimizing the risk of reputational damage and potential lawsuits.

What’s up?

Link copied Link copied?

What else drives us

Contact

Link copied Link copied?

Any questions?

Let's connect and find out how we can make things happen.

Ramon Wartala
Associate Partner | IBM Consulting