2024/11/21

AI in Cars for Enhanced In-Vehicle Operating Systems, Powered by LLMs

Authors: Ramon Di Canio (Senior Strategy Consultant, IBM Consulting) & Marius Mailat (CEO, P3 Digital Services GmbH)

Chat with SPARQ OS voice assistant powered by IBMs watsonx large language models

As more and more Software Defined Vehicles (SDV) are hitting the road, the complexity of vehicle features and functionalities have increased, and the software has become an integral part of the driving experience and key differentiator for brands. By integrating Large Language Models (LLM) in the vehicle, OEMs can take the user experience to the next level, improve technical diagnostics, provide new digital services, and improve automated driving.

Background

Communicating with your vehicle is no longer a novelty, it is a standard feature. Integrated systems can help you find the nearest charging station or restaurant, provide real-time traffic information, or control the vehicle’s air condition and entertainment system. The next generation of Artificial Intelligence (AI) based Operating Systems (OS) will offer even more complex interaction, supporting the driver e.g. in the management of daily tasks.

In the future your car could pro-actively manage service appointments at your preferred workshop, book a hotel room on the route, or keep you informed about the latest software updates based on features picked according to your preferences, all this while driving. With AI powered chatbots, your car will become a trusted companion that anticipates your needs and enhances your overall driving experience.

As part of this fundamental change the implementation of LLM-based in-vehicle applications requires the consideration of three key aspects: processing speed, scalability, and security. These aspects can be balanced with state-of-the-art technology approaches:

In-vehicle LLMs offering fast processing speed and reduced latency but may requiring more memory and processing power.
Cloud-based LLMs providing scalability and flexibility but may introducing latency and security risks.

It is now to the automotive OEMs to contemplate about their in-vehicle AI strategy including all its pros and cons that come with the implementation of either an in-vehicle or cloud-based LLM setup.

Integration challenges and considerations of AI and chatbots in automotive systems

The rapid advancement of AI and chatbot technology in the automotive sector presents several challenges and limitations.

Internet Connectivity Dependence: The reliance on internet connectivity poses a significant risk. Disruptions in service and compromised safety are potential issues, especially in areas with poor network coverage. Even in well-developed regions, internet connectivity may be poor or unavailable, such as in multi-story parking garages where mobile network signals are blocked by solid concrete structures.
AI-generated errors: Biases, hallucinations and other AI-generated errors, can have severe consequences on the road. Functional safety is crucial in automotive systems, and deterministic behavior, which cannot be provided by LLMs, is required for functional safety related functions.
Cost and Vendor Lock-in: Using cloud-based LLMs for every user interaction can result in high network and hosting costs. The use of proprietary protocols and systems, like Google Automotive Services (GAS), can lead to vendor lock-in, limiting OEMs’ flexibility. The resulting lack of transparency can also extend to costs, data usage, and decision-making processes.
Complexity and Human Error: The complexity of AI systems makes them difficult to govern and control, leaving room for human error and unintended consequences.

To mitigate some of these challenges, a combination of classical rule-based systems and LLMs can be used. Functional safety relevant topics should be delegated directly to the rule-based system to avoid any hallucinations. Functions not requiring deterministic behavior can be supported by in-vehicle or cloud based LLMs again pending on the importance of the previously mentioned three key aspects being processing speed, scalability, and security.

Balancing remote cloud and localized in-vehicle AI for enhanced performance and security

When implementing speech models in cars, there are two fundamental approaches: Remotely usable models that run on cloud infrastructure and locally usable models that are executed directly in the vehicle. Again, when comparing these two approaches it comes down to the previously mentioned three key aspects of processing speed, scalability, and security.

While remote models have the advantage of being constantly updated and improved, they also have the disadvantage of being dependent on the network connection and are therefore susceptible to latency and interruptions. Locally usable models, on the other hand, offer a faster and more secure alternative, as they run directly in the vehicle and do not require an external connection.

To decide between remote and local models, LLM Proxy and LLM Cache can be used. These two technologies decide which model to use based on the current network latency. If the latency is too high, the local model can be used to ensure a faster and more secure interaction. The LLM Cache can answer recurring requests without token prediction, making them faster and more cost-effective.

To execute local language models, High Performance Computers (HPC) come into play. They are powerful computing systems specifically designed for use in vehicles.

HPC cannot only execute machine learning models for level 2-4 assistance systems, but also local LLMs to avoid latency or network interruptions. This enables a faster and more secure interaction between the driver and the vehicle. Executing LLMs on HPC requires memory and processing considerations, as LLMs require significant hardware resources including GPU and NPU processing units.

IBM provides with watsonx a cloud-based infrastructure to train, host and govern LLMs in private, secure environments for our automotive OEM customers. In watsonx OEMs can train their own models based on available open-source models. For example, the user manual content and the vehicle functions (air condition, infotainment, navigation etc.) specific to a car series can be trained.

IBM offers Granite, an open-source model that can be specialized for various applications and is available in different model sizes. Through fine-tuning, the model can be adapted to the specific needs of the vehicle. This enables the optimization of the model’s performance and ensures a faster and more secure interaction between the driver and the vehicle.

RAG (Retrieval-Augmented Generation) can support local LLMs in querying vehicle and user data in real-time. This enables a faster and more secure interaction between the driver and the vehicle and offers a new dimension of vehicle interaction.

How to keep up and win the OEM race in AI enhanced in-vehicle operating systems

It can be assumed that software providers such as Apple or Google will soon come up with LLMs integrated into their Apple CarPlay or Android Automotive solutions. Depending on the application, either local or remotely usable LLMs could be used. These major tech players have the technology to make LLMs available in the cloud as well as to run them locally in smaller and more specialized versions. At Apple this is Apple Intelligence and at Google it is Gemini.

The integration of Google’s Gemini, into Android Automotive and the automotive sector at large could redefine the in-vehicle experience, bringing intelligence, personalization, and convenience directly to the driver. By incorporating Gemini into Android Automotive, drivers could enjoy enhanced voice assistants that support natural, conversation-like interactions, allowing them to safely navigate, control music, or send messages through seamless voice commands. Local media controls would also see improvements, as Gemini enables personalized media and environmental settings like adjusting seat position or climate preferences based on the driver’s past behaviors.

In addition to personalization, real-time information and assistance would be at drivers’ fingertips, with up-to-the-minute route updates, traffic changes, and nearby points of interest delivered instantly. Imagine also having quick answers to vehicle-specific questions, like tire pressure status or USB port locations, all retrieved directly from the vehicle’s manual.

Multi-modal capabilities in Gemini could take this a step further by combining text and visual cues to guide users with images or step-by-step instructions, ideal for tasks like changing a tire. Furthermore, remote and cloud-based integration with Google Cloud could support Gemini’s learning and adaptability, continuously refining the experience based on driver interactions while keeping data current.

For autonomous or semi-autonomous driving, Gemini’s LLM capabilities could enhance scene understanding, potentially narrating driving actions to explain decision-making in real-time. With additional benefits for vehicle maintenance and support, Gemini could even analyze vehicle data to predict servicing needs or troubleshoot issues, all while maintaining security protocols that prioritize driver privacy.

Apple and Google have not yet announced integration into their car software, so there is still plenty of room for integration with other language models.

This also makes it possible to connect operating systems such as SPARQ OS, P3’s Android Automotive turnkey solution, with a corresponding in-vehicle LLM or remotely connected version of watsonx Granite. Engaging now, the above-mentioned functionalities can be delivered even earlier than by Google or Apple helping the automotive OEM to position as strong and future oriented brand in the space of AI. In addition to the first mover advantage the adaptability of open-source models such as Granite will bring further advantages.

Fine-tuning Granite will allow to realize specific use cases of the OEMs and therefore meet their customers’ needs and expectations more precisely. While the in-vehicle LLM trained by specific vehicle manuals and documents could answer those manual-based questions the remotely connected watsonx Granite would allow for any question and answer beyond the provided documents.

Considering P3’s SPARQ OS, its access to the vehicle hardware abstraction layer (VHAL) is allowing deep bi-directional integration into the steering of the interior with use cases like: “Hi Jane, I am cold.” or “Hi Jane, activate all road assistance for the offroad.” but at the same time can also leverage language models such as Granite to respond to outside vehicle use cases like: “Hi Jane, tell me about the sightseeing opportunities around me.”.

In brief, the cloud-based LLM integration of IBM watsonx Granite into P3’s SPARQ OS answering outgoing API calls will not only improve the current user experience and raise customers’ satisfaction but also finally bring intelligence into the modern vehicle.