Data Privacy Worries Dissuade Enterprises from Adopting Commercial LLMs for Production Use

By Greg Tavarez August 29, 2023

The ascent of the large language model, or LLM, hype train has been nothing short of remarkable; it has captivated researchers, businesses and enthusiasts alike. Not many expected LLM and the likes of generative AI to take off as quickly as they did. Just look at the capabilities demonstrated by models like GPT-3 and GPT-4.

Yet, as the excitement continues to mount, a critical query comes to the fore:

How can organizations effectively transition LLMs from experimental novelties to impactful production tools that yield substantial value?

Well, one key point found in Predibase’s new report is that enterprises are concerned about sharing sensitive data with commercial LLMs.

This has sparked interest in self-hosted or open-source LLM alternatives to retain data control, and nearly 77% of respondents don’t use (or don’t plan to use) commercial LLMs beyond prototypes in production, citing concerns about privacy, cost and lack of customization, leading to an uptick in open-source alternatives. Meta, for example, moved away from building closed-source LLMs like LLaMA-1, replacing it with LLaMA-2, available as open-source and free for commercial and research applications.

For those not familiar, self-hosted or open-source LLM alternatives refer to deploying and using LLMs on an organization's own servers or infrastructure, or adopting publicly available models that can be customized as needed. These options allow organizations to have greater control over their data, as the models operate within their own environment without the need to share sensitive information with external vendors. This approach empowers enterprises to tailor the models to their specific requirements while mitigating concerns about data privacy and proprietary information.

The reason organizations are turning to customized LLMs is to achieve more accurate and tailored results. Most teams plan to customize their LLMs by fine-tuning or reinforcement learning with human feedback.

Still, there are roadblocks that are in the way of fulfilling these plans. When it comes to fine-tuning, for example, there tends to be a lack of data and the overall complexity of the process like managing infrastructure.

Platforms like Predibase address these challenges as they are designed to help developers build AI-powered applications with custom open-source LLMs. Built on managed infrastructure, these platforms provides a way for teams to deploy, operationalize and customize open-source LLMs on data in the cloud.

“Thanks to the widespread recognition of OpenAI’s ChatGPT, businesses are in an arms race to gain a competitive edge using the latest AI capabilities. Still, they require more customized LLMs to meet domain-specific use cases,” said Piero Molino, co-founder and CEO of Predibase. “This report highlights the need for the industry to focus on the real opportunities and challenges as opposed to blindly following the hype.”

Edited by Alex Passett

Get stories like this delivered straight to your inbox. [Free eNews Subscription]