This article is an introduction to our “Ask Anupam” series: Join 113 Industries Co-Founder and President as he tackles commonly asked questions and explores trending technology topics.
Written by Anupam Singh, co-founder, President, 113 Industries. Guest lectures on NLP, AI at Carnegie Mellon University and UC Berkeley
ChatGPT and Large Language Models (LLMs) are everywhere. As a result, many of our customers ask me: Should I use them or create my own custom AI model?
The answer, like in many other real-life and business situations, is ‘it depends.’ Before I expand on that, let us cover some basics.
The Rise of Artificial Intelligence (AI):
Artificial intelligence (AI) has exploded in capability and availability over the past few years. Moreover, tools like ChatGPT (trained on GPT-3 and subsequent models) and Claude, powered by increasingly more powerful AI models, have democratized access to powerful capabilities that were once the domain of programmers, engineers, and domain experts. The barrier between the common man and machine is shrinking by the minute. All you need is a browser, and voilà… you can summarize complex documents, generate professional-grade copy, create your own pictures and videos, write code, and much, much more. And this is happening at breakneck speed.
Companies Transforming the AI Industry:
This field exploded after Google Research released the seminal paper “Attention is All You Need” and the “transformer architecture” in 2017. Subsequently, the release of ChatGPT by OpenAI was an inflection point. How many of you knew the term “LLM” or “Large Language Model” three years ago?
The spikes in search volumes are illustrated in this chart from Google Trends:
Expanding Datasets Increase AI’s Knowledge:
AI models like GPT-3, GPT-4, Claude-2, or the myriad of other models accessible to the public are pre-trained on huge and ever-expanding datasets. Although details are not officially disclosed, estimates of the size of GPT-3’s training data range from tens of terabytes to petabytes. For reference, one petabyte is equal to 1000 terabytes. OpenAI confirmed that the Common Crawl corpus, containing years of web scraping data from online books, Wikipedia articles, web pages, etc., made up around 60% of the dataset. This alone would be several petabytes.
This broad training enables the model to understand and interpret language (we are only focusing on text data for this article), making it possible to provide tools to converse, write, and reason about a wide range of topics. However, without an understanding of a business’s specific products, customers, or data, this understanding is very likely to remain generic.
The Benefits and Drawbacks of Public Models:
Instant availability and broad capabilities make it very enticing to use public AI models and the tools built on top of these models. Within seconds, anyone can generate paragraphs of human-sounding text on nearly any topic or prompt for creative writing. This can provide great value for general research, brainstorming, or creating and testing hypotheses. However, they may fall short when more precise analysis is crucial.
*Note that a related issue with these models is whether they are open source or not. This is a topic deserving its own article.
Without training on domain-specific data, public models lack the necessary precision. Their knowledge and understanding are deep and powerful but not necessarily derived from data specific to your company, industry, domain, or use cases. This can lead to outputs that sound intelligent but may miss key details or nuances. Additionally, public platforms likely won’t align with your terminology, language, brand voice, and visual style. You need to ensure that the model you are using knows and understands the data relevant to your business and your use cases.
Furthermore, if you are in a field where data privacy, rights, and regulatory governance are important (such as medical or finance), you may not have the required control over what information you share with the model or what information the model outputs, which could violate your privacy rules and policies. A related issue with public models is the lack of transparency around exactly what data was used to train the models and any inherent biases the model may have learned.
Then there is the challenge of seamlessly integrating the model and applications built on top of it into your company’s workflow and business systems. You may not have the flexibility and control you need to design and architect these systems optimally. Many details need to be taken into consideration, including control, model efficiency, performance, latency, flexibility, and customization, which might necessitate exploring alternatives to public models.
When Custom AI Models Add More Value:
So, what are your choices? Broadly speaking, you can either fine-tune or train an existing model or build one from scratch. Although building a model from scratch provides ultimate control, flexibility, and performance, it’s not for the faint-hearted. As discussed above, it can take a prohibitive amount of time and money, along with a vast amount of data, assuming you have the required expertise and skillsets to do so. A more popular strategy is to fine-tune or train an existing model with your data. Providers of Large Language Models (LLMs) are investing a significant amount of money and resources to make this process easier. Additionally, there are entire companies dedicated to developing tools to help you structure and organize your data, label and annotate the data, manage your training datasets, assess and track model performance, and manage your custom AI model.
In conclusion, the plug-and-play availability of public AI models generates justified excitement. However, for business objectives requiring reliable, optimized, and precise systems, customizing them is a better choice. Just as tailored clothing fits better than off-the-rack, bespoke models trained on specific data transform vague AI promises into strategic advantages. While easy access now puts basic intelligence into more hands than ever, customization unlocks AI’s full potential.
Interested in learning more about building a custom AI model with 113 Industries’ proprietary software, Jacquard AI? Reach out to me, Anupam Singh, directly for a one-on-one discussion!