Large Language Models – Who are the key players?

Large Language Models (LLMs) have been among the hottest topics in the past few months and will likely become one of the highlights of 2023. The fact that 77 per cent of businesses using Natural Language Processing (NLP) plan to increase their investment shows that LLMs are not just hype.

LLMs are designed to understand and generate human language; they have a highly user-friendly interface that people can use to ask questions and complete different tasks, such as text summarization or code generation. Unlike other machine learning models, LLMs typically use a neural network called a transformer and are trained on massive amounts of text data. As a result, they’re effective at language and text-related tasks but not (yet) at math.

These algorithms rely on deep learning techniques, and will transform the world in many ways. LLMs have already improved language translation, customer service, healthcare, and education, but their impact will expand and become more significant in the years ahead.

And those that will lead the way are companies that are developing and enhancing these models so that they generate human-like language at scale.

Six Key Players in LLMs Development

ChatGPT by OpenAI

The fourth and latest iteration of ChatGPT has been one of the most prominent innovations in the past months, as this model is more creative and provides a longer context and visual input. Its responses are 40 per cent more factual and it is 82 per cent less likely to reply to requests for disallowed content.

Users can also fine-tune it for specific natural language processing tasks (e.g., language translation). However, ChatGPT-4 is only available with a paid subscription and an upgrade to the Plus version, which costs US$20. Otherwise, everyone can use the ChatGPT-3.5 free version on OpenAI’s website.

Bard by Google

Bard uses the Language Model for Dialogue Applications (LaMDA) by Google, provides real-time responses, and uses the internet to research. It’s free for everyone with internet access, and unlike ChatGPT, it was trained on a dataset centred around conversations and dialogue.

Hence, Bard understands the user’s intent and the nuances of their question. Although it can provide more human-like responses (and even claim it can feel emotions), ChatGPT outperforms it at summarizing large texts.

Currently, only people in the U.S. and U.K. can join the Google Bard waitlist, but the access is free.

Auto-GPT by Significant Gravitas

This open-source AI project used ChatGPT for its foundation, but differs by having decision-making abilities. That includes self-prompting and independently producing the needed prompts to finish a task.

Many refer to it as the tool with the first traces of Artificial General Intelligence (AGI), as it can function without human intervention; while users must guide ChatGPT throughout every step, Auto-GPT can intuitively develop a whole project based on one prompt. It uses AI agents that instruct the ChatGPT component on what action to take, which is why it can auto-develop, debug, and self-improve.

To use Auto-GPT, users must install the latest version of Python on their computers, requiring either programming skills or the ability to follow online instructions step-by-step. Auto-GPT isn’t free; it also requires adding billing details and setting a spending limit.

Bing by Microsoft

Bing uses ChatGPT, but unlike OpenAI’s model, it has internet access and performs like an AI-driven search engine. Unlike ChatGPT, which has 2021 as the knowledge cut-off date, Bing provides up-to-date responses.

It allows only 20 replies per conversation, suggests follow-up questions, and has three conversation styles: more precise, creative, and balanced. It footnotes each response with a list of the references it used. Users can access it by opening the Microsoft Edge web browser, accessing Bing search, and choosing the chat option, or by adding it as an extension to their browser.

Dolly 2.0 by Databricks

Considered the first truly open-instruction-tuned LLM, Dolly 2.0 is a text-generative model that powers apps such as text summarizers and chatbots and allows commercial use by independent companies and developers. Databricks employees generated 15,000 records to train Dolly 2.0, but its accuracy levels are flawed.

Besides providing potentially incorrect answers, it can be offensive, and can respond only in English. That’s why it’s better applied in addressing customer support tickets and generating code than, for example, creating long-form content.

Running Dolly 2.0 requires basic to medium programming skills, but there are various tutorials on installing it locally.

Megatron by Nvidia

Nvidia’s NeMo Megatron LLMs Framework helps organizations accelerate data training, and the new updates will align with models as large as 1 trillion parameters. It’s a top-to-bottom stack encompassing GPU-accelerated machine-learning libraries, hardware, and networking optimizations designed explicitly for cluster deployments.

You can access Megatron via GitHub.

We’ve only just begun – you can help

LLMs are becoming widespread across different industries due to their user-friendly, text-based interfaces that help people accelerate their tasks and be more productive. These models will only become more advanced and well-rounded, and those training and developing them will play a significant role in the future use of LLMs.

You can discover the latest innovations and updates in the LLMs landscape by checking the news and posts on our sites. Future articles in this series will include not only LLMs but also apps that we discover that appear useful or interesting.

You can also contribute by sending us links to sites that you have used successfully. To reach us, just click either the checkmark (I liked this) or the X (I didn’t) and you can then send a message directly to our editorial group.

Or you can reach @therealjimlove on our Mastodon site at technews.social

The post Large Language Models – Who are the key players? first appeared on IT World Canada.