GSAPP SMORGASBORD

LLM Chat

by William Martin

LLM Chat

You probably already use some of these services, but here are a few curated ones that we can use in these tutorials for various methods.

AI chat products are apps that manage our interactions with LLMs. They have a similar appearance to any other chat application, but the text is being generated by an LLM and streamed to us.

Most popular ones are hosted online or available through a mobile app, but some can be run locally on our machines ourselves.

Here are the popular hosted ones:

ChatGPT

The famous chat application that started the craze, released in Nov 2022 by OpenAI.

You can sign up for a free account here: https://chatgpt.com/

Perplexity.AI

Perplexity.AI takes an interesting approach to AI chat, attempting to address some of the oddities encountered with LLM's tendency to "hallucinate," or generate creative results without regard for their truthfulness. Perplexity.AI combines text generation with web search to look for credible sources relevant to a user's prompt.

It's available for public use here: https://www.perplexity.ai/

Hugging Chat

Hugging Face host their own chat service that enables you to pick from a variety of models. This is good for a free, powerful, flexible option to test models of different kinds.

Meta AI Chat

Meta AI Chat is backed by Meta's LLM, LLaMa. This model is one of the most popular open-weight models available, downloadable in different sizes and flavors, and runnable locally on your own machine with the hardware.

Find it here: https://www.meta.ai/

Google Gemini

"Gemini" is the name for both Google's LLM collection and their chat service. It's the successor to "Bard," which you may have heard of.

Find it here: https://gemini.google.com/app

Google Gemini via Google Vertex AI

Vertex AI is a cloud service of Google Cloud. You can run text completion on multiple models and with multiple kinds of inputs. You can adjust more of the parameters for inference like temperature.

Find it here: https://console.cloud.google.com/vertex-ai/generative/multimodal/create/text

Anthropic's Claude

Anthropic is a significant company in the AI space, with their flagship LLM called "Claude."

Anthropic is a public-benefit corporation that claims to be focused on AI safety. In 2023, they announced upcoming investment from Amazon upwards of $4B USD.

Find it here: https://claude.ai/

How LLM Chats Work

LLM chat applications are just web apps that do the following:

manage a list of "messages"
submit prompts to a hosted LLM API
receives completions from the hosted LLM API
manage a set of "tools" for functionality beyond text generation

And that's pretty much it. It's relatively easy to code your own LLM chat app, if you have access to an API for performing the LLM completion itself.

Whenever you submit a prompt, this is wrapped in a "message" object that is added to a list of managed messages, then all of those messages are sent to the LLM API.

This last point is key. All messages in a single "chat" session are sent with every new prompt, all of them since the start of the chat. The LLM then generates a completion one token at a time. This is how the LLM can perform completions based off of the entire conversation, limited by the token limit.

This also means that prompting techniques like prompting the LLM to expound on its reasoning first can often come up with more relevant (or higher quality?) completions.

The effect of seeing each token added in near realtime is called "streaming," and it's a feature of LLM APIs and SDKs.