My thoughts on AI, for my fellow lawyers.
I’ve been seeing a lot lately about AI, including some misunderstandings by some of my fellow lawyers that have the potential to get them into trouble. Since I have some background in the area I figured I would do a public service for my brothers and sisters (do you all say that or is it just a Massachusetts thing?) and provide a knowledge drop.
This will be in at least two parts. Today’s is technical material. I want to give some information on the technology that is background for what I want to write about its uses. But if you’re not interested in the technical, you can skip this. My next entry will be on the subject of how AI is (and is not) useful for lawyers, in my opinion, of course.
DISCLAIMER: The following is not meant to be comprehensive or precise. It is my attempt to explain some things, as I understand them. It has been fact checked by ChatGPT, not by a human expert. While the writing is my own, I have taken suggestions and corrections from ChatGPT itself.
What is an AI?
First things first. What is AI? It stands for artificial intelligence, and while it’s probably artificial, it’s arguably not intelligent. What most of think of as AI, at least in the law practice context, is a chatbot that uses large language modeling (LLM). I’ll use ChatGPT as my example because it’s the most popular one, but this all applies to similar platforms like Microsoft Copilot’s chatbot and Google’s “AI Overview” feature. It presents to us as a web site where we can ask questions and get answers on nearly any topic.
From a technical perspective it is a SAAS (software as a service) platform that runs natural language processing software to process questions, then uses a combination of its own database (referred to as a “pre-trained model”) and (when required) web searches to collate information for a response and a natural language processor to generate output that responds to the query in natural language (e.g. English).
From a less technical perspective, ChatGPT is an advanced tool for collecting, collating and presenting information in order to answer questions. But it’s still a tool built on software and databases - not humans - and its capabilities are therefore limited. At times it has the look and feel of a very smart human, but that’s because it has a huge amount of knowledge, not common sense, creativity, or what I like to call a “bullshit filter.”
What are the components of an AI?
An AI, by which I mean an LLM chatbot like ChatGPT, is software with four necessary capabilities, which I’ve already sort of touched on:
Collect, categorize and collate information to create a database. This is often referred to as “training” and is how an AI “learns” facts, which is really shorthand for a more involved process of finding not just facts but also patterns. The sources are textual, and primarily books and websites. The does not distinguish reliable sources, and it does ingest some bad information, like outdated studies, fake news, and posts on r/legaladvice. The information collected is broken down into components, the specifics of which we don’t need to get into, and these are tagged by subject areas and collated. Think of notes taken on index cards, each representing a single piece of information, and the card is tagged with references to sources and categories that you can look up in an index. (That’s overly simplified and does not completely address the sophistication of the system, which heavily incorporates relationships between information and not just raw facts.) The ability to take information from textual sources, decide what it is, and store it in a useful way has been in development for decades. Without this ability it is not possible to proceed to the next item, which is:
The database itself, which represent the collected knowledge of the GPT LLM. This is enormous, and “database” isn’t really an adequate term, but it’s a nice shorthand for “data structure that encodes relationships between pieces of textual information.” OpenAI has not publicly commented on the size of the GPT4 database, but presumably it’s larger than the GPT3 database, and that’s approximately 350 gigabytes. 350 gigabytes is hard to describe in plain language but as a rule of thumb, a page of text is about a kilobyte, so 350 gigs is about 350 million pages of printed text. If you’re like me, a client will get annoyed at you for handing them a 100 page estate planning binder, so the GPT3 database is roughly equivalent to enough pages of estate plan binder to annoy the entire population of Connecticut.
Interpreting queries and retrieving information to respond. The same language processing techniques that are used by item 1 are used to interpret the user’s question, then decide what information to look for in the database. Some LLM tools will also perform a web search for additional information.
Natural language output. After the query has been interpreted and the system has pulled together the chunks of information (index cards) that can come together to form a response, the language processing tech effectively acts in reverse to translate the data into sentences so that it can provide the answer to the question.
Why are we seeing all this come up so quickly right now?
AI chatbots have been around for a long time, even going back to a system called ELIZA in the 1960s, but it’s only in the last few years that they’ve advanced enough to be useful to the average person. A lot of the tech behind this was being developed in the 90s, but we needed the hardware and the access to data to catch up.
Here’s how far we’ve come since then. In 1998, the federal government paid about $47 million (in 1998 dollars) for the Blue Mountain supercomputer. This was installed at Los Alamos and used to simulate nuclear weapons as a replacement for real-world testing, which was banned by treaty. It fit in its own large building. Blue Mountain had over 6,000 CPUs (48 clusters of 128 CPUs each) and consumed a megawatt of electricity (about 750 US homes worth) for the computer alone. It needed additional power for cooling. At the time it was the second most powerful computer in the world. It was rated at 3.1 teraflops (a unit of computing power).
Today you can get 3.1 teraflops from a single CPU consuming about 5 watts, and you can buy the whole computer for $700 and keep it in your pocket, because I’m talking about an iPhone 15. Not even the Pro. Now try to scale that up based on the knowledge that OpenAI wants its own nuclear power plants, and you start to understand how much computing power we’re talking about.
Along with computing power, storage has come a very, very long way. If you’re like me and you were in high school in the 90s, you might have been lucky enough to have a Pentium computer with a hard drive. A computer for the equivalent customer today has an SSD 1,000x the storage capacity (4gb used to be very good; now it’s 4tb) and the data transfer rate is also increased by 1,000x (up from 5mb/s to 5gb/s). And that’s just on the consumer side. I’m not describing what data centers use because it would be too difficult.
I say all this because the enabling technology that’s allowed the software developed over decades to finally become a really good tool for consumers is the ability to store, access and process an absolutely enormous amount of data very quickly. You can ask ChatGPT almost anything and get an answer based on a sizable portion of the entire written knowledge of humanity in a few seconds. Millions of users can ask it questions simultaneously and for each user it’s querying a database that’s over 350 gigs. That’s something we can’t do without the most recent equipment. The realization that this scale of processing combined with the last few decades of software development could produce this result is what triggered the AI boom.
To be continued…