Large Language Models Explained

0:00 Welcome to Energy 101, where we ask the dumb questions. So you don't have to. So what are we talking about today? So I wanted to bring back the show and start with something that's pretty topical,

0:12 which is LLMs, large language models, think chat GPT, co-pilot, cloth, all that. It's kind of something that everyone is aware of, and even the boomers know. But there is a lot behind the

0:28 works, maybe the history of it, just what's in the future. It's not just simply the next Google, it's so much more. So I think it would be super interesting to have our own co-worker,

0:45 John is one of our young Calfan. John Calfan can really break it down for us. He can explain how he does it as a job and just his understanding in general Yeah, I hope I can be sufficient,

0:58 sufficiently insufficient. as far as how we break this down so that we don't get too deep. I'm going to try and keep it at higher level, but of course I'm also, you know, I'm an expert via usage,

1:12 not training or academia in any sense of the form as far as that goes, but. So yeah, let's start off with something basic, like describe what LLMs are or even describe like the industry, what you

1:28 do and what the industry is, really break it down like I'm five. Also, let's not use acronyms. If you do, say in my lab. So large language models,

1:40 I think are the best description I've heard or seen

1:46 of a large language model and what it is, is it's the statistical

1:49 next word. It's a model that basically predicts what the next word or words are based off of a bunch of inputs,

1:58 AI models. as a whole. So there's a lot of confusion around AI, because people to use AI and machine learning and language models and a bunch of other stuff all interchangeably. But there are

2:08 differences. But ultimately, all of them are kind of based off of training a model using some kind of data to then predict outputs, essentially, right? So you can, a traditional machine learning

2:22 model would be like, I give it a bunch of production data, keeping it oil and gas relevant. I give it years worth of production data across the same field, and then I train the model based off the

2:34 historic data to predict what should come next. That's essentially what large language models are doing, but just on language instead of numerical data. So traditionally, most people think of

2:46 machine learning. They think of doing that on numerical data, so statistics and all kinds of additional stuff that you do on that. But it's, you know, trying to predict outcomes or outputs of

2:58 machines, sensors, et cetera. With language models, you're predicting, based off what the user asks, what the answer should be, essentially. And so, what Google and anthropic and OpenAI have

3:12 done is they've gone out and they've essentially scraped the internet, including videos, which a lot of people don't think about, but there's a huge wealth of information on YouTube, if you can

3:21 transcribe

3:24 all of those videos,

3:26 to be able to train these models to answer questions, essentially. And so they do that as all things under the hood, so to speak, with AI, they do it through math. There's a ton of math that I

3:39 can't even begin to describe 'cause I don't understand it behind all of these things. And so ultimately it all boils down to numerical simulations and statistical predictions and that sort of thing

3:54 So a language model is just - taking what you asked it and then using those words to basically generate an answer from the knowledge base that it was trained off of, AKA the internet. So the first

4:07 question is kind of obvious. It's like to that, it's like, well, how is that different from Google? How was this better? 'Cause it

4:15 sounds, like that was very detailed, but it kind of just sounds like, oh, I guess that's what I just assumed Google does. So what is actually happening? That's

4:25 different. Yeah So, okay, so a large language model is trained to give you an answer. So I think that's a very important concept because it's not trying to give you the right answer, it's trying

4:33 to give you what it thinks is the answer, whether it's right or not, and so - In our language, like, and how we would, yes. Like how we would answer a question rather than Google giving it in

4:47 just like articles. Well, that's a thing too, right? like, yeah, it does change. truly does kind of change the way the internet looks in the future, because the amount of time, you know, we,

4:60 I guess my generation, our generation has gotten, or vary accustomed to, you go Google it, then you start looking through the results, then you might, you know, go to two or three sites to come

5:11 to your conclusion. The concept of language models is you ask it a question and it just immediately gives you the answer. The problem with even better, it can source it whether some of them do some

5:22 of them source and you can ask for the source. So it's kind of just, it's the same, same steps in a different order, a better order, in my opinion, which is why you radically, right? Like

5:32 that's the, that's the whole idea. Um, the scary part about that, you know, the flip side of that coin is that it's all based off how they train their models and what data they use to train their

5:41 models. And, you know, now you've got a bunch of AI generated content that's flitting the internet. And now you have this potential for this weird infinite loop of, AI is generating content that

5:50 goes onto the internet. The AI is scraping the content that the AI generated. That goes into the training sets of the models. And we'll see how this all shakes out. I'm glad I'm not doing any kind

6:01 of foundational model training and stuff, but it's a very interesting thing. But it does take away that like human discernment element of just going in Google searching and then looking through

6:14 different sources and stuff There's also, to your point, the Google -

6:20 a lot of people still don't think of it this way, but you've got sponsored ads and you've got - there's ways to kind of game, I guess, Google, in a sense, to make your stuff show up higher, even

6:31 if it may or may not be the exact answer. And that's absolutely coming on the language model side. I would not be surprised at all to see them introduce ads and things like that into the language

6:44 models or, you know, a lot of them, I've noticed, kind of getting into this retail space, I think, perplexity added like a stripe. You can buy shit directly off of perplexity now. And so you

6:56 start seeing like it's integrating all of these different data sources into one place. And so on one hand, it's nice. Like I was looking at, we just moved the offices and I was getting a new

7:06 monitor and I asked perplexity. I was like, oh, go tell me the best monitor that, you know, under this price, this big, et cetera. And it gave me a couple with their Amazon links and all of

7:16 that stuff And so I was like, oh, this is a nice place to kind of start.

7:21 And then I could compare them, right? Like it could pull the specs and actually compare them instead of me just clicking through the different links and then having to do that myself. But yeah, it

7:31 definitely is going to change the way that people use the internet, both for good and bad, right? So theoretically, you're getting answers faster. I mean, your kids can use it, like it can

7:43 teach you so much if you're just looking for general information. You also, in my opinion, have to always take that with a grain of salt if it's not siding a source or telling you exactly where

7:54 it's from because you don't really know. There's a really good paper out there called Chat GBT is bullshit. And it just talks about like the definition of a bullshitter is that they don't care if

8:06 the answer is correct. They just care that they give you an answer. Sounds familiar. And it's like, well, it's like, hey, that's an interesting way to think about this because that is kind of

8:15 what the language models do is they're trying to give an answer and not give the right answer. And so they don't have a way to truly know whether it is the right answer or not outside of the training.

8:26 And that takes a ton of time and money. And I can't even comprehend how much effort that takes when you're doing it on the entire internet and all of YouTube, right? Like that's, there's a reason

8:37 they've raised a ton of money because it takes a ton of time and a ton of people to train these I So. things think there's one more big question before we kind of bounced back and forth. But they

8:49 did great laying out the

8:52 groundwork for everything it is. Some things you're brought up was the future of ads and stuff like that. And it made you think, how do they make money and stuff like that? And a major question,

9:06 this is called Energy 101. What does LLMs have to do with energy usage? Everyone kind of has an understanding that you hear, whether you want to call them myths or not accurate. There's not like a

9:21 defining answer, but it's like you compare the output of a Google search compared to a chat GBT search. You hear about NFTs and

9:31 crypto, like all the energy usage that comes on these very, what seems just like ubiquitous with day to day life is actually super energy intensive, cost, how are they profiting, how much are

9:42 they spending, He's all just losing money. How is that all tying together with energy? Yeah, so kind of on two ends of that, right? On the front end, to train these models, you need a ton of

9:53 compute, specifically GPU power. This is why NVIDIA's stock is continuing to crush it because they're the dominant. They basically have a monopoly, I would say, in that space. But you need a ton

10:08 of compute, both hardware and then energy to power set hardware to train the models And then on the backside, every time you ask it a question, it uses compute to give you the answer. And so

10:19 that's called the inference side. And so that's really the biggest driver. But I saw there was an article recently they were talking about, I don't remember the number, but how many millions of

10:31 dollars are wasted on a weekly or daily basis, whatever it is, by people asking, telling GPT, please. Just that extra word, you don't need to tell it, please it's a robot, but just adding that.

10:45 is adding that much extra, and not just that extra word, adds that much extra to the cost of processing. Is this your PSA to tell people to stop being polite to chat GBT? Because they're coming

10:58 for you first, AI, when they're taking over. Well, it's funny. There were also articles recently, like this week or last week, that were talking about how people have started saying that they

11:08 give better answers when you threaten them. And so it's like, well, I don't know I don't want to be the one threatening AI model when they take over in Terminator becomes real life. But

11:21 no, that's a, so yeah, it's an interesting element there. But you, I mean, you already seen it, Microsoft is buying power generation. You know, they need more data centers as it is. And then

11:32 now they're adding additional compute, call it horsepower with all the AI stuff to those data centers. And so that's really the main tie for energy is, It is very energy-demanding.

11:45 Um, to run these things and, you know, when you've got millions and millions of people all over the world, using them every single day, multiple times a day. Yeah. So this all comes from

11:56 computing power. That's like, this kind of like a big, like looked over time and our recent 10 years, like GPU's like blew up out of nowhere. You know, it was great for gamers. It was great for

12:11 crypto Like it had all, it was this multi-use, like device hardware, and now it's, it's the reason LLMs are around. Like, I guess like it's, there's been an idea forever, like that for decades.

12:24 Like, oh, in the future, Adobe is something you could just ask a question. I'll answer it. So like everyone had this idea, but like the tech literally didn't exist. Yeah. Ask Jeeves, right?

12:32 Like, yeah, do you even remember that? Yeah, I did. I remember using it in like third grade. It was like the homepage of every school computer. Exactly. Yeah So it's kind of like a funny

12:44 conundrum. when you look at people who, like you, who have make a living doing this stuff. So it's like you, you, for before, let's say like 2017 was like a big year where like all the hardware

12:57 like really stepped up, all the AI really stepped up. Like, and now you make a living, you and so many others have these huge important jobs. And like y'all have been on standby basically waiting

13:10 for this to happen. And now the hardware to tech caught up and now y'all are rolling. It's like, what were you and other professionals doing before this existed? Like, how did y'all like create

13:22 this industry from like 2017 where nothing existed to, you know, less than a decade later, everything is like moving at a billion miles per hour? I feel like there's a conspiracy going on right

13:36 now that the minute we start talking about that, they start doing construction outside the office. Yeah, so that's a good secret.

13:43 One, I think big misconception is that AI or machine learning or any of this stuff is relatively new. It's not it's been around for decades. All the way back to like the 50s and 60s. It's the

13:56 thing that I've learned about a lot of this stuff is they're very like

14:01 component-tized to an extent. So it's like this one foundational algorithm developed, you know, way back when and then someone else evolved a different algorithm that did something else and then

14:13 somebody else came along and made another one. And what a language model is is basically a bunch of them stacked on top of each other. And so over time, it's kind of evolved into what we call

14:24 language models today, but there's been all these different elements of language models historically in different ways. They just were never kind of wrapped together like they are today. And so

14:34 that's that's a big thing. But another big element of this, just like with big data and, analytics and all of the buzzwords that came out around that a couple years ago, that was a, that was a

14:48 need that came about because we had so much numeric data that humans can't, we physically can't process it all at once. And so how do you do that? Or how do you manage that? Well, you get a

15:02 computer to do all the repetitive things that humans can't do. And so that's where

15:10 AI and machine learning originally kind of big data, data analytics, all that stuff spawned out of is, hey, we've got so much data now that we literally can't process all of this as humans by

15:20 ourselves and we need computers to do it. But there's also a chicken and egg situation with that is any kind of machine learning or AI model, you have to have a bunch of data to train them to begin

15:31 with. And so that was another big kind of element of the language model piece as we finally had enough just text based data that people go use and it could, you know, it's yes, it's obviously from

15:43 the internet, but then there's things like YouTube, right, where it's like, Hey, if we could accurately transcribe audio data, then we have an immense amount of new text based data that is also

15:56 informative. That's also directly from humans, you know, it's very human based, so to speak. And so

16:03 that there's a couple of things there that I think really helped trigger that But yeah, being able to, you know, accurately transcribe audio is a huge one, right? Like that literally, I don't

16:14 know how much,

16:16 you know, volume wise, YouTube transcripts would be compared to like the rest of the internet, but I have to believe it's a ton of text data. And so you have to have enough data in order to train

16:28 the models. And so it's this chicken and egg thing of like, well, we know we have models, but we just need more data to make sure that technology to transcribe, like, and the past like seven

16:37 seven years has really evolved, yeah.

16:41 It's, yeah, like now how accurate it is, how fast it is. Like you're never how long it is to take a yeah and how much money and how much frustration because all the words would be wrong and then

16:52 you have to go in. And yes, yes. And yeah. So now I mean, it's a very simple API call for me to transcribe a, you know, two hour long podcast and it's 90 plus percent accurate. Um, and so

17:06 that's, those are the really like foundational things there, right? You had to have enough data, both on, you know, true machine learning, as well as language model side before you could truly

17:17 build them to where they're effective enough for people to want to use. And so we've hit that, um, kind of inflection point. And that's why they're all kind of taken off now. But like I said, it

17:28 really evolved out of this. Just like problem that we as people have of, we can only process so much data at a time as humans, and so how How else do we do that when we have thousands or hundreds

17:43 of thousands or millions of documents within our company, organization, et cetera. So we talked about evolution of this in general, but let's be like more specific. So I'm on chat GPT, you know,

17:56 kind of since day one and playing around, I've seen it evolve and whatever. And shortly after they released Dolly, which was a separate entity by open AI So open AI owns chat GPT, open AI, it

18:09 opens Dolly. So chat GPT would be like text prompt and Dolly would be the image generator. And of course, all these other things were happening and companies were competing. But if I wanted to

18:20 generate an image, I would go to Dolly through open AI. But like within less than a year, now you can just go on chat GPT and text prompt it there So like how, how, why were they on two different

18:34 entities and now they're in one? No, for sure.

18:39 a really creative and smart way of essentially having your users train your data for you. So a really good example is CAPSHA. So everyone knows what CAPSHA's are. They're the little security

18:53 verification things that pop up and ask you to pick, you know, where's the bicycle in this picture, the bus or whatever it is. But a lot of people don't realize that's Google's, your training,

19:03 Google's computer vision model. And we have been for decades now. They pass it off as security, which there is a sense of security to that. But what you're really doing is they're taking images

19:13 that their models don't have some relative uncertainty about what is in the image and they're getting millions of people to validate the actual image. Never thought about that. That's crazy. That's

19:23 exactly what they don't think about it. It's so crazy. Oh yeah, well, we can get into all the crazy, scary shit Google's doing it on another time. But I mean, Google Chrome is literally was

19:33 built because they just wanted to track your mouse and all your clicks and stuff. Here we are, everyone uses it. But

19:42 going back, what they do is if you have a specific feature that you want to allow a user to use, and they did the same thing with the code interpreter, the coding agent, or whatever you want to

19:56 call it in chat GPT, when it initially came out, you had a separate dropdown that you had to pick and say code interpreter and then ask it to do whatever you wanted to do with the code or to

20:06 generate code for you. And what happens is they get enough people doing that over time, then they can take all the prompts and questions that people asked to the code interpreter, and they can

20:17 train another model that basically distinguishes your intent from your question. And so because they had, God knows how many millions and millions of data points from their users coming in and

20:30 saying, I know that I want to code something, so I'm gonna select code interpreter, They're basically labeling all of those prompts as. these have code intention. So then they go back, they take

20:40 all of that data, they train a model that they layer on top of the underlying workings that basically just looks at your prompt and then effectively routes it. It's called a router in the AI world,

20:53 but that's all they're doing is they're using the data that the users are generating by using the product to turn around and train the router so that it knows anytime someone asks these types of

21:05 questions, I need to go use my coding agent to answer it, and you know, the thing with language models or especially the rag type models is they can get confused very quickly if you throw a ton of

21:19 data at them, but if you can set up like pre-filtering and different things that reduce the amount, or reduce the data that it's searching across to a very narrow or focused or scoped kind of sub

21:32 data set, They're much more accurate. And that's kind of what is happening there. But you ultimately just have specialized models, right? Like Dolly is specialized to generate images. Code

21:42 interpreter was specialized to generate code. We're gonna have more and more of those that pop up. But it's just a kind of a cheat code to getting labeled data that you can use to train your models

21:55 to make them even better. So. I feel like as a product person, I respect that so much. Well, yeah, it's like the easiest way to, and it also

22:05 prompts the users on how to use it or what to use it for, 'cause it's endless. And you kinda notice that now where it's, there's not really any of those options anymore. You just tell it what to

22:17 do 'cause it has all the data it needs to be able to route, just genius. Yeah, no, even the web search piece, right? It's like, hey, instead of us having to manually go find and train the data,

22:31 let's enable the users to search the web, and then we'll base our new training their interactions with that data and those sources. And if those sources aren't in our sources, then we'll go scrape

22:42 those sources. And, you know, so it's all ultimately set up to try and make them better, more efficient, smarter. Yeah. We use what's called a RAG model. So explain the difference between an

22:55 LLM and a RAG model. So a traditional language model or a foundational model is, again, trained to give an answer based off of the training data that it was trained off of. A RAG model

23:12 uses a foundational LLM language model, but it only uses it to generate the answer, but the way it's generating the answer is you essentially do a search across a limited set of documents,

23:28 essentially the documents that you give it

23:32 It's bounded to just giving an answer from those documents. So you ask your question. It goes in searches, it pulls, there's a parameter in language models called top K, which is essentially how

23:46 many results from the search, do you wanna pass into the language model to generate your answer? And so how that ultimately breaks down is you have your documents, we go man and we chunk, it's

23:58 called chunking, we break those documents down, you can do it by page, you can do it by text length, there's a million different ways to do that. But just for simplicity sake, let's say we have

24:07 a 10 page document, we chunk it by page, so there's 10 different chunks within that document, you go and you search, it goes and it looks through each of those chunks and finds the best match or

24:17 the most relevant chunk that it thinks based off your question. And then it takes that chunk, it passes all of that information into the language model itself, and then tells the language model

24:27 based off of this information that I'm passing you generate an answer. And there's a lot more to that system prompt and all about that

24:36 goes into that, but that's a. effectively what's happening. So a rag allows you to search through a kind of

24:44 private or sub-segmented set of documents or data without having to train a foundational model from scratch.

24:55 And it's much faster than training a foundational model, it's much cheaper. It's generally very effective for a lot of, you know, just general business stuff where it's like, hey, I know we have

25:08 this, I know we have a contract on these wells, but I don't know what the contract says. I need to find the contract, show me the contract that contains these leases or whatever. And then it just

25:17 gives you the contract with the citation, you can click on the document and there's the actual contract you're looking for and you can read it yourself. You can go obviously much deeper into that,

25:26 but that's generally the concept With a rag it allows you to have basically dynamic data coming in all of the time whereas you know, GPT and the foundational models, they have a cut off date. And

25:40 so every time

25:43 anything after that cut off date, they're obviously currently downloading scraping and training, but it takes a very significant amount of time and money to go do that and constantly update it. So

25:53 for a corporation, a RAG is much more effective, much simpler, much more straightforward because you constantly have new documents and data coming in all the time But RAGs can also be hosted

26:06 privately, keep your data secure. There's a lot more flexibility to that. And so we at Collide, we have no intention of building a foundational model for anything specifically. We do think that

26:19 there's opportunity to fine-tune some of these models for specific use cases, but - What does that mean, but like, not a foundational model, but a fine-tune model. Yeah, so the big difference is

26:32 a fine-tune model basically takes a foundational model and then trains it. on a specific thing. So we could take llama or Mistral or one of these open source models, download it, essentially not

26:44 retrain it, but train it even further on a different data set, and then go and use it,

26:55 wherever we want it to. And so

26:59 it also takes time to do that more so than just a traditional rag, which is why it's trickier and not as common. But there are pretty, I think, decent results and papers and experiments out there

27:15 that show a rag with a fine-tuned model, generally speaking yields better results than just a plain rag.

27:24 But one thing that we will probably not talk about in detail today is there's a lot that goes into a rag model It's not just save it or uploading PDFs to a space.

27:37 magically it's done and I can just talk to all of them. We've been that'd be nice. It would, I mean, it's a great doing that on a single or two or three documents is great, right? Like that used

27:47 GPT, Claude, whatever you want for that all day, but when you have thousands or tens of thousands or hundreds of thousands of documents and they're all generally related to each other, they will,

27:57 those models will fail miserably and not yield very good results. And there's a bunch of data stuff on the back end that goes into that, but

28:05 I think that's probably one of my biggest gripes of the industry is that all the marketing stuff is framed to make this all look so easy. And this is also a function of just technology being better

28:17 and humans being better at UI and interface and the psychology behind software. But

28:24 most going from a prototype that only I use on a very small data set or number of files something an entire company uses with all of their datasets is a very, very big jump and requires a lot more

28:40 kind of effort and intensity to set up. So, I mean, that, this whole thing makes me think about, it's kind of like your own niche chat GBT, right? So if that's, that's essentially what a rag

28:55 is.

28:57 So like, let's say I'm a camera guy, I'm a video editor guy I have like these super specific questions about cameras, lenses, settings. Like there's no way chat GBT like knows like, oh, if I go

29:08 under the the settings on this camera and I need to like know what it does. Like,

29:14 you know, it's actually gotten really good at answering stuff like that. But I think it's just like pulling from like community forums, like a Sony camera forum. So like, wouldn't it be a great

29:25 product to make like a camera GBT, a cooking recipe, GBT. Like, did these exist? Is this something that people use? Because I feel like people just still use the blanket, everything, Chad GPT,

29:38 Claude, stuff like that. Yeah, I mean, they, I would say, on one hand, you have a bunch of people that are focused on kind of niche, GPT, knowledge bases, so to speak, and then you've got

29:49 the foundational model guys that, you know, seem hell bent on making sure that none of those companies are successful because they're scraping the entire internet, right? They want to be the go-to

30:03 one-stop shop for everything, but

30:07 it's not necessarily, we haven't really seen yet if that can be the case. And so, yeah, I think there's plenty of room for specialty kind of language models. I mean, there's a bunch of companies,

30:19 I think Voyage is one of them, they're hyper-focused on like the legal space, right? So they have like a fine-tuned model for legal, they have an embeddings model for legal, they have all this

30:28 stuff, but it's all focused around legal, right? And it's like, that makes a ton of sense.

30:39 because legal uses a lot of, you know, specific jargon, specific terms that outside of legal, no one would ever, no use, it would be hard to find even, you know, like the percent of legal

30:45 documents on the internet versus blogs, right? And so you have to think about it that way, right? And like empirical evidence. Yes. And 'cause there's, we've heard in the last five years, like

30:56 all the like the goofy lawyers who've gotten in trouble and disbarred from using chat GBT, but they're kind of like low-key ahead of their time 'cause it's like, well, this should exist. Like if

31:07 you could feed them all this like text, like, you know, data that's, I mean, that's what lawyers do, right? Like they just read a bunch of like old case studies and then apply them. So like,

31:18 why is there not a model with all of them loaded into it? I have something coming up soon where like the first, like GPT-3 had like billions of texts or whatever. Like why, like is it like that

31:30 doesn't sound so crazy to do. for lawyers in all these other industries, nursing, so on. Yeah, no, I mean - Therapy. There are plenty of people doing that. So even like the GPTs within Chad

31:40 GPT, right? Like I was testing one the other day for product management stuff and there's a bunch of them and it's like, hey, that's interesting, right? Like just they're kind of, I think of

31:51 GPTs and language models and all that as tools, right? Like they're not, to me, they're not always the source of infinite truth and they're not always right and you always have to check them and

32:03 validate them, but they're tools in a tool belt that you can use to do a bunch of things with. And so, you know, I think that's where, you know, everyone thinks about GPT or LLMs for search or

32:15 getting answers, but there's all kinds of stuff around the data side, the extraction side, the coding stuff obviously has blown up. Tons of people are coding using language models and stuff. And

32:25 so it's, you know, they're not, they're not this. mystery. I mean, you know, there's a black box element to some of it, but they're very powerful and a lot of nuance use cases that I don't

32:41 think a lot of people are really doing these days or thinking about these days. Yeah, it sounds exciting. I'm excited for the future. I'm also excited to get hit with ads on GBT. That's going to

32:51 happen literally any day now. I am not. All right, let's, let's move it to the on screen segments. This is going to prompt some more questions anyway. So I'm just going to go in order here. So

33:01 I think we're going to start off with some trivia. So I think it'd be fun if like me and Julie or Julie could attempt to answer these first. No, I wrote the answers though.

33:13 Okay. So

33:15 the ding ding. Yeah. The gloves, graphic. I know we're going to go ahead. No, that's actually fun. All right, I'll think about it Here we go.

33:25 All right, Julie. What does GPT stand for? John, do you know?

33:33 I'm going to say, you know.

33:40 I'm

33:44 going to say generative prompt technology. What if I told you it was like Wi-Fi where like it doesn't mean anything? You know what that, like Wi-Fi isn't an acronym. It just, it's like a, it

33:49 isn't like wireless. No, but is this an acronym? Yeah, it's an acronym. Okay. The answer. I'm going to say, no, no, no, no, wait. It's going to be generative. I'm pretty sure she's

33:56 definitely a generative. Generative. Prompt? I don't feel like it's prompt, but I feel like it's prompt. I don't remember. I have the - Generative, something technology. Generative.

34:09 Generative prompt technology doesn't sound right, but - Hold on, don't go yet. Tea could be tokenization. It could be - Text. Generative. Text, awesome. God, I really don't remember anything.

34:20 Prompt text. Okay, here we go. Generative. Pre-trained. Pre-trained. That makes sense. What does transformer mean in this context? So a transformer in a language model context is the root

34:33 technology that basically established LLMs. Okay. You'll hear people talking about transformers all the time when you get into the LLM space, but it's the transformers are that combination of a

34:45 bunch of different algorithms that people have used historically, all wrapped together in one, and that's what I think came out in 17, 18, originally, and that's where everything just exploded.

34:55 Right, all right, question two What's the term for when a large language model confidently generates information that isn't true or real? Julie? It hallucinates. That's right. We kind of touched

35:08 on this earlier. Yeah. I mean, if you want to just sum this up in a sentence or two, and also the future of it, maybe also your models that don't do it as much as others? No, so hallucinations

35:21 are, again, a function of the fact that these are trained to give an answer, not give the right answer. And so it's gonna give its statistically best guest answer every time you ask it a question.

35:36 And so again, that's why you have to take these with a grain of salt. But that's where some of the nuances and the different models come from, right? Like perplexity was written by academics. And

35:46 there's some really good podcasts if you ever care to really geek out. And here they're philosophies on all of this stuff. But like perplexity guys, they were all coming from academia and academia.

35:56 You cite everything, right? Like he, I think he said on the podcast that if it wasn't his, his own new perspective, it had to be cited in all of his PhD, all of that stuff. So I use co-pilot

36:08 when I want like the see this, 'cause they always put the source like automatically in the bottom. And 'cause some information, it's like, it isn't like, I don't need an opinion or I need help.

36:19 I need a literal answer. Yep, and so that's ultimately what hallucinations are It either got confused because there's a lot of related data and it doesn't know exactly what it's trying to find or it

36:33 may not have the best or the correct answer and it's just giving you this again, the statistically most likely outcome there. One really interesting thing about the RAG setup is that you can turn

36:47 the temperature which is another parameter in the setting and the language models all the way down to zero And the temperature, what the temperature is, is basically its ability to be creative with

36:57 the answer. And so if I pass it a bunch of texts and put the temperature to zero, it can only generate an answer from that text that I give it. Whereas if I made the temperature goes from zero to

37:10 one, if I made it 07, it could take that text and then reword it how it thought it should be reworded to give an answer and put its own words in there, change it around all of those sort of things

37:22 And so that's one of the, I would say the biggest kind of control elements around RAG versus a traditional foundational model is that you have the ability to tweak the temperature and some of those

37:33 parameters on the back end. But yeah, where I was going with perplexity, their whole thing is, you know, everything has to be cited. And so their concept is if they can find multiple instances

37:45 of the same story or article or topic or subject on the internet, then that validates that it's truth, so to speak, right? And so, you know, you could see that, which is both a good way to

37:59 think about it, but also not, in my opinion, the most, you've got to have a lot of, like, filtering in order to make sure that that works correctly, because, I mean, you see it with politics

38:09 all the time, right, like, both sides are putting out all this content, shitting all over the other one, and so it's like, okay, well, you can find multiple sources of the same thing that is

38:19 probably biased, significantly biased, one way or the other, but that doesn't mean that it's the right answer. And so, yeah, there's a lot of nuance to this. And it's very tricky from that

38:29 perspective to, it's always trust but verify, in my opinion. Wouldn't it be so cool if there was a platform where if it didn't know the answer, it just didn't hallucinate? It just told you I

38:40 don't know. Yeah, wouldn't that be so cool? That's what ours does. Who's that? Collide. Oh, interesting. Anyway, next question

38:51 What do we call it when an AI is asked to complete a task without being given any examples beforehand? An agent? I

38:59 would, my guess would be either an agent or a one shot.

39:05 A zero shot. So I picked this question out 'cause I came across the term first and saw it at, man, I was like, oh, it's interesting. Like, how does that work exactly? Yeah, generally speaking,

39:16 when I think of a zero shot, It's like, I give.

39:21 something cursor is the coding tool that I use. It's a coding interface that has the language models kind of built into it and wrapped around it, integrated very nicely into it and stuff. And so,

39:34 for example, I was trying to show

39:38 this older gentlemen, kind of the power of it. And I couldn't think of like a good demo to just like have it right real quick. So I was like, just told it to give me a map of all the water burger,

39:50 write a script to give me a map of all the water burgers in Texas. And it does its thing, that's all I told it. It does its thing, I hit the run button and it sure as shit spit out, that's really

40:00 perfect. Did it have orange pins? It did. Was it like an image or like an interactive, it was an HTML, whoa, file that I could pull up in the browser and it was a full ass, interactive map with

40:11 all the addresses and everything. It was like, yeah, the icons were branded. That's amazing. And so like that to me is an example of a zero shot, right? I asked it one time, it generated.

40:21 exactly what I wanted. There was no iteration through it. Well, that, that's, I would say becoming a little bit easier, but it's not common unless you have a very narrowly scoped thing that

40:34 you're trying to get it to do. One very explicit thing. All right. Next question. So you actually said this word a lot. So why don't you explain like we're five parameters. It seems to be like a

40:44 very universal term that you need to use to explain other things. So once you, you say in the parameter of this, I'm like already lost when she said parameters. Yeah. So it's like a setting,

40:53 right? So in the language model, there's a bunch of different settings around, again, for like a rag, there's the top K. So essentially how many, how many matches from the chunks? Do you want

41:08 to pass into the model to use for the answer, right? So if I had a hundred chunks and my top K was five, it would only return five of those every single time I asked a question to use, to generate

41:18 the answer.

41:20 The temperature is another one. Again, that's the creativity of it. There's a bunch of other ones that I really don't tend to mess with, but they're just all essentially settings around how it

41:33 processes your question. Yeah, I feel like all this stuff's way more explainable, like in infographics, the way they like visualize it is very interesting, so, you know, anyway. I think this

41:45 one's more of a fun question, like answer it like a jeopardy question basically Author Samuels groundbreaking 1959 self-learning program played this classic game, marking one of the earliest

41:55 demonstrations of AI. Chess. Close. Checkers. That's right. In fact, this guy supposedly coined a term, language model, modeling, what does it call it? Machine learning.

42:10 There's no language involved in checkers.

42:13 Is it algorithms? Yeah, it's just algorithms. It's like tic-tac-toe. Right. Yeah, it's just constantly looking at the moves predictions of what it thinks is gonna come next and all the possible

42:23 outcomes. Yeah. Modeling the statistical best next move. Yeah, and you mentioned like, you literally mentioned like the '50s or '60s something earlier about AI. Like this guy in 1959 or whatever

42:35 had coined a term, machine learning back then. It's like that term existed for 60, 70 years and it means something till this day, so I don't know. Yeah, no, like I said, there's a ton of, you

42:48 know, there's a couple like key points that people point to in history, but a lot of it is very evolutionary and it's like, this thing came out and then somebody else developed this thing and then

42:58 another person decided to put those together and it made this and it just this kind of iterative process over time. So it's definitely fascinating to kind of reflect on because before 2015, I had no

43:13 idea what big data analytics any of this stuff was, like I didn't, I didn't have a need to, and so. Right, and now you know everything about it. I do not know everything about it. I just use it

43:25 a lot and have had to learn because I use it a lot. You and everyone else, that's crazy. I got one more question here. If a person read nonstop 247 at an average pace approximately how many years

43:37 would it take them to read all the texts used to train GPT-3? A billion years. Yeah, so this is like the first big model I came out, right? Where like it was usable commercial yada yada By the

43:49 way, this was 570 gigs of text.

43:53 So I got, I pulled this from a lot. I was actually doing math in your head. I pulled it from a few sources. So I just have a range basically like a number dash number. So let's see if you can get

44:01 in that ballpark. How many years would it take? Yeah. Let's get with 1,

44:08 000. 1, 700 to 2, 600 or the numbers I saw that in the between. So I don't know. That's crazy.

44:16 gbt models at now if this is 570 gigs. Oh, I guarantee you it's over at terabyte. Yeah Just have text like we all know a text file like is like nothing. Yeah, that's just the language model It's

44:28 like then they have dolly which has all the image stuff Which is completely different and trained off of images and then they do the video It's so crazy right, but yeah. All right. Well, you

44:37 passed a test. I guess I don't know anyway I need you to explain what this is

44:44 I Do not participate in those types of keyboard. That's actually not accurate It's not supposed to have the letters on it or anything on it I don't I haven't taken a good look at him But there's a

44:54 there's a lot of people who have this including people in our office where it's but he doesn't have

44:60 Letters on them. I'm assuming that's it's coding completely bare. Is it a coding? No, it's an ergonomic. Oh, just comfort But I feel like some mods is coding because it has it doesn't have

45:10 anything on it. Well, he's just got hot. It's still a keyboard. I know. Buttons of a keyboard. They just don't have a numbers on them. Does something. Well, he has a smaller keyboard. He

45:22 doesn't have a number.

45:24 His keyboard is wild. Yeah, it is wild. All right. Is there a name? Can we show John's mouse? Because it is also confusing. Oh,

45:36 well, John has

45:38 a very typical or

45:42 but ergonomic mouse. It's confusing. It's a choice. It's a choice Just turned. Yeah. I wanted to make sure my wrists were not getting a carpal tunnel. All right. Two more segments. One is a

45:49 would you rather? I got three. So let's go over to first one. So would you rather just pick left or right?

45:56 If you can only have one. Can I do? Can I choose? I choose chat GBT all day. I mean, these are the two big big boys, right? So for me, it's probably clod because I code more with language

46:10 models than. generally anything else. And so the cloud model has, at least these last two have consistently been better than GBT code in my experience, but. I like the interface of chat GBT a lot

46:25 better. That's the thing. I almost never use the actual interfaces of any of these tools. I'm using either cursor, which is API, or I'm using the API. And so it's like, I never mess with the

46:35 actual interfaces. It's kind of weird, actually, 'cause I'll get on there after like a month or two and I'm like, well, where did all this stuff come from? Like, constantly update it. I just

46:43 use Claude when GPT's down, or like, it's just like not helping me. But I think I could say it, I could say it a reverse. Like, if I just grew up with GPT, like most people did, if I would,

46:55 like, Claude, like every time I use Claude, grew up. That's not the right way. In the last two years, he's been using

47:03 GPT. I feel like Claude is B-Team. If I was four, I feel like if I was four, so joining Claude would be like, totally fine. like I get used to it. You would be totally fine. I don't like the

47:12 colors. Again. And fine. If you're not using the interface, none of this matters. I will say one of the big differences in Google's model is that it has a million token context windows. So you

47:24 can put a lot more text in there for context. And if you take anything away from this episode, that would be my biggest piece of advice is give it context. Give it sample data sets, give it the

47:37 actual data sets, give it as much information as possible about what you're trying to do so that it understands what you're trying to do because that's how these work. It's all based off the prompt

47:50 that you give it. That's how you're gonna get your answer. So the better the prompt, the more define, the more information that's in there, the better the answer will get. And so I think GPT and

47:59 Claude are up to like 100-ish, thousand tokens.

48:05 as far as their context goes, but the Gemini one is at a million now. And so you could pass like very large documents into it and have it do all kinds of fun stuff with that. My favorite is there's

48:16 AI for your AI, like prompt engineering. So if you wanted to like give you a really good prompt to give your AI the prompt, you prompt the prompt. So anyway. That's actually the first step in

48:26 most of the foundational models when they get your prompt is they have a model. Like a hidden. Rewrite your prompt now. That's exactly what they do That's funny. All right, last would you rather,

48:38 would you rather be fluent in this? Oh man. What does this call it again? Python? Python, or would you rather be fluent in Spanish? I didn't know that was a symbol for Python. It's a logo for

48:50 Python. Would you rather be fluent in Python or Spanish? That's a hard, that's really hard. I would honestly say Spanish because of language models. If this was two or three years ago, all day

49:05 taken Python, but now

49:10 I have language models, right? Most of my scripts and code and stuff. You can do the same with

49:17 learning Spanish. Right, but if I'm in Mexico and I'm just talking to someone, I'm not breaking my, I don't wanna be able to break my phone out. I feel like you could, with them, yeah. But it

49:27 would be way cooler if we could just have, and my kids both speak. So you'd rather not touch grass. No Well, no, touching grass is learning Spanish, and traveling to all the places that speak

49:37 Spanish, and being able to have - But you're choosing Python. No. No, no, you chose Spanish. Spanish all day. So you are touching the grass. He's touching all the grass. Or whatever grass and

49:45 Spanish is. I don't know, actually. Well,

49:51 we'll color, say the color. Well, once you choose it, you will magically know. So that is country-southern. Bay of day. Bay of day, as my daughter. Okay, I lied to one more, would you

49:60 rather,

50:03 Would you rather a custom Cybertruck or a free trip on Blue Origin, but Katy Perry has to be there.

50:13 You wouldn't like looking at her? She is hot. I mean, sure, yeah. I think she's funny too. But she's not funny. Oh, she's not funny. I mean, do you not hear the recent coverage of her album

50:22 release? She didn't even look like her. Well, that's my thing. I still don't know that I believe that Blue Origin did anything based off of all the photo shoots stuff that happened on them getting

50:22 out of the

50:34 capsule. But I also have no desire to have a Cybertron. So you go into space? So we're going to space. And you're speaking in Spanish. In space, in Spanish. Okay, last thing. Last thing, yes.

50:47 That's right. We're wrapping up on time here perfectly. We have one more little thing. This is like a bang kill Mary, right? So. Bang kill Mary? I'm FK. To put it appropriately. I heard my

51:00 kids playing this.

51:03 Would you rather, the Ghibli, the Studio Ghibli animation, the what's behind a flat image thing, or - I don't understand what's happening. The toy, these are like things that AI does that are

51:22 like current trends,

51:27 and - Is that your family? Which one, that is my family actually, that's me. And then this is Tim from Midnight Marketing. Yeah, I recognize Tim And then there's this, the trend in the middle

51:34 is a TikTok trend where you get a flat image and it generates what's behind it and spins it. And if you know it, it plays a microwave sound effect. Tim looks like Jimmy Neutron. He does go like

51:42 Jimmy Neutron.

51:45 So what, what should one would you kill? What should one would you bang? And what should one will you marry? So I'm definitely banging Tim. Great looking dude. Good guy. I'm

51:56 killing the middle one because of how bad the leash pieces, Like, if you replay it. The leash is just completely disconnected the whole time. And then, yeah, we'll marry the Ghibli one because

52:11 image generation's pretty dope. It is good. Like, when I did this, like, I like my heart warned. I was like, Oh my God, this is so amazing. So it's like, I spent way too much time creating

52:24 all kinds of different versions of my LinkedIn profile in cartoons because of when all that came out Yeah, so Julie dipped, but I think that's just about it, you know? How is that? You like the

52:39 little segments we do at the end there? Yeah, these are perfect, man. I like that you've got some structure and change of pace and all of that stuff built in. It's, I like it. Yeah, all right,

52:50 so I appreciate you coming on and explaining LLMs. I think you use a lot of big words, but you did break it all down. I think this is a great video to send grandma and let her know I hope so. Why

53:02 charge EBT is. you know, controlling her life in the future or whatever. Yeah. No man, I appreciate it. It's been fun. I hope that I, I kept at high level enough. I know I used some jargon,

53:14 but just ask GPT what those things mean probably give you a pretty good answer, but. All right.

Large Language Models Explained