AI Context Windows - Why Memory Matters for LLMs

The History Factory Podcast · S6E8: All About Context and Context Windows

Share this episode

Hear about the “invisible” side of AI—context and context windows, aka the memory of LLMs—in the final episode of our three-part miniseries on Chroniqle™ and AI-ready data. Host Erin Narloch and Fred D’Silva explore what these are, why they matter, and how Chroniqle is structured so that it gives verified, cited information with every response.

Transcript:

Erin Narloch 00:11

Hey, History Factory listeners. Today, we’ll be concluding our three-part series on Chroniqle and AI-ready archives. Today, the conversation will be around context and context window, AKA the memory of an LLM. I hope you enjoy our conversation, where Fred really shares additional insights on how we approach this topic, why it’s so meaningful, and what other systems get wrong about it. So, let’s go ahead and get into it.

Erin Narloch 00:51

Hey, Fred, great to have you again, talking Chroniqle and all things AI in this three-part series. I thought, for today’s conversation, we can talk a bit about context, and maybe context windows, as it relates to LLMs and, specifically, Chroniqle. You know, I would love to get your kind of high-level take on: what is a context window, and why is it so important to the efficacy of an LLM?

Fred D’Silva 01:32

For sure. And this is actually a great topic that not many people really know what it is, or how it works, because a lot of this is invisible to the user. But essentially, what you could think a context window is, it’s essentially your AI’s short term memory. The best way to describe it is: these AI models, they’re still machines, and they operate on how many words that they’ve recognized, and can find a relationship against that word. So, when you think of, like, a sentence, a sentence might have, like, one word, might have five words, might have 10 words. So, a context window is, essentially: how many words can you fit that an AI is able to remember, and then act on, within those words? The best way to describe it is: you have a table in your dining room, and you’re trying to figure out, how many dishes can you fit within that dining room table until something falls off? And that fall-off point is, essentially, when you have too many things in your AI’s context window, it’s not able to answer meaningfully, meaning that it might start acting erratically. It might, like, start repeating words, it might start hallucinating information you gave it—hallucination meaning that it’s confusing one word in its memory versus another word, and conflating the two ideas together. So, the way that a lot of these models, providers, do—like, say, if you’re using, like, a ChatGPT or whatnot—is that they maintain only a sliver of information from your conversation history, because these AI models can only hold so much information inside of that context window, or so many words inside of that context window. The way that I could describe it is, like, when you have, like, too much information, these models start just to kind of, like, not give you the right info back. It might start, like, sounding more like a machine. It might not be as more conversational. And generally, what a lot of people would recommend in that case is—especially, like, when you have a very long conversation—is basically clear context or start a new chat session. Obviously, like, what we wanted to do with Chroniqle is we wanted to be able to make the information usable, and obviously we also wanted to make sure that the information you’re getting is authenticated on truth, so it’s verified information that it’s citing from, like, the sources you’re getting. The challenge is you have to be able to fit all of that information—from, like, the various articles that Chroniqle searched, the various paragraphs it got back—you have to fit it inside of that context window, and you have to be able to optimize how much information you’re putting in that context window, versus giving it enough space to also answer your question back. So, it’s always a delicate tug-and-pull versus: how much information do you keep inside your LLM, and how much information do you have to cut away or trim out to make sure it’s still giving you an optimized answer? And that’s a tug-and-pull that basically everyone is playing just to make sure that, from an experience standpoint, when you ask an AI question that’s, like, from 10, 15 chats ago, it still has a way of remembering or recalling that information.

Erin Narloch 06:12

So, thinking of all of the, you know, the knowledge base that makes up Chroniqle for different clients, how do we keep context part of the response when we’re talking about essentially thousands and thousands of origin, you know, pages of knowledge? How do we do that?

Fred D’Silva 06:38

Yeah, that’s a good question. It’s essentially: you want to make an AI machine, or model, that has enough information in its short-term memory to answer the question you’re asking it at that moment, but also be able to recall previous information, or previous, like, information that might have been discussed, in a meaningful way. One of the things that we do with Chroniqle is that it is instructed as the AI to always ensure that it’s basing its information on the searches it’s doing in its knowledge base, so that way, we’re able to meaningfully track the sources and the data chunks that we’re using to reference and answer. The challenge is, when you have, like, these lengthy responses with cited information from, like, previous answers or previous queries in that same chat session, if you don’t put up, like, a fence around the information you’re keeping or what you’re telling Chroniqle to cite, with other machines, it might start citing information from, like, a previous sentence that might not be fully accurate for what you’re asking. Because, again, it saw that it was a cited information from, like, a previous source, and it might say, ‘okay, I can reuse this cited source or this statement I already wrote.’ Without really giving too many details into the inner workings of Chroniqle, we basically base context over ensuring that if Chroniqle is going to use a cited fact, it has to first find that fact in its knowledge base. So, if you ask it, like, a follow-up question about what did someone do in the context of, like, the previous answer, it has to search for that information and find it, and then put it as a fact. But it still knows what to search based off giving it enough context to know, like, this is what we previously talked about. So, it can find—it can search that knowledge base again with knowing some of the keywords it knows from its previous searches, but then finding that same fact chunk or fact source to cite again. And it’s basically this delicate nature of: how much information do we still keep in Chroniqle, versus how do we also summarize a lot of that information and give it, like, enough information about what it’s talked about so far, without as many words as we’ve had? And that’s kind of, like, that delicate, like, ‘yin and yang’ of keeping what’s used in the context window versus what’s freed up in the context window.

Erin Narloch 10:03

Yeah, interesting. I think of that context window as, like, Dory from Finding Nemo, and how do we, like, feed the most, you know, most of the information and maintain that context without losing it? I think the analogy you provided earlier in our discussion about the table is a really great one. Well, once again, Fred, I always learn so much when we sit down together and talk technology. So, thank you for your time today. I thought that was really great insight into context.

Fred D’Silva 10:38

Appreciate it. Thanks, Erin.

Erin Narloch 10:40

Yeah.

Erin Narloch 10:46

Alright, and that officially concludes our three-part series on Chroniqle and AI-ready archives. I really hope you enjoyed today’s conversation. And if you haven’t already tuned in, check out the three previous episodes we have in this little mini-series. Alright, thanks for listening, and until next time.

View Transcript

All About AI Context and Context Windows: Memory Matters for Effective LLMs

Share this episode

Transcript:

More Episodes