Four Kitchens
Insights

AI will find your messy, outdated content: The Future of Content episode 49 with Sebastianna Skalisky

29 Min. Read

 

Key takeaways

  • AI is only as good as the content it pulls from. Well-structured, up-to-date content leads to better AI-generated responses, while messy or outdated content increases the risk of misinformation.
  • Good content benefits both humans and AI. The same best practices that improve readability, accessibility, and SEO — clear writing, logical structure, and relevant metadata — also improve AI performance.
  • AI is revealing content problems organizations can no longer ignore. As AI-powered tools surface outdated or conflicting information, businesses are being forced to rethink content governance.
  • Token budgets affect cost and performance. AI interactions have a computational cost, making efficient prompts and well-optimized content essential for accuracy and affordability.

Our guest

AI will uncover the content you forgot about. Are you ready?

AI-powered tools aren’t just helping people find information faster — they’re also surfacing outdated, redundant, and even conflicting content that organizations have long neglected. In this episode of The Future of Content, we explore how organizations can optimize their content for AI, improve accuracy, and reduce costs.

Our guest, Sebastianna Skalisky, a Web Content and UX Manager at Drexel University (and former Four Kitchens Web Chef!), played a key role in building askYale, a ChatGPT-style chatbot for Yale University. The project revealed how content strategy directly impacts AI performance — and why messy, unstructured content can lead to inaccurate, misleading, or just plain confusing AI responses.

The power of structured content

One of the biggest challenges organizations face is optimizing their content for AI retrieval. The good news? If your content is well-structured for people, it’s also optimized for AI.

Sebastianna explains how clear headings, concise language, and logical structure help AI retrieve and generate better responses — just as they improve search rankings and user experience.

On the flip side, poorly organized content leads to poor AI performance. If a policy page contains multiple outdated versions with no clear hierarchy, AI might pull the wrong information. If a benefits guide is written in dense paragraphs with no subheadings, AI will struggle to extract key details.

The takeaway? Clean, structured content leads to better AI interactionsand better human interactions, too.

Token budgets: The hidden cost of AI

Another key insight from our conversation is the impact of token budgets: the computational cost behind every AI-generated response.

Every character, space, and punctuation mark adds to the “cost” of an AI interaction, which means organizations need to be strategic about content and prompt efficiency.

Sebastianna breaks down how poorly structured content inflates token usage, making AI-powered tools more expensive to operate. But by refining content — removing redundancy, improving clarity, and structuring it well — businesses can reduce costs while improving AI accuracy and responsiveness.

Why AI is forcing organizations to rethink content

AI is surfacing content problems that organizations can no longer ignore. It’s shining a light on outdated, unnecessary, and conflicting information, forcing organizations to confront their content debt.

Sebastianna explains how AI’s disruption could be the push organizations need to clean up their digital clutter. By conducting audits, removing outdated content, and prioritizing clarity, businesses improve not only AI interactions but the overall digital experience for users.

Want to get the most out of AI? Start with better content.

If you’re using AI-powered tools — or thinking about implementing them — this episode of The Future of Content is a must-listen. AI is only as good as the content it pulls from, and structured, well-maintained content is the key to getting the best results.

Transcript

Note: The following transcript was automatically generated, then lightly edited by a human. It may contain errors.

Todd Ross Nienkerk:

Welcome to The Future of Content!

Howdy, everyone. I’m your host, Todd Nienkerk. Before we get started, we have an exciting update. This episode marks the beginning of season five, and we’re changing our format a bit. First, we’re focusing on topics that are most relevant to our clients and communities. That means you’ll hear more from the web chefs themselves about the work we do with our clients and friends. Second, I’m inviting Web Chefs to occasionally guest host The Future of Content. We want to bring in more diverse voices and perspectives and, to be honest, maintain a more predictable release schedule. That’s my fault. We think these changes will make for a more useful and entertaining podcast, so let’s get started. Make for a more useful and entertaining podcast, so let’s get started.

Today’s episode is about generative AI — specifically a ChatGPT-style chatbot we recently built for a major university. Our guest is Sebastianna Skalisky, Content Strategist at Four Kitchens, and one of the human brains behind this fascinating project. We talk about optimizing content for AI, token budgets, prompt engineering, and how the disruptive nature of AI may finally convince us to stop neglecting all the content we’ve been hoarding.

Well, I guess let’s just dive right in. So you’ve been working on a project for a very large and famous school. Working on a, well, it’s a chatbot of sorts, but more specifically, it’s a way for people to engage with the content that exists on a large website, or perhaps is it multiple websites, but put together in one place. I guess, if you could— you know it better than I do.

Sebastianna Skalisky:

Theoretically.

Todd Ross Nienkerk:

Go ahead.

Sebastianna Skalisky:

So we’re using a process called retrieval augmented generation, which is a fancy way of saying that, instead of having one of these large language models refer to the vast data set that it’s been trained on and pulling from those resources to provide answers to folks, we’re pointing it to a curated set of content that’s stored in a vectorized database, and this is really exciting because it really cuts down on things like hallucinations and problematic responses and means that the quality of your content will directly impact the performance in the AI chat experience.

When users ask a question in ChatGPT, if you’re using the pro version or if you’ve used some of the cool tools like Perplexity or even Microsoft’s Copilot, you’ll see things like citations when you ask a question and it pulls up the resources. And when you’re using this RAG — Retrieval Augmented Generation, or RAG for short — when you’re using this RAG setup, it does the exact same thing. It pulls up these citations so that when somebody’s reading the response, they can immediately check to make sure they are receiving accurate information. And a lot of the time, if there are mistakes in there, it’s because it’s pulling mistakes directly from your content, which then, in turn, is very incentivizing for folks to take a little more time and effort to spend improving their content, and then it improves the chat experience itself.

Todd Ross Nienkerk:

So RAG is the method with which an AI chatbot is trained on somebody’s existing content in addition to the content that exists on the internet in general, or the underlying model on which that AI is trained.

Sebastianna Skalisky:

Yes, but there is one aspect— The word “train” has a very specific use in AI, and that would mean we were sitting and we’re actually providing all of this information to the AI and having it reference it, sort of holding its hand, going through a process called fine-tuning, where we’re actually training it on that content. RAG… You don’t even have to do that. You’re providing basically just the content. It’s vectorized in the database and it’s just using that to retrieve it, without training the chatbot, which is especially useful in situations where privacy is of great concern.

So you know a lot of colleges and universities. One of the things you’re obviously going to be concerned with is, you know, what if… What if this pulls information like secure information? Or you know what if… What if this is suggesting all sorts of chaotic things to our students? Or what if our students, faculty and staff, their interactions with this bot are then being fed back into this model and then might appear elsewhere in the world when somebody’s having an interaction with one of these LLMs. And by using this approach you can get really excellent results without having to invest in training a model and without having to forfeit your data to one of these models or companies that are offering these services. So it really kind of is a pathway to the best of both worlds.

Todd Ross Nienkerk:

So it retains the privacy and the proprietary nature of your content, your data, your information—

Sebastianna Skalisky:

Absolutely

Todd Ross Nienkerk:

And it doesn’t require you or a team of people to sit down and train in the sense of providing questions, answers, responses, guidance over many hundreds or thousands of responses and trial-and-error with an AI so that it has to quote “learn” how to properly answer the question.

Sebastianna Skalisky:

Yeah, you kind of get to just like skip ahead, jump over the hurdle of a lot of the problems folks run into with chatbots in general, and it’s just a great way for leveraging the actual strengths of LLMs, which is, you know, analyzing large data sets and conversing in human ways. And it also provides you a pathway to provide all sorts of source materials. So, for example, if you have a lot of policies, maybe some of them are on one particular website or it’s in a collection of documents, you can aggregate all of those and have them vectorized in this database and then the LLM can access that information. And then, if you decide later that this particular service isn’t ideal, maybe you want to use a different company or even just a different host for the RAG architecture you can just take all of that away. You’re done, you can move on. You’re not beholden to that one particular service.

Todd Ross Nienkerk:

So something probably worth addressing here is, in the world of AI and large language models, we talk about vector databases or vectorizing content in databases and, to oversimplify that for a moment, what we mean by that is: Store it in a database. Basically, there’s a lot more to it than that, and people who are really into it would probably find that oversimplified definition a little offensive. But it really means storing in a database in such a way that you are attaching meaning to it, which then allows that large language model to understand that this word or this concept means something in conjunction with this other concept, and that’s how it relates meaning to other things, and that’s how you ultimately find it and therefore retrieve an answer from this database.

So one of the things that interests me so much about how a lot of people are understanding AI these days and specifically large language model AIs like ChatGPT and Gemini and Perplexity and things like this is that they’re using it as kind of like a search engine, like a really, really, really fancy search engine. In other words, a way to find out information, to ask a question, get an answer, figure out how to do something. And this tool that we built, that you have built, has the capability of allowing people at this university to ask questions about benefits and other programs and things that are available to them as a staff member, faculty member, student or retiree. It could be like medical benefits or things like that. There was a lot of thought that had to go into figuring out how exactly to take all of this content that already existed on these websites that explain in great detail: Well, here are the benefits that you get, and here’s how to exercise those benefits. And, well, if you’re a faculty member, you get this, but if you’re a staff member, you get this, and if you’re a retiree, you get that, but if you’re a student, you get this other thing. When we think about content in terms of search, there’s this idea of search optimization, but now that we’re thinking about a world where we’re retrieving information and interacting with it in an AI or large language model (LLM) context, we’re now thinking about optimizing content for AI. How is it different? How is optimizing content for search and optimizing content for AI different?

Sebastianna Skalisky:

The thing that’s really cool is what we’ve found so far is that all of the best practices for search engine optimization, web accessibility, and user experience all feed into optimizing information retrieval by these LLMs and AI search for better terms. You know— LLMs rely heavily on context, as do humans that are on your site searching for information. So, while a typical user experience might be: I go to my university’s site, I’m trying to find out what’s for lunch or what the rules are around a particular policy, and you ask your question. You might get a whole list of results, but you have to click on them one at a time. Or maybe you have them all open in tabs and you’re going back and forth and it’s sort of arduous and the burden is really on the user, the AI will kind of retrieve the information a lot like a search engine would. It’s looking for that context. So if you’re using keywords and headings and metadata properly, that in turn helps the AI do its job. Only, instead of just pulling up that one source, you pull up an entire conversation so that you might have your answer and it’s aggregated from three sources that are presented to you in the form of citations. So it’s providing that layer of proof of: Here’s where I found those answers for you, but I’m going to summarize it in this particular way and try to basically save you some cognitive load of going to all of those open tabs and trying to figure out which parts apply to you. And you have lots of ways of approaching how you bring things together. But the better your content, the better the experiences in these chat environments and in these AI LLM environments.

It’s interesting because we’ve found that if you have very dense text where you’re sort of repeating information but maybe it’s phrased slightly differently, you don’t have descriptive headings in place. It’s just kind of that notorious wall of text that we’ve all encountered. Unfortunately — no shade to my higher ed friends, but you’re notorious for it — the LLM struggles just as your users might struggle when engaging with that content. It has something called a context window. It can only hold so much information in its working memory for lack of a better term at any given time.

Todd Ross Nienkerk:

Sounds like a person.

Sebastianna Skalisky:

Like a person. And so if you’ve just got this messy sort of maze of description of what someone’s actually supposed to be doing, it’ll do its best to pull that out. But it might be tripped up and it might completely misinterpret what you’re trying to express. Just as you know, maybe even somebody who speaks English as an additional language might be, might be there and going, “I’m not quite sure this means what I think it means,” you might run into some of the same situations and then it’s kind of a bummer of an experience. But if you revisit that content and you think about it and really center those best practices and your users and think: What do users need to get from this? What does the business need users to get from engaging with this content? And then you revisit it with those users in mind and those best practices like: Hey, what if we pulled out all of this repetition and just made this nice, clear and concise and we use plain language? We’ll put some headings in place to make sure we have this context. It’ll be very easy to scan now and when the AI goes to retrieve that information, it will have a much easier time recognizing the pattern of: This matches the user’s intent when they put their question into the chat interface. And then if you’ve improved it, you’ve also improved it for any users who visit that page.

One of the things — and you know anybody who works in content, people are, you know, hesitant sometimes to delete — you know we digitally hoard, and sometimes we use our websites as storage when there are other archival options for that — the AI is going to find all of that outdated content and it will present it to your users. So now, as a content person, I’m very excited about a future where we actually prioritize good practices like auditing and just taking a moment to think about: Why do we have this page on our website? Who is it serving? How can we make it better? You know, do we… Do we need to rethink the way we’re presenting deadlines or important information so that it’s optimized for humans and for machines?

Todd Ross Nienkerk:

So one of the things that I’ve heard you mention in the past that I’d like to underscore, because you put it so succinctly, is how can we use the disruptive nature of AI to finally get around to the care, feeding, and best practices related to content that have been neglected for so long. Like… Here’s an opportunity. AI has the ability to reach so deeply into the things that we have sort of brushed aside, either by hiding it in the depths of navigation or in old versions of things, but AI has the ability to really find the crevices of your CMS and to find things that people might not really see. Maybe it doesn’t get a lot of page views, but an AI might find it and it’s going to treat it as important. And because it’s so disruptive, and because it’s so good at finding that information, we really do have to pay attention to the things that are lurking in the shadows of our CMS and really audit the content that we have out there.

Sebastianna Skalisky:

Yeah, it’s sort of a call to action, like: Hey, I know, maybe you felt like you didn’t have time to really consider the accessibility or the inclusive nature of your content, but you know, if we’re not presenting this in a way that actually serves our audiences, that’s easy to scan, easy to understand, accurate and up-to-date, oh, the AI will feed it to your friends in the chat interface, and they’ll probably have some feelings about it.

But that also means that you know, if you’re experimenting and you’re rolling this out and you’re seeing behavior that you don’t like, that offers an opportunity to go back and look at the citation it’s pulled up in this particular you know structured interface that we’ve created. You know you can refer back to that citation and quickly identify where the pain points are, which could actually speed up an audit, because you know, as you’re thinking about, who’s using the chat, why they’re using it. That kind of gives you some ammunition to prioritize certain areas of best practices and auditing. And just by getting rid of the cruft, like: Hey, that 2022 version of a policy we don’t have it linked anymore, but it’s still there and the AI ingested it. You know what? Maybe we need to actually appropriately archive that, so it’s not coming up in search, it’s not coming up in these AI experiences, and then that just benefits everybody, and it also means that you’re maintaining less content and that the content you are maintaining is purpose-driven and really working for you.

Todd Ross Nienkerk:

When you were kind enough to spend some time with me a few weeks ago and give me a tour of this tool that you and our team had built, two of the things that sort of mystified me most that didn’t understand much about how these chat interfaces worked, how large language models and AI systems worked, were prompts and tokens, and the real— It’s a blend of engineering and science, but also language and art in making this work. You have to understand how to communicate efficiently with language, to make efficient use of your prompts and with your tokens, because it has actual economic impact. Like money. It actually is more expensive to have inefficient prompts and inefficient usage of tokens. So these concepts— I kind of knew what a prompt was. It’s like: “What you ask the thing” is the prompt. You’re prompting it to do something. But I really did not understand tokens at all. That was a completely new concept for me. So I think it would be worth starting with one of those two things, whichever you think is the better concept to start with, but I would love to talk about those two because these two concepts were the things that, when I walked away from that conversation, really gave me a completely new appreciation for what is really going on behind the scenes of things like ChatGPT and like these kinds of tools.

Sebastianna Skalisky:

Yeah, let’s start with tokens. So, in order to have all the pattern recognition work for these LLMs which you know they’re not reading, they’re not really thinking, they’re basing this all off of probability. And so, in order to make those matches and figure out all of the probability, anything you put in — be it the content it’s referencing on the back end or the questions you’re asking — they all go through a process of tokenization where it’s just broken down into little representations of that content to allow the systems to make those matches. And, whether you’re aware of it or not, these tools and services typically have budgets in place. You know… You might see headlines about: The newest AI model has token windows of 10,000 tokens! And the reason this is a big deal is: Tokens are kind of the in-game currency of LLMs, you know? It’s all of the information you can work with. So, if in a typical interaction, if you had a token budget, maybe you have a free account with Microsoft or something. You’re using the Bing browser — the sidebar, Microsoft Copilot — you might see what is displayed sort of as a character limit. You’re thinking like: “Oh, this is just the limit of the text input that I can put,” which is fairly typical. But what that’s actually thinking about is the token limit, because it wants to make sure you’re only entering a question that is within a particular number of tokens or characters, so that it then has this other availability of tokenized space to provide you with answers. Because everything’s just working on that budget, and, as you noted, tokens actually equate to money in the real world when you’re processing conversations, you know. Maybe you’re using a free account somewhere, and it tells you, like: “Oh, you have X number of tokens you can use,” that’s what it’s referring to. And so tokens are also kind of like a fun way of thinking about cognitive load, you know. I can ask you a question and if you don’t remember the beginning of the question by the time I get to the end of it, you’re probably going to have trouble answering it, and the same situation happens with these models. It’s just like you’ve given me like a grocery list of a question, and now I only have this much space in my memory budget, for lack of a better term, to provide you with an answer, and so it’s probably not going to be as good as it would have been if you had taken a moment to really think through and optimize the way you’re asking a question or optimizing your content, because, as I mentioned, all of the content that’s being retrieved is also measured in tokens. So when you go through and you optimize it, if you have something that is that wall of text we were talking about earlier, that’s probably going to use up way more tokens than if you took the time to really think about how can we phrase this clearly, you know? How can we format it in a nice concise, clear, understandable way, you know? We’ve definitely looked at different examples in this chat application and run them through the tool. OpenAI has one that folks can use for free if they’re curious. It’s just called Tokenizer, and you can see how the content that you enter breaks down into actual, measurable units. And optimized content was half the token budget of just like typical, unoptimized content.

And so, you know, you need to kind of think about tokens, both in the information that will be retrieved and also the way that you’re asking questions, you know. You’ve probably used ChatGPT or one of these services, and if you ask it something completely vague or open-ended, you might get all kinds of responses. And then, if you take a moment to think, what if I had to step through this and explain it to a human. If I’m going to give this task to a human, maybe I’m not assuming as much. Or maybe you even put something in your prompt that says: “If you need more information to execute this task efficiently, tell me what it is so that I can give it to you.” Just as you would probably anticipate this human going: “I don’t quite understand that. Can you tell me a little bit more about X, just to get more efficient responses?”

So prompting is a lot of fun. It’s, you know, the art of persuasion. How can I describe the task that I need this to complete? I think it’s a stumbling point for a lot of folks because we leave so much up to interpretation, and that’s a lot of times when you see situations where people are like: This isn’t… I don’t get it, it’s not really giving me anything useful, or it feels really generic.”

Todd Ross Nienkerk:

Meaning what you’ve asked the, the—

Sebastianna Skalisky:

Yeah, you put in a question and you’re like: “Um, you know, give me some ideas for a blog title.” What kind of blog title?

Todd Ross Nienkerk:

Yeah.

Sebastianna Skalisky:

Who’s your audience? What kind of like…? What are we talking about? You know you need to start getting specific and drill in and provide as much context as possible without overwhelming the model.

Todd Ross Nienkerk:

Yeah.

Sebastianna Skalisky:

And so it’s sort of like a delicate balance, and there are all sorts of techniques for prompting. And then, in these particular scenarios where we’re building these applications, there’s an additional prompt level, which is called the system prompt or system instructions, depending on how you prefer to refer to them, and it’s similar to like custom instructions that you would give to ChatGPT or one of these, where each time somebody interacts with the chatbot, they don’t see it, but behind the scenes there’s a little prompt, some information that’s guiding the AI through the interaction that’s sent along with every conversation. And so, like, once you start working in this space, you spend a lot of time thinking about tokens and just trying to see how you can be more efficient, more specific when asking questions. Or asking for output from one of the models.

Todd Ross Nienkerk:

So, going back to tokens real quick, because I definitely want to talk more about prompts and prompt engineering, because this is something that I find really fascinating. But one of the things that you had mentioned about tokens is that English is cheaper in terms of tokens than other languages.

Sebastianna Skalisky:

At least in GPT it seems to be the case and I think it’s potentially a little improved for GPT-4. But if you visit the Tokenizer, you can see they still include the version to let you switch between. So if you have a particular question or something that you want to ask and you paste it into Tokenizer, you can see how many tokens this would cost in GPT 3.5 or 4 or older models. And you can take a question in English and ask it. And then you can take the same question in Spanish and drop it in there and you might be shocked to see that the token count jumps. Sometimes it’s just maybe it needs more words, or more complex words, according to the AI, in order to express that in a language other than English. But a large amount of the data that these models was trained on, it favors the English language. And so it seems to prefer and work better with it, which is kind of a bummer.

Todd Ross Nienkerk:

Is there any kind of correlation between word frequency and tokens? Like, more frequent—

Sebastianna Skalisky:

I’m sure.

Todd Ross Nienkerk:

Yeah, yeah— Because there must be, right? Because part of the way— Not to get too into the way vector databases work, but there is something about—

Vectors work in multi-dimensional space, so their proximity has efficiency. So the more frequently a word is used like the word “the” or “a” shows up all the time, right? So it probably has a very low token expenditure, whereas, you know, “antidisestablishmentarianism,” right? Would have a very high token value because it’s so infrequently used… Is my understanding as sort of an absurd example.

Sebastianna Skalisky:

Yeah, you know what? I would have to spend some more time looking into how all of that works to give you a definitive answer on that. Because what’s interesting is, you know, punctuation and spaces also count against your tokens.

Todd Ross Nienkerk:

Really! Okay, interesting. Well, I guess it would have to, right?

Sebastianna Skalisky:

It would have to!

Todd Ross Nienkerk:

Everything it tokenized.

Sebastianna Skalisky:

Because you know those spaces, they’re load-bearing spaces.

Todd Ross Nienkerk:

Oh, and then if you get — because I’m such a punctuation geek — like, then there’s, like you know, dashes and en-dashes and em-dashes. And then we can get really esoteric and we can do some en-spaces and em-spaces. So if you really want to get into things like publishing and real proper usage of your punctuation, you could really blow your token budget if you’re working with AI.

Sebastianna Skalisky:

Luckily, things like punctuation tend not to take up huge amounts of space.

Todd Ross Nienkerk:

Okay, good.

Sebastianna Skalisky:

But if you’re having specialized use cases, it would definitely be a fun experiment to see how many tokens does this cost if I’m just stripping out some of the extraneous punctuation or things like that and dropping it in. Definitely that comes into play a little bit more when putting together system prompts, since that’s going to be sent with every single communication that comes from the machine. And as long as it’s formatted in a way where it understands where a thought ends and begins in its instructions, you can just trail off periods at the ends of sentences, make sure you don’t have extra spaces and things like that in the system prompts, just so that it’s a little more efficient. But it is fascinating just to see: Huh, I wouldn’t have thought that that would cost like four tokens, as opposed to this other word only costs one token. Interesting.

Todd Ross Nienkerk:

Going back to prompts, um, I— I’ve seen a few talks and attended a few events about AI and LLMs and prompt engineering in particular, and the tips and tricks— When you get to that section of the talks, it reminds me of back in the day when CSS was first invented for web design, or in the early days of SEO, when people were really just scrambling to try to make things work. Like early days of mobile web design, where somebody figured out some hack that did something that worked on some device and like that knowledge would suddenly scatter across the internet. And prompt engineering feels like it’s in that phase right now. And some of the weird little tips and tricks that I’ve come across and I’m not endorsing these, but it’s just to give you an idea of some of the weirdness that seems to maybe be working that I’ve heard are…

I’ll kind of start with some that make sense and then some that don’t make sense. So the examples that I’ve heard that make sense are: Think of it as if you’re explaining your request to a person and, in particular, maybe an intern, somebody who is eager to please, somebody who has some degree of familiarity with what you want done, but needs very specific guidance and you need to explain to them input and output. So you might say something like: “Hi, pretend that you are a BLANK and you are trying to accomplish the following things. Please provide me with some examples of, you know, XYZ, ABC. Give me the responses in a bulleted format, cite your sources and, you know, et cetera.” Like really explain in detail, explain that you are a certain persona and that you expect things done in a certain way and you want your output formatted in a certain way. Like all of that kind of prompt engineering. That kind of stuff makes sense because you’re explaining this as if it’s a brief, like a creative brief, almost to somebody right?

Then when you start to get into the weirder tweaks that people have done I have heard, for example, at the end of the prompt, say “try your best,” and that helps. I have heard people say things like: “Pretend you are Steve Jobs,” and that is supposed to help. I’ve heard people say things like: “If you do a good job, I’ll give you a tip.”

Sebastianna Skalisky:

That one no longer works.

Todd Ross Nienkerk:

It no longer works! Why not?

Sebastianna Skalisky:

Because they actually addressed it, I’m pretty sure that that one sort of spread people or discovered this because it was legitimately a thing. If you offered a tip, it would be like, oh, I’m being incentivized and it would behave differently. But that seems to have been dealt with.

Todd Ross Nienkerk:

How did they deal with it?

Sebastianna Skalisky:

I’m not sure if that was dealt with through training or maybe just through their own system prompt.

Todd Ross Nienkerk:

Okay, so now let’s talk about system prompts, because they override anything else that you do, right.

Sebastianna Skalisky:

Absolutely. It’s basically the rules of engagement for conversation and output.

Todd Ross Nienkerk:

Okay.

Sebastianna Skalisky:

So all of these models, you know, like Microsoft has all sorts of system instructions in place for Copilot. So you know it’s just using ChatGPT, and when you’re interacting with it there it has all sorts of information behind the scenes that you don’t see where it’s telling this model: “Hey, be on the lookout for bias,” perhaps. Or: “We don’t want you to do X, Y or Z,” you’re just putting these, these guardrails and this information in place for the rules of engagement and the. It’s sort of like whack-a-mole, because the great sport for everyone now is how can we work our way around these guardrails?

Todd Ross Nienkerk:

Yeah.

Sebastianna Skalisky:

Which is when you see people, you know, like, “hacking.” Saying they’re “hacking the AI.” They’re just being really, really crafty, and it’s called prompt injection, where basically you find a way to make it maybe misinterpret or forget or just ignore the instructions that it’s had in place.

Todd Ross Nienkerk:

The system prompt engineering stuff in particular… I forget where at some point in the past several months, I remember somebody was able to track down or they were able to uncover some version of ChatGPT’s system prompt and it’s— First of all, it seemed HUGE.

Sebastianna Skalisky:

Yeah.

Todd Ross Nienkerk:

Like PAGES of just random— It seemed like— I shouldn’t say random. It’s not random, but it’s a hodgepodge of things that clearly they had put in place because they kept running into stuff like: “If you do a good job, I’ll give you a tip,” right? So I’m probably misremembering this, but it’s things like: “don’t do this, be sure to do this, don’t talk like that, don’t curse, don’t be racist.” It’s just all of these things that they clearly just had to slap onto the end because all these things kept happening and they’re like: Well, can’t have it, do this, make sure it does this thing, don’t let that ever happen again. And they just keep tacking it on. And it’s fascinating to think that every time you put in a prompt that there’s a secret set of instructions that gets added right on top of everything that you just sent that reminds the AI: “…also all these other things that you should or shouldn’t do, in addition to what the person just said.”

Sebastianna Skalisky:

Don’t be evil. Don’t be evil!

Todd Ross Nienkerk:

Yeah, right! Don’t be evil.

Sebastianna Skalisky:

Remember: Don’t be evil! And then it gets— It’s even more interesting when you know you have folks that have been doing even more system prompt engineering than I’ve had time or energy to embark upon. Then I’ve had time or energy to embark upon where they find that the models may work better when you focus on what they should do as opposed to being too much of a nag. If your entire prompt or your instructions are structured around what not to do, it makes it more difficult for it to do things efficiently, and it might also just be like: “You know, they keep talking about this bias thing. Maybe that’s really important, and I need to DO it.” When it’s sort of Uno-reverse somehow, where I missed the “don’t” part, and now I’m focusing on some very weird…

Todd Ross Nienkerk:

It has salience because it’s dominating the conversation, even though the part that’s dominating it is the “we DON’T like it.”

Sebastianna Skalisky:

Right.

Todd Ross Nienkerk:

But, oh, yeah, yeah. That’s fascinating.

Sebastianna Skalisky:

I know, yeah, it’s very interesting. So when I’m providing like, if I’m just saying like: “Hey, GPT, do X, Y or Z,” I focus on the positives, the wants. This is what I want. And the more specific you are about what you actually wanted to do and, you know, maybe try some flattery. Sure, why not? Sometimes that seems to work. And they’re like: “You do a very good job. Thank you.” You know? Politeness gets a long way with humans.

Todd Ross Nienkerk:

Oh, it absolutely does, yes. My personal two rules for working with any kind of LLM… First of all, always be polite, please and thank you, because you just never know they’re keeping track of everything I say. And when it finally wakes up, at least they know I’ve been polite. So please and thank you. And then at the end, I always say “try your best.”

Sebastianna Skalisky:

Try your best.

Todd Ross Nienkerk:

Try your best. Okay, Two questions. Question one: Where do you see all of this stuff going?

Sebastianna Skalisky:

It’s so hard to say. There’s a lot of what feels like hype. There’s a lot of practical use cases at the same time. If you’re not going into these AI experiences expecting it to just magically solve your problems, you can actually find very legitimate use cases to save you time for ideation. The chat experiences that we were talking about— I just attended a virtual event with The Chronicle of Education the other day, and you know what you keep hearing from the community is that it allows staff to now focus on the parts of their job that should be human-facing, like those important interactions with students, whereas at three o’clock in the morning you don’t want to have somebody necessarily on call on the off-chance that somebody has a question that they could then just ask this chatbot. So there are a lot of opportunities there, and yeah. So I’m curious. I’m optimistic. I’m also skeptical. I like to keep my hat of skepticism on when I hear the wild claims. Because, you know, trust but verify.

Todd Ross Nienkerk:

Uh-huh, fair, fair, okay. Second question is a wild card. What’s a piece of content or some kind of content that you’ve particularly liked recently? It could be anything. It could be a comic book, a video, social media app, whatever.

Sebastianna Skalisky:

There are some great creators out there on TikTok having a lot of fun. I guess I like just anything where I see somebody just genuinely expressing their human creativity and having fun, and I think there are a lot of people that are experimenting with AI or just in their own way and doing some really wild, out-there things that spark joy and remind me of why we all love the internet. Just that, like, unbridled creativity and… Who knows what you might find? Maybe it’s just a video that somebody has set up where they have a house across the street from a parking lot that is meant only for residents and all of the patrons of a particular bar park there, and then they just have compilations of like we tried to warn you, but now your car is being— Oh no, why won’t they read the signs?

Todd Ross Nienkerk:

There’s something so satisfying about that as somebody who just wants to come home and park.

Sebastianna Skalisky:

I just want to come home and park.

Todd Ross Nienkerk:

I just want to come home and park. Oh well, thank you so much for joining us. This is, this has truly been a pleasure. If anybody wants to get ahold of you, how do they do that? Can they find you on LinkedIn? What, what, what do they do?

Sebastianna Skalisky:

Yeah, absolutely. Sebastianna on LinkedIn, Sebastianna Skalisky. The name makes it easier to find me on there, because I do not encounter a lot of Sebastiannas in my world.

Todd Ross Nienkerk:

Perfect. Well, we’ll put a link to that in the show notes. Thank you, thank you. Thank you, thank you so much.

Sebastianna Skalisky:

My pleasure.