Fandom, AI, and Chainlit: How I Built a Chat Experience for Novel Lovers

Abdul Hersi

28 Nov 2024 • 13 min read

Hey everyone, it’s me again… When I first started this blog, my goal was to be consistent and share my interests along the way. I’ll admit, I’ve been anything but consistent—but hey, better late than never, right? And honestly, I got a little push from Paul Graham’s latest essay, “Writes and Write-Nots.” It’s a great read, and it got me thinking that I definitely don’t want to be a “Write-Not.” (It’ll make more sense if you check out the essay.)

On a lighter note, if Beyoncé has the Beyhive, what do we call the folks waiting for a Paul Graham essay drop like it’s a new album? Just something to chew on.

First, what did I develop, and why? As the title of the project illustrates, I've developed a chat experience for Novel lovers. More specifically, an opportunity for fans of a specific novel, to chat with their favourite characters in an immersive type of experience. It feels like a right of passage for anyone developing with LLM's to start with creating a chatbot. Usually these chatbots are meant to give users the ability to chat with a specific set of information or a knowledge base. However after stumbling across some tiktoks of fans from a specific novel, I was surprised by the level of fandom they had. These people were crying, screaming, confused, upset, but more importantly super immersed in the novel and wanted to share the emotions they felt after reading specific chapters. I noticed they also would constantly ask the Author for alternate endings, and questions to hypotheticals some authors wouldn't want to answer(Some authors don't want to undermine their carefully created work). So I thought to myself, what if I allowed these fans of these novels to be able to chat with their favourite characters from the novel? I would gamify it a bit of course. For example, this fictional character might appear on a late night television show, and the user would get to be the host. Or, maybe this character got himself into some trouble, and your supposed to calm him down before he gets into some mess. How well do you know this character? Do you know what makes them tic? Can you prove how much of a fan you are relative to others? Seemed like something interesting to tackle, so I went ahead and got started.

First things first, tech stack. After a bit of research, I stumbled across Chainlit. It's an open-sourced Python framework that simplifies the development of scalable conversational AI applications. Perfect. It has a pretty active developer community, with a helpful discord group. So I got started right away and had a simple chatbot demo running in minutes. It does a lot of the heavy lifting on the front end, allowing me to focus on the meat of the project. Chainlit does however give you the ability to modify a few things on the frontend with some custom javascript and html, but there is a lot of limitations for customizations.

Since Chainlit application's are publicly accessible by default, I had to think about authentication. I used Google OAuth as well as Auth0 to manage users who wanted to create new accounts as opposed to using their private gmail accounts. As for my choice of LLM, I stuck with OpenAI's GPT-3.5 then later GPT4. I'll go into a bit more detail as to why I made these specific choices.

As for my database, I chose to go with Supabase this time. I've used firebase plenty of times for different projects, but I've heard many good things with Supabase so I decided to give it a shot. For now, I decided not to save users chats if they exit the application, so the only thing I needed to save is basic user information, some login info, if they are a paid user and when their subscription would expire, as well as the amount of tokens they used. I figured that would give me a basic metric to see how much a user was using the application.

Lastly, I used stripe for my payment provider. I've been hearing a lot of buzz on twitter about another option called Lemon Squeezy, but I decided to stick with Stripe, no real reason, just felt more comfortable using a more recognizable company.

Going into this project, I saw a clear challenge, which might also be clear to you guys. This isn't going to be a regular chatbot application, where you can have the LLM answer questions based on a specific knowledge base which you can provide, using RAG. This chatbot has to represent the character from the novel as close as possible. It has to completely embody this character in terms of personality. It would have to remember relationships they had with others, remember important moments in the novel, and be able to answer questions about the novel just like a fan would be able to answer. That might sound straight forward, but in fact its not. Let me give an example of how a LLM chatbot would work if you just wanted it to answer questions on a specific set of data, and not to represent a specific character from a novel. If you were talking to an LLM and wanted to ask it questions on a pdf that contains up to date stock market information, you might find yourself asking questions like the following:

Do you know what the current stock price of AAPL is?

What happens here is this specific user query gets vectorized, into a multidimensional space. In the meanwhile, the pdf that contains the important information is also vectorized and stored in some sort of vectorized database. This basically means the pdf is broken up into smaller chunks and each chunk is vectorized. A vectorized search technique is then used to find documents in the vectorized database that is closely related to the vectorized query. The document vector that is closest to the the query vector is identified, and is returned, transferred back to text, and can be used to answer the original query. So walking through this example on a higher level, when the user asks for the current stock price of AAPL, the portion of the PDF that relates to this question the most, might include a description of todays AAPL stock price. There might be other portions of the PDF that also describes AAPL's price, but the one that is closely related to the query might be a portion describing "todays" price, or "recent" price. That would get returned, then we can simply tell the LLM to answer the users question based on the returned vector from the PDF. Pretty simple right? Now lets try replicating this with what a user might ask an LLM embodying a character from a novel:

Hey Marcus. Do you regret not telling your best friend about your secret sooner?

This might be very difficult for an LLM to answer. it requires the LLM to reflect on a fictional character’s emotional journey and their motivations within the story. The chatbot needs to generate an answer that aligns with the character's internal struggle, personality, and storyline context. The LLM has to understand the nuances of the character's emotions and convey them authentically, while also keeping consistency with the character's established traits and relationships. It's more about creating a realistic emotional response rather than retrieving factual information, making it quite complex. If we were to use the same technique as the previous example, what portion of the novel would get returned to be highly related to the query? To answer this question, we would need to maintain context, consistent traits, and understanding nuanced interactions—more akin to role-playing or acting. It requires a deep, almost "alive" presence. Context comes from many different parts of the novel, and together provides a complete picture which is necessary to answer a question of this nature. This simply goes beyond retrieval augmented generation(RAG).

So the million dollar question, is how can we have the LLM answer questions consistently and accurately, especially when asked about things that occurred in the NOVEL? Is it even possible? To be honest, I'm not sure. One thing I am sure of, is that if my users were to ask questions that are specific to the novel, the LLM hallucinates quite a bit. The information I pull from the novel isn't always enough to answer the questions. Novels have too much depth and history, and RAG is simply not enough. Also, all it takes is for the LLM to give an answer that is far from accurate to immediately result in the users getting disinterested, and I can't have that happen. So I decided to limit what the LLM would talk about(for now). I decided to create scenarios, that are completely new and not immediately related to the novel, and set specific parameters for the LLM. This would allow me to focus on creating an LLM that embodies the character of the novel, in terms of feel and personality. I would focus on getting the personality on point, and not focus on making sure the LLM can answer any question about the novel. For example, lets say I did this for Harry Potter, the idea is you would be able to chat with him and the experience would be believable, but asking questions about the novel with him would be the sticky part. The scenario based use cases I create would steer the user away from this. I would also guide the LLM not to go into detail when asked questions about the novel. The goal of the user is to participate in this created scenario and attempt to achieve a specific goal. For example, I might instruct the LLM to participate in a late night talk show with the user being the host. The goal might be for the user to have the LLM reveal a deep dark secret never heard before. Another scenario might be a coffee date with this specific character where you discuss modern day issues. The goal was to stray the user away from discussing things that are specific to the novel. The reason why the users might find this engaging is because even though they aren't talking about the novel, the LLM would ideally embody the characteristic, tone, behaviour, and attitude of the character in the novel making it feel like they were brought to life.

How Was This Achieved

Since I only decided to focus on having the LLM embody the character from novel, the question was how can I best achieve this? My first approach was to include as much information about the character in the system prompt for the LLM, which surprisingly worked pretty well, but there was definitely room for improvement. It didn't quite feel like the character was good enough. The next option was to fine tune the model. OpenAI has some pretty straightforward instructions on how to do that, which is a bit simpler than fine tuning with some open source models. To fine tune you have to prepare a dataset that demonstrates conversations similar to the conversation that you expect the LLM to have. You can find more details on the structure of the dataset on OpenAI's website, but it essentially is in the following structure:

{"messages":[
{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, 
{"role": "user", "content": "What's the capital of France?"}, 
{"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}

If you observe the fine tuning snippet above, you'll notice there are 3 roles. A system, user and assistant role. The system role defines the behaviour and personality of the LLM by providing a prompt that describes its role.The user role represents the individual interacting with the LLM, asking questions or providing input, and the assistant role is the LLM itself, responding to the user’s input based on the instructions given in the system prompt. To get the LLM to embody the character, I did the following; I iterated through the novel and at any point I came across the name of the character, I would take 150 words before and after that location in the novel and save that excerpt in another file. After doing that for the entire novel I was left with quite a bit of excerpts. I then manually created as many fine tuning pairs like the example above, from each individual excerpt. I made sure to keep the system prompt consistent, but the user and assistant content was specific to the interactions in the excerpts. This way I would have plenty of interactions that the character participated in. To simplify this method, I dabbled with a few different LLM's to help me create these. Providing the excerpts along with instructions on how the fine tuning pairs should look like was almost good enough, but required a lot of manual intervention to clean up. Once I created all my fine tuning pairs, I used that to create my fine tuned model, which would eventually be used in my application!

Bumps Along The Way

Getting a quick version 1 of this application using Chainlit was pretty simple, it was up and running in essentially minutes. However, I really started feeling the struggles with platform dependency. Updates to APIs or features that introduced breaking changes several times. Even worse, sometimes certain features weren't available, so it required me to do some unconventional things to achieve certain goals.

There was this one time where there was a bug with the authentication process, which wouldn't get approved for support, at least immediately. This bug was specific to the Auth0 integration, which wouldn't log users out completely unless they cleared their cache. Since I didn't have complete control of the stack I had to create a work around with some custom javascript, which was pretty clever if I have to say. It took me a few sessions to figure out, but ironically the update got rolled out a month or so after I figured out a work around. It was an interesting experience though, especially since there were others who were experiencing the same issue and I was able the provide a solution on the discord channel. It was great to see how much others appreciated a solution to a problem they were having trouble figuring out. I guess that's what open source software is all about!

One thing I do enjoy when it comes to developing an application from start to finish, is that you get to touch every part of the stack, and you learn something new every time. I deployed my application on GCP, and I'm not sure if it was related to the framework I was using, but I noticed my website was pretty slow especially when first loading it. I decided to take my time diving deep into a rabbit hole understanding how deployment works on GCP to give me a better over understanding on best practices. The solution was pretty simple, my issue had to do with cold starts. A cold start occurs in Google Cloud Platform when a new instance of a Cloud Function or Cloud Run handles its first request, resulting in a longer response time. To fix this all I need to do was set a minimum number of instances. This can helped avoid cold starts, but I was billed for the idle time of those instances, which was definitely worth it for the performance boost.

Decisions Decisions Decisions...

When I was working on this project, I had a thought, "To charge or not to charge".Well, not exactly. I knew I wanted to charge, but I wasn't sure if I should start charging right away. I like the idea of minimizing friction for my users, letting them be able to use the application as quick as possible to allow them to get a taste of what they would be paying for. I would then prompt them to pay sometime later, but they need to see what they might be missing out on if they don't pay. So my idea was to give them some free credits to play with, until I make that adjustment.

I was about to make that change, where they would have free credits, but I think I started to do too much. As in, I should prioritize getting this in the hands of users as fast as possible, no need to wait. So I just removed the stripe portion of the application and made it temporarily free. I obviously put a cap on my own OpenAI credits in case someone abused the usage, but I decided to prioritize getting this out to the public.

Now that I got it out to the public, one day 1, I had 25,000 users join IMMEDIATELY! ...is what I would have loved to say, but no, not a single user. I dm'd people on tiktok, reddit, and facebook inviting people to try this out, but my approach wasn't the best. I was so focused on the development side of things, that I forgot about how important the marketing portion of releasing an application is. The effort is surprisingly a lot. I managed to get a few users to use it, but not a whole lot of people. Literally a handful, like less than 5....and more than 1. Funny thing is, that dopamine hit I got when I saw someone sign up on a random Thursday night, and actually use my application, was amazing. One hit of that, and I know I needed more. Like a lot more. But it hasn't been easy. My target audience isn't just book lovers, but its fans of this specific novel that I created this app for, and they are LOYAL to the author. Some fans told me directly that if its not coming from the author themself, they aren't using it, which makes sense, I can understand that. A random tiktok account with 1 follower is messaging you to use this random app that asks you to provide an email and password to try it out? I can't blame them at all.

Time To Pivot

My initial marketing idea was to reach out to fans of this novel, by reaching them on this specific facebook group or partnering with tiktok creators who love this book and who have somewhat decent to create a post about this application, but this proved to be a time consuming and slow. My first pivot was to then reach out to the author themself, and partner with them. That also, was a not as easy I thought. I haven't gotten a response from them at all. So I thought to myself, why don't I reach out directly to publishing companies, who house a bunch of authors, who want to grow their following and increase sales. I can talk to them directly, about a potential opportunity to help their authors. I could then use this application I developed as a demo of what I can do for them. One publishing company might house more than 20 authors, and all I need is a yes from one publishing company to get a chance to speak with a ton of authors! So I went ahead and shot my shot. I cold emailed about 3 publishing companies, with low expectations, and, drum roll...two responded! No one actually officially agreed yet, but there was interest. One company actually cc'd several people in the company including the CEO about getting a meeting started. Another dopamine hit for me! I haven't scheduled a sit down, but I'll be following up with them soon to see where this goes.

What's Next?

There is still so much to explore in this space, and this idea is only limited to the creativity an author might have. I'm looking to schedule a meeting with a publishing company, and hopefully have the opportunity to chat with an actual author to see if this is something they might be interested in. My assumption is that authors would love to leverage an innovative digital format to build stronger fan relationships, and I'd love the chance to discuss a potential collaboration that can lead to something special for them and their readers. While I’ve noticed some authors express concerns about AI, particularly in relation to their novels, these conversations can help uncover how they really feel and open the door to unique possibilities. Like the old adage says, ‘only time will tell.’ That’s it for now—hopefully, the next time I share an update, it will be to celebrate some incredible progress!