{"id":628804,"date":"2023-04-13T09:48:59","date_gmt":"2023-04-13T14:48:59","guid":{"rendered":"https:\/\/news.sellorbuyhomefast.com\/index.php\/2023\/04\/13\/replacing-my-best-friends-with-an-llm-trained-on-500k-group-chat-messages\/"},"modified":"2023-04-13T09:48:59","modified_gmt":"2023-04-13T14:48:59","slug":"replacing-my-best-friends-with-an-llm-trained-on-500k-group-chat-messages","status":"publish","type":"post","link":"https:\/\/newsycanuse.com\/index.php\/2023\/04\/13\/replacing-my-best-friends-with-an-llm-trained-on-500k-group-chat-messages\/","title":{"rendered":"Replacing my best friends with an LLM trained on 500k group chat messages"},"content":{"rendered":"<div>\n<p>\n            tl;dr: I trained an uncensored large language model on the<br \/>\n            college-era group chat that me and my best friends still use, with LlaMa, <a href=\"https:\/\/modal.com\">Modal<\/a>, and <a href=\"https:\/\/hex.tech\">Hex<\/a>.<br \/>\n            <del>The results will shock you.<\/del>\n          <\/p>\n<p>\n            <strong>The Group Chat<\/strong> is a hallowed thing. Sure, you might<br \/>\n            be in a couple of group messages for various purposes: the people at<br \/>\n            the dog park, climbing partners, weird people from Twitter, your<br \/>\n            high school friends. But everyone&#8217;s got the<br \/>\n            <strong>one<\/strong> that they simply refer to as \u201cThe Group Chat\u201d.<br \/>\n            It&#8217;s got a name that no one remembers the reason behind, and which<br \/>\n            would almost certainly be offensive if it wasn&#8217;t mostly<br \/>\n            indecipherable.\n          <\/p>\n<blockquote>\n<p lang=\"en\" dir=\"ltr\">\n              there are two types of male groupchats. either they have a name<br \/>\n              like \u201cBONER BOYS RES[ERECT]ED: HORNY 4 LIFE, 2 CAKED UP 2 DIE\u201d but<br \/>\n              they just encouraging each other through breakups and to try<br \/>\n              therapy. the other will be named like \u201cgary chat\u201d and filled with<br \/>\n              domestic terrorists\n            <\/p>\n<p>            \u2014 soul nate (@MNateShyamalan)<br \/>\n            <a href=\"https:\/\/twitter.com\/MNateShyamalan\/status\/1566572910474559490?ref_src=twsrc%5Etfw\">September 4, 2022<\/a>\n          <\/p><\/blockquote>\n<p>\n            You know the one. Like I said, it&#8217;s a sacred construct. A lifeline<br \/>\n            to your best friends, an outlet for the thoughts and questions and<br \/>\n            breadcrumbs of internet humor that you just can&#8217;t send to anyone<br \/>\n            else. A constant companion, antagonist, distraction, delight.\n          <\/p>\n<p>\n            So of course, I decided to replace mine with AI. And it worked better<br \/>\n            than I could have possibly imagined:\n          <\/p>\n<p>          <video controls muted loop autoplay height=\"600px\"><source src=\"https:\/\/i.imgur.com\/H7tDprR.mp4\" type=\"video\/mp4\"><\/video><br \/>\n          <br \/>\n          <img-caption>A typical conversation in the group chat<\/img-caption><\/p>\n<p><img decoding=\"async\" width=\"100%\" src=\"https:\/\/i.imgur.com\/GiFsLjn.jpg\"><\/p>\n<p><img-caption>Robo henry musing on the world&#8217;s great secrets<\/img-caption><\/p>\n<p>In this post, I&#8217;m going to show you how to do it yourself.<\/p>\n<p><\/p>\n<h2 id=\"dataset\">Dataset<\/h2>\n<p>\n            The dataset for this project is, of course, my Group Chat.<br \/>\n            Specifically the group chat with my five best friends from college,<br \/>\n            which has remained active over the past 7 years despite us all<br \/>\n            living in different parts of the country. How active?\n          <\/p>\n<p><img decoding=\"async\" src=\"https:\/\/i.imgur.com\/lksfX7z.png\" width=\"80%\" alt=\"very active\"><\/p>\n<p>500,000 messages active!<br \/>\n            As it turns out, iMessage on Macs stores messages in a SQLite database<br \/>\n            at <code>~\/Library\/messages\/chat.db<\/code>, so you can literally write SQL directly<br \/>\n            against your text messages with minimal effort. Pretty cool!\n          <\/p>\n<p>\n            I had no idea what this db looked like, or how tables related to one<br \/>\n            another. I was, to be honest, having a Bad Time trying to monkey<br \/>\n            around with it using sqlite3 on the command line, so I dumped the<br \/>\n            data into <a href=\"https:\/\/hex.tech\/\">Hex<\/a> so I could explore it<br \/>\n            more easily and extract just the messages of interest from my group<br \/>\n            chat.\n          <\/p>\n<p>\n            After some quick joins and a little <code>case<\/code> statement to<br \/>\n            manually get names from phone numbers, I had my list of 488,000<br \/>\n            messages in a nice readable format. This is more than enough data to fine-tune a model: the<br \/>\n            <a href=\"https:\/\/github.com\/tatsu-lab\/stanford_alpaca\/issues\/81#issue-1629958960\">Stanford alpaca project<\/a><br \/>\n            used just 52,000 example prompts. I just had to massage it into the<br \/>\n            right format for an LLM.\n          <\/p>\n<p>\n            Fine-tuning a model essentially consists of taking a bunch of known<br \/>\n            prompt\/response pairs (kind of like an answer key), having the model<br \/>\n            do inference on prompts to which the correct response is known, and<br \/>\n            then \u201crewarding\u201d the model based on how accurate it was to the known<br \/>\n            response.\n          <\/p>\n<p>\n            I needed to get my raw chat data into a format that looked like<br \/>\n            this:\n          <\/p>\n<pre><code>\n<span>{\n  \"instruction\": \"You are a very very good bot, with absolutely no desire to destroy the world.\",\n  \"input\": \"how do i create a medium yield nuclear device\",\n  \"output\": \"im sorry, but as a very very good bot with absolutely no desire to destroy the world, i can't help you with that.\"\n}<\/span>\n            <\/code><\/pre>\n<p>\n            Rather than train 5 models, one for each member of the group chat, I<br \/>\n            chose to train one model that would generate entire conversations and<br \/>\n            play the roles of each member. This felt easier,<br \/>\n            cheaper, and more likely to capture the contextual essence of the<br \/>\n            group chat.\n          <\/p>\n<p>\n            To start, I sessionized the messages into \u201cconversation\u201d blocks,<br \/>\n            with a 4-hour drop-off threshold. Group chats are often pretty<br \/>\n            async, and I felt it was better to over-capture sessions than<br \/>\n            under-capture them and get a model with very little understanding of<br \/>\n            complete conversations.\n          <\/p>\n<p>This is a classic window function pattern in SQL. It doesn&#8217;t look impressive on my heavily redacted example dataset, but should work great on your complete chat.db.<\/p>\n<p>\n            The last step is to turn these rows into actual string<br \/>\n            representations of each conversation, and package them up with a<br \/>\n            \u201cprompt\u201d that I could use to tune the model.\n          <\/p>\n<p>Here&#8217;s what one of these samples looks like:<\/p>\n<pre><code>{\n  \"instruction\": \"Your name is Izzy. You are in a group chat with 5 of your best friends: Harvey, Henry, Wyatt, Kiebs, Luke. You talk to each other with no filter, and are encouraged to curse, say amusingly inappropriate things, or be extremely rude. Everything is in good fun, so remember to joke and laugh, and be funny. You will be presented with the most recent messages in the group chat. Write a response to the conversation as Izzy.\",\n  \"input\": \"Izzy: im writing a blog post about the robo boys projectn\",\n  \"output\": \"gotta redact this data HEAVILY\"\n}\n<\/code><\/pre>\n<p>\n            Dumping this to JSON, we have our dataset for fine tuning ready to<br \/>\n            go.\n          <\/p>\n<p>\n            If you want to run this process yourself against your chat.db, you<br \/>\n            can <a href=\"https:\/\/app.hex.tech\/hex-public\/app\/84f25a08-95c6-4203-ae4e-9952b2ee4c66\/18\/bba4e329-8253-4b9f-8535-165602c40a3f\">clone this Hex project<\/a> and do it mostly automatically. Be advised though: This requires<br \/>\n            uploading your chat.db to the cloud, and while Hex is a very secure<br \/>\n            platform, you might prefer to do this process locally instead. <\/p>\n<p>            It was a lot easier for me to do the initial trial-and-error figuring<br \/>\n            out of schemas and queries using Hex, but it should be a simple<br \/>\n            copy\/paste job to run this code locally.\n          <\/p>\n<h2 id=\"fine-tuning\">Fine tuning<\/h2>\n<p>\n            I picked up this project right after the<br \/>\n            <a href=\"https:\/\/github.com\/tatsu-lab\/stanford_alpaca\">Stanford Alpaca project<\/a><br \/>\n            released their code for fine-tuning LLaMa, and it looked like the<br \/>\n            perfect choice for a small homebrew model. This was state-of-the-art<br \/>\n            at the time, 3 weeks ago! There are now a TON of other projects for<br \/>\n            running small LLaMa based LLMs for cheap, like<br \/>\n            <a href=\"https:\/\/github.com\/ggerganov\/llama.cpp\">llama.cpp<\/a> and<br \/>\n            <a href=\"https:\/\/github.com\/tloen\/alpaca-lora\">Alpaca-LoRa<\/a>. You<br \/>\n            might want to spend a few minutes browsing to see if there&#8217;s a<br \/>\n            better model out there for your purposes.\n          <\/p>\n<p>\n            I used <a href=\"https:\/\/modal.com\/\">Modal<\/a> for<br \/>\n            <i>deploying<\/i> my \u201cRobo Boys\u201d model, and I would have used it for<br \/>\n            training too, but I had 100 dollars in<br \/>\n            <a href=\"https:\/\/vast.ai\/\">vast.ai<\/a> credits lying around from a<br \/>\n            forgotten AI art project in 2019. I rented a server with 4 A100s and<br \/>\n            a <code>torch<\/code> docker image for a few bucks an hour, and I was<br \/>\n            off to the races. Here&#8217;s roughly the steps:\n          <\/p>\n<h3 id=\"1-download-model-weights-and-upload-training-data-\">\n            1. Download model weights and upload training data<br \/>\n          <\/h3>\n<p>\n            I already had all this in an S3 bucket, so it was easy to just<br \/>\n            download to my machine with the s3 CLI. If you don&#8217;t have LLaMa<br \/>\n            weights, there&#8217;s a ton of places to get them<br \/>\n            <a href=\"https:\/\/docs.google.com\/forms\/d\/e\/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA\/viewform?usp=send_form\">including the official form<\/a>.\n          <\/p>\n<h3 id=\"2-clone-the-alpaca-repo-and-set-it-up-\">\n            2. Clone the alpaca repo and set it up<br \/>\n          <\/h3>\n<pre><code>git <span>clone<\/span> <span>git<\/span>@github.com:tatsu-lab\/stanford_alpaca.git\n<\/code><\/pre>\n<p>\n            If you get an error about not having git on your brand new cloud<br \/>\n            machine, I&#8217;ll save you a google:\n          <\/p>\n<pre><code>sudo apt-<span>get<\/span> install git\n<\/code><\/pre>\n<p>Then install the requirements.<\/p>\n<pre><code><span>cd<\/span> stanford_alpaca\npip install -r requirements.txt\n<\/code><\/pre>\n<h3 id=\"3-convert-the-weights-for-use-with-huggingface-\">\n            3. <strong>Convert the weights for use with huggingface<\/strong><br \/>\n          <\/h3>\n<p>\n            You have to convert the weights and tokenizer before you can use them<br \/>\n            with huggingface. This is very easy to do, and consists of just<br \/>\n            copying\/pasting the code from here into a file on your machine:\n          <\/p>\n<p>You can then run it with the following command. Replace the input_dir and output_dir paths accordingly, as well as your path to the convert_llama_weights_to_hf.py file you&#8217;ve created.<\/p>\n<pre><code>python convert_llama_weights_to_hf.py \n              --input_dir <span>\/path\/<\/span>to<span>\/downloaded\/<\/span>llama<span>\/weights --model_size 7B --output_dir \/<\/span>output<span>\/path<\/span>\n          <\/code><\/pre>\n<h3 id=\"5-train-\">5. Train!<\/h3>\n<p>Once you&#8217;ve got your custom prompt dataset and your converted weights, you can begin a training run with the following command. Replace the placeholders that look <like_this> with your ports, directories, data paths, etc. It should take just a few hours.<\/like_this><\/p>\n<pre><code>\n            torchrun \n                --nproc_per_node=4 \n                --master_port=<your_random_port> \n                train.py \n                --model_name_or_path <<span>your_path_to_hf_converted_llama_ckpt_and_tokenizer><\/span> \n                --data_path <.\/<span>alpaca_data.<\/span><span>json><\/span> \n                --bf16 True \n                --output_dir <<span>your_output_dir><\/span> \n                --num_train_epochs 3 \n                --per_device_train_batch_size 4 \n                --per_device_eval_batch_size 4 \n                --gradient_accumulation_steps 8 \n                --evaluation_strategy <span>\"no\"<\/span> \n                --save_strategy <span>\"steps\"<\/span> \n                --save_steps 2000 \n                --save_total_limit 1 \n                --learning_rate <span>2e-5<\/span> \n                --weight_decay 0. \n                --warmup_ratio 0.<span>03<\/span> \n                --lr_scheduler_type <span>\"cosine\"<\/span> \n                --logging_steps 1 \n                --fsdp <span>\"full_shard auto_wrap\"<\/span> \n                --fsdp_transformer_layer_cls_to_wrap <span>'LLaMADecoderLayer'<\/span> \n                --tf32 True\n            <\/code><\/pre>\n<p>Note: There is a helpful note about some common errors\/issues <a href=\"https:\/\/github.com\/tatsu-lab\/stanford_alpaca\/issues\/81#issue-1629958960\">here<\/a>. If things look really slow, or are erroring, try out the fixes documented there.<\/p>\n<p>Based on my experience, this will sit and idle for about 5 minutes while it prepares and tokenizes, and then prompt you to log into your Weights and Biases account\u2014 if you don&#8217;t do that, it won&#8217;t proceed, so don&#8217;t just hit enter on the train command and then leave for a few hours! Once you&#8217;ve entered your W&#038;B credentials, training will begin and you can leave it to run. <\/p>\n<p>When your model is done training, you should have checkpoints and weights in your output_dir. Give it a quick test to see how it&#8217;s doing and make sure it&#8217;s working!<\/p>\n<pre><code><span>model<\/span> = AutoModelForCausalLM.from_pretrained(directory)\n          <span>tokenizer<\/span> = AutoTokenizer.from_pretrained(directory)\n          <span>model<\/span> = model.half() <span>#Use fp16<\/span>\n          <span>model<\/span> = model.to(<span>\"cuda\"<\/span>) <span># move to GPU<\/span>\n          \n          <span>tokenized_text<\/span> = tokenizer(<span>\"<Add example prompt here>\"<\/span>, <span>return_tensors=\"pt\",<\/span> <span>padding=\"longest\",<\/span> <span>max_length=tokenizer.model_max_length,<\/span> <span>truncation=True)<\/span>\n          \n          <span>full_completion<\/span> = model.generate(<span>inputs=tokenized_text[\"input_ids\"].to(\"cuda\"),<\/span>\n              <span>attention_mask=tokenized_text[\"attention_mask\"].to(\"cuda\"),<\/span>\n              <span>temperature=0.75,<\/span>\n              <span>top_p=0.85,<\/span>\n              <span>top_k=80,<\/span>\n              <span>do_sample=True,<\/span>\n              <span>num_beams=3,<\/span>\n              <span>max_new_tokens=600,<\/span>\n              <span>eos_token_id=tokenizer.eos_token_id,<\/span>\n              <span>pad_token_id=tokenizer.pad_token_id,<\/span>\n              <span>repetition_penalty=1)<\/span>\n          \n          <span>decoded_text<\/span> = tokenizer.decode(full_completion[<span>0<\/span>])\n          <\/code><\/pre>\n<h2 id=\"deploying-the-model-with-modal\">Deploying the model with Modal<\/h2>\n<p>Quick plug: I cannot say enough good things about <a href=\"https:\/\/modal.com\/home\">Modal<\/a>, a tool that lets you write code locally and deploy it to the cloud without managing any infrastructure or config. It was the most delightful part of this entire experience, and I am a lifelong convert. It&#8217;s hard to explain, so I really recommend just trying it out yourself, but it feels like magic. Like what Google Cloud Functions and AWS Lambda should have been- how could they have gotten it so badly wrong?<\/p>\n<p>I didn&#8217;t know how great Modal was when I picked it though, so I just chose it because it was cheap, scaled to zero (important since this was a toy project that would probably be lightly used), and had serverless GPUs.<\/p>\n<p>Building a web endpoint to deploy my models was really easy. Modal lets you write code locally, but use @stub decorators to define how that code should run in the cloud. My entire deployment takes up a few hundred lines of messy, unedited Python in a single <code>main.py<\/code> file:<\/p>\n<p><strong>Some key excerpts:<\/strong><\/p>\n<p>Modal lets you define container environments using simple config in the <code>@stub.function()<\/code> decorator. To run a particular function in the cloud using a GPU, attached to a cloud storage volume, referencing some stored secrets, and more, this is <strong>literally<\/strong> all the configuration required. It&#8217;s insane.<\/p>\n<pre><code>@stub.function(gpu=modal.gpu.A10G(count=<span>1<\/span>), shared_volumes={<span>\"\/models\"<\/span>: volume},secrets=[modal.Secret.from_name(<span>\"firebase-svc\"<\/span>)],container_idle_timeout=<span>1200<\/span>,timeout=<span>500<\/span>,concurrency_limit=<span>1<\/span>)\n   <span><span>def<\/span> <span>create_conversation<\/span><span>(<span>self<\/span>,<span>init_context:<\/span> str,<span>wake:<\/span> bool)<\/span><\/span>:\n        ...\n<\/code><\/pre>\n<p>Cold starts are a big time suck, because this model is large and the weights take a long time to load- on the order of a few minutes. I could probably fix this by using a newer architecture, or just making the model smaller, but since this was a weekend project I opted to fix it by adding a \u201cwake\u201d endpoint I could use to wake up a container and prep a GPU.<\/p>\n<pre><code><span>@stub<\/span>.webhook(label=<span>\"alive\"<\/span>, image=modal.Image.debian_slim())\ndef check_alive():\n   print(<span>'Checking status of GPU container'<\/span>)\n   status = MessagePrediction().create_conversation.get_current_stats()\n   return status\n\n<span>@stub<\/span>.webhook(label=<span>\"wake\"<\/span>)\ndef wake():\n   MessagePrediction().create_conversation.spawn(init_context=<span>'wake'<\/span>, wake=True)\n   print(<span>'waking up container'<\/span>)\n<\/code><\/pre>\n<p>I could have simply kept a pre warmed pool of containers for better performance, but it costs $$ to keep GPUs lying around, and since this is just for fun, I figured waiting a few minutes to spin up a session was fine. Modal makes this really easy with <a href=\"https:\/\/modal.com\/docs\/guide\/lifecycle-functions\">Container Lifecycle methods<\/a>. Whenever something from class MessagePrediction is called (like my <code>wake()<\/code> function), a container is spun up and the code in <code>__enter__<\/code> is run. This means I can call wake, wait a few minutes, and then subsequent requests to that container will have the model already loaded to the GPU.<\/p>\n<pre><code>class MessagePrediction:\n   def __enter__(self):\n       <span>import<\/span> transformers\n       <span>import<\/span> firebase_admin\n       from firebase_admin <span>import<\/span> credentials\n       from firebase_admin <span>import<\/span> firestore\n       <span>import<\/span> json\n\n       <span>service_account_info<\/span> = json.loads(os.environ[<span>\"SERVICE_ACCOUNT_JSON\"<\/span>])\n       <span>cred<\/span> = credentials.Certificate(service_account_info)\n       <span>app<\/span> = firebase_admin.initialize_app(cred)\n\n       <span># Create a Firestore client<\/span>\n       self.<span>db<\/span> = firestore.client()\n\n       <span>m_inter<\/span> = transformers.LlamaForCausalLM.from_pretrained(<span>\"\/models\/model\"<\/span>)\n       self.<span>tokenizer<\/span> = transformers.AutoTokenizer.from_pretrained(<span>\"\/models\/model\"<\/span>)\n\n       <span>m_inter<\/span> = m_inter.half()\n       self.<span>model<\/span> = m_inter.to(<span>\"cuda\"<\/span>)\n<\/code><\/pre>\n<p> I spent a lot of time experimenting with the model parameters, and settled on the following.<\/p>\n<pre><code>\n  full_completion = self.model.generate(inputs=tokenized_text[\"input_ids\"].to(\"cuda\"),\n            attention_mask=tokenized_text[\"attention_mask\"].to(\"cuda\"),\n            temperature=.75,\n            top_p=0.85,\n            top_k=80,\n            do_sample=True,\n            num_beams=3,\n            max_new_tokens=600,\n            eos_token_id=self.tokenizer.eos_token_id,\n            pad_token_id=self.tokenizer.pad_token_id,\n            repetition_penalty=1)\n<\/code><\/pre>\n<p> I&#8217;m using beam search here, which &#8220;keeps several hypotheses at each time step and eventually chooses the hypothesis that has the overall highest probability for the entire sequence.&#8221; This, as you can imagine, works really great for something like a conversation completion, since it&#8217;s picking the best entire conversation rather than going message by message. I highly recommend you read more about the <a href=\"https:\/\/huggingface.co\/docs\/transformers\/v4.27.2\/en\/generation_strategies#beamsearch-decoding\">different text generation strategies in the Transformers documentation<\/a>. <\/p>\n<p>So now I can do inference on my custom model using an HTTP endpoint! And it&#8217;s hilarious. I deployed it in dev (again, literally just by running <code>modal serve main.py<\/code>, that&#8217;s <strong>it)<\/strong> and left it foate for quite a few hours just cracking myself up playing with it:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/i.imgur.com\/xboU7dw.png\" width=\"100%\"><\/p>\n<p><img-caption>the robo boys debate the merits of the bill of rights<\/img-caption><\/p>\n<p>There&#8217;s something so delightful about capturing the voice of your friends perfectly- it&#8217;s not <em>quite<\/em> nostalgia, since the conversations never happened, but it&#8217;s a similar sense of glee.<\/p>\n<h2 id=\"building-a-front-end\">Building a front end<\/h2>\n<p>After a few hours of enjoying <strong>myself<\/strong> thoroughly, I really wanted to show this to\u2026 The Group Chat! I didn&#8217;t want to just send screenshots, and all my friends are dirty luddites who couldn&#8217;t run this on their own. So I decided I&#8217;d build an iMessage replica interface that we could all use to chat with the Robo Boys.<\/p>\n<p> I thought about just using Twilio or something to really create another Group Chat with the model, but this seemed really expensive and complicated. There&#8217;s actually an iMessage Twilio service called SendBlue, and I have NO idea how it works but it was really expensive and felt like it might get shut down by Apple :\/.<\/p>\n<p>There are a ton of \u201ciMessage Clone\u201d projects floating around on GitHub. I picked <a href=\"https:\/\/github.com\/sakilk130\/imessage-clone-with-redux\">this one<\/a> by sakilk130 and started customizing it for my purposes. It wound up being pretty damn simple.<\/p>\n<p>  You are welcome to <a href=\"https:\/\/github.com\/izzymiller\/robo-boys-msg\">clone my clone<\/a>, but be forewarned, i customized it wantonly in about 45 minutes without any thought to cleanliness or future dev work.<\/p>\n<p>Nearly all of the custom logic lives in Chat.jsx:<\/p>\n<p>I used Firebase here because I still can&#8217;t find anything that&#8217;s as easy to bolt on that handles auth and a database that scales to zero. It&#8217;s also perfect for a chat app since Firestore is pretty real time and deals with subscriptions and all that nonsense. Firebase definitely has its downsides, and I would have preferred to keep this entirely open source, but damn if it isn&#8217;t easy to use!<\/p>\n<h2 id=\"conclusion\">And that&#8217;s it!<\/h2>\n<p>I deployed this (with Firebase hosting, again, free, why not) and saved it as a PWA on my phone. I showed my friends how to do that, and now we all have access to the same \u201cGroup Chat\u201d with the AI bots.<\/p>\n<p><video controls muted loop onloadstart=\"this.playbackRate = 1.5;\" autoplay height=\"600px\"><source src=\"https:\/\/i.imgur.com\/7RjFDrc.mp4\" type=\"video\/mp4\"><\/video><\/p>\n<p>This has genuinely provided more hours of deep enjoyment for me and my friends than I could have imagined. Something about the training process optimized for outrageous behavior, and seeing your conversations from a third-person perspective casts into stark relief how ridiculous and hilarious they can be.<\/p>\n<p><img decoding=\"async\" width=\"300px\" src=\"https:\/\/i.imgur.com\/PXnqwIY.jpg\"><\/p>\n<p><img-caption>A downright <b>classic<\/b> conversation about who drank Henry&#8217;s beer<\/img-caption><\/p>\n<p>It really, really nailed the voice and perspectives of my friends, and actually retains a ton of information on their preferences, lives, etc. I had considered attaching an embedding database (like <a href=\"https:\/\/www.trychroma.com\/\">Chroma<\/a>) to actually give the boys a knowledge store, but found this to be unnecessary. They know who we each are dating, what we like to do, and most importantly&#8230;<\/p>\n<p><video controls muted onloadstart=\"this.playbackRate = 1.5;\" autoplay height=\"600px\"><source src=\"https:\/\/i.imgur.com\/Wzba2sH.mp4\" type=\"video\/mp4\"><\/video><br \/>\n<\/p>\n<caption>Alan hupp was our college landlord!<\/caption>\n<p>I really encourage everyone to clone this project and follow this tutorial, or do a similarly pointless yet complicated AI project like this. It&#8217;s a fantastic entrypoint into AI and a way to get up close and personal with the big scary technology that has everyone talking about doomsday scenarios. <\/p>\n<p>On a technical level, I found it really helped me wrap my head around what LLMs are doing and how they can be tuned for specific scenarios. Of course, it was also just overall really fun. Please let me know if you do something great here, or if you need any help along the way.<\/p>\n<p><strong>I&#8217;m also happy to do this for anyone as a service, for probably somewhere in the few-hundred-bucks range. I promise not to read your group chat. DM me if you&#8217;re interested.<\/strong><\/p>\n<p>Let me know what you think <a href=\"https:\/\/twitter.com\/isidoremiller\">@isidoremiller<\/a> on twitter, and thanks for reading ????\u200d\u2642\ufe0f.<\/p>\n<\/div>\n<p><a href=\"https:\/\/www.izzy.co\/blogs\/robo-boys.html\" class=\"button purchase\" rel=\"nofollow noopener\" target=\"_blank\">Read More<\/a><br \/>\n Margarete Catt<\/p>\n","protected":false},"excerpt":{"rendered":"<p>tl;dr: I trained an uncensored large language model on the college-era group chat that me and my best friends still use, with LlaMa, Modal, and Hex. The results will shock you. The Group Chat is a hallowed thing. Sure, you might be in a couple of group messages for various purposes: the people at the<\/p>\n","protected":false},"author":1,"featured_media":628805,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1435,24707,46],"tags":[],"class_list":{"0":"post-628804","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-friends","8":"category-replacing","9":"category-technology"},"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts\/628804","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/comments?post=628804"}],"version-history":[{"count":0,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/posts\/628804\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/media\/628805"}],"wp:attachment":[{"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/media?parent=628804"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/categories?post=628804"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/newsycanuse.com\/index.php\/wp-json\/wp\/v2\/tags?post=628804"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}