Building up the Rusty Llama (NixOS)

Justin Bender

July 27, 2023

Contents

Building up the Rusty Llama (NixOS)

💬 This is from a video created and produced by Code to the Moon. I appreciate the time he put into learning these technologies and I recommend watching the video yourself. If this little guide can help you, perfect!

This project involves Rust 🦀 to you have to make sure that you have it downloaded as well as all of the other packages.

Project links

Before we begin let's start with a little background of what we are building and what I am using at this moment for my operation system. I'm not using a normal distribution on Linux. We are using a form of NixOS. There might be some pieces missing from this explanation like git or other packages that are expected to already be on your machine.

Inside of your /etc/nixos/configuration.nix we need to add a few packages to make everything work.

pkg-config
openssl
binaryen
sass

Not 100% sure if needed but added in and don't plan to debug to see if I can take it out.

libiconv
libgudev

Now why do we need to have these packages? Well it's because the cargo crates that we plan to install will require them.

Trunk
cargo-leptos
cargo-generate

❯ cargo install trunk cargo-leptos cargo-generate

Packages included in our configuration file may download automatically on the system that you are using. Since I'm on NixOS I have to handle these. This is fine but I also needed to add a path to openssl because the pkg-config could not locate everything as needed. Using find /nix/store -name "openssl*" I was able to find the location that I needed to use.

The path that I found for example was something like this:

❯ export PKG_CONFIG_PATH=/nix/store/z4vl5vjz1a479wmbm10mhdpqygg5b3vl-system-path/lib/pkgconfig:$PKG_CONFIG_PATH

Why couldn't I just use:

❯ /run/current-system/sw/lib/pkgconfig

I'm not 100% sure but it did not work that way and accurately find the path. It would keep failing. So we just have to set it to the /nix/store path.

At this point we should be able to download cargo-leptos and it should download. Now it's time to start working on the project.

Let's start building out frontend

Create the project and go into the project folder. Using the basic leptos setup and actix

❯ cargo leptos new --git leptos-rs/start

This will ask us the name we would like to use. Let's just pick rusty-llama

What is happening here? We are using a lot of the crates downloaded already cargo generate & cargo leptos to generate our project.

🛑 Explaining the internals of Leptos is out of scope for this project. There is both a server and a frontend. Lot's of code going on. So please look through it and maybe watch the youtube video connected to this tutorial

The basic file structure of our project looks like this:

❯ tree --gitignore
.
├── assets
│   └── favicon.ico
├── Cargo.lock
├── Cargo.toml
├── end2end
│   ├── package.json
│   ├── package-lock.json
│   ├── playwright.config.ts
│   └── tests
│       └── example.spec.ts
├── LICENSE
├── README.md
├── rust-toolchain.toml
├── src
│   ├── app.rs
│   ├── lib.rs
│   └── main.rs
└── style
    └── main.scss

6 directories, 16 files

We want to add a few files conversation.rs and mod.rs inside of a model directory. That lives inside of the src directory.

└── src
    ├── app.rs
    ├── lib.rs
    ├── main.rs
    └── model
        ├── conversation.rs
        └── mod.rs

Inside of the mod.rs file should just include the conversation to make it accessible and public to the lib.rs. So we will have to include a few lines of code inside of both mod.rs and lib.rs like so:

mod.rs:

pub mod conversation;

There is only one line that is added to this next file right now. mod model. This allows the project to know that the model module exists for this project. lib.rs:

pub mod app;
mod model;

use cfg_if::cfg_if;

cfg_if! {
if #[cfg(feature = "hydrate")] {

  use wasm_bindgen::prelude::wasm_bindgen;

    #[wasm_bindgen]
    pub fn hydrate() {
      use app::*;
      use leptos::*;

      console_error_panic_hook::set_once();

      leptos::mount_to_body(move |cx| {
          view! { cx, <App/> }
      });
    }
}
}

Now that we have the module working. We want to build up our structs to handle the conversations we will have with the Large Language Model.

model/conversation.rs:

use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct Conversation {
    pub messages: Vec<Message>
}

impl Conversation {
    pub fn new() -> Conversation {
        Conversation {
            messages: Vec::new()
        }
    }
}

#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct Message {
    pub user: bool,
    pub text: String,
}

If you notice we are using serde so we must include that in our project with:

❯ cargo add serde -F derive

Looking in our conversation.rs file we have a few different structures.

Conversation
Message

This is a very simple design. We will have text inside of the message and we will have a boolean that tells us who is talking. The user or the model response. We also include an implementation of Conversation to have a new function. That will return a Converstation. Which is a vector of messages.

Now that we have created the structures for our project we can start working on the frontend for a bit. Let's go into the /src/app.rs and slim down the file a bit before we build it back up.

app.rs:

use leptos::*;
use leptos_meta::*;
use leptos_router::*;

use crate::model::conversation::Conversation;

#[component]
pub fn App(cx: Scope) -> impl IntoView {
    // Provides context that manages stylesheets, titles, meta tags, etc.
    provide_meta_context(cx);

    view! { cx,
        // injects a stylesheet into the document <head>
        // id=leptos means cargo-leptos will hot-reload this stylesheet
        <Stylesheet id="leptos" href="/pkg/leptos_start.css"/>

        // sets the document title
        <Title text="Rusty Llama"/>
        // contents
        <ChatArea/>
        <TypeArea/>
    }
}

This does not run at this moment. There is no ChatArea or TypeArea. What are we actually building here.

Chat starts
User enters text
Click submit button
Send to backend
Generate Response
Send to frontend
Display response

Now that we have our basic setup we will start adding in the functionality that is required to accomplish our tasks. Two functions called create_signal and create_action

let (conversation, set_conversation) = create_signal(cx, Conversation::new());

let send = create_action(cx, move |new_message: &String| {
    let user_message = Message {
        text: new_message.clone(),
        user: true,
    };
    set_conversation.update(move |c| {
        c.messages.push(user_message);
    });

    // TODO converse
});

This will setup the ability to send a signal to set the conversation and action that will take the entered message. The TODO will be the sending to the server, digesting and sending back to update the messages.

Another addition will be connecting conversation and send to the frontend components.

<ChatArea conversation/>
<TypeArea send/>

Building up the API

Now we will start with out API by creating a new file called api.rs inside of your src directory.

First we want to add a few lines into our Cargo.toml file. We choose to use github for this crate. Just so we know that we have the most updated version possible. If this does not work. Go to something more stable. We also want to include the rand crate for use later on.

[dependencies]
llm = { git = "https://github.com/rustformers/llm.git", branch = "main", optional = true }
rand = { version = "0.8.5", optional = true }

[features]
ssr = [
  "dep:llm",
  "dep:rand",
]

Now we can go into our api.rs file to start building it up.

use leptos::*;
use crate::model::conversation::Conversation;


#[server(Converse "/api")]
pub async fn converse(cx: Scope, prompt: Conversation) -> Result<String, ServerFnError> {
    use llm::models::Llama;
    use leptos_actix::extract;
    use actix_web::web::Data;
    use actix_web::dev::ConnectionInfo;

    let model = extract(cx, |data: Data<Llama>, _connection: ConnectionInfo| async {
        data.into_inner()
    })
    .await.unwrap();

    use llm::KnownModel;
    let character_name = "### Assistant";
    let user_name = "### Human";
    let persona = "A chat between a human and an assistant";
    let mut history = format!(
        "{character_name:Hello - How may I help you today?\n\
        {user_name}:What is the capital of France?\n\
        {character_name}:Paris is the capital of France.\n"
    );

    for message in prompt.messages.into_iter() {
        let msg = message.text;
        let curr_line = if message.user {
            format!("{character_name}:{msg}\n")
        } else {
            format!("{user_name}:{msg}\n")
        };
    }

    let mut res = String::new();
    let mut rng = rand::thread_rng();
    let mut buf = String::new();

    let mut session = model.start_session(Default::default());

    session.infer(
        model.as_ref(),
        &mut rng,
        &llm::InferenceRequest {
            prompt: format!("{persona}\n{history}\n{character_name}:")
                .as_str()
                .into(),
            parameters: Some(&llm::InferenceParameters::default()),
            play_back_previous_tokens: false,
            maximum_token_count: None,
        },
        &mut Default::default(),
        inference_callback(String::from(user_name), &mut buf, &mut res),
    )
    .unwrap_or_else(|e| panic!("{e}"));


    Ok(String::from(""))
}

We do need to make sure