Building up the Rusty Llama (NixOS)


Building up the Rusty Llama (NixOS)
π¬ This is from a video created and produced by Code to the Moon. I appreciate the time he put into learning these technologies and I recommend watching the video yourself. If this little guide can help you, perfect!
This project involves Rust π¦ to you have to make sure that you have it downloaded as well as all of the other packages.
Project links
- Build a Full Stack Chatbot in Rust (feat. Leptos & Rustformeres) - Code to the Moon
- Rustπ¦
- Wizard-Vicuna-7b-Uncensored-GGML
- Cargo Generate
- Leptos
- trunk
- NixOS
- WASM
- Rustformers/llm
- Tailwind CSS
Before we begin let's start with a little background of what we are building and what I am using at this moment for my operation system. I'm not using a normal distribution on Linux. We are using a form of NixOS. There might be some pieces missing from this explanation like git
or other packages that are expected to already be on your machine.
Inside of your /etc/nixos/configuration.nix
we need to add a few packages to make everything work.
pkg-config
openssl
binaryen
sass
Not 100% sure if needed but added in and don't plan to debug to see if I can take it out.
libiconv
libgudev
Now why do we need to have these packages? Well it's because the cargo crates that we plan to install will require them.
Trunk
cargo-leptos
cargo-generate
β― cargo install trunk cargo-leptos cargo-generate
Packages included in our configuration file may download automatically on the system that you are using. Since I'm on NixOS I have to handle these. This is fine but I also needed to add a path to openssl
because the pkg-config
could not locate everything as needed. Using find /nix/store -name "openssl*"
I was able to find the location that I needed to use.
The path that I found for example was something like this:
β― export PKG_CONFIG_PATH=/nix/store/z4vl5vjz1a479wmbm10mhdpqygg5b3vl-system-path/lib/pkgconfig:$PKG_CONFIG_PATH
Why couldn't I just use:
β― /run/current-system/sw/lib/pkgconfig
I'm not 100% sure but it did not work that way and accurately find the path. It would keep failing. So we just have to set it to the /nix/store
path.
At this point we should be able to download cargo-leptos
and it should download. Now it's time to start working on the project.
Let's start building out frontend
- Create the project and go into the project folder. Using the basic
leptos
setup andactix
β― cargo leptos new --git leptos-rs/start
This will ask us the name we would like to use. Let's just pick rusty-llama
What is happening here? We are using a lot of the crates downloaded already cargo generate
& cargo leptos
to generate our project.
π Explaining the internals of Leptos is out of scope for this project. There is both a server and a frontend. Lot's of code going on. So please look through it and maybe watch the youtube video connected to this tutorial
The basic file structure of our project looks like this:
β― tree --gitignore
.
βββ assets
βΒ Β βββ favicon.ico
βββ Cargo.lock
βββ Cargo.toml
βββ end2end
βΒ Β βββ package.json
βΒ Β βββ package-lock.json
βΒ Β βββ playwright.config.ts
βΒ Β βββ tests
βΒ Β βββ example.spec.ts
βββ LICENSE
βββ README.md
βββ rust-toolchain.toml
βββ src
βΒ Β βββ app.rs
βΒ Β βββ lib.rs
βΒ Β βββ main.rs
βββ style
βββ main.scss
6 directories, 16 files
We want to add a few files conversation.rs
and mod.rs
inside of a model
directory. That lives inside of the src
directory.
βββ src
Β Β βββ app.rs
Β Β βββ lib.rs
Β Β βββ main.rs
Β Β βββ model
Β Β βββ conversation.rs
Β Β βββ mod.rs
Inside of the mod.rs
file should just include the conversation
to make it accessible and public to the lib.rs
. So we will have to include a few lines of code inside of both mod.rs
and lib.rs
like so:
mod.rs
:
pub mod conversation;
There is only one line that is added to this next file right now. mod model
. This allows the project to know that the model module exists for this project.
lib.rs
:
pub mod app;
mod model;
use cfg_if::cfg_if;
cfg_if! {
if #[cfg(feature = "hydrate")] {
use wasm_bindgen::prelude::wasm_bindgen;
#[wasm_bindgen]
pub fn hydrate() {
use app::*;
use leptos::*;
console_error_panic_hook::set_once();
leptos::mount_to_body(move |cx| {
view! { cx, <App/> }
});
}
}
}
Now that we have the module working. We want to build up our structs to handle the conversations we will have with the Large Language Model.
model/conversation.rs
:
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct Conversation {
pub messages: Vec<Message>
}
impl Conversation {
pub fn new() -> Conversation {
Conversation {
messages: Vec::new()
}
}
}
#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct Message {
pub user: bool,
pub text: String,
}
If you notice we are using serde
so we must include that in our project with:
β― cargo add serde -F derive
Looking in our conversation.rs
file we have a few different structures.
- Conversation
- Message
This is a very simple design. We will have text inside of the message and we will have a boolean that tells us who is talking. The user or the model response. We also include an implementation of Conversation
to have a new
function. That will return a Converstation
. Which is a vector of messages.
Now that we have created the structures for our project we can start working on the frontend for a bit. Let's go into the /src/app.rs
and slim down the file a bit before we build it back up.
app.rs
:
use leptos::*;
use leptos_meta::*;
use leptos_router::*;
use crate::model::conversation::Conversation;
#[component]
pub fn App(cx: Scope) -> impl IntoView {
// Provides context that manages stylesheets, titles, meta tags, etc.
provide_meta_context(cx);
view! { cx,
// injects a stylesheet into the document <head>
// id=leptos means cargo-leptos will hot-reload this stylesheet
<Stylesheet id="leptos" href="/pkg/leptos_start.css"/>
// sets the document title
<Title text="Rusty Llama"/>
// contents
<ChatArea/>
<TypeArea/>
}
}
This does not run at this moment. There is no ChatArea
or TypeArea
. What are
we actually building here.
- Chat starts
- User enters text
- Click submit button
- Send to backend
- Generate Response
- Send to frontend
- Display response
Now that we have our basic setup we will start adding in the functionality that
is required to accomplish our tasks. Two functions called create_signal
and
create_action
let (conversation, set_conversation) = create_signal(cx, Conversation::new());
let send = create_action(cx, move |new_message: &String| {
let user_message = Message {
text: new_message.clone(),
user: true,
};
set_conversation.update(move |c| {
c.messages.push(user_message);
});
// TODO converse
});
This will setup the ability to send a signal to set the conversation and action
that will take the entered message. The TODO
will be the sending to the
server, digesting and sending back to update the messages.
Another addition will be connecting conversation
and send
to the frontend
components.
<ChatArea conversation/>
<TypeArea send/>
Building up the API
Now we will start with out API by creating a new file called api.rs
inside of
your src
directory.
First we want to add a few lines into our Cargo.toml
file. We choose to use
github for this crate. Just so we know that we have the most updated version
possible. If this does not work. Go to something more stable. We also want to include the rand
crate for use later on.
[dependencies]
llm = { git = "https://github.com/rustformers/llm.git", branch = "main", optional = true }
rand = { version = "0.8.5", optional = true }
[features]
ssr = [
"dep:llm",
"dep:rand",
]
Now we can go into our api.rs
file to start building it up.
use leptos::*;
use crate::model::conversation::Conversation;
#[server(Converse "/api")]
pub async fn converse(cx: Scope, prompt: Conversation) -> Result<String, ServerFnError> {
use llm::models::Llama;
use leptos_actix::extract;
use actix_web::web::Data;
use actix_web::dev::ConnectionInfo;
let model = extract(cx, |data: Data<Llama>, _connection: ConnectionInfo| async {
data.into_inner()
})
.await.unwrap();
use llm::KnownModel;
let character_name = "### Assistant";
let user_name = "### Human";
let persona = "A chat between a human and an assistant";
let mut history = format!(
"{character_name:Hello - How may I help you today?\n\
{user_name}:What is the capital of France?\n\
{character_name}:Paris is the capital of France.\n"
);
for message in prompt.messages.into_iter() {
let msg = message.text;
let curr_line = if message.user {
format!("{character_name}:{msg}\n")
} else {
format!("{user_name}:{msg}\n")
};
}
let mut res = String::new();
let mut rng = rand::thread_rng();
let mut buf = String::new();
let mut session = model.start_session(Default::default());
session.infer(
model.as_ref(),
&mut rng,
&llm::InferenceRequest {
prompt: format!("{persona}\n{history}\n{character_name}:")
.as_str()
.into(),
parameters: Some(&llm::InferenceParameters::default()),
play_back_previous_tokens: false,
maximum_token_count: None,
},
&mut Default::default(),
inference_callback(String::from(user_name), &mut buf, &mut res),
)
.unwrap_or_else(|e| panic!("{e}"));
Ok(String::from(""))
}
We do need to make sure