Abstract
In an era of heightened data privacy concerns, the development of local Large Language Model (LLM) applications provides an alternative to cloud-based solutions. Ollama offers one solution, enabling LLMs to be downloaded and used locally. In this short article, we’ll explore how to use Ollama with LangChain and SingleStore using a Jupyter Notebook.
The notebook file used in this article is available on GitHub.
Introduction
We’ll use a Virtual Machine running Ubuntu 22.04.2 as our test environment. An alternative would be to use venv.
Create a SingleStoreDB Cloud account
A previous article showed the steps required to create a free SingleStore Cloud account. We’ll use Ollama Demo Group as our Workspace Group Name and ollama-demo as our Workspace Name. We’ll make a note of our password and host name. For this article, we’ll temporarily allow access from anywhere by configuring the firewall under Ollama Demo Group > Firewall. For production environments, firewall rules should be added to provide increased security.
Create a Database
In our SingleStore Cloud account, let’s use the SQL Editor to create a new database. Call this ollama_demo, as follows:
Install Jupyter
From the command line, we’ll install the classic Jupyter Notebook, as follows:
Install Ollama
We’ll install Ollama, as follows:
Environment Variable
Using the password and host information we saved earlier, we’ll create an environment variable to point to our SingleStore instance, as follows:
Replace <password> and <host> with the values for your environment.
We are now ready to work with Ollama and we’ll launch Jupyter:
Fill out the notebook
First, some packages:
Adjust this command if you already have some of these installed.
Next, we’ll import some libraries:
from langchain_community.vectorstores import SingleStoreDB
from langchain_community.vectorstores.utils import DistanceStrategy
from langchain_core.documents import Document
from langchain_community.embeddings import OllamaEmbeddings
We’ll create embeddings using all-minilm (45 MB at the time of writing):
and for our LLM we’ll use llama2 (3.8 GB at the time of writing):
Next, we’ll use the example text from the Ollama website:
“Llamas are members of the camelid family meaning they‘re pretty closely related to vicuñas and camels“,
“Llamas were first domesticated and used as pack animals 4,000 to 5,000 years ago in the Peruvian highlands“,
“Llamas can grow as much as 6 feet tall though the average llama between 5 feet 6 inches and 5 feet 9 inches tall“,
“Llamas weigh between 280 and 450 pounds and can carry 25 to 30 percent of their body weight“,
“Llamas are vegetarians and have very efficient digestive systems“,
“Llamas live to be about 20 years old, though some only live for 15 years and others live to be 30 years old“
]
embeddings = OllamaEmbeddings(
model = “all-minilm“,
)
dimensions = len(embeddings.embed_query(documents[0]))
docs = [Document(text) for text in documents]
We specify all-minilm for the embeddings, determine the number of dimensions returned for the first document and convert the documents to the format required by SingleStore.
Next, we’ll use LangChain:
docs,
embeddings,
table_name = “langchain_docs“,
distance_strategy = DistanceStrategy.EUCLIDEAN_DISTANCE,
use_vector_index = True,
vector_size = dimensions
)
In addition to the documents and embeddings, we provide the name of the table we want to use for storage, the distance strategy, that we want to use a vector index and the vector size using the dimensions we previously determined. These and other options are explained in further detail in the LangChain documentation.
We can now ask a question, as follows:
docs = docsearch.similarity_search(prompt)
data = docs[0].page_content
print(data)
Example output:
Next, we’ll use the LLM, as follows:
model = “llama2“,
prompt = f“Using this data: {data}. Respond to this prompt: {prompt}“
)
print(output[“response“])
Example output:
Vicuñas are the wild ancestors of domesticated alpacas and llama, and they are found in the Andean region of South America. Like llamas, vicuñas have a distinctive long-haired coat with a characteristic white stripe running down their back. They are also known for their gentle nature and ability to be trained for riding and packing.
Camels, on the other hand, are found in Africa, Asia, and Australia, and they are better known for their ability to survive in hot and arid environments. Like llamas, camels have a distinctive hump on their backs, which helps them store water and food for long periods of time without access to water.
Overall, the close relationship between llamas, vicuñas, and camels is due to their shared evolutionary history as members of the camelid family. Despite their differences in size, coat color, and adaptations to different environments, these animals share many similarities in terms of their physical characteristics and behaviors.
Summary
In this short article, we’ve seen that we can connect to SingleStore, store the text and embeddings, ask questions about the data in the database, and use the power of LLMs locally through Ollama.