Skip to main content

RAG service

Now, we'll put together everything we've learnt so far

I've a file doc1.txt with following content

cat says meow

And, I've another file doc2.txt with following content

dog will bark

Let's create a simple function which would convert query into embeddings and find the matching documents

Typescript Code:

async function queryEmbedding(query: string) {
const embedding = await createEmbeddingOpenAI(query);
const { data, error } = await supabase.rpc("query_documents", {
query_embedding: embedding,
});

return data;
}

Python Code:


def query_embedding(query: str):
embedding = create_embedding_openai(query)
response = supabase.rpc("query_documents", {"query_embedding": embedding}).execute()
return response

We'll put everything together for querying

  • Read the files
  • Create embeddings for contents in those files
  • Get the query
  • Create embedding for the query
  • Find the matching documents

Typescript Code:

async function main() {
await createEmbeddingFromFiles(["data/doc1.txt", "data/doc2.txt"]);
const query = "dog";
const results = await queryEmbedding(query);
console.log(results);
}

main();

Python Code:

def main():
create_embedding_from_files(["data/doc1.txt", "data/doc2.txt"])
response = query_embedding("dog")
print(response)

main()

When I execute above code, I'm getting below result in console

[
{ doc_name: 'data/doc1.txt', similarity: 0.360020222665688 },
{ doc_name: 'data/doc2.txt', similarity: 0.545840669796527 }
]

As you can see, doc2.txt has contents about dog and that is why similarity score is more.