RAG service

Now, we'll put together everything we've learnt so far

I've a file doc1.txt with following content

cat says meow

And, I've another file doc2.txt with following content

dog will bark

Let's create a simple function which would convert query into embeddings and find the matching documents

async function queryEmbedding(query: string) {
  const embedding = await createEmbeddingOpenAI(query);
  const { data, error } = await supabase.rpc("query_documents", {
    query_embedding: embedding,
  });

  return data;
}

We'll put everything together for querying

Read the files
Create embeddings for contents in those files
Get the query
Create embedding for the query
Find the matching documents

async function main() {
  await createEmbeddingFromFiles(["data/doc1.txt", "data/doc2.txt"]);
  const query = "dog";
  const results = await queryEmbedding(query);
  console.log(results);
}

main();

When I execute above code, I'm getting below result in console

[
  { doc_name: 'data/doc1.txt', similarity: 0.360020222665688 },
  { doc_name: 'data/doc2.txt', similarity: 0.545840669796527 }
]

As you can see, doc2.txt has contents about dog and that is why similarity score is more.