From Document to Vector: Using OpenSearch to Store Embedding Data

The supporting infrastructure for large LLM jobs can be difficult and costly to set up. Storing vector data requires careful consideration of resource consumption. OpenSearch offers a straightforward way to store embeddings generated by tools like Azure OpenAI or the Natural Neural Search plugin. It also handles querying, reducing the operational overhead. This talk shows how to prepare pdf files, send them to Azure’s OpenAI API to generate embeddings, and store the resulting vectors in OpenSearch. This will be running on a low maintenance Raspberry Pi cluster and Charmed OpenSearch.

Details

Thursday, September 28 1:15pm-1:55pm in Willow

Track: Search

Speakers

Pedro Cruz photograph

Pedro Cruz

Engineering Manager at Canonical
Alastair Flynn photograph

Alastair Flynn

Software Engineer at Canonical