
Embeddings
Embeddings are mathematical representations of words, phrases, images, or any content that capture their meaning in a form that computers can process. They convert the fuzzy, complex world of human concepts into precise numerical values. They're essentially the "vector" part of vector databases.
- Similar concepts are positioned close together
- Different concepts are positioned far apart
A Practical Example
When you take a word like "king" and convert it to an embedding, you might get something like:
[0.2, -0.6, 0.1, 0.9, -0.2, ...]
These numbers represent various aspects of "king" - perhaps masculinity, royalty, power, leadership, etc.
The word "queen" might have a similar embedding:
[0.2, -0.5, 0.1, 0.9, 0.2, ...]
Notice how they're similar in many dimensions (positions in the vector), but differ slightly in others (perhaps the one representing gender).
In modern AI systems, embeddings are the foundation for everything from search relevance to recommendations to understanding natural language.
What the difference?
OpenAI, Ollama, and Gemini use different embedding models which produce different vector representations.

These differences matter because:
- Compatibility: Embeddings from different providers aren't directly comparable - you can't mix OpenAI and Gemini embeddings in the same vector space meaningfully
- Performance: Each may perform better for specific domains or types of queries
- Implementation: Database systems that use vectors need to be configured for the specific dimensionality of your chosen embedding model
- Ecosystem: Some providers integrate better with certain tools and frameworks
When building a vector database, you need to choose one embedding provider and stick with it consistently throughout your application. Switching providers would require re-embedding all your content.
