Vector Embeddings and the Mathematics Behind Them
June 2025 · 83 viewsImagine stepping into a world where everything — words, images, even products — is mapped into a universe of numbers. In this world, "cat" and "dog" aren’t just words, but points lying close together in a vast multidimensional space. Welcome to the fascinating world of vector embeddings.
🤔 Why I Started Exploring Vector Embeddings
At Creatosaurus, we manage a huge library of design templates. Manually tagging each asset had become a massive bottleneck. We needed a smarter, scalable way for users to discover the right designs — without relying on manual work.
That’s when I stumbled upon semantic search — a system that understands meaning, not just keywords.
I quickly built a prototype using a few templates, and the results blew my mind. When I searched using natural phrases, the system understood my intent and returned highly relevant results — even without exact keyword matches.
This "aha!" moment sparked my curiosity:
How can a system understand meaning so well — even visually?
After diving deeper, I began learning how this magical system works. This article captures my understanding, as I explore the math and ideas behind it.
🧠 What is a Vector Embedding?
A vector embedding is a mathematical representation of an object — a word, image, document, or even a user — as a point in n-dimensional space, usually a dense vector of real numbers.
Examples:
-
Word:
"cat" → [1.2, 0.9, -0.3, 0.0, 0.7] (5D vector) -
Sentence:
"I love books" → [0.13, -0.76, 0.48, ..., 0.91] (e.g., 300D)
👉 The idea: The more similar two objects are in meaning, the closer their vectors lie in space.
⚙️ How Do Vector Embeddings Work?
🔄 From Raw Data to Vector
So how do we go from "cat" to [1.2, 0.8, -0.3, ..., 0.1]?
The process involves transformation — turning raw data (text, images, etc.) into vectors using trained machine learning model like CLIP.
For example:
- A word like "cat" becomes a vector like [1.2, 0.8, -0.3, ...]
- A sentence like "The cat is on the mat" gets its own vector
- An image of a dog is converted into another vector
- A product description like "wireless gaming mouse" is also embedded as a vector
Once transformed, each item exists as a point in a shared numerical universe — ready for math!
🗺️ Visualizing Embeddings
A 2D graph with emojis as data points:
- X-axis → "Category Similarity"
- Y-axis → "Semantic Depth" (like living vs. non-living)
📍 Placement examples:
- 🐶 Dog, 🐱 Cat, and 🦁 Lion cluster in the upper-right (similar animals)
- 🌳 Tree is higher on the semantic axis — it's living, but not an animal
- 🚗 Car is in the bottom-left — non-living and categorically unrelated
✨ This visualization shows how embeddings capture meaning — not just surface similarity, but conceptual relationships.
🧮 The Math Behind Vector Embeddings
Once we have vectors, we can use geometry and linear algebra to compare and analyze them.
🟩 a. Dot Product (Inner Product)
Measures directional similarity.
Larger values → more aligned vectors.
🟨 b. Vector Norm (Magnitude)
Gives the length of a vector — how "strongly" it points in space.
🟦 c. Cosine Similarity
- +1 → identical direction (high similarity)
- 0 → orthogonal (no relation)
- -1 → opposite direction (opposite meanings)
Used in search, recommendations, and semantic analysis.
🟥 d. Euclidean Distance
Measures the actual distance between two vectors.
Smaller = more similar.
✏️ Let’s Try an Example
Let’s assign some coordinates to three familiar concepts on a 2D plot:
- 🐱 Cat (A): [0.6, 0.8]
- 🐶 Dog (B): [0.4, 0.2]
- 🌳 Tree (C): [-0.8, 2.2]
🔍 Compare: Cat vs Dog
Dot Product
Norms
Norm of Cat (A = [0.6, 0.8]):
Norm of Dog (B = [0.4, 0.2]):
Cosine Similarity
Euclidean Distance
✅ Close together
🔍 Cat vs Tree
Dot Product
Norm of Tree (C = [-0.8, 2.2])
Cosine Similarity
➖ Moderate similarity
Euclidean Distance
➖ Farther apart
🧵 Closing Thoughts
Vector embeddings aren’t just numbers — they capture meaning. From knowing a cat is like a dog to understanding a product’s vibe, they power smarter systems.
As someone approaching this from a design and product lens, I’m amazed at how math, language, and visuals come together.
This post was my attempt to simplify the magic — with intuition, not theory. Hope it gives you a friendly nudge into the world of embeddings!
Happy Coding! 👨💻✨