Text embeddings represent text in numerical form (list of vectors) that encompasses the semantic meaning.

Text is translated into text embeddings using an embedding model. The performance of the model depends on size of the model and relevance of domain of the training data.

The distance between two vectors can be calculated using cosine similarity or other distance functions.

Small distance represents high relatedness.

Common uses for embeddings:

  • Search
  • Clustering
  • Recommendations
  • Anomaly detection
  • Diversity measurement
  • Classification