Skip to content

Embedding

MaaS-embedding-3-large is the latest and most powerful embedding model. Upgrading between embedding models is not possible. To transition from using text-embedding-ada-002 to MaaS-embedding-3-large, you need to generate new embeddings.

The following models are available for purchase:

  • MaaS-embedding-3-large
  • MaaS-embedding-3-small
  • MaaS-embedding-ada-002

MaaS-embedding-3-large

  • Powerful Performance

    The average score in common multilingual retrieval benchmarks (miracl) has surged from 31.4% to 54.9%, and in English task benchmarks (mteb), the average score has improved from 61.0% to 64.6%. This model can understand and process text content more accurately, creating embeddings with dimensions up to 3072, thereby providing richer semantic representations for complex natural language processing tasks.

  • Flexible Adjustment Support

    Developers can flexibly adjust embeddings by passing dimensions through the API parameter without losing their conceptual representation attributes. This allows for shortening embeddings (by removing some numbers from the end of the sequence) to balance performance and cost, thus adapting to different application scenarios and resource constraints.

  • Wide Application Range

    The model can be applied to various natural language processing tasks and scenarios, such as text clustering, retrieval, and knowledge graph construction. It also supports applications like knowledge retrieval in ChatGPT and Assistants API, as well as numerous retrieval-augmented generation (RAG) development tools.

MaaS-embedding-3-small

  • Performance Enhancement

    Compared to the previous generation model, MaaS-embedding-ada-002, the average score in common multilingual retrieval benchmarks (miracl) has increased from 31.4% to 44.0%, and in English task benchmarks (mteb), the average score has risen from 61.0% to 62.3%, demonstrating improved performance.

  • Support for Embedding Shortening

    Developers can shorten embeddings by passing dimensions through the API parameter without losing their conceptual representation attributes. This allows for the removal of some numbers from the end of the sequence, facilitating a balance between performance and cost, thus accommodating different application scenarios and resource constraints.

MaaS-embedding-ada-002

  • Affordable Pricing

    Compared to other models, it offers a cost advantage, lowering the barrier to entry.

  • Powerful Performance

    Outperforms previous embedding models in tasks such as text search, code search, and sentence similarity, and also demonstrates strong performance in text classification. It can be applied to various natural language processing tasks, including text clustering, sentiment analysis, and machine translation.

  • Comprehending Text Meaning

    Not only focuses on the literal meaning of the text but also grasps deeper meanings, such as understanding synonyms or the specific meaning of words in context, enabling the generated embedding vectors to reflect semantic features.

  • High-dimensional Data Compression

    Capable of compressing complex textual information into simpler numerical vectors, reducing data dimensionality while preserving key information.

  • Versatile Usage

    The embedding method is universal, suitable for a wide range of natural language processing tasks.

  • Built on Deep Learning

    Constructed using deep learning technology, it has been trained on vast amounts of textual data, thereby learning how to effectively represent and comprehend language.

  • Convertible to Numerical Vectors

    Transforms text into numerical vectors, facilitating computer processing through various algorithms.

  • Facilitates Subsequent Tasks

    The generated numerical text vectors can be used to train machine learning models or for various data analysis and natural language processing tasks.

  • Pre-trained Model

    It is a pre-trained model that users can utilize directly without needing to train from scratch.