Gemini Embedding 2: natively multimodal embedding model

(blog.google)

33 points | by panarky 2 days ago

4 comments

jeanloolz a day ago
This is colossal. It can creates embeddings on pretty much any type of format, video, audio, documents. The context is still a bit small compared to what we are used to in text, but this seems major
Grimblewald 21 hours ago
How does it compare with qwens open weight multimodal embedding model? Anyone know? This seems lesser form what i read, with the drawback of bei g via some api/model i dont have control over. Qwen gives great ebeddings out of the gate while also being steerable, i.e. you can supply a prompt to focus on embedding specific tasks with higher resolution, which in my tests has been mind-blowingly good. Not seeing the value add here.
jerrygoyal a day ago
what's the pricing and how does it compare to zembed-1 for text only embeddings?
[-]
- jiggawatts 10 hours ago
  Pricing is here: https://cloud.google.com/vertex-ai/generative-ai/pricing#emb...
  Seems to be 20 cents per million tokens of text and 0.012 cents per image.