This is very exciting. I tried it on OpenRouter, and the speed is really impressive. I compared it with gpt-oss-120B. I would say Mercury was a few times faster. Was it 10x faster? Maybe, I didn't check with a timer, but subjectively, I would say it's at least 3x faster. About the quality: if anything I liked Mercury better. But I think at this point it's just a matter of style preferences. The content was absolutely comparable.
Now, what I tried was some free sample for both models. It looks like if you want to use the models in earnest, you pay $1 per million output tokens for Mercury, and between $0.25 and $0.95 for gpt-oss-120B, with Cerebras getting you a mindblowing 6645 tokens/second for the latter, at $0.65/MM tokens.
Still, Mercury is able to trade punches with freaking OpenAI. That's, in my opinion, quite exciting.
This is very exciting. I tried it on OpenRouter, and the speed is really impressive. I compared it with gpt-oss-120B. I would say Mercury was a few times faster. Was it 10x faster? Maybe, I didn't check with a timer, but subjectively, I would say it's at least 3x faster. About the quality: if anything I liked Mercury better. But I think at this point it's just a matter of style preferences. The content was absolutely comparable.
Now, what I tried was some free sample for both models. It looks like if you want to use the models in earnest, you pay $1 per million output tokens for Mercury, and between $0.25 and $0.95 for gpt-oss-120B, with Cerebras getting you a mindblowing 6645 tokens/second for the latter, at $0.65/MM tokens.
Still, Mercury is able to trade punches with freaking OpenAI. That's, in my opinion, quite exciting.