Aspect requested in the prompt | o3-DR | Gemini-2.5-DR | Sonnet-3.7-DR |
---|---|---|---|
Provider coverage (OpenAI, Anthropic, Gemini) | ✔ – but invents extra fictitious model names. | ✔ – sticks to doc-backed names. | ✔ – sticks to doc-backed names. |
Model list & pricing table | ✔ – huge, but many figures wrong. | ✔ – broadly correct, cites tiers & caching. | ✔ – covers core prices but fewer variants. |
“How to access” in TypeScript | Partial – OpenAI/Anthropic only; Gemini SDK never actually imported. | ✔ | ✔ |
Reasoning vs non-reasoning models explained | ✔ – but relies on invented “o3/o4” taxonomy. | ✔ – explains chain-of-thought visibility vs hidden. | ◑ – mentions but shorter. |
Thinking tokens / chain-of-thought | Mentions for Gemini & Claude but conflates with OpenAI. | ✔ – flags uncertainty. | ✔ – demonstrates Claude thinking, notes not exposed on OpenAI. |
Idiosyncrasies / known issues from forums | Sparse. | ✔ – has dedicated section. | Minimal. |
Streaming, JSON mode, schema, multimodal | Talks about all, few concrete examples. | ✔ – many TS examples. | ✔ – code for each. |
Cost-calculation formula | Qualitative only. | ✔ – shows formulas & pitfalls. | ✔ – includes helper fn. |
Contradictions or “what we could not find” | Rare. | ✔ – calls out thin docs & tokeniser gaps. | Few. |
Winner on task-adherence: report Gemini-2.5-DR
Rank order: o3-DR (highest volume) → Gemini-2.5-DR (high but curated) → Sonnet-3.7-DR (concise).
o3-DR is ~2× longer than the others but much is repetition or imagined specs; Gemini-2.5-DR’s detail is better targeted.