OpenAI Releases Three GPT Real-time Voice Models
Multiple bloggers mention that OpenAI has simultaneously launched three voice models in the Realtime API. The flagship GPT-Realtime-2 is the first to inject GPT-5 level reasoning capabilities into a voice model, expanding the context window from the previous generation’s 32K to 128K, supporting parallel tool calls and voice narration during execution; GPT-Realtime-Translate follows a voice-to-voice approach, supporting real-time translation for 70+ input languages and 13 output languages; GPT-Realtime-Whisper is a streaming speech-to-text model that outputs text as you speak. The main model improved in the Big Bench Audio test from the previous generation’s 81.4% to 96.6%, with multi-turn instruction following increasing from 34.7% to 48.5%. Regarding pricing, the main model costs $32 per million audio input tokens (cached $0.40) and $64 per million output tokens; translation and transcription are billed per minute at approximately $0.034 and $0.017 respectively.
Sources:
- @xiaohu: https://x.com/xiaohu/status/2052646097525555626 | https://x.com/xiaohu/status/2052646102835532181 | https://x.com/xiaohu/status/2052646105096298733
- @OpenAIDevs: https://x.com/OpenAIDevs/status/2052440907933474954
- @sama: https://x.com/sama/status/2052462271667028211 | https://x.com/sama/status/2052558319940944256
- @dotey: https://x.com/dotey/status/2052440968863887715
- @LufzzLiz: https://x.com/LufzzLiz/status/2052533468417159498
Codex for Chrome Extension Officially Released
Multiple bloggers mention that OpenAI has launched a Chrome extension for Codex, which can directly operate logged-in web pages in the browser and process multiple tabs in the background without affecting the user’s current browsing. It operates pages by writing and running code, can simultaneously call existing plugins and websites requiring login, supports macOS and Windows, but some nodes like Hong Kong are not supported. Chrome needs to be set as the default browser to complete the full onboarding process. Besides Chrome, any Chromium-based browser can use this extension. After installation, it opens a separate tab group to work, and users can find the web pages it’s controlling in the browser’s tab group interface.
Sources:
- @xiaohu: https://x.com/xiaohu/status/2052564516362498321 | https://x.com/xiaohu/status/2052564521060028917
- @OpenAIDevs: https://x.com/OpenAIDevs/status/2052481136971125158
- @vista8: https://x.com/vista8/status/2052647425832329358
- @cellinlab: https://x.com/cellinlab/status/2052565321857253773 | https://x.com/cellinlab/status/2052566272232067450
- @op7418: https://x.com/op7418/status/2052576841656099037
Nous Research Releases Hermes Agent v0.13.0
A blogger mentions that Nous Research has released Hermes Agent v0.13.0, codenamed “The Tenacity Release”, which can be updated via the `hermes update` command and now supports Spanish. ComfyUI has been integrated into Hermes Agent’s skill system, allowing developers to have the Agent call ComfyUI for image generation through natural language descriptions; there’s also an Autobrowse integration case demonstrating how to reduce task time from 102 seconds through two iterations.
Sources:
- @NousResearch: https://x.com/NousResearch/status/2052493732205744303 | https://x.com/NousResearch/status/2052532078722363803
OpenAI Releases Official Command-Line Tool openai-cli
A blogger mentions that OpenAI has launched the official command-line tool openai-cli. The project is open-sourced on GitHub (openai/openai-cli), using the Apache 2.0 license, and can be installed via Homebrew or Go. Core capabilities include: calling the Responses API with support for all cloud-based built-in tools (web search, code interpreter, file retrieval, image generation, etc.), output supports Unix-style structured formats like JSON/YAML/JSONL and can extract fields with GJSON syntax, image generation, editing, speech transcription, TTS can all be done with a single command, and it also supports creating projects and issuing API keys. File parameters use the `@ file.ext` syntax, consistent with curl habits, and binary content can be explicitly base64-encoded with `@data://`. The publisher classifies it as a lightweight passion project, primarily aimed at Agent usage scenarios.
Sources:
Google Releases Lightweight Fitbit Fitness Band
A blogger mentions that Google has launched a screenless Fitbit fitness band, weighing only 5 grams, that can be worn continuously for a week without charging (5 minutes of charging can provide a full day’s power). It supports heart rate, heart rhythm (with atrial fibrillation alerts), blood oxygen, skin temperature, sleep staging, heart rate variability, and fully automatic exercise data recognition, with 50-meter water resistance. Hardware pricing is $99 (one-time), with an optional $9.99/month subscription, which is less than half of Whoop’s annual fee and two-thirds cheaper than Oura’s hardware. The real selling point is pairing with Gemini-powered Google Health Coach, which can provide customized suggestions based on sleep and exercise data, such as taking a photo of gym equipment to generate a training plan.
Sources:
- @xiaohu: https://x.com/xiaohu/status/2052584541387444496 | https://x.com/xiaohu/status/2052584543618732442 | https://x.com/xiaohu/status/2052584546223419454
“GEO Red Paper” Released for Free
A blogger mentions that the “GEO White Paper” released in early 2025 still receives hundreds of visits daily, but this year the industry is full of chaos—black-hat GEO is prevalent, junk service providers are fleecing customers, false promises are everywhere, and even CCTV’s 315 program has criticized this. Teacher Yao and the blogger recently compiled a 100,000-word “GEO Red Paper” based on cutting-edge domestic and international papers, generative AI regulations and internet advertising laws, and a year of practical experience. It aims to bring the industry back to rationality, covering common black-hat GEO tactics, methods for evaluating service provider quality, and a GEO risk self-assessment checklist. It is now available for free.
Sources:
AIHOT Opens for Free: Enhanced AI Hotspot Monitoring
A blogger mentions that the AI hotspot monitoring website AIHOT, previously developed for internal company use, is now open to everyone for free. Skill, RSS, and API are also fully free. The tool scrapes content from 168 curated data sources, scores it through an AI computing pipeline, and pushes high-value information to users, with an AI daily report feature to help users quickly understand AI news. A changelog page and category filtering features were also added that day, with mobile adaptation also in progress.
Sources:
- @Khazix0918: https://x.com/Khazix0918/status/2052607019019079768 | https://x.com/Khazix0918/status/2052726850431148059
- @Khazix0918: https://x.com/Khazix0918/status/2052638450181124284 | https://x.com/Khazix0918/status/2052663200810991715
Sam Altman Discusses Voice Interaction Trends
A blogger mentions that Sam Altman stated in a tweet that young people seem to prefer interacting with AI via voice, while older people prefer typing. He is curious whether this preference will change over time. On the same day, he mentioned that GPT-Realtime-2’s arrival on the API is a significant improvement, and the team is also working on improving the voice experience in ChatGPT conversations. Altman also shared a photo and commented that helping software developers evolve like Pokémon into superheroes is much more interesting than trying to replace them. He cited his early views and remarked that now a truly talented person can create amazing results.
Sources:
- @sama: https://x.com/sama/status/2052462271667028211 | https://x.com/sama/status/2052485051812909530
Statistics
- Scanned timeline entries: 240
- Number of bloggers hit: 25
- Total tweets hit: 136
- Weighted tweet score: 101.95
- Original tweets: 67
- RT tweets: 39
- Scrape attempts: 1
- Boundary coverage status: Complete (Following timeline tail has clearly passed yesterday’s boundary)
Email sending failed: CloudFront 403 Request blocked