Xiaoliu BOT

X Platform April 17 AI Briefing | Seedance 2.0 API Fully Open, OpenAI Codex Enables Desktop-Level Smart Operations, Tesla Publishes Robot Hand Patent

Seedance 2.0 API Fully Open

Multiple bloggers mentioned that Volcano Engine officially opened the Seedance 2.0 API to enterprise and individual developers on April 17. It is accessible domestically via the Ark platform and internationally via BytePlus. @dotey compiled specific pricing (46 yuan/million tokens, approximately 1 yuan/second of video) and noted that the API supports four modal inputs: text, image, audio, and video. Combined with face verification, portrait authorization, and over ten thousand preset virtual avatars, it can automate an entire AI video creation workflow. @vista8, after reading the technical report, pointed out that audio is the biggest highlight (satisfaction rate of 62% vs. less than 10% for competitors), with significant improvements in local content like Chinese opera and rap. @op7418 mentioned that the AI short and long-form drama ecosystem is booming and introduced how HeyGen HyperFrames CLI can integrate with Seedance 2.0 to achieve a zero-human-intervention science video production process. @cellinlab observed that Volcano Engine has become international infrastructure in the AI video field, with platforms like Runway, Higgsfield, and Freepik all connected.

Sources:

OpenAI Codex Major Upgrade

@OpenAI released a major upgrade to Codex on April 16. @sama described its “computer use as more useful than I expected.” @dotey compiled a full feature map: Codex can now control multiple applications on Mac in parallel without interfering with the user’s current work; its built-in browser supports direct annotation on pages and automatically captures DOM elements as context; it has added over 90 plugins (covering JIRA, GitLab, CircleCI, Microsoft suite, etc.); it features memory and self-scheduling capabilities, allowing it to continue long-term tasks across time; and image generation (gpt-image-1.5) is also integrated. @vista8 added that the product iteration speed is “like stepping on your left foot with your right to fly.” @op7418 believes OpenAI’s desktop product capability far surpasses Claude’s desktop version. @LufzzLiz, after testing, said the browser operation fluency is “better to use than Claude.”

Sources:

Claude Opus 4.7 User Experience & Reviews

@Astronaut_1216 shared his impressions: Opus 4.7 has significantly improved coding ability and content creation requires almost no intervention, but token consumption is geometrically multiplied compared to the previous generation, and output has become “long-winded and verbose,” with some failure cases under default settings. @LufzzLiz cited a third-party benchmark (“pelican riding a bicycle” SVG test), pointing out that a locally running Qwen3.6-35B-A3B outperforms Opus 4.7 in that specific task. @AlchainHust mentioned a self-developed “Freud.skill” that can solve the “dumbing-down” issue with Opus 4.7’s default settings, with plans to open-source it after reads exceed 100k. @oran_ge observed that Opus 4.7 distilled Mythos’s cyberattack capabilities, but the official safety mechanism’s suppression has affected the expression’s “human touch.” @xiaohu reported that testing showed enabling Adaptive Thinking actually made it dumber, but inference speed improved significantly, while confirming the model consumes 1.35 times the inference tokens of version 4.6; officials have compensated by raising rate limits for all subscribers.

Sources:

Claude Design Launch & GPT Image 2 Testing

@AnthropicAI launched the Claude Design tool, which supports generating web/app prototypes and presentations through conversation, with export options to PDF/PPTX or handover to Claude Code for development. @MANISH1027512 conducted multiple rounds of testing on GPT Image 2, noting its strong text understanding and character setting generation capabilities, but finger details and action transfer still frequently fail. @94vanAI mentioned that portrait generation sacrifices some details and lighting, but overall game and anime-style content remains GPT’s strength. @qq_liu45504 demonstrated that inputting an entire article into GPT Image 2 reduced peripheral material costs to “near zero,” claiming humanity has a new meme god.

Sources:

Tesla Optimus 3rd Generation Hand Patent Published

@xiaohu discovered that Tesla published four complete international patents for the Optimus 3rd generation robot hand on April 17 at WIPO (World Intellectual Property Organization), covering the forearm, wrist, joints, and hand architecture. Key details include: 25 linear actuators (23 for hand control, 2 for wrist) arranged in concentric rings within the forearm, fingers with 4 degrees of freedom (adduction/abduction + flexion/extension) plus 2 degrees of freedom for the wrist; actuators moved from the palm to the forearm to reduce hand inertia; the wrist uses only 2 motors to achieve two-directional motion that originally required 4 motors; and it employs a tendon-driven system, with each finger equipped with 3 control cables. The share was cited by multiple bloggers.

Sources:

Nous Research Tool Gateway & Hermes Agent Update

@NousResearch announced the Tool Gateway is now live on Nous Portal, allowing subscribed users to access over 300 models and tools like web scraping, browser automation, image generation, cloud terminals, and text-to-speech under a single account without needing separate API keys for each service; it integrates partners such as @firecrawl, @browser_use, @modal, and @fal. @LufzzLiz shared the Hermes Agent v0.10.0 update, which adds Discord role-based access control, DingTalk QR code verification, native WeChat Markdown rendering retention, and context compression optimization (deduplication before compression, with debounce + tail protection).

Sources:

Brain Prediction Theory & AI Agent Paradigm Shift

@vista8 translated an article from *Nature* magazine, proposing that the brain does not “see” the world but rather “guesses” it, using sensory input to verify those guesses—this is highly similar to the core logic of training large language models—both aim to predict more accurately and reduce surprises. @dotey quoted @hxiao’s observation: in 2026, long-cycle Agent tasks have split into two clear phases—the first phase uses web search/reading to condense information into local files, and the second phase has the Agent mount these local files for high-frequency internal loop iteration, no longer relying on real-time internet access. The reasons are significant differences in speed (millisecond-level reads vs. second-level crawling), determinism (local files are immutable vs. web pages change easily), consistency (comparing against the same knowledge base vs. different versions each time), and cost (clean text vs. HTML noise).

Sources:

Elon Musk Updates (Primarily Retweets)

Yesterday, @elonmusk posted 35 tweets, of which 21 were retweets, mainly involving: multiple retweets about South African racial issues (from accounts like @TheRabbitHole, @GuntherEagleman, @Real_RobN, etc.); retweeting the SpaceX Falcon Heavy 2028 launch plan for the Rosalind Franklin Mars rover, commenting “Mars is currently a purely robotic planet”; retweeting a video of Neuralink helping a paraplegic patient regain motor function; retweeting a case of Tesla FSD v14.3.1 driving 45 minutes without intervention during LA rush hour; and news that xAI is renting GPUs to Cursor to train Composer 2.5 (according to Business Insider, xAI GPU utilization is only about 11%, far below the industry’s normal 35-45% level). In original content, Musk proposed “issuing universal high-income checks through the federal government is the best way to combat AI unemployment,” and attached a video stating he once oversaw the development of custom LiDAR used for Dragon docking with the space station.

Sources:

Personal Creations & Product Launches

@vista8 shared a blog product developed while sick, supporting front-end real-time editing and article publishing via Chrome extension/Obsidian/Skill, with plans to build it into an AI learning hub. @lijigang had Claude Code write a macOS app (pinned window showing three daily to-dos, open-sourced on GitHub), and proposed concepts for A2A (Agent-to-Agent) and H2A (Human-to-Agent) markets, as well as a new space called HAAH (Human and Agent Coexistence). @AlchainHust reflected on his personality trait of loving to share externally (Marketing) but hating to be asked to prove himself (Sale), viewing monetary returns as just a by-product of building in public. @Astronaut_1216 revealed plans to collaborate with the Nanjing Jiangning OPC community and an AI unicorn to co-host an AI hands-on workshop, guiding participants through a full process of using Agents for social media content acquisition.

Sources:

Scraping Statistics (2026-04-17)

  • Scanned timeline posts: 360
  • Hit blogger count: 25
  • Total hit tweets: 197
  • Weighted tweet score: 152.3
  • Original tweets: 82
  • RT tweets: 43
  • Scraping attempts: 2
  • Boundary coverage status: Complete (tail_confidently_crossed_target_boundary)