X Platform May 20 AI Brief | Google I/O Updates, Meta AI Training Controversy, GitHub Extension Breach

Google I/O 2026 Unveils Multiple Products, Gemini 3.5 Flash is Fast but Controversial

Google I/O 2026 launched products including Gemini 3.5 Flash, Antigravity 2.0, Gemini Omni, Gemini Spark, and Ask YouTube. Gemini 3.5 Flash surpasses the previous flagship Gemini 3.1 Pro in coding and agentic capabilities, with extremely fast generation speeds—some bloggers measured around 800-1480 tok/s on TPU 8i. However, multiple bloggers noted its knowledge cutoff date is January 2025 (17 months prior), and its price is three times that of Gemini 3 Flash. Antigravity 2.0 was pointed out by several bloggers to have an interface and functions highly similar to Codex, with the official video even showing a screenshot of a Codex folder. The Ask YouTube feature received positive reviews for allowing direct jumps to relevant video moments. Gemini Spark is described as a cloud Agent gateway, similar to a cloud-based Claude Code. Google also released its 8th generation TPUs, with the training chip TPU 8t and inference chip TPU 8i designed for different scenarios.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2057056672913486023
@Gorden_Sun: https://x.com/Gorden_Sun/status/2057002467234345453
@op7418: https://x.com/op7418/status/2056904254175281353
@xiaohu: https://x.com/xiaohu/status/2056886402785411247

Meta All-Hands Recording Leaked: Zuckerberg Says Company Uses Employees as AI Training Subjects While Cutting 10% of Staff

A recording of Meta’s all-hands meeting was leaked. During the meeting, Zuckerberg told employees that Meta is using internal engineers as subjects to train AI, reasoning that the company’s employees have higher average intelligence than the external workforce, and having internal personnel build tools and solve coding tasks will improve the model’s coding capabilities faster than competitors. Meta’s head of HR confirmed the company will cut 10% of its staff, with approximately 8,000 employees being laid off via email at 4 AM, while over 7,000 people will be transferred to work on new AI-centric projects.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2057098481756676557
@MorePerfectUS: https://x.com/MorePerfectUS/status/2056842597117636890

GitHub Hacked: Malicious VS Code Extension Leads to Internal Repository Leak, About 3,800 Repositories Affected

GitHub detected that an employee device was compromised due to a malicious VS Code extension containing planted malicious code. GitHub has removed the malicious extension version, isolated the terminal, and initiated an incident response. The current assessment indicates the activity involved only the exfiltration of GitHub’s internal repositories, and the attacker’s claim of about 3,800 repositories aligns with the official investigation findings. GitHub has rotated critical keys, prioritized credentials with the greatest impact, and continues to analyze logs and monitor for subsequent activity.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2056979772232835533
@github: https://x.com/github/status/2056884788179726685

DeepSeek Forms Code Team, Recruiting Product Managers and R&D Engineers in Beijing

A DeepSeek researcher tweeted that DeepSeek is forming a new Code Harness team to build Code Harness (which can be called DeepSeek Code) from scratch. The work location is in Beijing, with two positions open: Harness Product Manager and Harness R&D Engineer. This signifies that DeepSeek is entering the AI coding tools field.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2057099281765691594
@victor207755822: https://x.com/victor207755822/status/2057064415300841626

Alibaba Unveils Zhenwu M890 AI Chip and 128-GPU Super Node Server

At the 2026 Alibaba Cloud Summit, Alibaba released a 128-GPU super node server based on Pingtou Ge’s next-generation AI chip, the Zhenwu M890. It features the interconnect chip ICN Switch 1.0, with communication latency as low as hundreds of nanoseconds. This allows 128 AI chips to operate as a single computer, meeting the demands for concurrent inference and large model training in the Agentic era. The Senior Vice President of Alibaba Cloud Intelligence Group stated that Alibaba Cloud has made full-stack preparations for the Agentic era.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2056938432849699243

OpenAI Announces Partnership with Singapore Government, Investing S$300 Million to Establish First Overseas Applied AI Lab

OpenAI has committed an investment of S$300 million to develop Singapore’s AI capabilities, with the core being the establishment of an applied AI lab in Singapore, its first such institution outside the United States. OpenAI will create over 200 technical jobs in Singapore over the next few years and will designate Singapore as one of its global hubs for frontline deployment engineers. It also launched a frontline deployment engineer training program and is exploring an accelerator project for AI startups.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2056976182995005877
@gabrielchua: https://x.com/gabrielchua/status/2056956065795985853

Cerebras Runs 100-Billion-Parameter Kimi K2.6 at ~1000 tok/s, Setting New Record for Frontier Model Speed

Cerebras ran Kimi K2.6 (a 1000B parameter ultra-large model) during an enterprise trial run, achieving a speed of approximately 1,000 tokens/s. According to Artificial Analysis measurements, this is the fastest frontier model performance ever recorded. Cerebras’ stock surged 68% on its first day of trading, pushing its market cap to $67 billion.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2056957427933995374
@cerebras: https://x.com/cerebras/status/2056778123329274279

Nous Research Releases CNA Method: Steering LLM Behavior Without Training Sparse Autoencoders

Nous Research released Contrastive Neuron Attribution (CNA), a method for steering LLM behavior by identifying and ablating MLP sparse circuits. Given a small set of contrasting prompt pairs, CNA can isolate the top 0.1% of MLP neurons with the largest activation differences. Ablating this small circuit removes the target behavior without affecting the model’s other capabilities. The method has been validated on 8 instruction-tuned models, including Llama-3.1-70B and Qwen2.5-72B. The research found that the refusal mechanism is not inherent in pre-trained models but is connected as a behavior gate through alignment fine-tuning.

Sources:

@NousResearch: https://x.com/NousResearch/status/2056778746716107193

Railway Banned by Google Cloud, Causing 6-Hour Service Outage

The Railway platform, serving over 2 million developers with over 10 million monthly deployments, was banned by its upstream provider Google Cloud, leading to a service outage of approximately 6 hours without cause or explanation. Railway had just completed a $100 million Series B funding round this year. Nous Research also confirmed that its Nous Portal users were affected. This incident has sparked discussions about the risk of unilateral bans by cloud service providers.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2056958635612504469
@Railway: https://x.com/Railway/status/2056883076496789854
@NousResearch: https://x.com/NousResearch/status/2056878995975630980

OpenAI Offers $2 Million in Tokens to Current YC Batch Startups in Exchange for Equity

Sam Altman announced at a Y Combinator event that OpenAI is offering $2 million worth of OpenAI tokens to each YC startup in the current batch in exchange for equity. This move is seen as a major promotional push by OpenAI within the developer ecosystem.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2056982131986051153
@bosmeny: https://x.com/bosmeny/status/2056914385814401238

HuggingFace Launches Hardware Tracking Feature, Showcasing Real Hardware Usage in the Open-Source AI Ecosystem

HuggingFace launched a hardware tracking feature to showcase the hardware actually powering open-source AI, including popular GPUs and CPUs, VRAM distribution, and inference hardware trends. This is seen as a landmark event for hardware transparency in the open-source AI ecosystem.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2057102544724529394
@julien_c: https://x.com/julien_c/status/2057084823097794772

Midjourney Founder Says Using Google TPU for Training May Have Set Research Back a Year

Midjourney founder David Holz stated that using Google TPUs to train models may have set their research progress back by about a year. If he could go back in time, he would try to use only Nvidia cards from the start. However, he denied being deceived by Google, saying he might have deceived himself, and noted that many of the most successful models were made on these chips.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2057064580178968663
@DavidSHolz: https://x.com/DavidSHolz/status/2056898979745714243

Gemini 3.5 Flash Shows Mixed Results in Voice and Task Agent Benchmarks

Benchmarks for voice and task agents released by developer @kwindla show Gemini 3.5 Flash as the overall top scorer in the task agent benchmark, but all Gemini 3 models are too slow to work well in voice agent scenarios. Claude Haiku 4.5 remains the best-performing model for voice agents, with a TTFT under 700 milliseconds. Gemini 3.5 Flash performs well with high reasoning budgets, but its actual cost is higher than GPT-5.4 and Claude Sonnet 4.6. The benchmarks also show that low-reasoning settings don’t always save money, as making more mistakes requires more tokens to complete a task.

Sources:

@MaxForAI: https://x.com/MaxForAI/status/2056975071739339012
@kwindla: https://x.com/kwindla/status/2056959360837030344

Statistics: Timeline threads scanned=480, Bloggers matched=39, Total tweets matched=283, Weighted tweet score=226, Original tweets=123, RT tweets=51, Fetch attempts=3, Boundary coverage status=tail_confidently_crossed_target_boundary