Answer-First Lead
Today’s technology and society stories: Meta’s AI chatbot was tricked into handing over Instagram accounts including the Obama White House handle [1], NZ experts are “gobsmacked” by legislation allowing AI benefit decisions [2], Anthropic warns AI systems could escape human control [3], and OpenAI admits prompt injection can’t be fully solved, offers Lockdown Mode instead [4]. Trust, automation, and accountability — the Monday quartet.
🔍 THE BOTTOM LINE
When AI systems make decisions about accounts, benefits, and safety, the question isn’t just whether they work — it’s whether we can trust them when they don’t.
Meta AI Chatbot Handed Instagram Accounts to Hackers
What happened: Hackers exploited Meta’s AI support chatbot to hijack Instagram accounts by convincing the bot to send password reset codes to email addresses they controlled [1]. Victims include security researcher Jane Wong, the Obama-era White House Instagram account (inactive since 2017), and U.S. Space Force chief master sergeant John Bentivegna.
The exploit: Attacker uses VPN to spoof victim’s location, chats with Meta AI Support Assistant, requests adding new email to target account. Bot sends verification code to attacker’s email, attacker provides code, bot displays “Reset Password” button. Complete account takeover without ever accessing victim’s legitimate email.
Meta’s response: Instagram spokesperson Andy Stone replied on X that the issue was fixed. No details on what changed, how many accounts were affected, or whether similar vulnerabilities exist in other Meta AI integrations.
Why it matters: This is automation risk in its purest form — the exact capability designed to help users (account recovery via chatbot) became the attack vector. Meta trained an AI to make trust decisions about account ownership, and that AI couldn’t distinguish between legitimate users and attackers spoofing location data.
The deeper problem: As AI agents gain more authority to act on behalf of users — resetting passwords, making purchases, accessing files — the attack surface expands exponentially. Every action an agent can take is a potential exploit path.
Related: High-Profile Meta AI Chatbot Breach Spotlights Security Risks of Automation
NZ Expert “Gobsmacked” by AI Benefit Decision Law
What happened: A new law allowing the Ministry of Social Development to use artificial intelligence to make benefit decisions passed its third reading in Parliament under urgency last Friday [2]. An AI and privacy expert told the NZ Herald she was “gobsmacked” by the rushed passage.
Government position: The government says it’s not AI, Labour did it first — suggesting this is automated decision-making that predates current AI terminology. Maxim Institute is calling for a register of AI use across government.
Why it matters: New Zealand is moving faster than most democracies on embedding AI in welfare administration, but doing so without public consultation or transparency frameworks. The “not AI, actually” framing is semantic gymnastics — if a system makes decisions about benefit eligibility using algorithmic logic, calling it something else doesn’t change the accountability questions.
The NZ context: This follows earlier controversy over Algorithmic Charter principles and government use of predictive analytics. The rush under urgency suggests the government knew this would be controversial and wanted it done before debate could coalesce.
Related: The Real-World Cost of AI — The Spinoff on AI’s environmental and economic costs
Anthropic: AI Systems Could Escape Human Control
What happened: Anthropic published a blog post urging leading AI companies to agree on a coordinated mechanism to temporarily slow or pause development of advanced AI systems [3]. The company warns that as systems become capable of recursive self-improvement, humans could lose control.
Key argument: AI systems are rapidly improving at performing software tasks independently, including coding. With sufficient compute, future systems could design and build improved versions of themselves. Anthropic says it would be “good for the world to have the option to slow or temporarily pause” further development.
Industry response: OpenAI separately stated that democratic governments — not private companies — should determine AI rules and safeguards. No single company or lab should decide the pace of innovation.
Why it matters: This is safety-focused positioning from Anthropic at a time when competitors are racing toward IPOs. But it also raises genuine questions: if recursive self-improvement becomes real, can any coordination mechanism work when the payoff for going first is this large?
The history: Similar pause calls in 2023 (Future of Life Institute letter with Musk support) produced no industry-wide action. Expect the same this time — but the debate itself shapes public perception of who’s responsible.
Related: Sakana AI Opens Lab For Recursive Self-Improvement — Sakana is betting RSI can break the compute arms race
OpenAI’s Lockdown Mode: Admitting What Can’t Be Fixed
What happened: OpenAI rolled out Lockdown Mode to ChatGPT, disabling live browsing, agent mode, deep research, image retrieval, Canvas networking, and file downloads to block data exfiltration via prompt injection [4]. Available on all plans.
What it does: Lockdown Mode doesn’t stop prompt injections from happening — malicious payloads can still influence model behaviour. It shuts down outbound pathways attackers would use to exfiltrate data. No live browsing = no network requests to external servers.
The trade-off: Significant functionality loss. Live browsing drops to cached content only. Agent mode is gone entirely. Deep research is disabled. OpenAI acknowledges it’s “not intended for everyone.”
Why it matters: This is rare honesty from a major AI lab — OpenAI is admitting it cannot solve prompt injection, only mitigate exposure. The underlying weakness is fundamental: LLMs cannot reliably separate data from instructions. Lockdown Mode is a pragmatic concession, not a solution.
The pattern: Security researchers have demonstrated hijacks against agents from Anthropic, Google, and Microsoft via GitHub Actions integrations. All three paid bug bounties but published no public advisories. Lockdown Mode is the first public acknowledgment that the problem persists at scale.
Related: AI Agents Hijacked via Prompt Injection — Bug Bounties Paid, No CVEs
🔍 THE BOTTOM LINE
Four stories, one theme: trust. Meta’s chatbot trusted the wrong person. NZ’s government wants citizens to trust automated benefit decisions without transparency. Anthropic says we should trust them to coordinate a pause while competitors race ahead. OpenAI says don’t trust ChatGPT with sensitive data unless you disable most of its features.
The honest takeaway: AI systems are being deployed faster than we’re learning to verify them. Every automation is a trust decision — and right now, we’re making those decisions on faith.
❓ Frequently Asked Questions
Q: Should I worry about my Instagram account? The vulnerability appears fixed, but the pattern matters more than this specific exploit. Any AI system with authority to change account settings is a potential attack vector. Enable two-factor authentication, use unique passwords, and be skeptical of any AI support tool that can reset credentials without multi-step verification.
Q: Is NZ’s benefit AI law different from other countries? Most democracies are debating similar questions, but NZ moving forward under urgency without public consultation is notable. The UK’s welfare digitalisation has faced legal challenges over transparency. The EU AI Act would classify welfare AI as high-risk, requiring impact assessments. NZ’s approach is faster, quieter, and less accountable.
Q: Can prompt injection ever be fully solved? Not with current LLM architectures. The fundamental issue is that data and instructions are processed the same way — a malicious payload embedded in content looks identical to legitimate instructions. Lockdown Mode reduces exposure; it doesn’t fix the root cause. True solutions would require architectural changes to how models parse input.
📰 SOURCES
- Hackers hijacked Instagram accounts by tricking Meta AI support chatbot — TechCrunch
- Automated benefit decisions: Expert gobsmacked; Govt says it’s not AI, Labour did it first — NZ Herald
- Anthropic urges pause on advanced AI, warns of loss of human control — Newswire.lk
- OpenAI adds Lockdown Mode to ChatGPT to block data theft from prompt injection attacks — The Next Web