Milla Jovovich's Vibe Coding and MemPalace: The Light and Shadow of Celebrity Open Source

On April 6, 2026, a repository appeared on GitHub. It was called MemPalace—an open-source system that gives AI agents persistent cross-session memory. Nothing unusual so far. What made it remarkable was that its architect was Hollywood actor Milla Jovovich.

Nine days later: 45,000 GitHub stars. 5,800 forks. 163 issues and 213 PRs. And just as many questions.

This post looks at MemPalace from both the technical and cultural sides.

What Is Vibe Coding

Jovovich described her development approach as “vibe coding.” Rather than writing code herself, she directed AI coding agents like Claude Code using natural language to build software. She introduced herself as the “architect” and crypto entrepreneur Ben Sigman as the “engineer.”

A GitHub account called aya-thekeeper that pushed the code was deleted shortly after launch. Jovovich explained that “Lu_code is my AI agent,” but questions about code authorship transparency lingered.

Vibe coding itself has become a trend in 2026. The issue isn’t vibe coding per se—it’s who verifies the quality of what vibe coding produces, and how.

What MemPalace Does

Every conversation with AI vanishes when the session ends. Six months of daily AI use generates roughly 19.5 million tokens of context, but none of it fits in any context window.

Existing memory systems (Mem0, Zep, etc.) have LLMs extract facts like “the user prefers Postgres.” The original conversation is discarded. The context of why that decision was made disappears.

MemPalace takes a different approach: store every conversation verbatim and find it through semantic search.

Inspired by the ancient Greek mnemonic technique “Method of Loci,” it structures memory using spatial metaphors:

Wing: Top-level domain for a person or project
Room: A specific topic within a Wing (auth-migration, graphql-switch, etc.)
Hall: Corridors connecting rooms by memory type (facts, events, preferences)
Tunnel: Cross-links between Rooms of the same topic across different Wings

The tech stack is ChromaDB (vector DB) + SQLite (knowledge graph). Fully local. Zero API cost. MIT license.

The Truth Behind 96.6%

MemPalace’s README puts it plainly: “The highest-scoring AI memory system ever benchmarked. And it’s free.”

LongMemEval benchmark, Raw mode: 96.6% R@5. An impressive number. But look closer and the story changes.

96.6% is ChromaDB’s default embedding search performance. MemPalace’s unique architecture—Wings, Rooms, Halls, Tunnels—played no role in this benchmark. Activating the Palace structure actually drops performance to 89.4%.

Initially, the team claimed a “100% perfect score.” But this was achieved by identifying failed questions, modifying the system to address those specific failures, and retesting on the same set. In academic terms, this is “teaching to the test.”

The AAAK system, marketed as “30x lossless compression,” turned out to be lossy compression. When activated, accuracy dropped to 84.2%—a 12.4 percentage point decline from raw mode.

Community criticism poured in. One independent tester reported that when actually connected to an LLM, the correct answer rate was roughly 17%.

The Team’s Response: “Right Over Impressive”

Here’s where the story gets interesting. Unlike many projects that ignore or deflect criticism, Jovovich and Sigman revised the README within 48 hours.

They changed 100% to 96.6%. They removed the “30x lossless compression” and “+34% palace boost” claims. They added a “A Note from Milla & Ben” section to the README:

“We’d rather be right than impressive.”

The open-source community received this response positively. The speed and sincerity of acknowledging errors actually restored some of the project’s credibility.

The Crypto Shadow

But another controversy looms. A MemPalace token appeared on pump.fun, with a 50:50 creator reward split between Jovovich and Sigman. The token exhibited a pump-and-dump pattern within 24 hours of launch.

Evidence of direct involvement is unclear. However, the emergence of an eponymous token from an open-source project co-founded by a crypto entrepreneur casts a shadow on credibility—regardless of intent.

Where the Real Value Lies

Strip away the hype and controversy, and MemPalace contains a genuinely meaningful insight.

“Verbatim storage yields higher recall than LLM extraction.”

Existing memory systems have LLMs extract important facts from conversations. Context is lost in the process. MemPalace chose the far simpler approach of storing raw text and finding it via vector search. And this simple approach scored higher on benchmarks.

One sentence from the README captures it: “Nobody tried the simple thing and measured it properly.”

Compared to competitors:

System	Approach	LongMemEval	Cost	Local
MemPalace	Verbatim + semantic search	96.6%	Free	Fully local
Mem0	LLM extraction	~85%	$19-249/mo	Cloud
Zep	Temporal knowledge graph	~85%	$25+/mo	Cloud
Letta	LLM self-memory mgmt	-	Free (OSS)	Local possible

MemPalace is currently the only serious option offering “fully local + free + verbatim storage” simultaneously.

What Celebrity Open Source Means

The bigger question MemPalace raises isn’t about the technology itself.

How should we evaluate open source that’s AI-coded by a non-developer?

Vibe coding lowers the barrier to entry. Someone with an idea can build software without writing code. This is democratization. At the same time, the absence of expertise in benchmark interpretation, security review, and code quality verification led to inflated claims and an immature codebase.

Code-level issues in MemPalace:

Contradiction detection module exists but isn’t connected to the knowledge graph
stdout bug breaks Claude Desktop MCP integration
Segfaults on macOS ARM64
No input validation (prompt injection risk)
21 modules with only 4 test files and no CI/CD at launch

45,000 stars measure interest, not quality. Celebrity effect generates attention, but attention doesn’t equal maturity.

Conclusion: Measure, Question, Acknowledge

If there’s anything to learn from MemPalace, it’s three things.

First, try the simple thing first. The discovery that verbatim storage outperforms LLM extraction is real. It demonstrates the value of measuring a simple baseline before building complex solutions.

Second, question the benchmarks. Behind the number 96.6% lies the question: “What is this actually measuring?” In MemPalace’s case, what was measured wasn’t the project’s unique architecture but the baseline performance of its dependency library.

Third, the speed of acknowledging errors determines trust. Jovovich and Sigman correcting their claims within 48 hours may matter more to the project’s future than accumulating 45K stars in 9 days.

MemPalace is still a three-week-old project. Version 4.0 promises swappable storage backends and hybrid search. The core ideas that remain after stripping away the hype—verbatim preservation, local-first, zero cost—are worth paying attention to.

Just make sure you’re looking at the code, not the star count.