Papyr | Moses Adebayo

Why I built it

Keeping up with AI research has a discovery problem and a reading problem.

The discovery problem is that the good stuff is spread across arXiv, a handful of lab blogs (Anthropic, DeepMind, OpenAI, Google Research), and about fifty independent newsletters. The reading problem is that most of the places that surface it — Twitter/X especially — are optimised for reaction, not reading. By the time you’ve clicked through, you’ve been ambushed by three unrelated threads and a sponsored post.

I wanted something closer to a reader than a feed. One column. Long lines. No engagement loop.

What it does

Papyr pulls new papers and posts from a curated set of sources, dedupes them, and lays them out in a single scrollable reading list. Each item opens into a focused view with the abstract, metadata, and a link out to the full paper. Filters let you narrow by source or topic when you want to; the default is chronological.

The aggregation layer runs on a scheduler that polls arXiv categories and RSS endpoints, normalises the metadata into a shared schema, and writes into Postgres. The frontend reads from a simple API layer — nothing clever, just boring HTTP and an index on published_at.

Design decisions I keep coming back to

Boring aggregation, aggressive deduplication. Every source has its own idea of a stable identifier. Papers get cross-posted between arXiv and company blogs. I lean on DOIs where I can, fall back to title + first author hashing where I can’t, and accept some manual cleanup at the edges.

No algorithmic ranking. Chronological order, user-selected filters, no “for you.” The bet is that the people using Papyr want to read the feed, not be told what’s important.

Reading experience over feature surface. Most of the engineering time sits in typography, line length, and letter spacing. If the page doesn’t feel good to read, none of the aggregation work matters.

What’s next

The next iteration is about saved state — reading progress, a proper “read later” stack, and exports to common reader formats. I’m also looking at per-source summaries using a small local model so the feed can be skimmed faster without pushing users toward LLM-generated synthesis by default.