Case study: PSBloggen migration
Context
A small crew of PlayStation enthusiasts had been running a game blog for over ten years. The site had accumulated a solid library of reviews, editorials, and commentary — content that was genuinely worth keeping. The infrastructure around it, however, was the usual WordPress story: PHP runtime, MySQL, plugin updates, security patches, and an editor that made structured writing unnecessarily painful.
Beyond just migrating the blog, there was an unmet need: a clean way to answer “what PlayStation games are coming out this month?” Most games media sites are noisy, ad-heavy, and not particularly opinionated about what’s worth paying attention to. The goal was to build a focused discovery portal — tied to the same domain, sharing navigation and auth — that pulled live game data from IGDB and integrated naturally with the existing editorial output.
Two problems, one project.
Goals
- Migrate the existing WordPress content (posts, categories, tags, media) to a static, Markdown-based editorial site
- Keep writing simple: Markdown files on disk, a lightweight admin UI to generate them, no external CMS database
- Build a game discovery portal powered by IGDB for upcoming and recent PS4/PS5 releases
- Add user features: accounts, favourites, play-status tracking, shareable URLs
- Keep everything self-hosted on a single server, single-command deploy
The migration
Exporting from WordPress
WordPress’s built-in XML exporter produces a standard RSS feed with WordPress-specific extensions. It contains post content, categories, tags, publish dates, author, and media attachments. The content itself is HTML — WordPress has always been HTML-first.
The conversion pipeline
A Node.js script processed the export file in several passes:
1. Parse the XML with xml2js to get structured post data.
2. Pre-process WordPress-specific HTML before running Turndown. The main offenders:
WordPress caption shortcodes:
[caption id="..." align="aligncenter" width="640"]
<img src="..." />
Caption text
[/caption]
These needed to be converted to <figure>/<figcaption> first — otherwise Turndown would produce the caption text as an orphaned paragraph.
Embedded media (YouTube, etc.) posted as bare oEmbed URLs also needed handling: strip the embed and leave a clean link with context.
3. Convert HTML to Markdown with Turndown, with custom rules for figure/figcaption elements and WordPress line-break handling.
4. Rewrite internal links from the old /year/month/slug URL structure to the new flat /reviews/slug and /editorial/slug structure.
5. Download media from the WordPress media library and rewrite image paths to local references.
6. Write frontmatter mapping WordPress post metadata to the Astro content collection schema:
| WordPress field | Astro frontmatter |
|---|---|
post_title | title |
post_date | pubDate |
| Post excerpt | description |
category | categories[] |
tag | tags[] |
dc:creator | author |
| Post type | postType (review / editorial) |
Reviews got additional frontmatter fields: score, sub-scores, verdict, pros, cons, platform badges. These were either in the original post as structured HTML or added manually during migration review.
Choosing Astro for editorial content
Static generation is the obvious fit for a blog where content changes infrequently and the authors are comfortable with Markdown and Git. Adding a review or editorial post is: write a .md file, commit, push — the CI pipeline builds and deploys automatically.
Astro’s content collections provide a typed schema with Zod validation. If a post has a missing pubDate or a malformed score value, the build fails with a clear error rather than silently producing a broken page.
The output is static HTML: no PHP, no Node.js process, no database. nginx serves files from disk.
The portal
Why build on top of IGDB
Rather than just migrating the existing content, the new stack was an opportunity to add something the WordPress site couldn’t easily do: live game data. IGDB is the most comprehensive freely available games database, covering release dates, cover art, screenshots, videos, similar games, genres, and company information across all platforms.
Constraint: IGDB credentials are per-application and rate-limited. Calling IGDB directly from the browser would expose credentials and exhaust the quota under any meaningful traffic. An API proxy was mandatory from the start.
The API layer
An Express backend sits between both frontends and the outside world:
IGDB proxy — receives queries from the React portal, injects credentials, forwards to IGDB, returns results. Credentials never reach the client.
In-memory cache — wraps IGDB responses in a TTL cache. Listing pages (upcoming/recent games) cache for 30 minutes. Detail pages (single game, company, genre) cache for 2 hours. This keeps the portal responsive under repeated page loads while staying well within Twitch API quota limits.
Auth — JWT issued on login, stored in an httpOnly cookie. /api/auth/me lets both frontends check login state without exposing anything in localStorage. Cookie is domain-scoped so both the React portal and the Astro editorial site pick it up with a single fetch.
Content API — the React homepage needs recent reviews and editorial posts for its landing page feed. Rather than coupling the two frontends at build time, the Express backend reads Markdown frontmatter directly from the filesystem and exposes it as JSON via /api/content/recent. Decoupled, no shared build step required.
Favourites — SQLite via better-sqlite3 stores (user_id, game_id, status) rows. Status values: want / playing / completed / dropped.
Dual-stack architecture
The project has two distinct frontends sharing one backend.
React SPA handles everything game-data-driven: listing pages (upcoming/recent by platform), detail pages, search, genre browser, developer pages, favourites. These views are stateful and updated by live IGDB data — a natural fit for a SPA with client-side caching (React Query) managing server state.
Astro editorial site handles content the crew writes: reviews with scores, sub-scores, verdict, pros/cons, and platform badges; editorial posts for freeform commentary. Astro pre-renders these to static HTML at build time. The only client-side JS on the editorial site is the auth state check (one fetch to /api/auth/me) so the nav can show a login/logout button.
nginx routes between the two in production:
location /reviews/ { root /srv/site; try_files $uri $uri/ /reviews/index.html; }
location /editorial/ { root /srv/site; try_files $uri $uri/ /editorial/index.html; }
location /api/ { proxy_pass http://api:3001; }
location / { root /srv/portal; try_files $uri /index.html; }
One domain, one nav, one auth state — but editorial content is purely static and the game portal is a proper SPA.
SQLite for user data
User accounts, favourites, and play statuses have minimal write volume. SQLite via better-sqlite3 (synchronous API, embedded in the Node.js process) eliminates a container from the stack and removes any connection-string configuration. WAL mode handles concurrent reads cleanly. A named Docker volume persists the database file across deploys.
Admin CMS and Markdown generation
The editorial site treats .md files as the source of truth, but that doesn’t mean the team writes raw frontmatter by hand. The Express backend includes an admin UI at /admin/reviews/new that generates the Markdown files.
Writing a review:
- Fill in title, description, scores, sub-scores, verdict, and pros/cons in a form
- Attach IGDB games — a search-as-you-type game picker queries IGDB; selecting a game auto-populates developer, publisher, platforms, and genres from the API response
- Insert screenshots — an image picker modal shows all IGDB screenshots available for the attached game(s); clicking one inserts it at the cursor position in the Markdown body as a standard
tag - Save — the Express backend writes the completed
.mdfile directly tosite/src/content/reviews/
The output is a plain Markdown file with structured frontmatter. Nothing proprietary, nothing locked in — the files are readable and editable in any text editor.
Hot deploy without a rebuild:
In production the content directory (site/src/content/) is mounted as a Docker volume, and the Astro editorial site runs as astro dev rather than a pre-built static server. Astro’s dev server watches the filesystem; when the admin saves a new review, the .md file appears on disk and the Astro process picks it up immediately. No restart, no rebuild, no deploy step — the review is live within seconds.
The content volume is decoupled from the Docker image. Image rebuilds and redeploys don’t touch the content files.
Git as the persistent source of truth:
The hot-deploy path (write to disk → Astro picks it up immediately) handles the live site, but the .md files are also committed to the Git repository. After saving via the admin, the file is pushed to the repo automatically. This means the filesystem and Git always stay in sync: the volume is the working copy, Git is the authoritative record. Rolling back a bad review, recovering after a disk failure, or spinning up a new instance all start from the same place — a git clone and a docker compose up.
IGDB artwork, resolved at request time:
Review hero images are not stored in the Markdown files. The frontmatter stores only the igdb_id(s) of attached games. When the Astro SSR layer renders a review page, it calls the IGDB API server-side, fetches available artwork for those IDs, selects the best landscape image (highest resolution with width > height), and caches the result in memory for 2 hours.
This means no artwork management — if IGDB adds a better image for a game, it appears automatically after the cache expires. It also means the Markdown files stay clean and portable: moving content to a different system doesn’t require migrating a library of image references.
Tradeoffs
Chose a proxy over serverless functions. A long-running Express process is simpler to reason about than a functions-as-a-service setup. Cold starts don’t matter here, Docker handles the lifecycle, and the IGDB cache is in-process — it would be lost between serverless invocations anyway.
Chose in-memory cache over Redis. The cache only needs to survive process lifetime. Redis would add an extra container for no meaningful gain given the access patterns.
Chose Markdown-on-disk over a headless CMS. The admin UI generates .md files and writes them to the filesystem — there’s no separate CMS database, no external service, no API to stay compatible with. The files are auditable, searchable, editable in any text editor, and versionable in Git. Moving to a different publishing setup in the future means moving plaintext files, not migrating a database.
Chose separate frontends over a unified SPA. The editorial content is genuinely static — pre-rendering it at build time is better for SEO, faster for readers, and simpler for authors. Forcing everything into a SPA would be adding complexity to solve a problem that doesn’t exist.
Chose SQLite over PostgreSQL. Write volume doesn’t justify PostgreSQL’s operational overhead. SQLite with WAL mode handles the concurrency requirements cleanly.
What the migration taught us
WordPress shortcodes are the hard part. Standard HTML parsers don’t know what [caption] means. Writing targeted pre-processing for the worst offenders was faster than trying to make Turndown handle them generically. The lesson: identify the three or four patterns that appear most often in the export, write specific handlers, and let the generic converter handle everything else.
Frontmatter validation at build time catches real problems. Astro’s Zod schemas caught malformed dates and missing required fields in the migrated posts that would have silently produced broken pages. Build-time validation was worth setting up.
Static and live data coexist cleanly when the boundary is real. Editorial content changes rarely; game data changes often. That’s a genuine architectural boundary, not an artificial one. Astro for the former, React Query for the latter, Express to mediate between them — each tool doing what it’s actually good at.
The cache layer is the rate-limit protection. IGDB’s quotas are per-credential per second, not per month. Without caching, every page refresh triggers fresh upstream fetches. With a 30-minute TTL on listing pages, normal multi-author use stays well within quota regardless of how often people hit refresh.
Results
- Full WordPress content migrated: posts, categories, tags, media, internal links all converted and validated
- Game discovery portal: upcoming/recent PS4/PS5 listings, detail pages, search, genre and developer browsing
- User features: accounts, favourites with play status, shareable URLs for all views
- Sub-50ms response times for cached IGDB data
- Single-command deploy:
docker compose up --build - Editorial workflow: fill in the admin form, hit save — the
.mdlands on disk and is live within seconds via Astro’s filesystem watcher, no rebuild required - IGDB artwork resolved at runtime: hero images fetched server-side by
igdb_id, cached 2 hours — no artwork management, better images appear automatically
What’s next
- Price lookup: link favourited games to PSN store or ITAD for current pricing
- PS Plus indicator: flag games available on PS Plus Extra/Premium
- Play notes: short freeform note per favourited game (DB column already present, just not wired up)
- Rate limiting: protect the Twitch/IGDB quota against unexpected traffic spikes
- Sort order on listings: by rating, alphabetical, or release date
- Integration tests: catch IGDB API shape changes before they surface as frontend breakage