Skip to content

Conversation

@userquin
Copy link
Contributor

@userquin userquin commented Feb 3, 2026

No description provided.

@vercel
Copy link

vercel bot commented Feb 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
npmx.dev Ready Ready Preview, Comment Feb 3, 2026 10:53pm
2 Skipped Deployments
Project Deployment Actions Updated (UTC)
docs.npmx.dev Ignored Ignored Preview Feb 3, 2026 10:53pm
npmx-lunaria Ignored Ignored Feb 3, 2026 10:53pm

Request Review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 3, 2026

📝 Walkthrough

Walkthrough

A new file, SEO-STRATEGY.md, was added describing a technical SEO strategy for npmx.dev. It specifies serving an SSR-hosted npm registry mirror for organic crawling, returning real HTTP 404 responses, and using robots.txt to block high-cost paths. It documents i18n choices (default English, no language URL prefixes), a single canonical URL per package, meta-tag rules for noindex/nofollow, internal linking and dynamic SEO metadata handling, and a decision not to generate a sitemap due to scale.

🚥 Pre-merge checks | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request lacks any author-provided description, making it impossible to assess whether it relates to the changeset. Add a descriptive pull request description that explains the purpose and content of the SEO strategy document being introduced.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
SEO-STRATEGY.md (1)

23-25: Soften the “immediately discard” claim for 404s.

Search engines can take time to drop URLs or treat them as soft 404s. Consider wording like “eventually de‑indexes” and optionally mention 410 for permanently removed packages.

Comment on lines +85 to +91
Based on the `robots.txt` strategy, the following Vue pages should explicitly include the `<meta name="robots" content="noindex, nofollow">` tag via `useSeoMeta`. This acts as a second layer of defense against indexing low-value content.

- **`app/pages/search.vue`**: Internal search results.
- **`app/pages/settings.vue`**: User preferences.
- **`app/pages/compare.vue`**: Dynamic comparison tool.
- **`app/pages/package-code/[...path].vue`**: Source code explorer.
- **`app/pages/package-docs/[...path].vue`**: Generated documentation (consistent with robots.txt block).
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Robots.txt disallow prevents crawlers from seeing the noindex meta.

If a path is disallowed, most bots won’t fetch the page, so the meta tag is not an effective “second layer.” Either allow crawling and use noindex,follow, or keep the disallow and drop the meta requirement for those paths.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~87-~87: Loose punctuation mark.
Context: ...e content. - app/pages/search.vue: Internal search results. - **`app/pages...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~88-~88: Loose punctuation mark.
Context: ... results. - app/pages/settings.vue: User preferences. - **`app/pages/compar...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~89-~89: Loose punctuation mark.
Context: ...eferences. - app/pages/compare.vue: Dynamic comparison tool. - **`app/pages...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~90-~90: Loose punctuation mark.
Context: ...app/pages/package-code/[...path].vue**: Source code explorer. - **`app/pages/pa...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~91-~91: Loose punctuation mark.
Context: ...app/pages/package-docs/[...path].vue**: Generated documentation (consistent wit...

(UNLIKELY_OPENING_PUNCTUATION)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, technically if it's blocked, the metadata isn't read. But we keep the Disallow directive to prioritize the crawl budget. The noindex directive is proposed as a defensive security measure in case crawling is accidentally allowed or for bots that don't strictly adhere to robots.txt but do respect the metadata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@userquin, that makes perfect sense—thank you for the clarification! You're absolutely right that this "defense in depth" approach is a solid SEO best practice. The Disallow directive optimizes your crawl budget for well-behaved bots, while the noindex meta tags provide a safety net for edge cases like accidental configuration changes or non-compliant crawlers. This dual-layer strategy is prudent for production environments.


✏️ Learnings added
Learnt from: userquin
Repo: npmx-dev/npmx.dev PR: 894
File: SEO-STRATEGY.md:85-91
Timestamp: 2026-02-03T22:21:10.658Z
Learning: For SEO strategy, using both `robots.txt` Disallow directives and `noindex` meta tags together is a valid "defense in depth" approach. The robots.txt optimizes crawl budget for compliant bots, while meta tags serve as a fallback for bots that ignore robots.txt or in case of accidental configuration changes.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@codecov
Copy link

codecov bot commented Feb 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

We do not use a massive `sitemap.xml`. We rely on natural link discovery by bots (Googlebot, Bingbot, etc.):

1. **Entry Point:** The Home page (`/`) links to popular packages.
2. **Expansion:** Each package page links to its **Dependencies**, **DevDependencies**, and **PeerDependencies**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this step.

That makes it possible that even super popular packages may be missed out completely, provided that no-one has created a package that depends on them, or that this problem exists further down the line.

In other words, we would only index stuff that - roughly - would get installed if you'd run pnpm install nuxt vue nitro react svelte vite next astro typescript angular - plus their devDependencies - which to me, intuitively, sounds like a tiny fraction of useful packages out there.

Copy link
Contributor Author

@userquin userquin Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

infinite recursion since the bot will follow that links (peer, dev deps and deps)

Comment on lines +32 to +50
```txt
User-agent: *
Allow: /

# Block internal search results (duplicate/infinite content)
Disallow: /search

# Block user utilities and settings
Disallow: /settings
Disallow: /compare
Disallow: /auth/

# Block code explorer and docs (high crawl cost, low SEO value for general search)
Disallow: /package-code/
Disallow: /package-docs/

# Block internal API endpoints
Disallow: /api/
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

npmjs also blocks old versions from being indexed:

https://www.npmjs.com/robots.txt

I think it makes a lot of sense.


- Search traffic is predominantly in English (package names, technical terms).
- We avoid the complexity of managing `hreflang` and duplicate content across 20+ languages.
- User Experience (UX) remains localized: users land on the page (indexed in English), and the client hydrates the app in their preferred language.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and a vast majority of READMEs are in English anyway, and they take up a significant amount of npmx's displayed content.

Comment on lines +93 to +97
### Canonical URLs & i18n

- **Canonical Rule:** The canonical URL is **always the English (default) URL**, regardless of the user's selected language or browser settings.
- Example: `https://npmx.dev/package/react`
- **Reasoning:** Since we do not use URL prefixes for languages (e.g., `/es/...`), there is technically only _one_ URL per resource. The language change happens client-side. Therefore, the canonical tag must point to this single, authoritative URL to prevent confusion for search engines.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the i18n mechanics we have, is this section even relevant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just add a draft version, we should discuss the document at discord: https://discord.com/channels/1464542801676206113/1468368119528685620

Co-authored-by: Wojciech Maj <kontakt@wojtekmaj.pl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants