Site Build & Technical SEO
The technical foundation — build it right (0 → 1)
The Site Build & Technical SEO layer solves one problem: letting search engines smoothly find, crawl, understand, and index your pages. It sits after keyword research and before content production—no matter how precise your research or how good your writing, if crawlers can’t read your pages, your URLs are a mess, or the page takes three seconds and shows a blank screen, ranking is off the table. The good news: this layer is the battlefield where people who can write code have the biggest edge. The vast majority of “technical SEO problems” are, at heart, config files, HTTP headers, HTML tags, and performance optimization—stuff you already know. Below is a quick overview of this layer.
Domain & Hosting
- What it is: The domain is the site’s identity (Domain); the host/server determines how fast pages respond and how available they are (Hosting).
- Why it matters: Slow responses and frequent downtime directly drag down crawl efficiency and user experience; HTTPS is a baseline ranking signal.
- How to do it:
- Pick a domain that is short, memorable, and topically relevant; a new domain gets no “domain age” bonus, but it also carries no historical baggage.
- Force HTTPS site-wide, and use 301 redirects to consolidate
http://and the with-/without-wwwvariants into a single canonical version, so you’re not treated as multiple sites. - Prefer hosting with a CDN and data centers close to your target users; the lower the Time To First Byte (TTFB), the better.
🧑💻 Developer’s view: run
curl -I https://your-domainto inspect the response headers. Confirm the status code is200, thatstrict-transport-securityis present, and that there’s no unexpectedx-robots-tag: noindex.
Site Architecture
- URL structure: Use semantic, lowercase, hyphen-separated paths like
example.com/seo/technical-seo, and avoid parameter pileups like?id=123&cat=7. - Hierarchy: Keep important pages reachable within 3 clicks from the homepage whenever possible; the shallower the hierarchy, the smoother authority flows and crawling happens.
- Internal Links & Topic Cluster: Use a single “Pillar Page” to anchor a topic, then have several “Cluster Pages” link to and around it, signaling topical authority to search engines.
- How to do it: Keep navigation and breadcrumb structure clear, and make internal-link anchor text use descriptive keywords rather than “click here.”
💡 Tip: Picture your site as a tree. The homepage is the root, categories are the branches, and articles are the leaves—crawlers “climb the tree” along internal links, and a broken link is a broken branch.
Technical SEO Checklist
This is the core of this layer. Go through each item:
| Item | What it does | Key points |
|---|---|---|
robots.txt | Tells crawlers which paths can be crawled | Put it at the site root; don’t use it to “hide” pages—it doesn’t prevent indexing |
sitemap.xml | Lists the URLs you want indexed | Submit it to Google Search Console; include only indexable pages |
| canonical | Designates the “original” among duplicate content | Use <link rel="canonical" href="..."> to point to the preferred URL |
| hreflang | Maps multilingual/multi-region versions | Use <link rel="alternate" hreflang="zh-CN" href="...">; references must be bidirectional |
| Structured data (Schema) | Helps search engines understand the content type | Use JSON-LD to earn Rich Results |
- Mobile-First Indexing: Google primarily uses the mobile version of pages for indexing and ranking, so make sure mobile content, structured data, and desktop stay consistent.
- Core Web Vitals:
LCP(Largest Contentful Paint ≤ 2.5s),INP(Interaction to Next Paint ≤ 200ms),CLS(Cumulative Layout Shift ≤ 0.1). - Crawl Budget & Duplicate Content: Use canonical tags, parameter handling, and sensible internal linking to keep crawlers from wasting their budget on pointless duplicate URLs.
A minimal robots.txt example (note the sitemap uses an absolute URL):
User-agent: *
Allow: /
Disallow: /admin/
Sitemap: https://example.com/sitemap.xml
A skeleton of structured data (JSON-LD) for an article:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "技术 SEO 入门",
"author": { "@type": "Person", "name": "你的名字" },
"datePublished": "2026-06-18"
}
</script>
🧑💻 Developer’s view: don’t hand-write these files and risk errors. Just use this site’s robots/sitemap generator and Schema generator to produce them, then validate with Google’s Rich Results Test.
Choosing a Build Tool (CMS / Framework)
Different tools vary widely in how “SEO-friendly by default” they are:
- WordPress: A mature ecosystem; plugins like Yoast/Rank Math handle meta, sitemap, and Schema for you; the downside is that performance has to lean on caching and optimization plugins.
- Webflow: Visual site building, with SEO fields (title, canonical,
hreflang) configurable out of the box—great for people who don’t want to touch a server; custom logic is limited. - Next.js: A React framework supporting SSR/SSG with the most control; but you have to “fill in” SEO yourself using
next/head,generateMetadata, and dynamic sitemap routes, otherwise CSR is unfriendly to crawlers by default. - Astro: Built for content sites, outputting static HTML with zero unnecessary JS by default, which naturally fits Core Web Vitals and crawling—especially handy for blogs/docs/SEO sites (this site uses Astro).
⚠️ Note: For pure client-side-rendered (CSR) single-page apps, the first-paint HTML may be an empty shell. Prefer static generation (SSG) or server-side rendering (SSR) so crawlers get the full content.
📌 This Layer Is Under Construction
The full tutorial (with step-by-step setup for each item and code examples) is being written—stay tuned. For now, use the quick checklist below to lay a solid foundation:
- Force HTTPS site-wide and 301 all domain variants to the single preferred URL
- Generate and submit
robots.txtandsitemap.xml(you can use this site’s generator) - Add canonical tags to key pages and sort out ownership for duplicate/paginated content
- Use JSON-LD to add structured data for articles, breadcrumbs, and organization info
- Run Lighthouse once and confirm LCP / INP / CLS all meet the thresholds