An organic traffic plateau often points to problems search-engine bots can’t see past. Maybe new blog posts disappear from the index, pages load like molasses, or mobile users bounce after one swipe. Whatever the symptom, the root cause usually hides in the technical layer of your site—the part that crawlers explore long before humans show up.
This guide hands you the same playbook seasoned auditors rely on to surface those buried flaws and turn them into ranking gains. You’ll gather the right tools, crawl your architecture, interpret logs, and prioritize fixes with a clear, impact-first framework. Follow along and, by the time you reach the final checklist, you’ll know exactly how to run a clean, comprehensive technical SEO audit—no agency invoice required.
Ready to pop the hood? Let’s start by collecting a baseline—so every improvement you make is easy to prove and brag about later.
Before you start poking at robots directives or trimming redirect chains, line up your tools and record where the site stands today. This prep work does two things:
Think of it like a doctor’s intake: vitals first, treatment second. In the context of technical SEO, those “vitals” are crawl stats, Core Web Vitals, and organic performance.
A crawler-friendly, technically sound site earns search engines’ trust. When bots can fetch, render, and understand every page quickly, rankings follow. Common triggers that signal it’s time for an audit include:
If you’re wondering how to do a technical SEO audit without enterprise resources, know that the process is the same whether you’re an indie blogger or a Fortune 500 brand—the difference is how many URLs you’ll crawl.
Below are five tool categories you’ll reach for over and over. Most have free tiers that are more than enough for a starter audit.
Category | Example Tools | Quick Usage Tip | Price Note |
---|---|---|---|
Crawlers | Screaming Frog, Sitebulb | Set the user-agent to Googlebot and crawl staging first to catch blockers early. |
Free up to 500 URLs; unlimited in paid versions (~$219/yr for SF). |
All-in-One Suites | Semrush, Ahrefs | Use the “Site Audit” module to surface duplicate content and HTTPS issues in one view. | Both start around $119/mo; free trials available. |
Page-Speed Testers | PageSpeed Insights, WebPageTest | Compare “First Contentful Paint” across mobile vs. desktop profiles. | 100% free. |
Log Analyzers | GoAccess, Screaming Frog Log File Analyzer | Filter logs by status code 5xx to locate server misfires bots actually hit. |
GoAccess is open source; SF LFA starts at $134/yr. |
Webmaster Consoles | Google Search Console, Bing Webmaster Tools | Submit your XML sitemap and monitor “Crawl Stats” to catch sudden spikes or drops. | Always free. |
Pick at least one from each row; overlapping reports provide a second opinion when metrics look off.
Document today’s numbers before you change anything:
Organic KPIs
Crawl Stats
Core Web Vitals
Server Logs (optional but powerful)
access.log
, then gzip and archive it for later pattern analysis.Store everything in a version-controlled spreadsheet or a Git-tracked Markdown doc. Tag the commit with the date and audit name (2025-07-24_initial_baseline
) so you can diff future crawls against the original. When the inevitable “Did that fix anything?” question comes up, you’ll have the receipts.
Crawlability answers one simple question: can search-engine bots reach and retrieve every URL you care about? If the answer is “sort of,” rankings stall no matter how perfect your content or backlinks are. While “crawlability” and “indexability” get lumped together in People-Also-Search-For snippets, they are not synonyms. Crawlability is the gateway; indexability is the bouncer deciding who gets listed. In this part of the technical SEO audit you’ll run a few fast checks that expose invisible walls, rogue directives, and server hiccups keeping Googlebot outside.
Most crawl blocks start with a single line of text. Head to https://yourdomain.com/robots.txt
and eyeball it for red flags:
Disallow: /*
blocks the entire site—often left over from a staging launch.Disallow: /wp-content/
can prevent Google from rendering CSS and JS, which harms mobile usability scores./sitemap-staging.xml
.Quick test: open Google Search Console → “Robots.txt tester,” paste any suspect URL, and confirm it returns “Allowed.”
Meta robots tags live in page headers (<meta name="robots" content="noindex,nofollow">
) or in response headers (X-Robots-Tag
). Use them when you need page-level precision—think thank-you pages or beta feature URLs. During the audit, export all pages with any noindex
value using Screaming Frog’s “Directives” report and verify each one is intentional.
With gatekeepers reviewed, fire up a desktop crawler:
Googlebot
or Bingbot
.cdn.example.com
or blog on a subdomain.What to flag while it runs:
/old
→ /2023-old
→ /new
) wastes crawl budget.?day=next
) that spawn endless URLs.Once finished, export the “Response Codes” and “Redirect Chains” reports—these become your to-fix punch list.
Below is a lightning primer on HTTP status codes you’ll see and the fastest remedies:
Code Group | Meaning | Typical Causes | Fix |
---|---|---|---|
2xx | Success | Healthy pages | No action |
3xx | Redirect | Site moves, trailing slash, http→https | Consolidate to single 301 hop |
4xx | Client error | Deleted pages, bad links | 301 to closest match or reinstate content |
5xx | Server error | Timeouts, misconfigured PHP, overloaded hosting | Check server logs, upgrade resources, patch scripts |
Mini-checklist for quick wins:
Completing these tasks removes friction at the very first step of discovery, which is why seasoned pros tackle crawlability before asking how to do a technical SEO audit on deeper issues. With the crawl gates wide open, you’re ready to confirm that the right pages actually make it into the index.
Passing the crawl test doesn’t guarantee rankings. Google can fetch a URL and still refuse to file it in its index, leaving you with the dreaded “Crawled – currently not indexed” or “Discovered – currently not indexed” messages in Search Console. The purpose of this step in a technical SEO audit is to prove that (a) every revenue-driving page is eligible for inclusion and (b) low-value or duplicate URLs stay out of the way. Think of it as quality control on the content warehouse you just opened to bots.
An XML sitemap is your curated inventory list. Treat it like one:
/robots.txt
(Sitemap: https://example.com/sitemap.xml
) and Search Console.Health check workflow:
Cleaning the sitemap can reduce the “Indexed, not submitted” noise and sharpen Google’s understanding of site priorities.
Duplicate content dilutes crawl budget and splits link equity. The rel="canonical"
tag is your traffic cop that tells bots which version is the “master record.”
Quick guidelines:
Example: a product that comes in three colors might live at:
/tees/classic-tee?color=black
/tees/classic-tee?color=blue
Set each variant’s canonical to /tees/classic-tee
and add unique copy only where it matters (e.g., alt text, swatch labels) to keep relevance without cluttering the index.
Directives often overlap in everyday discussions about how to do a technical SEO audit, but their effects differ. Use the matrix below as a cheat sheet:
Directive | Location | Bot Can Crawl? | Bot Can Index? | When to Use |
---|---|---|---|---|
noindex |
Meta/X-Robots | Yes | No | Thank-you pages, internal search results |
nofollow |
Meta/Link attr. | Yes | Yes, but links pass no PageRank | UGC comments, paid links |
Disallow |
robots.txt | No (blocked) | No (can’t crawl) | Infinite calendars, dev folders |
Audit steps:
noindex
. Manually verify each one: does it truly add no SEO value??sort=
, ?sessionid=
), keep them crawlable but set a noindex
directive until you can consolidate them via canonicals or URL parameter handling in GSC.By the end of this phase you should have:
When these boxes are ticked, every visit from Googlebot becomes a clear invitation to store and rank your best work—setting the stage for the architectural and performance tweaks coming next.
Crawlability and indexability tell you that Googlebot can reach your pages, but site architecture determines how quickly it gets there—and whether authority flows in the right direction once it does. A clean hierarchy shortens click paths, concentrates PageRank on money pages, and makes maintenance less of a nightmare when your catalog or blog inevitably grows. This portion of the technical SEO audit focuses on mapping those paths, spotting link equity leaks, and tightening up any structural loose ends.
A “flat” architecture keeps every important page within three, maybe four clicks of the homepage. Anything deeper starts to look like a forgotten basement to crawlers and humans alike.
Visual cheat sheet (text version):
Homepage
├─ /category/
│ ├─ /category/sub/
│ │ └─ /product/ ← ideal: depth 3
│ └─ /article/
└─ /about/
If your audit reveals paths that look like /store/dept/region/season/collection/product
, you’re burying revenue.
Humans and robots both prefer URLs that read like breadcrumb trails.
/mens-running-shoes
During the crawl, filter for mixed cases (/Blog/
vs. /blog/
) and both slash versions. Canonicalize or 301 everything to the chosen standard to avoid duplicate-content headaches.
Orphan pages sit in your XML sitemap (or nowhere) but receive zero internal links. They’re ghosts—Google may stumble upon them, but odds are slim.
Detection recipe:
LEFT JOIN
to flag sitemap URLs missing from the crawl’s “Inlinks” table.For broken links, sort the crawler’s “Response Codes” report by 4xx and click the “Inlinks” tab to see every anchor pointing to those dead ends. Fix options:
E-commerce filters, calendar widgets, and sort options can explode your crawlable URL count overnight. Left unchecked, they burn crawl budget and create duplicate content.
Guidelines that keep things sane:
Challenge | Recommended Action |
---|---|
Paginated lists | Offer a “View All” page if feasible; otherwise, ensure each page has unique <title> and self-canonical. |
Faceted filters (?color=black&size=xl ) |
Canonical back to the non-filtered category page unless the facet generates significant, unique demand. |
Sort parameters (?sort=price-asc ) |
Block via robots.txt or GSC URL Parameters tool and append noindex meta if already discovered. |
Pro tip: whitelist only revenue-generating parameters in GSC; let everything else be disallowed or consolidated.
By tightening architecture and internal links you not only guide search bots efficiently but also craft a smoother user journey—two birds, one logical site structure. Next up, we’ll tackle performance signals that influence both rankings and bounce rates.
Crawlable, indexable pages still under-perform if they frustrate users. Google’s Page Experience signals measure that frustration in milliseconds and layout shifts, then bake the score into ranking calculations. Treat this step of your technical review as a hybrid UX and infrastructure checkup: you’re tuning servers, code, and design so that humans stay engaged and search engines notice. The tests below are quick to run and frequently reveal the biggest ROI fixes when learning how to do a technical SEO audit that actually moves the needle.
Google currently tracks three Core Web Vitals (CWV). Hit the recommended thresholds below and you’re in the “green” zone:
Metric | Good | Needs Improvement | Poor |
---|---|---|---|
Largest Contentful Paint (LCP) | ≤ 2.5s |
2.5–4s |
> 4s |
Interaction to Next Paint (INP)* | ≤ 200ms |
200–500ms |
> 500ms |
Cumulative Layout Shift (CLS) | ≤ 0.1 |
0.1–0.25 |
> 0.25 |
*Google will replace FID with INP as the stable metric in March 2025.
How to pull the numbers:
https://pagespeedonline.googleapis.com/pagespeedonline/v5/runPagespeed
API endpoint and dump results into a sheet.Quick wins that often shave seconds:
loading="lazy"
.Always retest after changes—CWV improvements are only “real” when the field data turns green for 28 straight days.
Since mobile-first indexing rolled out, Google primarily crawls your mobile view. Parity between desktop and handhelds isn’t optional.
Checklist:
<meta name="viewport" content="width=device-width,initial-scale=1">
tag exists.m.example.com
, plan a responsive migration and consolidate URLs via 301s.Bonus move: emulate devices in Chrome DevTools, throttle bandwidth to “Slow 3G,” and verify content parity—hidden text can cause “Content mismatch” issues that tank rankings.
Google called HTTPS a “tiebreaker” ranking factor back in 2014, but user trust is the bigger prize. Run SSL Labs’ free test or openssl s_client -connect example.com:443
to verify:
Next, add security headers at the server or CDN layer:
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
X-Frame-Options: SAMEORIGIN
Content-Security-Policy: upgrade-insecure-requests;
During your crawl, filter for any resource requested over http://
. Mixed content can silently break CSS/JS in Chrome and will always trigger a warning in Lighthouse. Update asset URLs or route them through your CDN’s automatic HTTP→HTTPS rewrite.
Images are often 50–70 % of total page weight. Lighter files mean faster LCP and happier visitors.
srcset
plus sizes
so browsers pick the smallest file for each viewport.Implementation snippet:
<img
src="shirt-480.webp"
srcset="shirt-320.webp 320w, shirt-480.webp 480w, shirt-800.webp 800w"
sizes="(max-width: 600px) 90vw, 480px"
alt="Classic black tee"
loading="lazy"
/>
Don’t forget non-visual formats:
oxipng -o 4
.Finally, rerun your PageSpeed or Lighthouse tests. A sub-2 s LCP often follows a disciplined image workflow, pushing you over the CWV finish line.
By nailing these performance levers—Core Web Vitals, mobile usability, secure delivery, and optimized media—you transform the “feel” of your site for both bots and humans. With speed bottlenecks resolved, the rest of your technical SEO audit can focus on on-page elements and log-level insights without worrying that sluggish load times are masking other victories.
The crawl, index, and performance tune-ups you’ve completed so far clear the runway; now it’s time to make sure each individual page is built to fly. On-page technical elements tell search engines what the page is, where it belongs, and how (or if) it should appear in results. Skip this layer and you can still rank—just not for the queries or SERP features you deserve. The checkpoints below slot neatly into every workflow on how to do a technical SEO audit, whether you manage ten URLs or ten million.
HTTP responses are the first conversation Googlebot has with your server. Keep it short and sweet:
Code | Use It For | Watch Out For |
---|---|---|
200 OK | Live, indexable pages | None—this is the goal |
301 Moved Permanently | Consolidations, canonical moves | Redirect chains, wrong canonical target |
302/307 Temporary | A/B tests, limited promos | Leaving them in place after the promo ends |
404 Not Found | Deleted content without replacement | High-traffic URLs still attracting links |
410 Gone | Content intentionally removed forever | Same pitfalls as 404 |
5xx Server Errors | Maintenance windows | Hosting limits, buggy plugins causing random outages |
Quick troubleshooting tree:
/var/log/nginx/error.log
) for spikes in memory or PHP failures.curl -I -L https://example.com/page
to confirm the fix.Rich results—FAQ accordions, review stars, recipe cards—come from clean schema. Validation is binary: “eligible” or “missing/invalid.”
[Article](https://rankyak.com/blog/how-to-write-seo-articles)
, Product
, FAQ
, Breadcrumb
.<script type="application/ld+json">
block in the <head>
.Pro tips:
Organization
markup into every page—irrelevant markup violates guidelines.Headings are the semantic outline that bots read before parsing paragraphs. Titles and descriptions close the sales pitch in SERPs.
Best-practice checklist:
<h1>
that mirrors the primary intent keyword.<h2>
, <h3>
) to group topics—never skip levels.Reusable title formula (adapt as needed):
<Primary Keyword> | <Unique Value> – <Brand>
Example: Running Shoes for Flat Feet | Free 60-Day Returns – SportHub
Use your crawler’s “Missing or Duplicate H1” and “Duplicate Title” reports to flag thin or reused tags, then batch-edit in your CMS or via database script.
If your site targets multiple languages or regions, hreflang
is non-negotiable. Misconfiguration leads to self-cannibalization—Google may serve the wrong locale to the wrong audience.
Syntax refresher:
<link rel="alternate" hreflang="en-us" href="https://example.com/us/page" />
<link rel="alternate" hreflang="en-gb" href="https://example.com/uk/page" />
<link rel="alternate" hreflang="x-default" href="https://example.com/page" />
Audit steps:
hreflang
tags with your crawler’s “Hreflang” report.301
s break the mapping.When all four on-page pillars are sturdy, search engines can parse, enrich, and geo-target your content with confidence. Up next: the server-side reality check—log file analysis and crawl budget optimization.
Crawl reports are estimations; server logs are the receipts. Every request—bot or human—lands in an access.log
file, showing timestamp, user-agent, response code, and bytes served. Mining this raw data tells you exactly how search engines spend their crawl budget, which URLs they ignore, and where they get stuck in loops. A quick log audit is often the missing link when you’re figuring out how to do a technical SEO audit that scales past what desktop spiders can simulate.
First, get the goods:
/var/log/apache2/access.log
/var/log/nginx/access.log
Always gzip large files—bots rack up gigs fast. Then parse with a dedicated tool:
Tool | Best For | Setup |
---|---|---|
Screaming Frog Log File Analyzer | Point-and-click insights, matches against crawl data | Import .log → set filters |
GoAccess | Real-time dashboards in terminal or HTML | goaccess access.log -o report.html |
Filter by known bot user-agents:
grep -E "Googlebot|Bingbot|DuckDuckBot" access.log > bots.log
Now you have a clean dataset of crawling activity for the last 7–30 days.
Patterns matter more than isolated hits. Red flags include:
?page=1&page=2&page=3…
)/events/2099-12-31
)Load bots.log
into your analyzer and sort by URL frequency. Anything with triple-digit hits but zero organic sessions in Google Analytics is probably a crawl trap. Common fixes:
robots.txt
(Disallow: /*?page=*
)Always retest in Google Search Console’s URL Inspection → Live Test to verify the trap is sealed.
Log data also reveals what bots aren’t seeing enough:
For massive sites, focus on URLs that drive revenue or attract backlinks. A simple two-column sheet—Bot Hits vs. Organic Sessions—makes it obvious where crawl budget is wasted and where it should be reallocated.
By aligning your server logs with business metrics, you transform raw data into an actionable plan: close crawl traps, surface money pages, and ensure every bot visit advances your ranking goals rather than burning bandwidth.
You’ve surfaced an ocean of data—redirect chains, sluggish LCPs, rogue noindex
tags. Now comes the part that separates dabbling from doing: triage. A clear prioritization framework tells you what to fix first, who will own it, and how to measure success. Without it, even the best‐documented walkthrough on how to do technical SEO audit devolves into a backlog nobody touches.
Plot every issue on a simple 2 × 2 grid that weighs business impact against implementation effort.
Low Effort | High Effort | |
---|---|---|
High Impact | Do First e.g., remove noindex from top converters |
Plan & Schedule e.g., full HTTPS migration |
Low Impact | Nice to Have e.g., tweak alt text on old blog images |
Re-evaluate e.g., redesign legacy sub-site no one visits |
Label each finding, then sort the sheet by quadrant. The top-left box becomes tomorrow morning’s to-do list.
Quick, visible wins build momentum and executive goodwill; long-term projects lock in durable gains.
5 Quick Wins (hours to days)
<title>
tags on paginated seriesloading="lazy"
for below-the-fold imagesDisallow: /wp-content/
from robots.txt
5 Long-Term Projects (weeks to months)
Flag dependencies (design, dev ops, legal) so nothing stalls once the sprint begins.
A one-page executive summary keeps non-SEOs engaged:
Use a color-coded Google Sheet or slide deck so everyone sees status at a glance—green (done), yellow (in progress), red (blocked). Close each sprint with a mini-report comparing new GSC and Core Web Vitals data against the baseline you captured on day one. Stakeholders will see progress, teams stay accountable, and your audit evolves from a static document into a living optimization engine.
Running a technical SEO audit isn’t a one-time chore—it’s preventative maintenance for your entire online business. Here’s the lightning recap:
Lock these eight steps into your workflow and you’ll catch problems before they sap rankings. Need to keep publishing fresh, optimized content while you focus on fixes? Automate that side of the equation with RankYak—so your newly tuned site stays fueled with search-ready articles, day after day.
Start today and generate your first article within 5 minutes.