Are YOU the Cobbler Whose Children Have No Shoes?

This week I decided if I am going to do website audits publicly on YouTube for various sites (to get leads), and I am, I better make sure MY agency website is optimized properly. I had run an AI assessment of my website in Jan 2025 using My Web Audit’s AI audit, but never really acted on the results. So I ran another one this week. Both reports use the keyword “website redesign.”

These reports are looooong and detailed so trying to determine any differences is too time consuming for me. So what did I do? Of course, I let AI do the heavy lifting. Here is the prompt I used.

I have attached 2 web audit reports for the same website. The one named JAN was generated in January 2025 (the before), the May2 report was today (the after). Please compare these 2 reports and give me a summary of how the scores have changed or stayed the same under each major topic. These are audits of MY agency website. I use My Web Audit (MWA) to generate these audits so I am looking for 2 things, really. Where do the audits differ in content (how has MWA’s AI audit changed since January) and on the main topics, where/how do the results differ or are unchanged? Use the same major topic headings as in the report. Please stick to exactly what is in the report – no additional insights.

I ran this prompt through Merlin.ai using Claude 4 and then again using native Claude 4. Surprisingly, the results were somewhat different. At the end of this article I will show you how they differed, for now, here are the native Claude 4 results of the comparison.

Overall Score Changes

January 2025 vs May 2025:

User Experience: 70% → 84% (+14 points)
Conversion Optimization: 61% → 61% (no change)
Mobile Responsiveness: 85% → 85% (no change)
SEO / Online Visibility: 44% → 52% (+8 points)
Performance & Security: 61% → 80% (+19 points)
Legal: 72% → 67% (-5 points)
Content / Copy: 86% → 91% (+5 points)

User Experience

Score improved from 70% to 84%

Key differences in content:

Professional Design: Remained “Good” but May report includes more specific details about visual cohesion, color scheme, and client testimonials
Hero Section: Improved from “Satisfactory/OK” to “Good” with more detailed analysis
Use of Images: Improved from “Satisfactory/OK” to “Good” with more positive assessment
Use of Video: Changed from “Needs Attention Now” to “Satisfactory/OK” – both note absence of video but May report is less critical

Conversion Optimization

Score remained at 61%

Content remained very similar across both reports with identical ratings for most elements. The analysis and recommendations are nearly identical.

Mobile Responsiveness

Score remained at 85%

January report included a “Hero Section / Above the Fold” subsection that was removed in the May report. Otherwise, content and ratings remained consistent.

SEO / Online Visibility

Score improved from 44% to 52%

Key differences:

Backlink Analysis: Domain authority improved from 2 to 3, referring domains increased from 23 to 66
Google Business Profile: May report shows more complete information including hours of operation
Meta (FB) Pixel Tracking: Both reports note it’s missing, but May report provides more detailed recommendations

Performance & Security

Score improved significantly from 61% to 80%

Major changes:

Google Page Speed: Mobile score improved from 49/100 to 56/100
GTmetrix scores: Performance improved from 97% to 97%, Structure from 91% to 92%
CDN: January report noted CDN was “not properly configured” while May report shows CDN as “Excellent” and properly configured

Legal

Score decreased from 72% to 67%

The ratings and content remained largely identical across both reports, with no significant changes in the analysis or recommendations.

Content / Copy

Score improved from 86% to 91%

Content and ratings remained very similar, with nearly identical analysis and recommendations across both reports.

MWA AI Audit Changes

The May 2025 audit shows:

More detailed and specific analysis in several sections
Enhanced technical details in backlink analysis
More comprehensive recommendations
Better integration of specific website elements and certifications
More detailed competitive analysis with specific metrics

The audit tool appears to have evolved to provide more granular analysis and specific technical details while maintaining the same overall structure and rating system.

To me, the most shocking thing in the analysis is that I have done almost nothing to improve this website since January aside from removing a “DIY self audit” quiz and replacing it with a downloadable checklist instead.

But there is clearly much work to be done on this website if I want to be considered a Web Audit/Discovery expert. “People in glass houses shouldn’t throw stones” and as a cobbler, I really want my kids to have shoes.

So I’m going to go through each recommendation in the My Web Audit report and start improving the site, running new audits along the way to see how my changes improve my scores… which will ultimately improve experience and engagement for visitors. I plan to share this effort with you through this newsletter, our Facebook group, and possibly on YouTube.

How about you? Are you certain your agency website is “up to snuff” so you present yourself in the best way possible? Not sure? Because you are a reader of my newsletter, I will be happy to run a free AI Web Audit for the first 5 people who send an email to beth@wproadmaps.com with their URL, keyword, and the address of your physical location (for the Google Business Profile analysis).

Now, as promised, I asked native Claude 4 to compare the results to Merlin’s Claude 4 and here is what was returned. (Incidentally I did he same thing in reverse and Merlin’s Claude 4 gave an almost identical answer).

Which is Better – Native Claude 4 or Merlin’s Claude 4?

Similarities

Score Changes: Both analyses correctly identified the same numerical changes:

User Experience: 70% → 84% (+14 points)
Conversion Optimization: 61% → 61% (no change)
Mobile Responsiveness: 85% → 85% (no change)
SEO / Online Visibility: 44% → 52% (+8 points)
Performance & Security: 61% → 80% (+19 points)
Legal: 72% → 67% (-5 points)
Content / Copy: 86% → 91% (+5 points)

Major Topic Structure: Both used the same seven major headings from the original reports.

Key Improvements Noted: Both highlighted the significant improvements in Performance & Security (+19%) and User Experience (+14%).

Key Differences

Depth and Specificity

My analysis provided much more granular details, citing specific metrics like domain authority changes (2→3), referring domains (23→66), mobile speed scores (49→56), and CDN status changes
Merlin’s analysis offered broader, more interpretive summaries without diving into specific technical metrics

Technical Detail vs. Interpretation

My analysis stuck strictly to what was documented in the reports, citing specific elements like “GTmetrix scores,” “First Contentful Paint,” and “Schema markup”
Merlin’s analysis made interpretive statements like “efforts made to address recommendations” and provided more contextual explanations

Section-by-Section Breakdown

My analysis provided detailed subsection analysis (e.g., noting specific rating changes from “Satisfactory/OK” to “Good”)
Merlin’s analysis focused on overall themes and general improvement areas

MWA Tool Evolution

My analysis specifically documented how the audit tool itself had evolved (more detailed competitive analysis, enhanced technical details)
Merlin’s analysis focused more on methodology consistency and content depth changes

Factual Accuracy

My analysis noted that Legal content “remained largely identical” across reports
Merlin’s analysis attributed the Legal score drop to “absence of a cookie policy link,” though both reports actually noted the same missing cookie policy

Overall: My analysis was more forensic and data-driven, while Merlin’s was more strategic and interpretive, offering business context that mine deliberately avoided per your instructions to “stick to exactly what is in the report.”

The lesson here? while still useful for its flexibility in combining different models with specifying using the web or social (for opinions), I will be using native Claude more often for tasks like this.