large language models

AI vs. Human Content: A Case Study

Patrick Danial, Chief Technology Officer & Co-Founder



Terakeet’s content strategy leaders undertook a testing project to assess the AI content options on the market and how they compare to human-driven content creation. 

The idea was to determine where AI hits its limit, where humans excel most, and how humans can leverage the power of AI for maximum impact.

We’re sharing the methods and comprehensive findings of the study as well as the insights collected from this endeavor. The results may surprise you. 

Note: This is not a review or an endorsement of any specific AI tools or platforms.

Background

The AI world has been quickly developing numerous tools aimed at content strategy and content production. These tools are marketed as content solutions with capabilities including researching, editing, outlining, and writing content. 

But given the complexity of producing content that meets search goals and authentically connects with human audiences, questions of efficacy remain unanswered.

In this case study we examined some of the top AI content platforms and tools to determine if current content generation applications are strong enough performers in demanding digital spaces. 

Is there one solution? Can you really outsource content production to AI? Or is AI just another helpful tool on the road to authentic consumer connections?

Let’s find out.

Our research method

We conducted our research by determining the most vital content tasks and, within each task, identified the most relevant metrics and criteria. We then asked the AI tools and our human content specialists to complete these tasks to see how the outcomes would stack up. 

Testing details

Across the tools, we provided consistent inputs to ensure the viability of the experiment. We provided each tool the same prompt and data and asked each to generate multiple content pieces. The tools were given the same specifications that our human writers typically receive.

The contenders

Terakeet content specialists (our writers) vs. artificial intelligence

We tested the following AI platforms:

  • Jasper — “Enterprise-grade AI tools to help marketing teams achieve both speed and performance.”
  • Typeface — “Generative AI application for enterprise content creation, empowers all businesses to create exceptional, on-brand content at supercharged speeds.”
  • Writesonic — “AI writer that creates SEO-friendly content for blogs, Facebook ads, Google ads, and Shopify for free.”
  • Copy.ai — “Copy.ai empowers you to automatically generate, optimize, and audit content at unmatched scale.”
  • ChatGPT (GPT 3.5 and GPT 4.0) — Two iterations of OpenAI’s flagship conversational generative AI platform.

Content tasks

We tested our specialists and the selected AI tools on the following tasks:

  1. Outline creation: Developing SEO-driven outlines for creating rankable content
  2. Full draft creation: Creating SEO-friendly, audience-aligned, and engaging content based on consumer insights
  3. Unique, factual, accessible content creation: Creating one-of-a-kind content, writing factual and accurate content, and meeting reading-level requirements

These three categories represent the most important aspects of a content strategy, especially when it comes to achieving content visibility and ranking, audience engagement, and strong consumer connections.

Content criteria and metrics

Here’s a deeper look at how we scored performance across the three content task categories. Ideally, your content strategy should consistently achieve most, if not all, of these criteria. Gaps will greatly degrade the results of your content efforts.

Outline Creation

  • Heading structure and content flow
  • Interlinking and user-focused CTAs
  • Keyword coverage and usage
  • Audience orientation
  • Appropriate sourcing
  • SERP feature targeting
  • SERP alignment

Full Draft Creation

  • Information accuracy and timeliness
  • Structure and content coverage
  • Sources and interlinks
  • Keyword and SERP feature targeting
  • Audience orientation
  • Engaging content
  • Obvious AI language

Quality Content Creation

  • Content uniqueness
  • Content accuracy
  • Content reading accessibility

Findings

Here are the conclusions we were able to draw from our testing. 

Across all three categories, our content specialist managed to achieve every task, nailing all the criteria and metrics. See the tables below for the full results.

Outline Creation

Here are the results of our outline creation testing:

AI content creation test results

Conclusion: AI-generated outlines

  • Overall, all of the AI tools created barebones outlines. Some outputs included very high-level writer instructions but lacked the level of detail necessary to guide the creation of an SEO-rich piece of content.
  • AI tools were seemingly not capable of assessing existing SERPs. Outline structure was largely dictated by the data supplied by the user.
  • AI tools failed to integrate keywords naturally. It was common for keywords supplied in the prompt to be regurgitated into section headers. 
  • AI tools were not capable of recommending interlinks.
  • AI tools cited sources inconsistently, citing dead sources, and potential hallucinations.

Full Draft Creation

Here are the results of our full draft creation testing:

AI full article test results

Conclusion: AI-generated drafts

  • AI tools consistently had structural issues and topic coverage gaps throughout generated full drafts.
  • Keyword usage was an issue, and it dictated the structure of articles, which created redundancy.
  • Writesonic’s Article Writer 5.0 offered the functionality to assess articles in the SERPs.
  • The AI tools made inaccurate claims while generating an article on a timely financial wellness topic.
  • AI tools struggled to seamlessly integrate audience insights and brand voice/style preferences. Outputs often read robotically and in a way that would limit reader engagement. 
  • Content generated by AI was formulaic, had tone misalignment, used colloquialisms and dramatic language, and was easily identifiable as AI.

Quality Content Creation

Here are the results of our content quality testing:

AI content quality test results

Conclusion: AI falls short in quality

  • AI produced factual inaccuracies that limited credibility, were easily detectable, and added an editing burden on the user.
  • AI tools wrote in a tone of both absolute claims and broad generalizations, both of which made for dubious statements. 
  • AI tools produced content that resembled existing web content and used many of the same phrases.
  • AI content was inaccessible due to its tendency to create long, technically and grammatically pure sentences.
  • Factual content tended to be at too high of a reading level, collegiate and above.

Why these results matter

The goal of content strategy is to create unique, engaging, human-centered content that actually connects with human readers. At this stage, it appears that AI-generated content simply is not the market-leading force one might expect. 

Across each of our tests, it was unable to hit the metrics and specifications necessary to a successful content strategy. It also failed less objective measures of what we’d call “good” writing.

Some considerations:

  • There’s already a seemingly endless supply of web content available and yours has to offer unique value, contain interesting perspectives, and offer human insights to stand out. AI-led content production adds to the noise and will not reach people. Google is reportedly cracking down on this as well.
  • AI can produce content at a massive scale but can never be truly unique as it relies on content that already exists. It’s not able to create novel ideas and angles a human can.
  • Providing solutions with exceptionally clear and concise content is a baseline requirement for human connection. AI struggles with these elements. Overly complicated jargon and high reading level are two barriers here.
  • Content that contains any factual errors or issues around sourcing from dead links, or even fabricated sources, is a surefire way to lose your audience. At this time, AI has this issue.

Additional findings: Rankability and search

We also tested AI on a number of SEO and ranking metrics plus Google’s Helpful Content Update to see how the tools fared. The results weren’t stellar.

AI Google quality test results

Why Google’s guidelines matter

Content should always serve a number of purposes like connecting with humans, solving problems, supporting other content, ranking in search engines, etc. Each is important, but because Google is an exceedingly important channel, violating Google’s guidelines will hinder consumer connections. 

AI tends to create material that runs against Google’s position on what content is valuable to users. And if you’re using AI to generate full content you’re taking the risk of not connecting or being seen, at best, and being delisted, at worst.

In the long term, investing heavily in AI-generated content is not a strategy. The AI space shifts constantly and one innovation, policy change, or even legislative action could devalue large libraries of AI-generated material. 

Achieving longer-term performance, sustainable growth, or increased market share requires content that connects and aligns with mega-channels like Google.

Takeaways: AI is a tool, not a solution

AI is not the end-all solution to creating content that authentically connects with human audiences. It’s not a method for search visibility or shareability, either. But AI can be an efficient and capable tool that marketers can leverage to supercharge certain tasks and content creation in more narrow ways.

Our testing also revealed some of the key characteristics of human content and intelligence. Humans communicate in a personal tone that engages us. Humans can write for different audiences and make it accessible to each, even if the topic is complex. 

Humans also have the unique ability to notice and communicate intriguing insights and observations. These outcomes are not possible with current technology.

How humans and AI can work together

The human element

  • Humans are highly capable of churning out interesting, unique content.
  • Humans are relatability machines, able to create content that resonates and connects with that certain human element.
  • Humans can deeply understand audiences and respond to unique needs. 
  • When it comes to factual accuracy and oversight, humans are much more adept than AI.
  • Human-led development of SEO strategy and incorporation into content is vital at this stage.

The AI assistant

  • AI tools can be trained to build customized outputs.
  • AI can assist in source generation for writers to use in their content.
  • AI can provide useful support in article research and topic ideation stages.
  • AI tools can help writers “speak” like experts and discover industry jargon. 
  • AI tools can inspect content for style preferences, readability, tone, grammar, and other tasks, though full editing is currently not viable.
  • AI can inspect documents for absolute statements and factual claims to aid in fact-checking.

Additional resources

For more on artificial intelligence and marketing, check out some of our related articles and resources.

AI articles

Terakeet employees sitting on couches

Learn how consumer insights can drive consumer connection

What We Do

CMO-level insights, digital trends, and thought leadership sent to your inbox.

Unlock instant access to 25+ digital marketing resources and the OAO 101 introductory email course to kick start your strategy.