Angry Panda

Panda 4.0 rolled out over the past several days, and it seems to be a pretty significant update. While some Panda updates have been a ‘data refresh,’ there are a lot of bigger ripples this time around (~7.5% of English queries.)

Panda is designed to keep websites with low quality content from ranking near the top of Google’s results. Typically, Panda targets sites with thin, low-quality or duplicate content.

In theory, this is all great. Nobody wants to read scraped content on a 3rd party site (unless it’s hosted on Google.com, of course). Google’s longer term goals definitely include being able to understand what pages have high-quality content and which do not.

Trying to do this algorithmically, however, is extremely difficult.

Is all thin content low quality? Is a product page with 50 words less deserving to rank than one that is 500 words? Does great content even exist or is all subjective?

If you work in SEO, it’s easy to get caught up in the SEO echo chamber and cheer on Google updates if you’re following the most up-to-date version of Google’s Webmaster Guidelines. “Don’t publish thin content” is common sense to an SEO in 2014. If Google kills sites with low quality content it should help SEOs.

The web is much, much bigger than the SEO bubble though.

It’s easy to forget that 99% of the web consists of site owners who aren’t SEO experts, and these types of penalties have a big impact on them. The average business owner does not know that their product or service pages with 75 words are a ticking time bomb. They’ve probably never even heard of Google’s Webmaster Guidelines.

MetaFilter is a good example of a site run by well-meaning smart people who have had to cut jobs because of Panda updates. MetaFilter is not spam or a low quality site. They’ve built a great community and have been around since 1999. Their content is not low quality by any standard.

Tons of copy doesn’t make a web page interesting or great. This is where some applications of Panda fail. Long form content is cool and all, but there’s nothing wrong with short form content.

As a user, I’d rather read a crisp 150-word product description than a 500-word expose full of meaningless marketing copy. There are plenty of times where thin content is preferable to users.

Google may hope that Panda updates are a step closer to providing smarter search with a search experience that displays a better understanding of the web, but I don’t see it. A meaningful Google update is one that is able to better comprehend the context and value of the copy on a web page. For the most part, all Panda does is mandate higher word counts and nuke a few scraper sites here and there.

While I’m all for getting true spam, duplicate content and spun content out of the SERPs, killing off traffic to legitimate sites because they don’t focus as much on developing content as Google would prefer is not a good thing for anybody.

This is part of a much larger issue with Google: they control 40% of the Internet’s traffic and can distribute it however they please. Updates to the algorithm aren’t inherently bad, but ones where Google takes more control over the greater web are worrisome.

I loved this line on HN the other day: “We are getting a Google-shaped web rather than a web-shaped Google.” Instead of Google letting people build and market their websites and then figuring out how to rank them, Google is mandating that sites be constructed and marketed in a certain way or else you’ll never rank in the first place.

The starting point for most marketing discussions isn’t “What’s the best way we can communicate to people through our website?” it is “Well, we know we need a ton of pages on our site that are at least 500 words. Oh, and we better add in Google+ markup too.”

This stifles innovation and gives control to Google instead of website owners.

Google is the heavy-handed editor of the Internet and Panda is their red pen.