Cheating Penguin is Easy

TL;DR: Google’s latest Penguin update doesn’t seem to detect unnatural link surges or anchor text abuse from expired domain link networks.

After Venice-like update in Australia, we had a rather uneventful week. Google started warming things up again on May 21 hitting orange alert on Algoroo on the 22nd of May. The next day we saw clean and unambiguous signs of the pre-announced Penguin v2.0 update. What’s interesting is that the update doesn’t seem to be technically finished, take a look at this graph:

penguin

Since the Penguin update Google has been tweaking results daily and in some cases several times per day.

What’s going on?

Penguin is just an algorithm. It’s been designed with the aim of wiping out the next layer of webspam and Google’s Matt Cutts promised a smarter, more thorough impact with the second version of Penguin algorithm. Google’s team is surely making adjustments in hope to improve the accuracy of webspam detection and the way they act on it. This is normal.

[blockquote type=”blockquote_line” align=”right”]If you see a spam site that is still ranking after the latest Penguin webspam algorithm, please tell us more about it. Google[/blockquote]While some webmasters are experiencing traffic growth and rankings increase, it’s evident that Penguin seems to have targeted a very specific type of webspam unable to detect and act with confidence on certain types of SEO tricks including expired domain link networks which seem to work really well at the moment. Speaking of confidence, Google’s search quality team is now openly crowdsourcing Penguin fails ((https://docs.google.com/forms/d/1rhRenrd16MDSgAOwnMVx9KQbp–0JoY9vKiJdIcMe44/viewform)) . The “Penguin Spam Report” has been debated in the industry as something webmasters could abuse but I still believe it’s a good move on Google’s behalf.

Post-Penguin Success

We reviewed several cases of domains suddenly surging up in rankings without containing any of the following:

  1. Quality content and authorship
  2. Great user experience
  3. Organic links
  4. Social signals (including Google+ activity)
None of the above seem to truly matter in many reviewed search results.

What works instead is rather simple, if not old-school:

  1. Exact match anchor text
  2. Link velocity
  3. Expired Domains
Here’s a link acquisition graph of the post-Penguin winning domain:

Link Velocity

Referring Domains

Yes, links can move up in this pattern naturally, and that would be a clear sign of website’s popularity. In this case we’re looking at near 100% exact match anchor text link profile from an expired domain link network.

The example below appears to be a failed activist-type campaign website which eventually expired, but no doubt collecting links and social shares during its lifetime:

Expired Domain

Wayback MachineThe domain operated for a few years until it expired and dropped in 2013 at which point it was turned into a WordPress blog and loaded with the type of content Panda should have taken care of. In my opinion Panda did a fairly good job at detecting low quality content so the question is, how is it that pages like this still pass link equity to websites they link to?

It won’t be easy to prove this with an experiment and unless we hear Google say otherwise I’m going to say that Penguin and Panda don’t talk to each other much. This makes no sense though, for if you’re in charge of Google’s search quality you’d surely use all resources available to make search better and this involves allowing Panda-Penguin collaboration.

Pages which fit in the Panda-like pattern should not pass link equity. Full stop.

[styledbox type=”warning” ]The purpose of this article is to flag flaws in Google’s search quality algorithm and raise awareness about bad SEO practices. Please refrain from using the tactics described in this article on your own website.[/styledbox]

Unanswered Questions

  1. How is it that with this round of updates we’re seeing Penguin v1.0 targets get away? (Penguin v2.0 has a better “engine” and acts on a granular level while Penguin v1.0 acted on home page level only ((http://youtu.be/nNbWw2OUUAc?t=1m29s)) .)
  2. Why is anchor text abuse still working so well?
  3. Why is Google unable to detect/act on footer/sitewide links with exact-match anchor text?
  4. Why is it so hard to detect low quality expired domains, even though Penguin has been refined for years?
  5. Why allow inorganic link growth act as a strong quality/QDF signal? Are the two processes separated?

Reader Reactions

Another interesting observations made by our readers was that sponsored/affiliate links seems to fly under the radar as well.

0 Points


11 thoughts on “Cheating Penguin is Easy”

  1. Chris Ward says:

    Why is it so hard to detect low quality expire domains, even though Penguin has been refined for years?
    Google is not as smart as we all think. Remember the Webspam team training document. (Webspam team are told to check interlinking, footprint and content)
    If you use way back machine to get the old content from the site and don’t interlink the network. It’s not easy to work it out.
    When link building is getting so expensive, you can produce a relevant link, that only links out to one or two sites for $100. What we have here is the new SEO “crack”.
    What do we know about “crack” though? Well, it works, it will get you high (Rankings). However, we also know that long term use is pretty bad. It’s addictive, you need to keep doing more and eventually all you do is “crack” and neglect other legitimate tactics. We have seen this before Dan.
    Longterm “crack” usage leads to users (Seo agencies) winding up on the street, jobless and doing freelance onsites at truck stops.

  2. Peter Watson says:

    I can confirm that I am seeing aggressive daily fluctuations in my niche for a few weeks now. Its very unstable, even post Penguin.
    It looks as though Big Brands are the winners again. What confuses me is that most of the big brands interlink with their other authority sites (eg seek.com.au linking to seekcommercial.com.au). 95% of the links pointing to seekcommercial.com.au are from seek.com.au, and as a result seekcommercial.com.au is totally dominating the serps!
    I’m also seeing exact match domains ranking very well, with a poor link profile and week on page factors. Nothing seems to have changes in this area either.
    Clustering is more evident now than ever, which is funny because ‘cluster’ results got worse after they ‘fixed’ the issue in my opinion.
    I think exact match anchors are still ok, but obviously they need to be on trusted sites. The problems start when we see large numbers of EMA’s on spammy, low quality sites.
    I disagree that Penguin 1 only targeted the home page. I have first hand experience dealing with sites that were smashed on the 24th April 2012 and it was clearly obvious that the deeper pages also suffered the same fate as the home page. I’m sure some sites only suffered home page rankings, but there were thousands of sites that were hit site wide or on multiple levels.

  3. Peter Watson says:

    I had not actually seen it, but I heard lots of people talking about it.
    All I can do is gauge Penguin 1.0 on my own experiences and other webmasters I know and we all experienced home page and deeper page penalties as a result of the 1st Penguin.
    Some of those sites recovered after the recent Penguin.

  4. Phill says:

    Check out this study from Portent: http://www.portent.com/images/2013/03/google-declining-spam-tolerance.pdf
    I honestly don’t think it will last long. It simply can’t..
    Also, thing about the constant of time in the equation. If a brand new site launches and has had on average 6.3 new links per day for last 4 years. That is normal. If you take a big brand that has had 1% non-brand links for the past 10 years then all of a sudden gets 3% non-brand, that could algorithmicly set some alarms off. What i’m saying is that it is all relative.

  5. Blueflux SEO says:

    Don’t forget that Google sees a sight that ( as long as it’s designed well ) is linked properly for a better experience for the visitor very highly. Keeping subjects separated across different pages yet connected in an appropriate and relevant manner, is a very big deal as far as quality and Google are concerned.
    With this in mind, having the site linked to others with the same idea in mind is the most likely reason for their performance.
    Martin H

  6. James Norquay says:

    My guess is their will be a further penguin roll out to remove additional junk from the index. Sure many expired domain networks are ranking sites, I also see more .ru link networks ranking with sites (which Google said they wiped out). Google should just do manual audits on the top commercial keywords first which are targeted heavily by affiliates. I am not too concerned it is only a matter of time before another shake up in the index too many commercial keywords are showing junk.

  7. Deyan SEO says:

    A matter of time yes but during that time…3…6…12… months we’re talking millions of dollars in lost revenue for whitenat sites and the same in gain by cheaters.

  8. James says:

    Brilliant analogy

  9. Peter,
    The comment from Matt Cutts about Penguin 1.0 dealing with the home page wasn’t that the penalty only affected the home page – the penalty will affect the entire website. I believe you’ll find his comment was pointing out that the algorithm only assessed the links going into the home page to determine whether or not to actually penalise a website with the Penguin algorithm.
    Taking those comments from Matt at face value, hypothetically under Pengiun 1.0 it would be possible to have a clean home page link profile but go hard building links to deep URLs and not get penalised. With the release of Pengiun 2.0, Matt has now said that they are assessing links going into all pages of a site – as such building low quality or over optimised links into any page of your site now puts your entire site at risk.
    Al.

  10. pmc says:

    This has very little to do with ‘expired domains’.
    You could have all the same linked from websites that weren’t once expired (e.g. paid for links or just swayed webmasters to link to you).
    The issue is linking from supposedly low quality content.