Last week I presented a method for search result hijacking. The story got much coverage in the SEO community, perhaps due to the fact that Rand Fishkin’s authority pages were also compromised as part of our experiment. One thing I did not elaborate on in the original article was the peculiar way Google Webmaster Tools handle document canonicalisation.
TL;DR
You can see somebody else’s links in your Google Webmaster Tools as if you were the authorised user of that site. The process involves creation of identical document copy and the results are visible in about two weeks.
In the following screenshot we see what Google calls “an intermediate” link. An old link which points to our old domain 301’s to our new domain. Nothing unusual about this.
There are other instances of the “intermediate link” in Google Webmaster Tools. One of them is related to document canonicalisation process described in a paper called “Large-scale Incremental Processing Using Distributed Transactions and Notifications” by Daniel Peng and Frank Dabek. This is exactly the same process I used in the result hijack article and the most interesting thing is that it works in reverse! (I’ll get to that later).
Here’s an example of one such case, some of you may remember this website from my hijacking experiment:
As you can see the “intermediate link” notification suggests that my page above receives a link from Rob’s website but the thing is, it doesn’t. So what’s going on?
Well, the page I created is a replica of the original page on htt
I am seeing the same links the owner of that site would see in their Google Webmaster Tools.
Here’s the interesting part, it works in reverse! You don’t even have to hijack the result for this to work, you can see the results with the ‘loser’ URL. If you create a duplicate page with lower PageRank (not very difficult to do that is it?) of any page on the web, you will be able to see its links in your own Google Webmaster Tools.
To test this concept I copied a PDF from another site and simply got it indexed, in about ten days I saw all its backlinks in webmaster tools, here it is:
Domains | Links |
doc-txt.com | 38 |
documbase.com | 19 |
pdfqueen.com | 7 |
cmu.edu | 6 |
msra.cn | 6 |
quora.com | 5 |
130.203.0.133 | 5 |
google.com | 4 |
berbatek.com | 3 |
podpdf.com | 3 |
blogspot.com | 3 |
seobythesea.com | 3 |
writingseo.com | 3 |
newyorklastminutetravel.com | 2 |
automotivedigitalmarketing.com | 2 |
123people.com | 2 |
blumenthals.com | 2 |
psu.edu | 1 |
pitt.edu | 1 |
65.54.0.113 | 1 |
52opencourse.com | 1 |
chatmeter.com | 1 |
journalogy.net | 1 |
vebidoo.de | 1 |
c4ads.org | 1 |
ryanmcd.com | 1 |
ebookbrowse.com | 1 |
uic.edu | 1 |
rightnow.com | 1 |
osti.gov | 1 |
informationweek.com | 1 |
christopherpotts.net | 1 |
sdsu.edu | 1 |
keywordspy.com | 1 |
psugeo.org | 1 |
tagwalk.com | 1 |
delib.net | 1 |
Now I can take any page/document from my competitor, place it on a domain of my choice, have it indexed and in a few weeks I’m able to see all their backlinks within Google Webmaster Tools.
It took me exactly 14 days to see the link data of Rob’s website.
Whoa indeed.
Now that’s a find. I wonder if Google will be able to quickly patch it up, or is it too deeply affected by their canonicalization process for a quick fix?
I would imagine this would take some time before it’s sanitised.
I’m not at all a technical SEO, or technical anything for that matter, so I’m always amazed (and envious) when I read posts like this. Amazing stuff, and thank you for sharing it with the SEO community.
Someday I’d love to hear the story about how you made this discovery.
I just saw some weird links which weren’t mine. Didn’t stop at that and continued investigating. One thing lead to another and once I’m finished playing I publish my findings.
How is this any better than putting a URL into Majestic or OSE?
GWT is hardly known for the accuracy of its backlink data.
Hi Dan,
What you found is just great and many SEOs just like me really appreciate the time and efforts you put in this. Just wanted to thank you for this and ask a question.
When you created a duplicate of the marketbizz.nl or submitted it into webmasters did the guys there receive any notification/email/alert/warning or not?
Regards…
ok, it’s interesting, but: is it worth spending your time if you have ahrefs, majesticseo, opensiteexplorer, linkdiagnosis and other tools?
Top stuff again.. thanks for sharing Dan.
Simple. I found links which were not in OSE / Majestic.
Thanks Keith. No notification.
Just want to elaborate on what Dan said – OSE is a smaller scraper than Google, so it won’t find a lot of links that are being counted. Besides, you now know for a fact every link that Google is counting 🙂
Nice work Dan. I’ve had to explain (multiple times) to different employers that Google does not publicly display backlinks for any website, even your own. You’d have to be a verified user and look inside of GWT. This hack certainly circumvents that. Going to put it to the test. Thanks!
Hi Dan,
Another epic post. Do you foresee this tactic being used for evil?
Whoah indeed! Great discovery Dan thanks for sharing. Lets see how long Google lets it last!
Hmm this looks a little bit evil… 😉
Nice little hack you found there Dan!
Awesome post, thank you dan
Hudge !!!
At the risk of looking like comment spam, thanks for the post, this is awesome.
Interesting! Let say I try to peek on other website’s inbound links with my own lower PR site than the original, and then it took about 2 weeks before I see the inbound link. If the original uses Copy Scape or some kind of alerts, wouldn’t he/she be able to found the copy before I get the data?
That is scary indeed .. and getting backlink data from GWT for any of your competitors like this is going to create quite a bit of fuss, I wonder how G is going to handle this . I should try this out for a demo site soon. Thanks for sharing this finding with us.
Love the kind of research you are are doing , keep up the good work.
Hmm, I wonder…If Google is reporting these as links to your site (even without a actual link), does that mean they’re passing PR to your page? The question seems to be…Is it a false report, or are they exposing how part of their algorithm works? Interesting.
I am going to have to try this. My question would be if Google finds out that you are doing this will there be penalizations for manipulation on purpose…guess it wouldn’t matter if they take down the duplicate page. 🙂
But I would also like to know (if possible), when you say duplicate are you doing just content because that is a sinch and if this works well, then holy crap I have some work to do before this leak in the damn is plugged.
Very interesting, it should be tested
This is super awesome hahaha!!! Wohooo
Dan, I posted about a strange situation in an SEOMoz question: http://www.seomoz.org/q/strange-situation-started-over-with-a-new-site-wmt-showing-the-links-that-previously-pointed-to-old-site
I think that what you saw is happening to my client’s site. The short version is that we scrapped a Penguin hit site and started fresh. We removed the old from the index and put up a new site with the same content other than the home page. There are no redirects from the old to the new. However, now WMT is picking up all of the links from the old site and attributing them to the new one. I’m worried that the new site will be affected when Penguin rolls out again.
I am wondering what is going on in my mind is also going on in your mind. So does it mean we can increase the pagerank of our site with this technique?? For example you copy some good quality posts with good backlinks on your domain, the domain authority and the pagerank of the domain will also increase??
That’s my question too! And if the links to the original page are manipulative (i.e. Penguin like), could they affect your domain?
Yes, none of those tools would ever display all the links one would have in their WMT account.
Somebody tried it on us 🙂
Heya, I tried it and it doesn’t work for me.. (waited 2 weeks though). Is it not working anymore ?
hey man very intresting ,but i doaubt bout some thing,i searched alote ,byt not get better results, my question is if we add our url as ?url=http://www ,this is calculated ad back links?
example http://seocheki.net/site-check.php?u=http://www.maccopacco.com
is this count AS backlink for maccopacco ,pls help
But WMT does not have this info either
Is this hack still working today, or did they find some way around it???
Does this still work ? I think 10 monthes are enough for GG to solve this problem ^^
Hi Dan,
This is very interesting. I never thought this could work out since I’m not really aware of my competitor’s SEO performance. Anyway, this is really helpful to track the links and make some assessments to my own site. Thanks a lot for sharing this informative article.