Our previous SEO experiments hinted at a possibility of Google (purposely?) shuffling the results out of expected pattern. This time we investigate what happens on a larger scale and use 100 images for our test.
The Experiment
We started by creating a row of numbers in Excel (1-100) and used conditional formatting to colour all the cells. This was then exported as PDF and chopped up in Photoshop. Each file corresponded to its number (e.g. 64 = 64.png) and they were arranged in correct sequence on the experiment page. We told Google about the page (social sharing) and waited for the results. And then…
That’s right. What you see above is what Google displays after roughly 30 minutes from our test page going live.
Result Analysis
We noticed several interesting things but main discovery is that our images seem to dance randomly in Google Image results and not in a gradient mode we expected with values of numbered images incrementing from 1 to 100. Take a look at the original page and the way images are ordered. One would expect that this should be the order of indexation and likewise, display in search results.
Something like this perhaps:
So what’s going on? Well, there are at least three possibilities:
- Crawler sent the page to index and the media was absorbed in a distributed way by a number of separate machines and stored in different physical locations. This does not seem particularly useful since Google tries to keep things close in its index and cache in order to keep things as little fragmented as possible.
- File size fluctuations could somehow influence ordering, though we see very little practical reason for this on such a small deviation (literally in bytes).
- Google may be purposely adding a RND function to things to mix things up a bit. This would keep results fresher and allow random discovery. Also it would prevent any systematic attempts at reverse-engineering of Google’s ranking algorithm.
Main highlight: Google seems to (whether purposely or not) randomise the order of otherwise neatly ordered images on a page.
Secondary findings include:
- Page enters cache within 60 seconds, assisted by social sharing.
- Two minutes later image search displays nothing.
- 30 minutes later images are in the results, but refer to the time of indexation (back when they were not visible)
- Small variation in indexation time (seconds) does not seem to affect the order of images in Google search.
- Only 27 images were indexed, and at random. Why? Perhaps due to PageRank allowance or some other limiting factor.
- Image colour filtering did not work 2 hours after the page went up indicating that colour processing may be a separate process
Two highlights: Google image search really is faster since recent updates and can include new images in search results within minutes, however they are not likely to be displayed (due to unknown bottleneck) in search results for at least 15-30 minutes.
Other Observations
Following four images were classified as photos:
Not directly associated with the test but we also noticed that a wide screen mode brought up a set of images slightly smaller than the rest (see top two rows).
Follow-Up Experiment
We performed the same test on a different domain and got the following 26 results:
For comparison, here’s the original 27 results:
Here’s the table which examines co-relation between file size and image results:
Update Journal:
5/5/2012: 30 images now display in results. Colours still not filtering.
Thanks a lot for this interesting experiment. I don#t believe in random (when thinking about google). But I hve no idea for possible answers for the results.
I will do something similar to make a second test on another website. Of course I will mention your article.Best wishes, Martin
Thanks Martin, I look forward to hearing about your results.
Very interesting. Hmmmm……..
FYI: here my results: https://www.google.de/search?q=site:tagseoblog.de/google-bilder-test&tbm=isch
Interesting: when I search for “blue images” it shows much more blue images than in the “normal search”.
Very thorough analysis indeed. I’ll have this article translated in English 🙂
Hi deyan,
here are my results. http://www.tagseoblog.de/google-bilder-experiment-indexierung-und-reihenfolge
Sorry for writing in german. Perhaps I will find the time to translate it.
My analysis in short:
The order of the result ist shown randomly, OK. But: what we can see with this experiment is that the “site:url” query is buggy. It doens’t show exactly the indexed images. And the site-query itself has no ranking-factors.
Check this: “site:research.deyandarketing.com/google-image-test/ 04” – it shows 35 results (to me).
I suppose the the crawl-rate in my blog is higher than in your blog: in my case all 100 images are crawled within 2 minuten (ten minutes after uploading). Perhaps you can check your logfiles for this -> filter for “googlebot-image”, and then for “uploads/2012/05/” -> you should see a list with all crawled images ordered by crawling time.
I have writen a short text on my testing page. If I search for the exact phrase Google shows some images (not all) in the nearly exact order (beginning with 01.png…)
In short: the site:-result-list is buggy.
Thanks for the testing-idea. And best wishes
Martin