Search for intitle:”Untitled Document” in Google and you will find plenty of websites with missing title tags. Worry not though as Google looks for hints of the page’s main theme elsewhere in the document and inserts it into search results instead of the title. So which elements can be used as a substitute? I investigate further.
Update: We’ve renamed the title of this very article to “Untitled Document” and the following is Google’s re-write based on our H1 and brand name:
Summary of Findings
Which elements will Google pick to replace a missing or inadequate title tag?
- Domain Name
- Page URL (note: gets confused with multiple levels)
- Domain + URL combination
- H tags
- Plain Text (tested when at the beginning of a document)
- Elements of a parent page (in case of iFrames)
- Truncation for boilerplate against variable elements.
Our Research in Detail
Case 1: http://www.sarahaking.com/
Rendered Title: Sarah A. King
Inference: Likely <font color=”#006699″>SARAH A. KING</font>
Note: Domain name could have been used but it doesn’t contain punctuation visible in the rendered title.
Case 2: http://www.wesleyburt.com/drawingsamples.html
Rendered Title: Drawing Samples – Wesley Burt
Inference: {URL_filename} – {Domain Name}
Note: Potentially also home page TITLE.
Case 3: http://www.pinholephotography.org/Solargraph%20instructions.htm
Rendered Title: Solargraphs – Pinhole photography
Inference: {URL_filename} – {Domain Name}
Note: Identical inference to the one above.
Case 4: http://stepheneastwood.com/tutorials/lensdistortion/strippage.htm
Rendered Title: Stephen Eastwood
Inference: {Domain Name}
Note: /1/2/3/ levels of directories may be the cause for the drop of this element in the title substitute.
Case 5: http://www.howardtangye.com/HT-90.html
Rendered Title: Untitled Document – Howard Tangye
Inference: Untitled Document + {Domain Name}
Note: How interesting! Google decides to keep the Untitled Document part and use only domain but in addition rather than completely replacing it? Do they think the piece is actually called “Untitled”? Perhaps not but a semantically challenging HT-90 may have something to do with them giving up on figuring out what it may be.
Case 6: http://www.benkler.org/CoasesPenguin.html
Rendered Title: Coase’s Penguin, or Linux and the Nature of – Yochai Benkler
Inference: H3 – H4
Case 7: http://www.scientificamerican.com/media/8-Wonders/01-Intro.html
Rendered Title: 8 Wonders of the – Scientific American
Inference: Parent Page Fragment (iframe) – Domain
Note: This is a strange case in which Google gets it wrong and truncates the title. What’s interesting is that Google follows the parent page in which this URL is embedded as an iFrame and gets its information from there. My guess on truncation is because this goes into several sections each slightly changing (.e.g Saturn Rings).
Case 8: http://www.tonyhawk.com/thth/
Rendered Title: (#THTH) Rules! – Tony Hawk
Inference: H2 Fragment – Domain
Note: This is a case where truncation is useful and Google removes general site title and leaves only the bit of substance (e.g. Tony Hawk’s Twitter Hunt (#THTH) Rules!)
So, the domain name and h1 are most important in case the title is missed in any of the page and it has been crawled by Google.
It seems so, but there are other significant elements as per our summary of findings.