feat(aio): update metatags to control search engine crawling (#21665)

The `<meta name="robots" content="noindex">` tag is used
to indicate to search engine crawlers that they should not index
the current page. This is set dynamically by the the document
viewer component to ensure that 404 and other erroring pages
are not added to the search index.

This relies upon the idea that the crawling bot will run the JS
and wait to see if this meta tag has been added or not.

Since we believe that the `googebot` will do this, we also
pre-emptively add a hard-coded noindex tag specifically for
this bot, so that if anything else fails in bootstrapping the app,
the failed page will not be added to the index.

Closes #21317

PR Close #21665
This commit is contained in:
Pete Bacon Darwin
2018-01-19 14:58:23 +00:00
committed by Misko Hevery
parent 0b38a039d0
commit 88045a5050
5 changed files with 77 additions and 4 deletions

View File

@ -31,6 +31,13 @@
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-status-bar-style" content="translucent">
<!--
Initially tell the Google crawler not to index this page.
If the page loads correctly will remove this tag (in the DocViewer).
Subsequent navigations will update the tag dynamically (i.e. soft 404).
Don't do the same for `robots` in general here, since they might not be able to handle the tag changing dynamically.
-->
<meta name="googlebot" content="noindex">
<!-- Google Analytics -->
<script>