Antezeta LogoAntezeta Web Marketing

Reflections on search engine optimization, web analytics and web marketing

Antezeta Web Marketing header image 2

UK and US English Dialect Considerations for Site Internationalization

by sean · 1 Comment

Share

Search Engines and Site Localization

While there are few differences between the UK and US English dialects which might lead to miscomprehension, Noah Webster’s spelling reform does lead to interesting issues which need to be considered when designing sites for international audiences.

Note Update: This document was written in 2006 and no longer represents the current state of search affairs. It has been left here as a historical reference. Search engines continually refine their algorithms and that is reflected in how they currently handle regional linguistic differences.

Is it “my favorite color” or “my favourite colour”?

While it may seem like an arcane academic question, how you spell your English language content can determine your site’s visibility in search engines and how your site is perceived by your visitors.

With about two-thirds of native English speakers in the US, American spelling predominates the web. Not surprisingly, a non-scientific survey of search expressions using both US and UK spellings yields more matches for the US variant:

Google search for “my favorite color” (2006-01-17)
Search (try it!) Google Matches % Total
US: my favorite color 36,100,000 84.9
GB: my favourite colour 6,430,000 15.1
42,530,000
Google search for “search engine optimization” (2006-01-17)
Search (try it!) Google Matches % Total
US: search engine optimization 33,300,000 89.9
GB: search engine optimisation 3,760,000 10.1
37,060,000

Several interesting points emerge from these search results:

  • Google does not respond with suggested alternatives Did you mean… as commonly occurs with misspellings
  • Google does not seem to be using an equivalences dictionary — the results are clearly different.
  • The British Council may need to up their investments to stem US linguistic hegemony :-) .

NoteNote that some pages with US spellings will show up in queries using British spellings and vice versa. This is probably due to the the use of UK or US spelling in link text on external sites which points to this content, i.e. a link Search Engine Optimisation Resources will point to a document written using American English, i.e. Search Engine Optimization Resources.

Should I use US or UK English?

If your audience is predominantly limited to the US or the UK, the choice is easy: use the dialect which will most resonate with your audience; this will be the dialect they use to search the Internet and it will be what they expect to see when they browse your site.

For sites targeting an international audience, the choice is a bit more difficult. As US English will resonate more closely with two-thirds of native speakers, this might be the best choice. US English is often taught as the preferred dialect in many parts of Asia. UK English should be considered when targeting primarily Europe and / or Commonwealth countries. Keyword research and analysis will help answer this question. For starters, once identified, are your keywords exactly the same in US and UK English?

Should I Mix and Match US and UK English?

A problem inherent in limiting your site’s content to either the US or UK dialect is that you’ll limit your search engine visibility — users searching with terms specific to one dialect or the other will be less likely to find you (your content may still appear if links on external sites pointing to your content contain the search terms in the same dialect as the search query).

One solution is to use both US and UK terms in your site’s content. You may choose to to mix dialects in the same page or to write some pages in US English and others in UK English, all at the risk of appearing inconsistent to a perceptive site visitor.

Site Localization and the Google Duplicate Content Penalty

A user-oriented solution to addressing an international audience is to maintain a .com site for the US dialect and a .co.uk site for the UK dialect. While a sensitive choice from a usability point of view, this approach runs afoul of Google’s recently refined duplicate content detection algorithm (changes were made in the so-called Jagger update last fall).

In essence, Google attempts to identify and penalize sites which are predominately carbon copies of other sites. Should you decide to offer your content in both US and UK dialects, we suggest you avoid Google’s potential wrath by focusing your marketing efforts on one of the two sites. Tell Google to ignore the other site with a robots exclusion file (at the site level) or a robots exclusion meta tag (at the page level). While a bit draconian, excluding your “duplicate” content from Google’s reach seems to be the only solution currently available to avoid incurring Google’s duplicate content penalty.

TipConsider blocking all search engine crawlers from your secondary English site. By doing so now, you’ll avoid future problems should Yahoo, MSN and others adopt a duplicate content penalty similar to Google’s. Of course, your alternative English spellings won’t be found….

Tag Your Content with the Language Dialect to Facilitate Proper Search Engine Indexing

You can help a search engine identify the language dialect of a page’s content by using the html lang tag in the html declaration, i.e.

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US">

or

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-GB" lang="en-GB">

as the case may be. Refer to the W3C Language tags in HTML and XML document for further information.

While we’re on the topic, note that language codes can also be set at the http heading level. This is used mostly by browsers.

for the Apache server, add a line similar to

AddLanguage en-GB .html

to your web server configuration file or .htaccess file. Your server will then include

Content-Language: en-GB

in its http headers. http headers can be viewed in Firefox using the livehttpheaders extension. Microsoft users should consider Microsoft’s wfetch.exe tool.

What’s your experience?

What experience have you had resolving internationalization issues?

Contact Us with feedback on your experience or to let us help you with your Search Engine Optimization and Web Analytics needs.

The use of the term Merit-based™ in conjunction with Search Engine Optimization is a Trademark of Antezeta.

Similar Posts:

Was this article useful? If so, spread the word:
  • Sphinn
  • StumbleUpon
  • Reddit
  • Digg
  • FriendFeed
  • Wikio
  • del.icio.us
  • Mixx
  • Google Bookmarks
  • Slashdot
  • Technorati
  • TwitThis
  • Facebook
  • Diigo
  • Netvibes
  • NewsVine
  • HelloTxt
  • Tumblr
  • Yahoo! Bookmarks
  • email
  • Suggest to Techmeme via Twitter
  • Yahoo! Buzz

If you haven't already, you might subscribe to my feed by Email, RSS feed and/or follow me on Twitter, which is updated on a more frequent – and more meaningless – basis. Finally, if you're a Sphinn user, Sphinn love is welcome :-). Thanks for visiting!

Share

Originally published January 17th, 2006 Tags: ···


1 response so far ↓

  • 1 SEO Chiangmai // Sep 10, 2009 at 8:42:49

    This is an old post but there does not seem to be much in terms of relevant, updated posts using the Google. I have several clients who target both US and UK markets and they are not sure what approach to take with spelling, especially targeting keywords and both markets simultaneously.

    I suspect that a single site could use Google Website Optimizer with geographical tracking to allow for single site indexing and adaptive responsiveness.

Leave a Comment

Warning: Comments are welcome insofar as they add something to the discussion. Anonymous and/or polemical comments without a rational justification of the author's position risk being mercilessly deleted at the sole discretion of the administrator. Yes, life is hard :-).

*
To prove you're a person (not a spam script), type the answer to the math equation shown in the picture. Click on the picture to hear an audio file of the equation.
Click to hear an audio file of the anti-spam equation