How Much Do You Trust Google? - Part 3

Can Google actually get you arrested? Thanks to Google's prefetching feature it is more than possible, it is reality. Google's prefetching can load websites directly into your cache without you even knowing. This can lead to endless consequences. Not only is this unethical, but it is violating your personal rights.

How Much Do You Trust Google? In Part 1 of the series, by Adam Prickett, we discussed the legal issues and secrets of Google. In Part 2, by Ethan Poole, we unmasked Google's controversial AutoLink feature. In Part 3 we'll continue our Google revealing by taking a look at Google's prefetching feature.

How Does Prefetching Work?

Prefetching starts like any search with the user entering a search query in Google. Google then spits out its results, and the first link may have a special attribute added to it - rel="prefetch". The adding of this attribute only happens on certain searches, integrated tightly into Google's search algorithm - so it remains a secret.

The small bit of added code lets a browser know to load this page into its cache. Then if the user visits that page the browser will just draw it from the cache. Currently the only popular browsers with this prefetching feature are Firefox and Mozilla. The only popular service that uses the prefetching feature is Google. If other web browsers were to implement prefetching then the situation would become even larger.

A browser that supports prefetching will then download that marked page into the user's cache. Then if the user visits the first link it has already been downloaded and speeds the process up for the user. Although sometimes this sounds like a great feature, it has a huge downside.

According to Google this is what they consider results prefetching:

Quotation by "Google"

On some searches, Google automatically instructs your browser to start downloading the top search result before you click on it.

Sure, this feature sounds just peachy, but not everyone wants unintended webpages downloaded in their cache.

The Dark Side of Prefetching - Do You REALLY Want That Page in Your Cache?

There have been cases where people have been arrested for having "kiddy porn" in their cache. What happens if a user does a search for "porn" in Google and the first site is for child pornography? Even if the user doesn't visit the page, it is downloaded in their cache. Now the police come along and see the child pornography in their cache for pages they never visited. They get arrested, end of story.

Advertisers are also affected by Google's prefetching. What happens if you have an ad on a webpage that is pre-cached by Google, but never viewed? This creates a huge disadvantage for web based advertisers, and messes with the web's economy. Does Google care that many people are losing money? Absolutely not, that's one thing that has been proved over and over with Google - they're in it for the money, they don't care about others losing money.

Another disadvantage for the user is that everything on the web page is pre-cached, this includes cookies. Most web users know the danger of many cookies, and the privacy they potentially could invade. Having cookies set for websites that you've never visited has no advantages. How many users want some cookie they don't know about downloaded into their cache? Not many.

A quick solution would be for Google not to prefetch results for searches related to pornography, but sadly that doesn't solve everything. Simply adding words such as "blow" or "suck" to a search query will instantly bring up inappropriate content. So even if you search for something harmless, like "which way do hurricanes blow," you'll end up with results you probably weren't expecting. Potentially, that unexpected result could now be in your cache.

Throwing off Browser Statistics?

Currently Firefox has a growing market share of close to 9%. Jon von Tetzchner, the chief executive of Opera, claims that these statistics are incorrect (external link). He presents a very well thought point about the false statistics of Firefox - all due to Google.

Jon von Tetzchner claims that Firefox's market share numbers are inflated by Firefox's prefetching feature, used by Google. "Sadly the statistics are ... overcounting Firefox. ... Firefox has added a pre-loading feature that Google has made use of. This inflates the numbers on the statistics," von Tetzchner said.

With the popularity of Google, and the emphasis of the easily built in Google search in Firefox's interface, the prefetching feature may be inflating statistics even more than we believe. The average users searches via Google multiple times each day. For each Firefox user, every query they enter has the potential to be pre-cached, and counted as a page loaded by Firefox. Even if the user never visits that page, it was still counted as a hit loaded by Firefox.

Although people debate that statistics programs can tell from the browser string a prefetched page, this isn't always true. AwStats, used by most webmasters, doesn't take into consideration the entire query string, and counts a pre-cached page as a hit loaded by Firefox. Better quality statistics programs may take a more precise look at the browser string, but yet most webmasters don't use these programs.

Overall, the issue of inflated browser statistics is hard to confirm, as it isn't easy to say Google has a huge effect on browser statistics. Plus, statistics are messed up in the first place, most spiders identify themselves as Internet Explorer. I wouldn't be surprised if Firefox and Opera both had higher market percentages.

How can I turn off prefetching?

Before I propose my solution, Firefox users, you can easily disable prefetching - to avoid conflict. In your address bar type about:config. Then scroll down to network.prefetch-next and set it to "False".

The Solution

Most of you would expect me to say stop using Firefox, as I am an Opera (external link) advocate, but that isn't my proposed solution. My solution calls on an action from Google. Google needs to end its prefetching feature, due to the many unwanted side effects. Due to the popularity and variety of pre-cached pages Google needs to end prefetching on its side. Even though Firefox/Mozilla offers this feature, Google's algorithm will often load pages into the cache that are inappropriate or not desired by the user.

A practical solution would be for Google to only prefetch pages to Google Account users who are logged in. Then the people who desire the prefetching and are aware of the consequences will have the option to use it. Choice is what drives the web, and a tradition that needs to be continued to the full extent with Google prefetching.

Extended Links