Improving the quality of Google Analytics data for Jekyll sites
As explained in the Google Analytics documentation, the default GA JavaScript snippet will extract the
information about the page location and title from window.location
and document.title
. This works fine
if the page is loaded directly from its origin without modification. However, a visitor of the site may
load the page from Google’s cache or use a service such as Google Translate to display it in a different
language. In this case the window.location
no longer contains the original URL of the page, and
document.title
may have a modified value as well. On the other hand, the Google Analytics code contained
in the page will (in general) still be executed.
The end result is that this particular type of page view is still recorded, but with somewhat unexpected data, as shown in the following screenshot:
The solution for this problem is to pass the page location and title information explicitly when sending the page view event.
Of course, for this to work these values must appear as constants in the JavaScript returned by the site.
With Jekyll this is easy to achieve because the location and title are available from the
page.url
and page.title
template parameters, so that they can be substituted when the HTML of the
page is generated. This is straightforward, except that care must be taken to correctly escape
quotes that may be part of the page title:
ga('send', 'pageview', {
'page': '{{ page.url }}',
'title': '{{ page.title | replace: "'", "\\'" }}'
});
In the generated page these options will indeed appear as constants and they will be sent unmodified to Analytics, even if the page is loaded from Google’s cache or modified by Google Translate:
ga('send', 'pageview', {
'page': '/2015/01/05/jekyll-improving-ga-data-quality.html',
'title': 'Improving the quality of Google Analytics data for Jekyll sites'
});
Note that this relies on the assumption that all pages have the title
parameter set, i.e. that
the markup for the page title is always generated as follows:
<title>{{ page.title }}</title>
If that’s not the case, then you need to implement the necessary changes to make things consistent, either by modifying the JavaScript code or by changing the way the page metadata is generated.
Note that with Jekyll 2.x, a side effect of the change described here is that the home page of the site will no
longer be reported as /
, but as /index.html
. In another post I explain why this is actually
desirable (but note that this is no longer the case with Jekyll 3.x). In case you don’t like this because
it introduces a discontinuity in your analytics data,
it should be easy to modify the code shown above to take that particular situation into account.