The Digital Analytics Association is history – and no one cares.

It was a bit surprising. I had recently emailed with Jim Sterne when it came to the German branch. The DAA had also contributed a foreword to my web analytics book. It’s a bit of a shame.

For those who don’t know: The DAA was previously the WAA, the Web Analytics Association, and it created the most widely used definition of web analytics. Although that definition has long been missing from the website, most researchers who copy quotes from other papers didn’t seem to care.

But how is it possible that such an organization, despite the importance of data, is shutting down? It could be, for example, because many have installed Google Analytics & Co., but the data is not actually being used. In my last paper, which unfortunately isn’t public yet, it was found that most users don’t even realize that embedding the GA code alone isn’t enough to work data-driven. And maybe it’s also a bit due to the DAA itself, that it didn’t manage to make its relevance clear.

I had only been a member out of nostalgia in recent years. I had used my student status to lower the membership fees a bit.

The website is already no longer accessible.

Why the average session duration in Analytics is complete nonsense


I have been dealing with web analysis for over 20 years, starting with server log files and today with sometimes crazy implementations of tracking systems. The possibilities are getting better and better, but not everything has gotten better. Because one superstition simply cannot be killed, namely that Time on Site or the “average session duration” is a good metric, or that the given values are correct at all, so here is black and white: In a standard implementation, the Time on Site is not measured correctly, whether in Adobe Analytics or Google Analytics or Piwik or whatever.

Why Average Session Duration Isn’t Measured Properly

The explanation why the times cannot be correct is quite simple. In a standard installation of Google/Adobe Analytics/[place your system here], a measurement is performed every time the user triggers an action. For example, it comes to a website at 1:00 p.m., and then the tracking pixel is triggered for the first time. The user looks around a bit and then clicks on a link at 1:01 p.m., so that he gets to another page of the same website, where the tracking pixel is fired again. Now we can calculate that he has spent 1 minute on the website so far, because we have two measurement points with different timestamps. We measure time here with the temporal distance between two sides.

On the second page, on which he is now located, the user stays longer, because here he finds what he was looking for. He reads a text, and at 1:05 p.m., i.e. after 4 minutes, his need for information is satisfied, so he leaves the page. So he was now on the website for a total of 5 minutes. However, Analytics only knows about the 1st minute and will only include this 1 minute in the statistics. Because when you leave the page, nothing is fired. As written above: We measure time with the time distance between two pages. Time distance between 1st and 2nd page: 1 minute. Temporal distance between 2nd and exit: Not measurable, because the next page is missing. And most users are not aware of this. The time a user spends on the last page is not measured.

Can’t Analytics measure when a user leaves the page? No, it can’t, no matter which system. At least not in the standard installation. Of course, these can be adjusted. Otherwise: If a user comes to a website and only looks at one page, then no time is measured. Even if he spends 10 minutes on this one side, it does not flow into the average session duration. Since a one-page visit is not uncommon, quite a lot of data is missing.

Is that really that bad?

Does that really make that much difference? Yes, it does. It is often said of the astonished user that with the figures given, one would at least have a clue. What can you do with a clue that is completely wrong? Of course, no one likes to admit that all previous data was wrong and so were the decisions based on it.

To illustrate the differences, here are a few data points. Until August 2017, the average session duration on my site was 1 minute on average (red line, here compared to the previous year). As of August 2017, the average session duration increases to 5 minutes (blue line). The time on site has increased fivefold and also seems much more realistic, since most of the content on my site cannot be read in 1 minute. However, even these 5 minutes are not the actual average session time, but only an approximation.

How do you get better figures?

How is it that more of time is now being measured? In another article I had written about measuring the scroll depth, and here an event is fired when reaching 25%, 50%, 75% and 100% of the page length. With each of these events, a timestamp is also sent. If a user scrolls down, a period of time is also measured from the last page of a visit, until the last event is triggered. It’s not unlikely that users will spend even more time on this page, as they may read something in the bottom section but stop scrolling.

Why, the question could now be, isn’t an event simply triggered every second as long as the user is on the site? Then you would have an exact duration of the meeting. Technically, this is actually possible, but with Google Analytics, for example, the free version allows a maximum of 10 million hits per month (hit = server call), with Adobe every single hit is billed. So I should allow myself 333,333 hits per day with Google Analytics, and if we assume an actual average session duration of 6 minutes (360 seconds), then I should have less than 1,000 users per day, so that I don’t get the juice turned off. And we haven’t measured anything else with that. Even with the scroll depth measurement, so many server calls would already be triggered on many pages that you simply can’t afford it. Here, however, at least a random sample of users can be measured in order to at least get an approximation worthy of the name.

Why use the average session duration at all?

This metric is often used when there are no “hard” conversions, for example when awareness is one of the marketing goals and only users should initially come to the site. But maybe a lot of time is only wasted because the desired information is not found, but urgently needed (have you ever looked for a driver on hp.com?); in other words, maybe a shorter time is even better?

As always, it’s a question of how well segmenting is done. For hp.com, a metric like “time to download” would be good, for a content-only page, the scroll depth paired with the time spent on the page would be a good indicator of how well the content is interacted. In addition, it would be necessary to take into account how much content is available on a page. This can be done with Custom Dimensions, for example. My favorite saying: Every minute that a user spends on my customer’s site, he cannot spend on the website of his market competitor.

However, the real average session duration is also exciting because the concept of holistic landing pages is drawing ever wider circles. Since Google sometimes also receives signals about how long someone has been on a page (for example, by returning to a search results page), every search engine optimizer should also be interested in how long someone was actually on a page and which parts of the holistic content were actually read (after all, the content is written for users and not for GoogleBot… or?

Result

The average session duration or time on site in any analytics system provides incorrect numbers in a standard installation, which is clear to very few users. This can be remedied by triggering events, for example for scroll tracking. As is so often the case: A Fool with a Tool is still a Fool. As long as you don’t deal with how a tool measures something, you shouldn’t be surprised if the conclusions drawn from it are wrong. This applies to analytics as well as to Google Trends or Similar Web

Is my content being read? Scroll depth per article as conversion


In September 2017, I wrote that the scroll depth would be a better indicator of whether a piece of content has been read than the pure session duration, which is nonsense anyway. A month later, Google then released a new feature in Google Tag Manager, a trigger for the visibility of elements (the note was missing in the German version of the release notes). This compensates for some disadvantages of the scroll depth approach, especially the restriction that not every page is the same length and “75% read” does not always mean that the content was read to the end (75% was chosen because many pages have an immense footer and therefore users do not scroll down 100%). A page on mine has so many comments that they make up more than half of the content. Continue reading “Is my content being read? Scroll depth per article as conversion”

The optimal tracking concept or The sailing trip without a destination


How often have I heard the sentence “Let’s just track everything, we can think about what we actually need later. But of course the tracking concept can already be written!”

Let’s imagine we want to go on a trip with a sailboat and we said “I don’t know where we want to go, let’s just take everything we could need for all eventualities”. Our boat would sink before the trip has begun. We would not know whether we would have to take water and canned food with us for a day or several weeks, whether we would need winter clothes or summer clothes and so on. But to be on the safe side, we just buy the whole sailing supply store empty, we will need some of it. And now we have more than the ship can bear in terms of load.

Likewise, you can’t track everything that may be needed. Or maybe it is, but that would not only be very expensive. It would also make the website virtually unusable for users. More on that later. The bad news for all those who are looking for a simple solution to a difficult question: A tracking concept requires a lot of brainpower. If you don’t, you collect useless data in most cases and burn time and money. Just as we have to think about what we want to take with us on the sailing trip, depending on the destination.

No tracking concept without clear goals

First of all, there is no way around defining goals, SMART goals, i.e. what by when, etc. For example, 100,000 new customers in a quarter or €500,000 in sales in a quarter. That is our destinationKPIstell us where we are on the way to this goal. Similar to a nautical chart, on which we determine our position through navigation instruments and adjust the route if we have strayed from the destination.

If I realize that I probably won’t reach my goal of 100,000 new customers, then I want to know what screws I need to turn so that I can take corrective action. But at least I would like to understand why this is so. Maybe I have to look for another goal because my actual goal doesn’t make sense at the moment. Because if I see that there is a storm in front of my destination port, then there may be another port. Through this we may then be able to reach our actual destination later. If I don’t reach the sales target because the return rate is higher than expected, I want to understand the cause. I won’t identify them with a standard implementation of Google Analytics.

All data and the information to be derived from it have only one meaning. We want to understand what action we can derive from the data. If a piece of information is only interesting, but does not offer any relevance to action, then the data has very likely been collected unnecessarily. At sea, I’m not interested in the weather forecast from two days ago. Nevertheless, such data is written in reports, after all, you have them, they will be good for something, we will notice that later. In the same way, we sail across the sea with our overloaded boat rather badly than right and tell ourselves that we will need the stuff at some point, we just have to get into the situation first.

On the impossibility of being prepared for everything

Space is limited on a boat, and all material has to find its place. This also applies to a tracking tool. For a shop, a connection to a CRM would certainly be interesting, so that the customer lifetime value etc. can be determined. Most likely, you will also want to work with custom dimensions in Google Analytics, so that data from the CRM can be used in Analytics for segmentation.

But how am I supposed to know which custom dimensions need to be defined if I don’t even know if and which ones I will need later? Especially if the number of custom dimensions is also limited? Custom dimensions are a fundamental decision, similar to a change to the boat that cannot be undone. Because a custom dimension can no longer be deleted.

Every event is a small program that creates load

Each piece of material has weight and changes the sailing characteristics of a boat, to the point of overloading. And of course, you can also use a tracking tool to trigger an event in the browser every second to see how long a user has been doing what on a page. But running events is running small programs in the browser, and a lot of load is not good, neither for the browser nor for the user. One of them will give up, the only question is who comes first.

So a tracking concept can really only be written when the goals and KPIs are clear. Unfortunately, the definition of it is an exhausting story. The good thing is that once this task has been completed, an actionable reporting dashboard can also be built. Numbers are no longer reported just because they can be reported, but because they provide added value. However, most dashboards are far from that. And so most sailboats are driven more at will, feeling and visibility. Except that we don’t put our lives at risk in online marketing.

Of course, you can make a stopover later on the route in a harbor and adjust the provisions, equipment and boat, because you realize that it doesn’t work that way. But then I lost not only time, but also a lot of money. The same applies to the tracking concept. If I don’t think about it upfront, then I’ve invested a lot of time and money in an enormously complex implementation without being able to use any of it the way I actually need it.

What is the standard for tracking?

“And if we just do what you do? There will be some standards.” The comparison with the sailing trip also fits here: What is the average sailing trip like? I have hardly seen a tracking concept that is the same as the other, even in the same industry. And so no two sailing trips are the same, because every boat is a little different, the crew is different, etc.

If you want to avoid the definition of the destination, you just want to set off to signal movement, but will notice at sea at the latest that you will not be able to sail through. Or he hopes that no one notices. At some point, however, someone will notice that no one is really interested in the numbers because they are completely irrelevant.

If you don’t know the port you want to sail to, no wind is the right one. (Seneca)