Vorab: Ich bin kein komplett ausgebildeter Statistiker oder Marktforscher. Zwar beschäftige ich mich leidenschaftlich mit Zahlen und versinke auch in meiner Freizeit in Statistikbüchern, aber je mehr man weiß, desto mehr weiß man, was man nicht weiß. Die Weisheit habe ich nicht mit Löffeln gefressen, und ich bin immer dankbar, wenn Schlauere Feedback geben. Ich glaube aber, dass jeder einigermaßen klar denkende Mensch ohne Statistik-Grundkurs verstehen kann, wann Daten nicht sinnvoll erhoben wurden oder falsche Ableitungen erstellt werden. Continue reading
Kategorie: Data Science
Nachdem die geschätzten Kollegen von Sistrix die Headline “Content-Marketing: Blogs sind für SEO ungeeignet” in den Äther gehauen hatten, musste ich auch erst zwei Mal lesen… wie können sie nur sowas behaupten? 🙂 Allerdings meinte der verehrte Hanns eben nicht WordPress als CMS, sondern Blogs als Content-Format. Für manche war das der Beweis, dass WordPress Quatsch ist, auch wenn das nicht die Intention des Posts war. Hatte jemand Zahlen? Nein. Also. Data trumps Opinion. Continue reading
With the Facebook-Cambridge Analytica scandal in mind, it is obvious that FB has lots of data that are of interest to researchers. Whilst Facebook regards itself as a targeting and not an insights platform (that’s what FB employees told me), using the FB Ad Manager allows advertisiers to see how interests are connected just by playing around with that data. However, no raw data is provided.
In order to create a good survey, you should
- do a pre-study with experts
- test your survey or interview with test users and asking them for feedback
Google Optimize is part of the Google Analytics 360 Suite, and the free version allows website owners to perform 5 tests simultaneously or using it as a means of personalization. Having said that, the free version does not allow to use Google Analytics segments so that onsite retargeting is impossible.
One of the main differences between Google Optimize and most of the other test tools out there is the use of Bayesian Inference in contrast to Frequentist methods; this will be further explained in the next section.
Web Analytics is a subset of data analysis, however, using also other data that do not come from a website alone. Often enough, other marketing data is connected, requiring additional knowledge about the increasing complexity of marketing technology. Without such expertise, the analysis and interpretation of such data is difficult if not impossible. And while the focus here has been on data mining and some basic statistics, we see more and more machine learning entering this space in order to cluster users and predict their behavior.
In addition, data questions become more complex and often cannot be answered with the Analytics packages alone anymore. As an example, a typical question could be what products are bought together in a shopping basket and how likely it is to see them together.
Data is the new oil (external link),, and data sciencist is the sexiest job of the 21st century (external link). But, in fact, the comparison with oil is not a good one, because oil is burnt and worthless after being used whereas data can become more valuable by being used. And the “sexyness” or R or Python is probably only seen by programmers, not by the average person.
In this course, we will look how data is collected on the web, how it is analyzed and what applications can be built based on this data.
SimilarWeb claims to provide data about web site visitors of almost every site on the web. According to their website, they are using panels, ISP traffic, Sites that submit their traffic directly to SimilarWeb, and crawlers (it is unclear what crawlers can do here to understand internet traffic of a site).
The panels are for example based on a browser plugin that every user can add to her or his browser. It will display some basic numbers from SimilarWeb for each site that this user visits. Having said that, it is obvious that users who would install such a browser extension will not be the average user, and it is has been proven that this data is skewed (see also this article written by Matt Cutts about Alexa, an Amazon service showing web traffic data, not to be confused with today’s Alexa).
Google AdWords is a product for advertisers that enables them to bid on search terms where they want their ads to be displayed (in fact, AdWords is also used for Display advertising, not only for search). AdWords includes a Keyword Planner that allows advertisers to examine how often a search term has been searched for in the past. Looking at search volume development over several years enables researchers to identify patterns that may be used for predictions.