Kategorie: Data Science

Wie aus b4p und Statista alternative “Fakten” entstehen

Vorab: Ich bin kein komplett ausgebildeter Statistiker oder Marktforscher. Zwar beschäftige ich mich leidenschaftlich mit Zahlen und versinke auch in meiner Freizeit in Statistikbüchern, aber je mehr man weiß, desto mehr weiß man, was man nicht weiß. Die Weisheit habe ich nicht mit Löffeln gefressen, und ich bin immer dankbar, wenn Schlauere Feedback geben. Ich glaube aber, dass jeder einigermaßen klar denkende Mensch ohne Statistik-Grundkurs verstehen kann, wann Daten nicht sinnvoll erhoben wurden oder falsche Ableitungen erstellt werden. Continue reading

Filed under: Data ScienceTagged with: ,

Wie stark ist WordPress in den Suchergebnisseiten?

Nachdem die geschätzten Kollegen von Sistrix die Headline “Content-Marketing: Blogs sind für SEO ungeeignet” in den Äther gehauen hatten, musste ich auch erst zwei Mal lesen… wie können sie nur sowas behaupten? 🙂 Allerdings meinte der verehrte Hanns eben nicht WordPress als CMS, sondern Blogs als Content-Format. Für manche war das der Beweis, dass WordPress Quatsch ist, auch wenn das nicht die Intention des Posts war. Hatte jemand Zahlen? Nein. Also. Data trumps Opinion. Continue reading

Filed under: Data Science, SEO


With the Facebook-Cambridge Analytica scandal in mind, it is obvious that FB has lots of data that are of interest to researchers. Whilst Facebook regards itself as a targeting and not an insights platform (that’s what FB employees told me), using the FB Ad Manager allows advertisiers to see how interests are connected just by playing around with that data. However, no raw data is provided.

Filed under: Data Science


In order to create a good survey, you should

  • do a pre-study with experts
  • test your survey or interview with test users and asking them for feedback
Filed under: Data Science

Google Optimize

Google Optimize is part of the Google Analytics 360 Suite, and the free version allows website owners to perform 5 tests simultaneously or using it as a means of personalization. Having said that, the free version does not allow to use Google Analytics segments so that onsite retargeting is impossible.

One of the main differences between Google Optimize and most of the other test tools out there is the use of Bayesian Inference in contrast to Frequentist methods; this will be further explained in the next section.

Filed under: Data Science

What is Web Analytics and what does it have to do with Data Science?

Web Analytics is a subset of data analysis, however, using also other data that do not come from a website alone. Often enough, other marketing data is connected, requiring additional knowledge about the increasing complexity of marketing technology. Without such expertise, the analysis and interpretation of such data is difficult if not impossible. And while the focus here has been on data mining and some basic statistics, we see more and more machine learning entering this space in order to cluster users and predict their behavior.

In addition, data questions become more complex and often cannot be answered with the Analytics packages alone anymore. As an example, a typical question could be what products are bought together in a shopping basket and how likely it is to see them together.

Filed under: Data Science


Data is the new oil (external link),, and data sciencist is the sexiest job of the 21st century (external link). But, in fact, the comparison with oil is not a good one, because oil is burnt and worthless after being used whereas data can become more valuable by being used. And the “sexyness” or R or Python is probably only seen by programmers, not by the average person.

In this course, we will look how data is collected on the web, how it is analyzed and what applications can be built based on this data.

Filed under: Data Science


SimilarWeb claims to provide data about web site visitors of almost every site on the web. According to their website, they are using panels, ISP traffic, Sites that submit their traffic directly to SimilarWeb, and crawlers (it is unclear what crawlers can do here to understand internet traffic of a site).

The panels are for example based on a browser plugin that every user can add to her or his browser. It will display some basic numbers from SimilarWeb for each site that this user visits. Having said that, it is obvious that users who would install such a browser extension will not be the average user, and it is has been proven that this data is skewed (see also this article written by Matt Cutts about Alexa, an Amazon service showing web traffic data, not to be confused with today’s Alexa).

Filed under: Data Science

Google AdWords

Google AdWords is a product for advertisers that enables them to bid on search terms where they want their ads to be displayed (in fact, AdWords is also used for Display advertising, not only for search). AdWords includes a Keyword Planner that allows advertisers to examine how often a search term has been searched for in the past. Looking at search volume development over several years enables researchers to identify patterns that may be used for predictions.

Filed under: Data Science