Data Science & Data Analysis

Work in Progress!

This is a growing collection of data science, data analysis and web analytics information and resources for my course at the HAW. Some parts of the script will be published here.

What is Data Science?

There is no official definition of Data Science (similar to „Big Data“); we will regard data science as the combination of different disciplines such as data mining, statistics and machine learning in order to derive information from data automatically. Whilst many of the approaches used in these fields have existed for a long time already, more and more free programming libraries and cheap computing time and storage space from AWS have been enabling more people to use the power of coping with huge amounts or complex data.

Data Analytics or Data Analysis can be regarded as a subset of Data Science, setting the focus on the analysis of data. Being very similar to statistics, the term „data analysis“ is sometimes regarded as old wine in new bottles. The existence of huge and complex data, often termed as „big data“, is not required for data analysis. Most often, quality is more restricting than quantity.

Web Analytics is a subset of data analysis, however, using also other data that do not come from a website alone. Often enough, other marketing data is connected, requiring additional knowledge about the increasing complexity of marketing technology. Without such expertise, the analysis and interpretation of such data is difficult if not impossible. And while the focus here has been on data mining and some basic statistics, we see more and more machine learning entering this space.

What we will cover

  • What exactly is data?
  • Basic Concepts
    • Understanding the business problem
    • Preparation Phase: Acquiring and checking the data
    • Analysis Phase: Building models
    • Reflection Phase: Reviewing results and looking at alternative models
    • Dissemination Phase: Reporting results
  • Grundlagen der Statistik
  • Data Science Tools
  • Basic Data Science Approaches
    • Supervised and unsupervised Learning
    • Classification and Clustering
    • Predictive Analytics
    • Support Vector Machines and Neural Networks
  • Data Science im Online Marketing
    • Wichtige Begriffe
    • Bereiche im Online Marketing
    • Statische und Dynamische Attribution
    • Analytics-Systeme
    • Anwendungs-Beispiel: Clustering von Kunden
  • Data Science Ressources