**This is the (d)english version, pure language versions will follow soon. Work in Progress!**

This is a growing collection of data science, data analysis and web analytics information and resources for my course at the HAW. Some parts of the script will be published here. Please find the EMIL room for the summer term 2017 here.

# What is Data Science?

There is no official definition of **Data Science** (similar to “Big Data”); we will regard data science as the combination of different disciplines such as data mining, statistics and machine learning in order to derive information from data automatically. Whilst many of the approaches used in these fields have existed for a long time already, more and more free programming libraries and cheap computing time and storage space from AWS have been enabling more people to use the power of coping with huge amounts or complex data.

**Data Analytics** or **Data Analysis** can be regarded as a subset of Data Science, setting the focus on the analysis of data. Being very similar to statistics, the term “data analysis” is sometimes regarded as old wine in new bottles. The existence of huge and complex data, often termed as “big data”, is not required for data analysis. Most often, quality is more restricting than quantity. In fact, there is no official definition of “big data”, and just because it is “a lot of data”, it should still not be called “Big” data. Some people even say, there is no thing such as big data.

**Web Analytics** is a subset of data analysis, however, using also other data that do not come from a website alone. Often enough, other marketing data is connected, requiring additional knowledge about the increasing complexity of marketing technology. Without such expertise, the analysis and interpretation of such data is difficult if not impossible. And while the focus here has been on data mining and some basic statistics, we see more and more machine learning entering this space.

# What we will cover

- What exactly is data?
- Basic Concepts
- Statistics Basics
- Different flavours of statistics
- Mean, Median und Mode
- Distribution, Variance und Standard Deviation
- Hypothesis Testing (Signifikanz-Tests in German)
- Correlation
- Regression

- Data Science Tools
- Data Acquisition
- Open Data
- Surveys
- Log Files
- Cookies and Pixels
- Fingerprinting
- Log Ins
- Web Analytics systems: Google Analytics / Piwik
- Tag Management Systems
- Scraping/Crawling

- Basic Data Science Approaches
- Supervised and unsupervised Learning
- Classification and Clustering
- Predictive Analytics
- Support Vector Machines and Neural Networks

- Data Science in Online Marketing
- Important Terms
- Online Marketing Basics
- Attribution
- Recommendation Systems

- Data Science Ressources