R

R is a software programming language and a software development environment, primarily used for statistics. In contrast to commercial statistics software packages such as SPSS, R is open source software. It is pointless to argue whether Python or R is the better language for data science since both languages have pros and cons. The biggest advantage of R is it’s ability to work with data in an interactive manner.

You can download R from the R Project site; in addition, it is highly recommended to also install RStudio, an IDE for R (IDE: Integrated Development Environment). You need R installed as a basis in order to be able to use RStudio.

Whilst RStudio allows you to import data with an easy-to-use menu, you should learn how to import data with the shell commands since this will allow you to understand what you are actually doing when you use the RStudio import function. As soon as you encounter problems using that function, knowing the fundamentals will help you to debug the import.

A very good introduction to R is provided by Garrett Grolemund and Hadley Wickham in their book R for Data Science; it is available for free online. The caveat here is that it uses the wonderful tidyverse packages (packages are extensions of R) that make it much easier to learn R but will make it a bit more difficult to work with teams that use other approaches such as the data.table package. Before you ask questions, please RTFM; manuals or documentation is called vignette in R.