data > opinion

Tom Alby

02 Daten importieren

2022-07-31


Zunächst laden Sie die Library Tidyverse, in der weitere Libraries enthalten sind:

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.8     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

Passen Sie den Pfad an, je nachdem, wo Sie die die Daten gespeichert haben:

IMDb_movies <- read_csv("data/IMDb_movies.csv")
## Warning: One or more parsing issues, see `problems()` for details
## Rows: 85855 Columns: 22
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (15): imdb_title_id, title, original_title, date_published, genre, count...
## dbl  (7): year, duration, avg_vote, votes, metascore, reviews_from_users, re...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Lassen Sie sich die ersten Zeilen des Datensatzes anzeigen:

head(IMDb_movies)
## # A tibble: 6 × 22
##   imdb_title…¹ title origi…²  year date_…³ genre durat…⁴ country langu…⁵ direc…⁶
##   <chr>        <chr> <chr>   <dbl> <chr>   <chr>   <dbl> <chr>   <chr>   <chr>  
## 1 tt0000009    Miss… Miss J…  1894 1894-1… Roma…      45 USA     None    Alexan…
## 2 tt0000574    The … The St…  1906 1906-1… Biog…      70 Austra… None    Charle…
## 3 tt0001892    Den … Den so…  1911 1911-0… Drama      53 German… <NA>    Urban …
## 4 tt0002101    Cleo… Cleopa…  1912 1912-1… Dram…     100 USA     English Charle…
## 5 tt0002130    L'In… L'Infe…  1911 1911-0… Adve…      68 Italy   Italian France…
## 6 tt0002199    From… From t…  1912 1913    Biog…      60 USA     English Sidney…
## # … with 12 more variables: writer <chr>, production_company <chr>,
## #   actors <chr>, description <chr>, avg_vote <dbl>, votes <dbl>, budget <chr>,
## #   usa_gross_income <chr>, worlwide_gross_income <chr>, metascore <dbl>,
## #   reviews_from_users <dbl>, reviews_from_critics <dbl>, and abbreviated
## #   variable names ¹​imdb_title_id, ²​original_title, ³​date_published, ⁴​duration,
## #   ⁵​language, ⁶​director
## # ℹ Use `colnames()` to see all variable names