Python&R&Julia: COVID-19のデータでスクリプトを覚える ver. 10

I.COVID-19の集計

 

簡単な集計をするにしてもグラフを描くにしても、RのTidyverseならだいたいできますが、PythonやJuliaはスクリプトを覚えていないので、いちいち調べなければならずめんどうです。

そこで、手書きCOVID-19の最新データの集計を繰り返し、それぞれのスクリプトを暗記しようと思います。

次の項目のスクリプトを書きます。

 

1.世界

 

1_1.世界の感染者数と死亡者数の推移

1_2.感染者上位10ヵ国

1_3死亡者上位10ヵ国

 

2.選んだ国

 

2_1.選択した国の直近の感染者数

2_2.選択した国の直近の死亡者数

2_3.選択した国の検査数に対する感染者数

2_4.選択した国の人口に対する感染者数

2_5.選択した国の人口に対する死亡者数

 

II.Python

 

1.世界

 

スクリプト

 

# import libraries 
import pandas as pd
from matplotlib import pyplot as plt
from covid19dh import covid19
# load the data
x = covid19(verbose=False)
x
# show column nmees, also you can use list(x.columns)
for col in x.columns: 
    print(col)
# group the data by day, and take the sum for each day
world = x.groupby('date').sum()
world
# create a multiple line plot for cinfirmed and deaths
world.reset_index().plot('date',['confirmed','deaths'],
                         kind = 'line',
                         title = "Confirmed and Deaths in the World 20200718")
# Set the x and y-axis label
plt.xlabel('Day')
plt.ylabel('Number')
plt.show()

# confirmed top 10 countries
confirmed10 = x.query('date == "2020-07-18"')[['iso_alpha_3', 'confirmed']].sort_values(by='confirmed', ascending=False).head(10)
confirmed10
# create a bar plot for confirmed
# Figure Size 
confirmed10.plot(x = 'iso_alpha_3', 
                     y = 'confirmed',
                     kind = 'bar',
                     title = 'Confirmed Top 10 Countries 20200718',
                     rot=0
                     )

# deaths top 10 countries
deaths10 = x.query('date == "2020-07-18"')[['iso_alpha_3', 'deaths']].sort_values(by='deaths', ascending=False).head(10)
deaths10
# create a bar plo for deaths
deaths10.plot.bar(x = 'iso_alpha_3',
                  y = 'deaths',
                  title = 'Deaths Top 10 Countries 20200718',
                  rot=0)

 

1_1.世界の感染者数と死亡者数の推移

グラフの1.4 le7は、14,000,000のことです。

 

1_2.感染者上位10ヵ国

1_3死亡者上位10ヵ国

 

2.選んだ8ヵ国

 

スクリプト

 

# infected in the selected countries
selected8_infected = x.query('date == "2020-07-18" and iso_alpha_3 in ["AUS","KOR","JPN","NPL","PER","TWN","USA","VNM"]')[['iso_alpha_3', 'confirmed']].sort_values(by='confirmed', ascending=False).head(10)
selected8_infected
selected8_infected.plot.bar(x = 'iso_alpha_3',
                            y = 'confirmed',
                            title = 'Confirmed in 8 Countries 20200718',
                            rot=0,
                            logy=True)  
# deaths in the selected countries
selected8_deaths = x.query('date == "2020-07-18" and iso_alpha_3 in ["AUS","KOR","JPN","NPL","PER","TWN","USA","VNM"]')[['iso_alpha_3', 'deaths']].sort_values(by='deaths', ascending=False).head(10)
selected8_deaths
selected8_deaths.plot.bar(x = 'iso_alpha_3',
                          y = 'deaths',
                          title = 'Confirmed in 8 Countries 20200718',
                          rot=0,
                          logy=True)

 

2_1.選択した国の直近の感染者数
対数をとっています。

 

2_2.選択した国の直近の死亡者数
対数をとっています。

Not complete

 

III.R

 

1.世界

 

スクリプト

 

# install.packages("COVID19") 
# load the package
library(COVID19)
library(tidyverse)
# Worldwide data by country
x <- covid19()
# column names
colnames(x)

## confirmed in the World
world_byday <- x %>%
  select(date, confirmed, deaths) %>% 
  group_by(date) %>% 
  summarise(
    confirmed_byday = sum(confirmed),
    deaths_byday = sum(deaths)
    )
# convert the data from wide to long
world_byday_long <- world_byday %>% 
  pivot_longer(
    confirmed_byday:deaths_byday,
    names_to = "var1",
    values_to = "number"
  )
# create a mutiple line chart
ggplot(world_byday_long, aes(x=date, y=number, colour=var1)) +
  geom_line() +
  labs(title = "Confirmed and Deaths in the World 20200720 ")

## cinfirmed top 10 countries
confirmed10 <- x %>% 
  select(id, date, administrative_area_level_1, confirmed) %>% 
  filter(date == "2020-07-20") %>% 
  arrange(desc(confirmed)) %>%
  head(10)
# create a bar chart
p1 <- ggplot(confirmed10, aes(x=reorder(administrative_area_level_1, confirmed), confirmed,
                        fill=administrative_area_level_1))
p1 + geom_bar(stat="identity") +
  guides(fill=FALSE) +
  xlab("Country") +
  ylab("Confirmed") +
  ggtitle("COVID-19 TOTAL CONFIRMED 20200720") +
  coord_flip()

## deaths top 10 countries
deaths10 <- x %>% 
  select(id, date, administrative_area_level_1, deaths) %>% 
  filter(date == "2020-07-20") %>% 
  arrange(desc(deaths)) %>%
  head(10)
# create a bar chart
p2 <- ggplot(deaths10, aes(x=reorder(administrative_area_level_1, deaths), deaths,
                           fill=administrative_area_level_1))
p2 + geom_bar(stat="identity") +
  guides(fill=FALSE) +
  xlab("Country") +
  ylab("Deaths") +
  ggtitle("COVID-19 TOTAL Deaths 20200720") +
  coord_flip()

## Confirmed in the 8 countries
countries8_confirmed <- x %>%
  select(id, date, iso_alpha_3, administrative_area_level_1, confirmed) %>% 
  filter(date == "2020-07-20" & iso_alpha_3 %in% c("DEU","KOR","JPN","NPL","PER","TWN","USA","VNM"))
# create a bar chart
p3 <- ggplot(countries8_confirmed, aes(x=reorder(administrative_area_level_1, confirmed), confirmed, fill=administrative_area_level_1))
p3 + geom_bar(stat="identity") +
  guides(fill=FALSE) +
  xlab("Country") +
  ylab("Confirmed") +
  ggtitle("Confirmed 20200720") +
  coord_flip()
## Deaths in the 8 countries 
countries8_deaths <- x %>%
  select(id, date, iso_alpha_3, administrative_area_level_1, deaths) %>% 
  filter(date == "2020-07-20" & iso_alpha_3 %in% c("DEU","KOR","JPN","NPL","PER","TWN","USA","VNM"))
# create a bar chart
p4 <- ggplot(countries8_deaths, aes(x=reorder(administrative_area_level_1, deaths), deaths, fill=administrative_area_level_1))
p4 + geom_bar(stat="identity") +
  guides(fill=FALSE) +
  xlab("Country") +
  ylab("Deaths") +
  ggtitle("Deaths 20200720") +
  coord_flip()

## confirmed/population*100000
confirmed8_bypop <- x %>% 
  select(id, date, iso_alpha_3, administrative_area_level_1, confirmed, population) %>% 
  filter(date == "2020-07-20" & iso_alpha_3 %in% c("DEU","KOR","JPN","NPL","PER","TWN","USA","VNM")) %>% 
  summarize(country = administrative_area_level_1, confirmed_by100000 = confirmed/population*100000) %>% 
  arrange(desc(confirmed_by100000))
# create a bar plot 
p5 <- ggplot(confirmed8_bypop, mapping = aes(x=reorder(country, confirmed_by100000), confirmed_by100000,
                                          fill = country))
p5 + geom_bar(stat="identity") +
  guides(fill=FALSE) +
  xlab("Country") +
  ylab("Confirmed") +
  ggtitle("Confirmed/Poplulation(100,000) 20200720") +
  coord_flip()

 

1_1.世界の感染者数と死亡者数の推移

 

1_2.感染者上位10ヵ国

 

1_3死亡者上位10ヵ国

 

2.選んだ8ヵ国

 

スクリプト

 

## Confirmed in the 8 countries
countries8_confirmed <- x %>%
  select(id, date, iso_alpha_3, administrative_area_level_1, confirmed) %>% 
  filter(date == "2020-07-20" & iso_alpha_3 %in% c("AUT","KOR","JPN","NPL","PER","TWN","USA","VNM"))
# create a bar chart
p3 <- ggplot(countries8_confirmed, aes(x=reorder(administrative_area_level_1, confirmed), confirmed, fill=administrative_area_level_1))
p3 + geom_bar(stat="identity") +
  guides(fill=FALSE) +
  xlab("Country") +
  ylab("Confirmed") +
  ggtitle("Confirmed 20200720") +
  coord_flip()
## Deaths in the 8 countries 
countries8_deaths <- x %>%
  select(id, date, iso_alpha_3, administrative_area_level_1, deaths) %>% 
  filter(date == "2020-07-20" & iso_alpha_3 %in% c("AUT","KOR","JPN","NPL","PER","TWN","USA","VNM"))
# create a bar chart
p4<- ggplot(countries8_deaths, aes(x=reorder(administrative_area_level_1, deaths), deaths, fill=administrative_area_level_1))
p4 + geom_bar(stat="identity") +
  guides(fill=FALSE) +
  xlab("Country") +
  ylab("Deaths") +
  ggtitle("Deaths 20200720") +
  coord_flip()

 

2_1.選択した国の直近の感染者数

 

2_2.選択した国の直近の死亡者数

 

2_4.選択した国の人口に対する感染者数

AustriaとGermanyを入れ替えました。

To be continued.

About shibatau

I was born and grown up in Kyoto. I studied western philosophy at the University and specialized in analytic philosophy, especially Ludwig Wittgenstein at the postgraduate school. I'm interested in new technology, especially machine learning and have been learning R language for two years and began to learn Python last summer. Listening toParamore, Sia, Amazarashi and MIyuki Nakajima. Favorite movies I've recently seen: "FREEHELD". Favorite actors and actresses: Anthony Hopkins, Denzel Washington, Ellen Page, Meryl Streep, Mia Wasikowska and Robert DeNiro. Favorite books: Fyodor Mikhailovich Dostoyevsky, "The Karamazov Brothers", Shinran, "Lamentations of Divergences". Favorite phrase: Salvation by Faith. Twitter: @shibatau

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.