The spread of COVID-19 across countries visualization with R

栏目: IT技术 · 发布时间: 5年前

内容简介:COVID-19 or Coronavirus pandemic has a huge and unpredictable effect on our lives. I wanted to see the speed and spreading of the virus across countries. And the following is what and how I’ve seen:The animated visualization focuses on the chronology of vi

COVID-19 or Coronavirus pandemic has a huge and unpredictable effect on our lives. I wanted to see the speed and spreading of the virus across countries. And the following is what and how I’ve seen:

The spread of COVID-19 across countries visualization with R

The animated visualization focuses on the chronology of virus distribution that started in China and spread globally. For strengthening a visual effect I placed countries (top 90 of all) in two semidiagonals, based on the date when each country reached the peak daily cases of the disease (dark red grid).

For a more detailed analysis, I’ve created two stationary charts. The first is the same as the animated one but countries are ordered from bottom to top.

The spread of COVID-19 across countries visualization with R

The second centered on a day of maximum amount cases and shows how long and intensive were previous and next stages. It gives an opportunity to compare the effectiveness of different countries.

The spread of COVID-19 across countries visualization with R All values of new cases for each country were normalized via min/max normalization and ranged from 0 to 1. You can use the following R code with comments to play with the public dataset:

click to expand R code
library(tidyverse)
library(reshape2)
library(purrrlyr)

# download dataset
df <- read_csv(url('https://covid.ourworldindata.org/data/ecdc/full_data.csv'))

# normalization function
fun_normalize <- function(x) {
        return ((x - min(x)) / (max(x) - min(x)))
}

# preprocess data
df_prep <- df %>%
        filter(location != 'World') %>%
        
        group_by(location) %>%
        # remove earlier dates
        filter(date > as.Date('2020-01-15', format = '%Y-%m-%d')) %>%
        # remove coutries with less than 1000 total cases
        filter(max(total_cases) > 1000) %>%
        # replace negative values with the mean 
        mutate(new_cases = ifelse(new_cases < 0,
                                  round((lag(new_cases, default = 0) + lead(new_cases, default = 0)) / 2),
                                  new_cases)) %>%
        ungroup() %>%
        select(location, date, new_cases) %>%
        # prepare data for normalization
        dcast(., date ~ location, value.var = 'new_cases') %>%
        # replace NAs with 0
        dmap_at(c(2:ncol(.)), function(x) ifelse(is.na(x), 0, x)) %>%
        # normalization
        dmap_at(c(2:ncol(.)), function(x) fun_normalize(x)) %>%
        melt(., id.vars = c('date'), variable.name = 'country') %>%
        mutate(value = round(value, 6))

        
# define countries order for plots
country_ord_1 <- df_prep %>%
        group_by(country) %>%
        filter(value == 1) %>%
        ungroup() %>%
        arrange(date, country) %>%
        distinct(country) %>%
        mutate(is_odd = ifelse((row_number() - 1) %% 2 == 0, TRUE, FALSE))

country_ord_anim <- bind_rows(country_ord_1 %>%
                                      filter(is_odd == TRUE) %>%
                                      arrange(desc(row_number())),
                              country_ord_1 %>%
                                      filter(is_odd == FALSE))
        
# data for animated plot
df_plot_anim <- df_prep %>%
        mutate(country = factor(country, levels = c(as.character(country_ord_anim$country)))) %>%
        group_by(country) %>%
        mutate(first_date = min(date[value >= 0.03])) %>%
        mutate(cust_label = ifelse(date >= first_date, as.character(country), '')) %>%
        ungroup()


# color palette
cols <- c('#e7f0fa','#c9e2f6', '#95cbee', '#0099dc', '#4ab04a', '#ffd73e', '#eec73a', '#e29421', '#e29421', '#f05336', '#ce472e')


# Animated Heatmap plot
p <- ggplot(df_plot_anim, aes(y = country, x = date, fill = value)) +
        theme_minimal() +
        geom_tile(color = 'white', width = .9, height = .9) +
        scale_fill_gradientn(colours = cols, limits = c(0, 1),
                             breaks = c(0, 1),
                             labels = c('0', 'max'),
                             guide = guide_colourbar(ticks = T, nbin = 50, barheight = .5, label = T, barwidth = 10)) +
        
        geom_text(aes(x = first_date, label = cust_label), size = 3, color = '#797D7F') +
        scale_y_discrete(position = 'right') +
        coord_equal() +
        
        theme(legend.position = 'bottom',
              legend.direction = 'horizontal',
              plot.title = element_text(size = 20, face = 'bold', vjust = 2, hjust = 0.5),
              axis.text.x = element_text(size = 8, hjust = .5, vjust = .5, face = 'plain'),
              axis.text.y = element_blank(),
              axis.title.y = element_blank(),
              panel.grid.major = element_blank(),
              panel.grid.minor = element_blank()
              ) +
        ggtitle('The spread of COVID-19 across countries: new daily cases normalized to location maximum')


# animated chart
library(gganimate)
library(gifski)

anim <- p + 
        transition_components(date) +
        ggtitle('The spread of COVID-19 across countries: new daily cases normalized to location maximum',
                subtitle = 'Date {frame_time}') +
        shadow_mark()

animate(anim,
        nframes = as.numeric(difftime(max(df_plot_anim$date), min(df_plot_anim$date), units = 'days')) + 1,
        duration = 12,
        fps = 12,
        width = 1000,
        height = 840,
        start_pause = 5,
        end_pause = 25,
        renderer = gifski_renderer())
anim_save('covid-19.gif')



# Heatmap plot 1
df_plot_1 <- df_prep %>%
        mutate(country = factor(country, levels = c(as.character(country_ord_1$country)))) %>%
        group_by(country) %>%
        mutate(first_date = min(date[value >= 0.03])) %>%
        ungroup()

ggplot(df_plot_1, aes(y = country, x = date, fill = value)) +
        theme_minimal() +
        geom_tile(color = 'white', width = .9, height = .9) +
        scale_fill_gradientn(colours = cols, limits = c(0, 1),
                             breaks = c(0, 1),
                             labels = c('0', 'max'),
                             guide = guide_colourbar(ticks = T, nbin = 50, barheight = .5, label = T, barwidth = 10)) +
        
        geom_text(aes(x = first_date, label = country), size = 3, color = '#797D7F') +
        scale_y_discrete(position = 'right') +
        coord_equal() +
        
        theme(legend.position = 'bottom',
              legend.direction = 'horizontal',
              plot.title = element_text(size = 20, face = 'bold', vjust = 2, hjust = 0.5),
              axis.text.x = element_text(size = 8, hjust = .5, vjust = .5, face = 'plain'),
              axis.text.y = element_text(size = 6, hjust = .5, vjust = .5, face = 'plain'),
              panel.grid.major = element_blank(),
              panel.grid.minor = element_blank()
        ) +
        ggtitle('The spread of COVID-19 across countries: new daily cases normalized to location maximum')


# Heatmap plot 2
df_plot_2 <- df_prep %>%
        group_by(country) %>%
        filter(date >= min(date[value > 0])) %>%
        arrange(date, .by_group = TRUE) %>%
        mutate(centr_day = min(row_number()[value == 1]),
               n_day = row_number() - centr_day) %>%
        ungroup()

country_ord_2 <- df_plot_2 %>%
        group_by(country) %>%
        filter(date >= min(date[value == 1])) %>%
        summarise(value = sum(value)) %>%
        ungroup() %>%
        arrange(value, country) %>%
        distinct(country)

df_plot_2 <- df_plot_2 %>%
        mutate(country = factor(country, levels = c(as.character(country_ord_2$country)))) %>%
        group_by(country) %>%
        mutate(first_date = min(n_day[value >= 0.01])) %>%
        ungroup()



# Heatmap plot 2
ggplot(df_plot_2, aes(y = country, x = n_day, fill = value)) +
        theme_minimal() +
        geom_tile(color = 'white', width = .9, height = .9) +
        scale_fill_gradientn(colours = cols, limits = c(0, 1),
                             breaks = c(0, 1),
                             labels = c('0', 'max'),
                             guide = guide_colourbar(ticks = T, nbin = 50, barheight = .5, label = T, barwidth = 10)) +
        
        geom_text(aes(x = first_date, label = country), size = 3, color = '#797D7F') +
        coord_equal() +
        
        theme(legend.position = 'bottom',
              legend.direction = 'horizontal',
              plot.title = element_text(size = 20, face = 'bold', vjust = 2, hjust = 0.5),
              axis.text.x = element_text(size = 8, hjust = .5, vjust = .5, face = 'plain'),
              #axis.text.y = element_text(size = 6, hjust = .5, vjust = .5, face = 'plain'),
              axis.text.y = element_blank(),
              axis.title.y = element_blank(),
              panel.grid.major = element_blank(),
              panel.grid.minor = element_blank()
        ) +
        ggtitle('Comparison of different countries effectiveness against COVID-19 
                (new daily cases normalized to location maximum and data centered on a day with maximum new cases)')

以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

C#本质论

C#本质论

米凯利斯 / 周靖 / 人民邮电出版社 / 2010-9 / 99.00元

《C#本质论(第3版)》是一部好评如潮的语言参考书,作者用一种非常合理的方式来组织《C#本质论(第3版)》的内容,由浅人深地介绍了C#语言的各个方面。全书共包括21章及6个附录,每章开头的“思维导图”指明了本章要讨论的主题,以及各个主题之间的层次关系。书中所包含的丰富的示例代码和精要的语言比较,都有助于读者理解C#语言。《C#本质论(第3版)》首先介绍了C#语言的基础知识,随后深人讲解了泛型、迭代......一起来看看 《C#本质论》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

MD5 加密
MD5 加密

MD5 加密工具