GGally与pairs相关关系图_史上最全(一)

栏目: 编程工具 · 发布时间: 5年前

内容简介:作者:四川大学在读研究生

GGally与pairs相关关系图_史上最全(一)

作者: 李誉辉

四川大学在读研究生

简介

对于多个变量之间的相关关系,常常使用相关关系图来可视化,R自带有 pairs() 函数,

可以画相关关系图,但是比较复杂,我们先介绍基于 ggplot2 GGally 包。

等介绍完,再介绍 pairs()

函数。

1.ggmatrix()

ggmatrix() 可以将多个 ggplot2 绘图对象,按照矩阵进行排列。

1.1

矩阵第1列

 1library(ggplot2)
2data(tips, package = "reshape")
3
4head(tips)
5
6g1 <- ggplot(tips, aes(x = total_bill, fill = sex)) +
7 geom_density(show.legend = FALSE)
8
9g2 <- ggplot(tips, aes(x = total_bill, fill = sex)) +
10 geom_histogram(position = position_stack(), show.legend = FALSE) +
11 facet_grid(rows = vars(time))#time变量行分面
12
13g3 <- ggplot(tips, aes(x = total_bill, y = tip, color = sex)) +
14 geom_point(show.legend = FALSE)
15

GGally与pairs相关关系图_史上最全(一)

1.2

矩阵第2列

 1library(ggplot2)
2
3g4 <- ggplot(tips, aes(x = time, y = total_bill, fill = sex)) +
4 geom_boxplot(show.legend = FALSE)
5
6g5 <- ggplot(tips, aes(x = time, fill = sex)) +
7 geom_bar(position = position_stack(), show.legend = FALSE)
8
9g6 <- ggplot(tips, aes(x = tip, fill = sex)) +
10 geom_histogram(position = position_stack(), show.legend = FALSE) +
11 coord_flip() +
12 facet_grid(cols = vars(time))
13

1.3

矩阵第3列

 1library(ggplot2)
2library(dplyr)
3library(tibble)
4
5# 第一个图
6text_1 <- round(cor(tips$total_bill, tips$tip), 3)
7tips_female <- as.tibble(tips) %>
% filter(sex == "Female") %>% as.data.frame()
8tips_male <- as.tibble(tips) %>% filter(sex == "Male") %>% as.data.frame()
9text_2 <- round(cor(tips_female$total_bill, tips_female$tip), 3)
10text_3 <- round(cor(tips_male$total_bill, tips_male$tip), 3)
11mytext <- c(text_1, text_2, text_3)
12mytext <- paste0(c("Cor", "Female", "Male"), ":", mytext)
13mytext <- data.frame(text = mytext,
14 x = 5,
15 y = c(6, 4, 2),
16 stringsAsFactors = FALSE)
17
18g7 <- ggplot(data = mytext[-1, ], aes(x = x, y = y, label = text, color = text)) +
19 geom_text(show.legend = F) +
20 geom_text(data = mytext[1,], aes(x = x, y = y, label = text),
21 color = "black")
22
23rm(text_1, tips_female, tips_male, text_2, text_3, mytext)
24
25# 第2个图
26g8 <- ggplot(tips, aes(x = time, y = tip, fill = sex)) +
27 geom_boxplot(show.legend = FALSE) +
28 coord_flip()
29
30# 第3个图
31g9 <- ggplot(tips, aes(x = tip, fill = sex)) +
32 geom_density(show.legend = FALSE)
33

1.4

customLayout合并图形

 1library(customLayout)
2# 创建画布
3mylay <- lay_new(
4 mat = matrix(1:9, ncol = 3))
5
6plot_list <- list(g1, g2, g3, g4, g5, g6, g7, g8, g9)
7
8lay_grid(plot_list, mylay) # ggplot2绘图列表传参,传递到画布mylay
9
10rm(g1, g2, g3, g4, g5, g6, g7, g8, g9, mylay)

GGally与pairs相关关系图_史上最全(一)

1.5

ggmatrix合并图形

 1library(GGally)
2
3gg_m <- ggmatrix(
4 plots = plot_list, # 绘图对象列表
5 nrow = 3, ncol = 3, # 行数和列数
6 xAxisLabels = c("Total Bill", "Time of Day", "Tip"),
7 yAxisLabels = c("Total Bill", "Time of Day", "Tip"),
8 byrow = FALSE, # 按列排
9 title = "ggmatrix合并图形"
10)
11
12# 添加主题
13gg_m + theme_bw()
14
15# 提取子集,只能提取其中一个
16gg_m[1,2]
17
18rm(plot_list, gg_m)

GGally与pairs相关关系图_史上最全(一)

GGally与pairs相关关系图_史上最全(一)

2.ggpairs()

GGally 通过添加几个函数来扩展 ggplot2 ,以降低 geom 与转换数据组合的复杂性。

其中一些功能包括配对图矩阵,散点图矩阵,平行坐标图,生存图,以及绘制网络的几个函数。

2.1

语法及关键参数

语法:

 1ggpairs(data, mapping = NULL, columns = 1:ncol(data), title = NULL,
2 upper = list(continuous = "cor", combo = "box_no_facet", discrete =
3 "facetbar", na = "na"), lower = list(continuous = "points", combo =
4 "facethist", discrete = "facetbar", na = "na"), diag = list(continuous =
5 "densityDiag", discrete = "barDiag", na = "naDiag"), params = NULL, ...,
6 xlab = NULL, ylab = NULL, axisLabels = c("show", "internal", "none"),
7 columnLabels = colnames(data[columns]), labeller = "label_value",
8 switch = NULL, showStrips = NULL, legend = NULL,
9 cardinality_threshold = 15, progress = NULL,
10 legends = stop("deprecated"))

关键参数:

  • mapping , 表示要叠加到x,y上的 aes() 映射变量,这里是全局映射。

  • column , 表示选择要绘图的列,可以用变量索引值指定,也可以用变量名指定。

  • columnLabels , 指定矩阵列名称。

  • titlexlabylab , 表示指定标题和坐标轴名称。

  • lower , upper ,表示指定下三角和上三角的plot类型,列表传参。

  • diag ,表示指定对角线的plot类型,列表传参。

  • axisLabels , 指定变量名称的显示位置,默认显示在外侧,

    "internal" 则显示在内测, "none" 则不显示。

  • labeller , 表示指定分面标签,

  • switch , 表示指定分面标签位置,与 ggplot2:facet_grid 中一致,默认在顶部和右侧,

    switch = "x" ,则显示在底部和右侧,若 switch = "y" 则显示在顶部和左侧,

    swith = "both" 则显示在底部和左侧。

  • showStrips , 布尔运算决定是否显示plots的条带,默认NULL只显示顶部和右侧的条带。

    TRUE 则显示所有的条带, FALSE 则不显示所有的条带。

  • legend , 默认 NULL 不显示,可以通过 theme(legend.position = "bottom") 调整图例的位置。

    有3种指定图例类型的方式:

    • 长度为2的数字向量,表示给矩阵所在的行和列增加图例。如 c(2,3) 表示第2行第3列增加图例。

    • 长度为1的数字向量,表示根据矩阵的顺序,给相应的panel添加图例,

      legend=3 表示给1行第3列增加图例。

    • 预先使用 grab_legend() 提取 ggplot2 对象的图例,然后指定给 legend

  • cardinality_threshold , 表示允许因子变量的最大因子水平数量,默认最多15个因子水平。 NULL 则因子变量不会绘图。

  • progress , 表示是否显示进度条,默认 NULL 当超过15个plots时显示进度条,

    对绘图结果没有任何影响,不需要关注。

    TRUE 则显示进度条, FALSE 则不显示进度条,

    也可用 ggmatrix_progress() 生成进度条,然后指定。

plot类型:

通过5个参数控制plot类型: continuous , combo , discretnamapping

  • continuous , 表示如果变量x,y都是连续的,应该是什么plot。

    • 对于 lowerupper 参数:

      可以是

      "point""smooth" , "smooth_loess""density""cor""blank"

    • 对于 diag 参数: 可以是

      "densityDiag""barDiag""blankDiag"

  • combo , 表示如果变量一个连续,一个离散,应该是什么plot。

    只能用于lower和upper不能用于diag。

    离散变量只能计数,不能映射坐标,所以可能存在 坐标翻转

    • 可以是

      "box""box_no_facet""dot"

      "dot_no_facet" ,

      "facethist""facetdensity"

      "denstrip""blank"

  • discrete , 表示2个变量都是离散的,应该是什么plot。

    • 对于 upperlower 参数:

      可以是:

      "facetbar""ratio""blank"

    • 对于 diag 参数: 可以是 "barDiag""blankDiag"

  • na , 表示指定变量为 na 的情况,

    • 对于 lowerupper ,可以是: "na""blank"

    • 对于 diag ,可以是

      "naDiag""blankDiag"

  • mapping , 表示 aes() 映射。若指定 mapping 参数,则叠加到x,y上去。

  • 默认

    lower = list(continuous = "point", combo = "facetthist", discrete = "facetbar")

  • 默认

    upper = list(continuous = "cor", combo = "box_no_facet", discrete = "box")

  • 默认

    diag = list(continuous = "density", discrete = "barDiag")

2.2

column及columnLabels

 1library(GGally)
2library(ggplot2)
3
4ggpairs(tips, mapping = aes(color = sex),
5 columns = c("total_bill", "time", "tip"),
6 columnLabels = c("Total_Bill(连续变量)", "Time(离散变量)", "Tip(连续变量)"),
7 title = "变量名指定column")
8
9ggpairs(tips, mapping = aes(color = sex),
10 columns = c(1, 6, 2),
11 columnLabels = c("Total_Bill(连续变量)", "Time(离散变量)", "Tip(连续变量)"),
12 title = "索引值指定column")

GGally与pairs相关关系图_史上最全(一)

GGally与pairs相关关系图_史上最全(一)

2.3

lower,upper, diag

2.3.1 自定义 lower

一个离散变量, lowerdiscrete 参数无效。

 1ggpairs(tips, mapping = aes(color = day), 
2 columns = c("total_bill", "time", "tip"),
3 columnLabels = c("Total_Bill(连续变量)", "Time(离散变量)", "Tip(连续变量)"),
4 lower = list(
5 continuous = "cor",
6 combo = "dot_no_facet" # 没有2个离散变量,不需要discrete参数
7 ),
8 upper = list(
9 continuous = "blank",
10 combo = "blank"
11 ),
12 diag = list(
13 continuous = "blankDiag",
14 discrete = "blankDiag"
15 ),
16 title = "自定义lower\n(lower$continuous = \"cor\", lower$combo = \"dot_no_facet\")"
17)

GGally与pairs相关关系图_史上最全(一)

两个离散变量, lower continuous 参数无效。

 1ggpairs(tips, mapping = aes(color = day), 
2 columns = c("total_bill", "time", "sex"),
3 columnLabels = c("Total_Bill(连续变量)", "Time(离散变量)", "Sex(离散变量)"),
4 lower = list(
5 combo = "dot_no_facet", #
6 discrete = "blank"
7 ),
8 upper = list(
9 combo = "blank",
10 discrete = "blank"
11 ),
12 diag = list(
13 continuous = "blankDiag",
14 discrete = "blankDiag"
15 ),
16 title = "自定义lower\n(lower$combo = \"dot_no_facet\",lower$discrete = \"blank\" )"
17)

GGally与pairs相关关系图_史上最全(一)

2.3.2 自定义 upper

一个离散变量, upperdiscrete 参数无效。

 1ggpairs(tips, mapping = aes(color = day), 
2 columns = c("total_bill", "time", "tip"),
3 columnLabels = c("Total_Bill(连续变量)", "Time(离散变量)", "Tip(连续变量)"),
4 upper = list(
5 continuous = "density",
6 combo = "dot_no_facet" # 没有2个离散变量,不需要discrete参数
7 ),
8 lower = list(
9 continuous = "blank",
10 combo = "blank"
11 ),
12 diag = list(
13 continuous = "blankDiag",
14 discrete = "blankDiag"
15 ),
16 title = "自定义upper\n(upper$continuous = \"density\", upper$combo = \"dot_no_facet\")"
17)

GGally与pairs相关关系图_史上最全(一)

两个离散变量, upper continuous 参数无效。

 1ggpairs(tips, mapping = aes(color = day), 
2 columns = c("total_bill", "time", "sex"),
3 columnLabels = c("Total_Bill(连续变量)", "Time(离散变量)", "Sex(离散变量)"),
4 upper = list(
5 combo = "dot_no_facet", #
6 discrete = "ratio"
7 ),
8 lower = list(
9 combo = "blank",
10 discrete = "blank"
11 ),
12 diag = list(
13 continuous = "blankDiag",
14 discrete = "blankDiag"
15 ),
16 title = "自定义upper\n(lower$combo = \"dot_no_facet\",upper$discrete = \"ratio\" )"
17)

GGally与pairs相关关系图_史上最全(一)

2.3.3 自定义 diag

diag 没有 combo 参数。

 1ggpairs(tips, mapping = aes(color = day), 
2 columns = c("total_bill", "time", "tip"),
3 columnLabels = c("Total_Bill(连续变量)", "Time(离散变量)", "Tip(连续变量)"),
4 diag = list(
5 continuous = "barDiag",
6 discrete = "blankDiag" #
7 ),
8 lower = list(
9 continuous = "blank",
10 combo = "blank"
11 ),
12 upper = list(
13 continuous = "blank",
14 combo = "blank"
15 ),
16 title = "自定义diag\n(diag$continuous = \"barDiag\", diag$discrete = \"blankDiag\")"
17)

GGally与pairs相关关系图_史上最全(一)

 1ggpairs(tips, mapping = aes(color = day), 
2 columns = c("total_bill", "time", "sex"),
3 columnLabels = c("Total_Bill(连续变量)", "Time(离散变量)", "Sex(离散变量)"),
4 diag = list(
5 continuous = "barDiag", #
6 discrete = "barDiag"
7 ),
8 lower = list(
9 discrete = "blank",
10 combo = "blank"
11 ),
12 upper = list(
13 discrete = "blank",
14 combo = "blank"
15 ),
16 title = "自定义diag\n(lower$continuous = \"barDiag\",diag$barDiag = \"barDiag\" )"
17)

GGally与pairs相关关系图_史上最全(一)

2.3.4  mapping 参数

1library(ggplot2)
2library(GGally)
3data(tips, package = "reshape")
4
5ggpairs(tips,
6 columns = c("total_bill", "time", "tip"),
7 columnLabels = c("Total_Bill(连续变量)", "Time(离散变量)", "Tip(连续变量)"),
8 title = "无mapping"
9)

GGally与pairs相关关系图_史上最全(一)

1ggpairs(tips, 
2 columns = c("total_bill", "time", "tip"),
3 columnLabels = c("Total_Bill(连续变量)", "Time(离散变量)", "Tip(连续变量)"),
4 lower = list(mapping = aes(color = time)),
5 title = "自定义lower(lower$mapping = \"time\")" # 局部映射
6)

GGally与pairs相关关系图_史上最全(一)

 1ggpairs(tips, 
2 columns = c("total_bill", "tip", "size"),
3 columnLabels = c("Total_Bill(连续变量)", "Tip(连续变量)", "Size(连续变量)"),
4 lower = list(
5 continuous = "cor",
6 mapping = aes(color = sex)
7 ),
8 upper = list(
9 continuous = "cor",
10 mapping = aes(color = smoker)
11 ),
12 diag = list(
13 continuous = "barDiag",
14 mapping = aes(color = time)
15 ),
16 title = "自定义lower,upper,diag\n(下三角颜色为sex,上三角颜色为smoker,对角颜色为time)"
17)

GGally与pairs相关关系图_史上最全(一)

2.3.5 同时指定 lower , upper , diag

2个连续变量,1个离散变量。

 1ggpairs(tips, mapping = aes(color = day), 
2 columns = c("total_bill", "tip", "time"),
3 columnLabels = c("Total_Bill(连续变量)", "Tip(连续变量)", "Time(离散变量)"),
4 lower = list(
5 continuous = "cor",
6 combo = "dot_no_facet" # 没有2个离散变量,不需要discrete参数
7 ),
8 upper = list(
9 continuous = "density",
10 combo = "dot_no_facet" # 没有2个离散变量,不需要discrete参数
11 ),
12 diag = list(
13 continuous = "barDiag",
14 discrete = "blankDiag" #
15 ),
16 title = "自定义lower,upper,diag(两个连续变量,一个离散变量)"
17)

GGally与pairs相关关系图_史上最全(一)

1个连续变量,2个离散变量。

 1ggpairs(tips, mapping = aes(color = day), 
2 columns = c("total_bill", "time", "sex"),
3 columnLabels = c("Total_Bill(连续变量)", "Time(离散变量)", "Sex(离散变量)"),
4 lower = list(
5 combo = "dot_no_facet", #
6 discrete = "blank"
7 ),
8 upper = list(
9 combo = "dot_no_facet", #
10 discrete = "ratio"
11 ),
12 diag = list(
13 continuous = "barDiag", #
14 discrete = "barDiag"
15 ),
16 title = "自定义lower,upper,diag(一个连续变量,两个离散变量)"
17)

GGally与pairs相关关系图_史上最全(一)

——————————————

往期精彩:


以上所述就是小编给大家介绍的《GGally与pairs相关关系图_史上最全(一)》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

流畅的Python

流畅的Python

[巴西] Luciano Ramalho / 安道、吴珂 / 人民邮电出版社 / 2017-5-15 / 139元

【技术大咖推荐】 “很荣幸担任这本优秀图书的技术审校。这本书能帮助很多中级Python程序员掌握这门语言,我也从中学到了相当多的知识!”——Alex Martelli,Python软件基金会成员 “对于想要扩充知识的中级和高级Python程序员来说,这本书是充满了实用编程技巧的宝藏。”——Daniel Greenfeld和Audrey Roy Greenfeld,Two Scoops ......一起来看看 《流畅的Python》 这本书的介绍吧!

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具