R语言 gganimate大量堆叠的geom_col在动画中产生白色

qkf9rpyu  于 2022-12-06  发布在  其他
关注(0)|答案(2)|浏览(142)

我正在创建一个堆叠条形图的动画(调用geom_col)。我有100列。当我生成动画时,我在应该填充的列中得到了很多白色。
请参见下面的gif:

这个gif基于大约100 k行的数据,所以我不能在这里全部发布。特别是,我不能在一个更简单的例子中重现这个:

library('tidyverse')
library('gganimate')

data.frame(time = rep(1:50, 200)) %>%
  arrange(time) %>%
  mutate(type = rep(c(rep('A', 100), rep('B', 100)), 50), 
         class = rep((1:100), 100), 
         value = runif(10000, 0, 1)) %>%
  ggplot(aes(x = class, y = value, fill = type)) +
  geom_col() +
  transition_time(time)

工作正常(忽略上面数据中的结构,但我没有白色):

我试着添加ease_aes()enter_fade()exit_fade(),但是都不起作用。有人对是什么导致了这个问题有想法吗?
---更新---
根据这些评论,我试着过滤数据,看看到底发生了什么。减少到只有两个国家和5年的数据,问题似乎是数据块在百分位之间移动。当我想要的是他们只是在每个百分位内增长和收缩。你可以在下面的gif中看到它:

产生这种情况的数据如下:

structure(list(country = c("US", "DE", "US", "US", "US", "DE", 
"US", "DE", "US", "DE", "US", "DE", "US", "DE", "DE", "US", "DE", 
"DE", "DE", "US", "DE", "US", "US", "US", "DE", "US", "DE", "US", 
"US", "DE", "DE", "US", "DE", "US", "DE", "DE", "US", "DE", "US", 
"DE", "US", "US", "DE", "US", "US", "DE", "US", "DE", "US", "DE", 
"US", "DE", "DE", "US", "DE", "DE", "US", "US", "DE", "US", "US", 
"DE", "US", "US", "DE", "US", "DE", "US", "DE", "US", "DE", "US", 
"DE", "US", "DE", "DE", "US", "DE", "US", "US", "US", "DE", "US", 
"DE", "US", "US", "DE", "US", "DE", "US", "DE", "US", "DE", "US", 
"DE", "US", "US", "DE", "US", "US", "DE", "US", "US", "DE", "US", 
"DE", "US", "DE", "US", "DE"), glob.perc = c(0, 1, 1, 2, 3, 3, 
4, 4, 5, 5, 6, 6, 7, 7, 7, 8, 8, 9, 9, 0, 1, 1, 2, 3, 3, 4, 4, 
5, 6, 6, 6, 7, 7, 8, 8, 8, 9, 9, 0, 1, 1, 2, 2, 3, 4, 4, 5, 5, 
6, 6, 7, 7, 7, 8, 8, 9, 9, 0, 1, 1, 2, 2, 3, 4, 4, 5, 5, 6, 6, 
7, 7, 8, 8, 9, 9, 9, 0, 1, 1, 2, 3, 3, 4, 4, 5, 6, 6, 7, 7, 8, 
8, 9, 9, 0, 1, 1, 2, 2, 3, 4, 4, 5, 6, 6, 7, 7, 8, 8, 9, 9), 
    avg.income.country = c(437288.3, 95483.3754884956, 140784.030084749, 
    140733.5, 92860.7570361667, 27041.1685330627, 82474.4007614941, 
    22845.1776491941, 75584.1480877374, 20954.7760014288, 70400.3370710519, 
    19852.2326809271, 54038.6152996391, 15598.3057384556, 15170.9872445152, 
    62785.1002246113, 18201.6743099168, 39606.7790727414, 39051.1193095399, 
    450574.9, 89747.1381942579, 143040.424101143, 144413.3, 95281.4131057479, 
    26564.8030858664, 84645.1806598295, 22453.3134663253, 99495.4, 
    58448.7245539485, 16815.8081430027, 15925.4607078112, 67342.4870614877, 
    18775.7716260376, 52078.6261482834, 14908.4732454128, 14586.6597398625, 
    60740.8587598986, 17551.4029073371, 449672.7, 85860.9513060095, 
    138573.062299181, 107999.713224424, 26551.7207203881, 118606.7, 
    81673.5478130351, 22256.5124499113, 74664.7815210055, 20289.8692320157, 
    69424.4509484861, 19130.6427260963, 53441.6796042233, 15011.8413898757, 
    14554.8379632521, 62031.6543795656, 17372.7239256402, 17038.0153770701, 
    59253.6721580242, 478696.8, 87965.3040019279, 141489.41469306, 
    110750.734809188, 28139.4736007857, 121395.4, 84564.2106500617, 
    23136.9326230234, 77452.4071740221, 20809.5254887263, 72187.8010950261, 
    19423.2184457137, 67965.6133547784, 18489.4603327709, 64700.6833849069, 
    17811.5804850837, 50612.3590346861, 14165.4003733601, 13829.472811758, 
    542123.2, 89948.9091254987, 158338.248242006, 156908.9, 104475.681782063, 
    29031.666816329, 92305.5514014955, 23750.4970524401, 107775.8, 
    78090.1791649968, 21282.8059573008, 73283.2631907787, 19808.7465702618, 
    69304.0213872794, 18813.7418777938, 65958.7178466761, 18090.1791160505, 
    559720.3, 92129.3365959901, 159846.146463587, 123870.105638014, 
    30030.7222753586, 135301.9, 94785.176213572, 24358.2621716462, 
    110644.4, 80286.8697338142, 21690.4391200441, 75280.156096728, 
    20090.0002975319, 71136.641950609, 19006.2143886443, 67594.6662796918, 
    18216.0069568407), region = c("Americas", "Europe", "Americas", 
    "Americas", "Americas", "Europe", "Americas", "Europe", "Americas", 
    "Europe", "Americas", "Europe", "Americas", "Europe", "Europe", 
    "Americas", "Europe", "Europe", "Europe", "Americas", "Europe", 
    "Americas", "Americas", "Americas", "Europe", "Americas", 
    "Europe", "Americas", "Americas", "Europe", "Europe", "Americas", 
    "Europe", "Americas", "Europe", "Europe", "Americas", "Europe", 
    "Americas", "Europe", "Americas", "Americas", "Europe", "Americas", 
    "Americas", "Europe", "Americas", "Europe", "Americas", "Europe", 
    "Americas", "Europe", "Europe", "Americas", "Europe", "Europe", 
    "Americas", "Americas", "Europe", "Americas", "Americas", 
    "Europe", "Americas", "Americas", "Europe", "Americas", "Europe", 
    "Americas", "Europe", "Americas", "Europe", "Americas", "Europe", 
    "Americas", "Europe", "Europe", "Americas", "Europe", "Americas", 
    "Americas", "Americas", "Europe", "Americas", "Europe", "Americas", 
    "Americas", "Europe", "Americas", "Europe", "Americas", "Europe", 
    "Americas", "Europe", "Americas", "Europe", "Americas", "Americas", 
    "Europe", "Americas", "Americas", "Europe", "Americas", "Americas", 
    "Europe", "Americas", "Europe", "Americas", "Europe", "Americas", 
    "Europe"), year = c(1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 
    1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 
    1980L, 1980L, 1980L, 1980L, 1981L, 1981L, 1981L, 1981L, 1981L, 
    1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 
    1981L, 1981L, 1981L, 1981L, 1981L, 1982L, 1982L, 1982L, 1982L, 
    1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 
    1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1983L, 1983L, 1983L, 
    1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 
    1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1984L, 1984L, 
    1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 
    1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1985L, 1985L, 1985L, 
    1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 
    1985L, 1985L, 1985L, 1985L, 1985L)), row.names = c(NA, -110L
), class = c("tbl_df", "tbl", "data.frame"))

动画的程式码如下:

df %>%  
  ggplot(aes(x = glob.perc, y = avg.income.country/1000, fill = region)) + 
  geom_col(position = 'stack') +
  theme_minimal() +
  labs(subtitle = "Year: {frame_time}", 
       x = element_blank(), 
       y = element_blank(), 
       fill = 'Region') +
  transition_time(year)

我的感觉是,这不是一个丢失数据的问题--每年的可视化都是完整的,没有空白。我认为这是一个geom_col()如何过渡的问题。

1dkrff03

1dkrff031#

gganimate在处理您的数据集时遇到了问题,因为一些year/country/glob.perc值有多个观察值,而一些值为零。它(错误地)假设您正在跟踪的一些值每年都在glob.perc类别之间移动。解决这个问题的一个方法是使每个year/country/glob.perc组合都有且只有一个值。在这里,我将www.example.comavg.income.country放在可能有更聪明的方法来实现这一点,也许可以基于邻近的值或回归模型来插补值。

df %>%  
  group_by(year, region, country, glob.perc) %>%
  summarize(avg.income.country = mean(avg.income.country), n = n()) %>%
  ungroup() %>%
  complete(year, nesting(country, region), glob.perc, fill = list(avg.income.country = 0)) %>%
  
  ggplot(aes(x = glob.perc, y = avg.income.country/1000, fill = region)) + 
  geom_col(position = 'stack', color = "black", alpha = 0.7) +
  theme_minimal() +
  labs(subtitle = "Year: {frame_time}", 
       x = element_blank(), 
       y = element_blank(), 
       fill = 'Region') +
  transition_time(year)

下面是原始数据中的观测值数量。注意,有些是双倍的,有些是缺失的。这给gganimate带来了歧义,因为不清楚你追踪的单位是否已经消失(它已经消失了),或者它是否已经移动到另一个glob.perc类别(gganimate假设的)。

df %>%  
  count(year, region, country, glob.perc) %>%
  ggplot(aes(year, glob.perc, fill = n)) +
  geom_tile() +
  facet_wrap(~country)

将这些数据绘制成线条,我们可以看到底层数据中有一些可疑的地方。您可以再看一下那里,看看您的glob.perc代码是否按您希望的方式工作。如果类别的含义与它们听起来的一样,我会假设线条不会交叉。

df %>%  
  filter(country == "DE", glob.perc >= 2) %>%
  ggplot(aes(year, avg.income.country, color = as.character(glob.perc), group = glob.perc)) +
  geom_line() +
  facet_wrap(~country)

abithluo

abithluo2#

您所显示的是每个时间步的单一状态。但是value从一个步骤到另一个步骤没有太大的变化,这导致了 Flink 。我相信您想看到的是如下所示的内容。
为此,我对数据group_by进行了分组,并添加了一列sum作为累积值数据。
filter用于限制渲染时间。

library('tidyverse')
library('gganimate')

ddf_anim <- data.frame(time = rep(1:50, 200)) %>%
  arrange(time) %>%
  mutate(type = rep(c(rep('A', 100), rep('B', 100)), 50), 
         class = rep((1:100), 100), 
         value = runif(10000, 0, 1)) %>%
  filter(time <10) %>% 
  group_by(class, type) %>% 
  mutate(sum = cumsum(value)) %>% 
  ggplot(aes(x = class, y = sum, fill = type)) +
  geom_col() +
  transition_time(time)

ddf_anim

相关问题