ggplot2 - R - ggplot: Selection of which labels to appear in bar stack and their position -


currently have struggled following many, many hours before posting question.

we have huge number of data-sets similar following:

income                 inhabitants              percent below  15000            below 5000              4.664723 15000 - 3.000           below 5000              15.743440 30000 - 40000           below 5000              13.994169 40000 - 50000           below 5000              12.609329 50000 - 60000           below 5000              11.333819 60000 - 70000           below 5000              11.370262 70000 - 100000          below 5000              14.795918 above  100000           below 5000              5.211370 not know             below 5000              10.276968 below  15000            5000-20000              4.225146 15000 - 3.000           5000-20000              13.157895 30000 - 40000           5000-20000              12.733918 40000 - 50000           5000-20000              11.739766 60000 - 70000           5000-20000              11.315789 70000 - 100000          5000-20000              18.728070 above  100000           5000-20000              7.880117 not know             5000-20000              9.356725 below  15000            20000-110000            4.013588 15000 - 3.000           20000-110000            11.147458 30000 - 40000           20000-110000            11.927529 40000 - 50000           20000-110000            11.751384 50000 - 60000           20000-110000            9.738299 60000 - 70000           20000-110000            10.367388 70000 - 100000          20000-110000            17.929039 above  100000           above 110000            13.198289 not know             above 110000            9.927026 below  15000            above 110000            4.662941 15000 - 3.000           above 110000            10.286413 30000 - 40000           above 110000            11.054838 40000 - 50000           above 110000            10.513447 50000 - 60000           above 110000            9.081383 60000 - 70000           above 110000            8.539993 70000 - 100000          above 110000            18.389801 above  100000           above 110000            18.040517 not know             above 110000            9.430667` 

we want make stacked bars of data, showing distribution between areas.

this did it:

dg=ggplot(data=frame, aes(x=inhabitants, ymax=100, y=percent,fill=eval(parse(text=special))))        g=g+geom_bar(stat="identity") g=g+theme_minimal() g=g+xlab("") + ylab("") g=g+theme(axis.text.y=element_blank(),axis.ticks.y=element_blank(),axis.ticks.x=element_blank())  g=g+scale_fill_discrete("",guide = guide_legend(reverse=true))  g 

nice, getting want. want add information: how many percent each section represent?

with following code close:

g=g+geom_text(aes(label = paste(round(percent,digits=1),"%"),y=percent),size = 2,hjust = 0.4, vjust = 1.4, position ="stack")  

getting this: http://s28.postimg.org/lv3zg2cnh/bars2.png

we want place numbers in middle of sections. however, turns out difficult (for us) do!

we have tried code following, no luck.

data=transform(frame,pos=round(ave(percent,inhabitants,fun=cumsum)-percent/2)) g=ggplot(data, aes(x=inhabitants, ymax=100, y=percent, fill=eval(parse(text=special))))  g=g+geom_bar(stat="identity") g=g+theme_minimal() g=g+xlab("") + ylab("") g=g+theme(axis.text.y=element_blank(),axis.ticks.y=element_blank(),axis.ticks.x=element_blank())  g=g+scale_fill_discrete("",guide = guide_legend(reverse=true)) g=g+geom_text(aes(label = paste(round(percent,digits=1),"%"),y=pos),size = 3,hjust = 0.4, vjust = 0, position ="stack")  g 

we have checked solutions. without luck, due our inexperience r. after many many hours giving , satisfied our first solution, not fact when handling data-sets more sections turns mess: http://s13.postimg.org/5jxavvohz/bars3.png

our primary question is:

1) how can prevent labels values less 2 percent appearing.

(our secondary question is: how can values positioned in middle? )

to avoid labeling stacks when percent less value, can assign positioning variable na cases.

for example, via ifelse , transform after creating pos variable via cumsum did in question. using 5 cut-off in example, no percent in example data less 2.

data = transform(data, pos2 = ifelse(percent < 5, na, pos)) 

now use pos2 y aesthetic in geom_text , not have text labels when percent less 5. remove position = "stack" geom_text labels centered.

here things example dataset (using fill = income because wasn't sure fill = eval(parse(text = special)) doing).

ggplot(data, aes(x = inhabitants, y = percent, fill = income)) +     geom_bar(stat="identity") +     theme_minimal() +     xlab("") + ylab("") +     theme(axis.text.y = element_blank(),           axis.ticks.y = element_blank(),           axis.ticks.x = element_blank()) +     scale_fill_discrete("",guide = guide_legend(reverse = true)) +     geom_text(aes(label = paste(round(percent, digits = 1),"%"), y = pos2), size = 3)  

enter image description here

as @epi10 pointed out, alternative use blank label every time percent less cut off. using original position variable , using ifelse inside of geom_text. line like:

geom_text(aes(label = ifelse(percent < 5, "", paste(round(percent, digits = 1),"%")), y = pos), size = 3)  

Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -