Data Visualization – ggplot() function

This section walks through how to visualize the data using the ggplot() function. Each example shows a specific task that demonstrates how to visualize the data.

Topics covered in this section are

  ➤ Data Frame

  ➤ Tidyverse library

  ➤ The ggplot2() function

  ➤ A Basic Plot

  ➤ Essential Elements

  ➤ Labels

  ➤ Themes

  ➤ Aesthetics

  ➤ Legends

  ➤ Bar Chart

  ➤ Line Chart

  ➤ Hybrid Chart

  ➤ Grouped Chart

  ➤ Area Chart

  ➤ Secondary Axis

Data Visualization – ggplot() function

↪ Data Frame

The data frame used is a stock closing price of a hypothetical corporation Andromeda for visualizing the data through various charts is below.

      stock <- read.csv('https://raw.githubusercontent.com/csxplore/data/main/andromeda.csv', header=T)
      stock <- stock[,c("Date..", "Close.Price..", "Total.Traded.Quantity..", "No..of.Trades..")]
      coln <- c("Date", "Close", "Volume", "No.of.trades")
      colnames(stock) <- coln
      stock$Date <- format(as.Date(stock$Date, '%d-%b-%Y'), format = '%d-%b-%y')
      stock10 <- stock[2:11,]
      stock10

      ---Output---       - Date Close Volume No.of.trades       2 01-Jun-22 1,394.85 6,045,948 152,770       3 02-Jun-22 1,385.10 5,737,510 160,429       4 03-Jun-22 1,380.30 3,478,622 174,541       5 06-Jun-22 1,378.45 3,086,633 126,359       6 07-Jun-22 1,362.60 5,126,983 133,385       7 08-Jun-22 1,367.40 2,993,294 108,646       8 09-Jun-22 1,377.70 3,655,908 149,498       9 10-Jun-22 1,351.10 4,577,715 159,998       10 13-Jun-22 1,326.60 5,446,286 269,494       11 14-Jun-22 1,312.00 5,935,178 247,100

Data Visualization – ggplot() function

↪ Tidyverse library

The ggplot2 is part of the tidyverse package. Load the tidyverse library using the following code.

      library(tidyverse)

If the above code fails to load the library with an error message “there is no package called ‘tidyverse’”, install the tidyverse package and run the library() function once again.

      install.packages("tidyverse")
      library(tidyverse)

Data Visualization – ggplot() function

↪ The ggplot2() function

The ggplot2() is one of the most versatile and elegant functions in R. Function implements the grammar of graphics which consists of

Essential Elements

  • Data: The data set that contains the variables to be plotted on the graph.
  • Aesthetics: Aesthetics are variables or attributes mapping to the x-axis and y-axis. The attributes are differentiated by color, shape, and size.
  • Layers or Geometries: These are actual visual graphs such as scatter points, lines, bars, etc.
  • Optional Elements

  • Labels: Labels modify axis, legend, and plot labels.
  • Themes: These are used for changing the appearance of non-data elements.
  • Coordinates: Coordinate system, such as cartesian, semi-log, and polar, maps the position of objects onto the plane of the plot.
  • Facets: These are small plots each showing a different subset of the whole data set.
  • Statistics: Display statistical information, such as mean and variance, of the data.
  • Scales: Scale control the mapping from data to aesthetic attributes. A scale for every aesthetic is required in a plot.
  • The grammar makes it easier to update a plot iteratively.

    Data Visualization – ggplot() function

    ↪ Basic Plot

    The code below runs the basic plot stock10 with the last 10 trading sessions on the x-axis and closing price on the y-axis.

          ggplot(data = stock10) +
            geom_point(mapping = aes(x=seq(1:10), y=Close))
    
    
    Basic Plot

    Data Visualization – ggplot() function

    ↪ Basic Plot: Issue with y-axis intervals

    Observe in the previous chart that the y-axis intervals are not evenly spaced. This is because the 'Close' price column in the data frame is a character type. This can be confirmed with the code below.

          str(stock10)
    
          ---Output---       'data.frame': 10 obs. of 4 variables:        $ Date : chr "31-May-22" "01-Jun-22" "02-Jun-22" "03-Jun-22" ...       $ Close : chr "1,388.95" "1,394.85" "1,385.10" "1,380.30" ...       $ Volume : chr "6,742,694" "6,045,948" "5,737,510" "3,478,622" ...       $ No.of.trades: chr "178,695" "152,770" "160,429" "174,541" ...

    To resolve this issue, convert the 'Close' price into a numeric value. Note that the the comma must be removed while converting the 'Close' price into the numeric value.

          y <- as.numeric(gsub(",","",stock10$Close))
          ggplot(data = stock10) +
            geom_point(mapping = aes(x=seq(1:10), y))
    
    
    Issue with y-axis intervals

    Data Visualization - ggplot() function

    ↪ Basic Plot: Setting y-axis range

    The ylim(low, high) sets the range of y-axis values.

          ggplot(data = stock10) +
            geom_point(mapping = aes(x=seq(1:10), y)) +
            ylim(1200,1700)
    
    
    Setting y-axis range

    Data Visualization - ggplot() function

    ↪ Essential Elements

    With ggplot2, plotting begins with the function ggplot(). The ggplot() function creates a coordinate system.

    Data

    The first argument of ggplot() is the data set to use in the graph, so ggplot(data = stock10) creates an empty graph.

    Layers

    Layers or Geometry defines the type of graphics.
  • geom_point() draws a scatter plot.
  • geom_line() draws a line plot.
  • geom_bar(stat = "identity") draws bar chart.
  • geom_col() draws bar chart.
  • geom_rect() draws rectangle.
  • geom_text() adds text to a plot.
  • Aesthetics

    Aesthetics describe how variables in the data are mapped to visual properties of geom functions. Aesthetic mappings can be set in ggplot() and in individual geom functions. The syntax is aes(x, y, ...), where x and y arguments map to values in x and y coordinates. Other commonly used mappings are color, size, and shape.

    Data Visualization - ggplot() function

    ↪ Labels

    Well-defined labels are critical for making plots readable to a wider audience. The labs() function sets the axis, legend, and plot labels. The labs() function can also be used for setting the title and subtitle to explain the main finding.

          ggplot(data = stock10) +
            geom_point(mapping = aes(x=seq(1:10), y)) +
            ylim(1200,1700) +
            labs(
              x = "Date",
              y = "Close Price",
              title = "Stock Data"
            )
    
    
    Labels

    Data Visualization - ggplot() function

    ↪ Themes

    Themes help make the plot aesthetically pleasing or match an existing style guide. Themes give control over things like fonts, axis text, and backgrounds.

    The element_ functions specify how non-data components of the plot are displayed.

  • element_text(): sets the font size, color, and face of text elements
  • element_rect(): borders and backgrounds
  • element_line(): lines
  • element_blank(): draws nothing, and assigns no space
  • The element_text() sets the font size, color, and face of text elements. The code below rotates axis text to 90 degrees. The ylim(low, high) sets the range of y-axis values.

          x <- stock10$Date 
          y <- as.numeric(gsub(",","",stock10$Close))
     
          base_s_plot <- ggplot(data = stock10) +
            geom_point(mapping = aes(x, y)) +
            ylim(1200,1700) +
            labs(
              x = "Date",
              y = "Close Price",
              title = "Stock Data"
            )
     
          base_s_plot +
            theme(axis.text.x = element_text(angle=90))
    
    
    Themes

    Data Visualization - ggplot() function

    ↪ Theme: Set the axis text to the middle of the tick line

    The code, vjust = 0.5, hjust = 1 argument to element_text, below adjusts the axis text to the middle of the tick line.

          base_s_plot +
            theme(axis.text.x = element_text(angle=90, vjust = 0.5, hjust = 1))
    
    
    Set the axis text to the middle of the tick line

    Data Visualization - ggplot() function

    ↪ Theme: Set the axis text

    The code, color = "red", face = "bold" arguments to axis.text.x = element_text below sets the x-axis text to bold and red.

          base_s_plot +
            theme(
              axis.text.x = element_text(angle=90, vjust = 0.5, hjust = 1,
                            color = "red", face = "bold")
              )
    
    
    Set the axis text

    Data Visualization - ggplot() function

    ↪ Theme: Set the plot title

    The code plot.title = element_text below sets plot title.

          base_s_plot +
            theme(
              axis.text.x = element_text(angle=90, vjust = 0.5, hjust = 1, color = "black"),
              plot.title = element_text(hjust = 0.5, color = "red", face = "bold")
            )
    
    
    Set the plot title

    Data Visualization - ggplot() function

    ↪ Theme: Set the plot background

    The code plot.background = element_rect_, below sets plot background.

          base_s_plot +
            theme(
              axis.text.x = element_text(angle=90, vjust = 0.5, hjust = 1, color = "black"),
              plot.title = element_text(hjust = 0.5, color = "black", face = "bold"),
              plot.background = element_rect(fill = "grey90")
            )
    
    
    Set the plot background

    Data Visualization - ggplot() function

    ↪ Theme: Set the panel grid

    The code panel.grid = element_line_ below changes the panel grid. The linetype is an integer (0:8) or a name (blank, solid, dashed, dotted, dotdash, longdash, twodash).

          base_s_plot +
            theme(
              axis.text.x = element_text(angle=90, vjust = 0.5, hjust = 1, color = "black"),
              plot.title = element_text(hjust = 0.5, color = "black", face = "bold"),
              plot.background = element_rect(fill = "grey90"),
              panel.grid = element_line(linetype = 8, color = "red")
            )
    
    
    Set the panel grid

    Data Visualization - ggplot() function

    ↪ Theme: Set the panel border and color

    The code panel.border = element_rect_ below sets the panel border and border color. The code plot.background = element_rect_ sets a border to the chart area.

          base_s_plot +
            theme(
              axis.text.x = element_text(angle=90, vjust = 0.5, hjust = 1, color = "black"),
              plot.title = element_text(hjust = 0.5, color = "black", face = "bold"),
              panel.grid = element_line(linetype = 0),
              panel.border = element_rect(color = "red", linewidth = 1, fill = NA),
              plot.background = element_rect(color = "red", linewidth = 2, fill = NA)
            )
    
    
    Set the panel border and color

    Data Visualization - ggplot() function

    ↪ Aesthetics: Color

    The color aesthetic property automatically assign a marker color to the variable. A corresponding color variable under labs() helper function adds the legend title.

          base_s_plot2 <- base_s_plot +
            theme(
              axis.text.x = element_text(angle=90, vjust = 0.5, hjust = 1, color = "black"),
              plot.title = element_text(hjust = 0.5, color = "black", face = "bold"),
              panel.grid = element_line(linetype = 0),
              panel.border = element_rect(color = "black", linewidth = 0, fill = NA),
              plot.background = element_rect(color = "black", linewidth = 2, fill = NA)
            )
     
          base_s_plot2 +
            aes(color = y) +
            labs(color = "Closing")
    
    
    Color

    Data Visualization - ggplot() function

    ↪ Aesthetics: Color

    To visualize the Number of Trades for Closing Price for the day, Number of Trades can be categorized as Low, Medium, and High. The following code creates a new 'Trade' variable with the values Low, Medium, and High based on the Number of Trades.

          stock10$No.of.trades <- as.numeric(gsub(",","",stock10$No.of.trades))
          stock10$Trade[stock10$No.of.trades <= 140000] <- "Low"
          stock10$Trade[stock10$No.of.trades > 140000 & stock10$No.of.trades < 170000] <- "Medium"
          stock10$Trade[stock10$No.of.trades >= 170000] <- "High"
          stock10
    
          ---Output---       - Date Close Volume No.of.trades Trade       2 01-Jun-22 1,394.85 6,045,948 152770 Medium       3 02-Jun-22 1,385.10 5,737,510 160429 Medium       4 03-Jun-22 1,380.30 3,478,622 174541 High       5 06-Jun-22 1,378.45 3,086,633 126359 Low       6 07-Jun-22 1,362.60 5,126,983 133385 Low       7 08-Jun-22 1,367.40 2,993,294 108646 Low       8 09-Jun-22 1,377.70 3,655,908 149498 Medium       9 10-Jun-22 1,351.10 4,577,715 159998 Medium       10 13-Jun-22 1,326.60 5,446,286 269494 High       11 14-Jun-22 1,312.00 5,935,178 247100 High

    The following code aes(x, y, colour = z1) sets the marker color to Low, Medium, and High.

          z1 <- stock10$Trade
          base_s_plot2 +
            aes(color = z1) +
            labs(color = "Closing")
    
    
    Color

    Data Visualization - ggplot() function

    ↪ Aesthetics: Shape

    Further, visualization can be extended to see Volume relative to the Closing Price. The following code creates a new 'Activity' variable with the values Low, Medium, and High based on the trade volume.

          stock10$Volume <- as.numeric(gsub(",","",stock10$Volume))
          stock10$Activity[stock10$Volume <= 4000000] <- "Low"
          stock10$Activity[stock10$Volume > 4000000 & stock10$Volume < 6000000] <- "Medium"
          stock10$Activity[stock10$Volume >= 6000000] <- "High"
          stock10
    
          ---Output---       - Date Close Volume No.of.trades Trade Activity       2 01-Jun-22 1,394.85 6045948 152770 Medium High       3 02-Jun-22 1,385.10 5737510 160429 Medium Medium       4 03-Jun-22 1,380.30 3478622 174541 High Low       5 06-Jun-22 1,378.45 3086633 126359 Low Low       6 07-Jun-22 1,362.60 5126983 133385 Low Medium       7 08-Jun-22 1,367.40 2993294 108646 Low Low       8 09-Jun-22 1,377.70 3655908 149498 Medium Low       9 10-Jun-22 1,351.10 4577715 159998 Medium Medium       10 13-Jun-22 1,326.60 5446286 269494 High Medium       11 14-Jun-22 1,312.00 5935178 247100 High Medium

    The following code aes(x, y, colour = z1, shape = z2) sets the shapes to indicate Low, Medium, and High activities.

          z2 <- stock10$Activity
          base_s_plot2 +
            aes(color = z1, shape = z2) +
            labs(color = "Closing",shape = "Volume")
    
    
    Shape

    Data Visualization - ggplot() function

    ↪ Legends

    Legends are components of Themes component and Legends can be placed literally anywhere. Use the legend.position option and specify the top, right, bottom, or left.

          base_s_plot2 +
            aes(color = z1, shape = z2) +
            labs(color = "Closing",shape = "Volume") +
            theme(
              legend.position = "bottom",
              legend.box = "vertical"
            )
    
    
    Legends

    Data Visualization - ggplot() function

    ↪ Legends: Placing the legend inside the plot area

    To put the legend inside the plot area, specify the x and y coordinates vector of length 2, both values ranging between 0 and 1.

    The command legend.justification sets the corner that the position refers to.

          base_s_plot2 +
            aes(color = z1, shape = z2) +
            labs(color = "Closing",shape = "Volume") +
            theme(
              legend.position = c(.98, .98),
              legend.justification = c("right", "top"),
              legend.box.just = "right",
              legend.margin = margin(6, 6, 6, 6)
            )
    
    
    Placing the legend inside the plot area

    Data Visualization - ggplot() function

    ↪ Legends: Placing the legend side by side

    The code, legend.box = "horizontal", sets the legends side-by-side.

          base_s_plot2 +
            aes(color = z1, shape = z2) +
            labs(color = "Closing",shape = "Volume") +
            theme(
              legend.position = c(.98, .98),
              legend.justification = c("right", "top"),
              legend.box.just = "right",
              legend.margin = margin(6, 6, 6, 6),
              legend.box = "horizontal"
            )
    
    
    Placing the legend side by side

    Data Visualization - ggplot() function

    ↪ Legends: Remove one part of the legend

    The legends can be suppressed using guides() helper function.

          base_s_plot2 +
            aes(color = z1, shape = z2) +
            labs(color = "Closing",shape = "Volume") +
            theme(
              legend.position = c(.98, .98),
              legend.justification = c("right", "top"),
              legend.box.just = "right",
              legend.margin = margin(6, 6, 6, 6),
              legend.box = "horizontal"
            ) +
            guides(shape="none")
    
    
    Remove one part of the legend

    Data Visualization - ggplot() function

    ↪ Themes: Remove the legend altogether

    The theme property legend.position = "none" can also be used to remove the legends altogether.

          base_s_plot2 +
            aes(color = z1, shape = z2) +
            labs(color = "Closing",shape = "Volume") +
            theme(
              legend.position = "none"
            )
    
    
    Remove the legend altogether

    Data Visualization - ggplot() function

    ↪ Bar Chart

    The geom_bar() and geom_col() functions create bar charts. The geom_col() uses stat_identity() which leaves the data as is and thus makes the heights of the bars equal to values in the data.

    The geom_bar() uses stat_count() which counts the number of values at each x position and makes the height of the bar proportional to the number of values in each group. The geom_col() is equivalent to geom_bar(stat = "identity").

    Code below renders a bar chart of the 'Closing Price' of the Andromeda stock.

          base_bar_plot <- ggplot(data=stock10, aes(x, y, fill = "Closing Price" )) +
            ylim(0000,1700) +
            theme(
              axis.text.x = element_text(angle=45, vjust = 0.5, hjust = 0.5),
              plot.title = element_text(hjust = 0.5, color = "black", face = "bold"),
              panel.grid = element_line(linetype = 0),
              panel.border = element_rect(color = "black", linewidth = 0, fill = NA),
              plot.background = element_rect(color = "black", linewidth = 2, fill = NA),
              legend.position = "none"
            ) +
            labs(
              x = "Date",
              y = "Closing Price",
              title = "Andromeda"
            ) 
     
          base_bar_plot +
            geom_col() 
    
    
    Bar Chart

    Data Visualization - ggplot() function

    ↪ Bar Chart: Choosing the color of the columns

    The fill argument to the geom_col() function changes the color of the columns.

          base_bar_plot +
            geom_col(width = 0.5, fill = "lightblue")
    
    
    Choosing the color of the columns

    Data Visualization - ggplot() function

    ↪ Bar Chart: Interchange x and y axes

    The coord_flip() function helps to interchange x and y axes i.e. horizontal axis becomes the vertical axis; and the vertical axis becomes the horizontal axis.

          base_bar_plot +
            geom_col(width = 0.5, fill = "lightblue") +
            coord_flip() 
    
    
    Interchange x and y axes

    Data Visualization - ggplot() function

    ↪ Line Chart

    Each group in the stock10 data frame consists of only one observation. The group aesthetic controls which rows of the data get grouped together for geom_ functions such as geom_line().

          base_line_plot <- 
          ggplot(data=stock10, aes(x, y, group = 1)) +
            ylim(1000,1700) +
            theme(
              axis.text.x = element_text(angle=45, vjust = 0.5, hjust = 0.5),
              plot.title = element_text(hjust = 0.5, color = "black", face = "bold"),
              panel.grid = element_line(linetype = 0),
              panel.border = element_rect(color = "black", linewidth = 0, fill = NA),
              plot.background = element_rect(color = "black", linewidth = 2, fill = NA),
              legend.position = "none"
            ) +
            labs(
              x = "Date",
              y = "Closing Price",
              title = "Andromeda"
            ) 
     
          base_line_plot +
             geom_line()
    
    
    Line Chart

    Data Visualization - ggplot() function

    ↪ Line Chart: Choosing the line color

    A linewidth argument to geom_line() function changes line thickness. Similarly, a color argument to the geom_line() function changes the line color.

          base_line_plot +
            geom_line(color = "lightblue", linewidth = 2) 
    
    
    Choosing the line color

    Data Visualization - ggplot() function

    ↪ Line Chart: Line Type

    A linetype = "dashed" to geom_line() function renders dashed line. Line type can be changed by changing the value of 'linetype' argument to "dotted", "dotdash", "longdash", "twodash", "solid", and "blank".

          base_line_plot +
            geom_line(color = "lightblue", linewidth = 2, linetype = "dashed") 
    
    
    Lyne Type

    Data Visualization - ggplot() function

    ↪ Line Chart: Stair Step

    The geom_step() creates a stair step plot, highlighting when changes occur.

          base_line_plot +
            geom_step(color = "lightblue", linewidth = 1) 
    
    
    Stair Step

    Data Visualization - ggplot() function

    ↪ Line Chart: Smoothing

    The geom_smooth() function adds a trend line. By default, the trend line is created using LOESS smoother. The LOESS smoother is a locally weighted running line smoother, which is a non-parametric smoother, uses linear regression.

          base_line_plot +
            geom_smooth(color = "lightblue", linewidth = 1) 
    
    
    Smoothing

    Data Visualization - ggplot() function

    ↪ Line Chart: Smoothing

    The geom_smooth() uses 95% confidence intervals but a different confidence level can be specified by the level argument. The code below changes the confidential interval to 99%.

          base_line_plot +
            geom_smooth(color = "lightblue", linewidth = 1, level = 0.99) 
    
    
    Smoothing

    Data Visualization - ggplot() function

    ↪ Line Chart: Smoothing

    The argument se = FALSE removes the confidential interval band.

          base_line_plot +
            geom_smooth(color = "lightblue", linewidth = 1, se = FALSE) 
    
    
    Smoothing

    Data Visualization - ggplot() function

    ↪ Line Chart: Smoothing

    The smoothing can be controlled by passing span argument to the geom_smooth() funtion.

          base_line_plot +
            geom_smooth(color = "lightblue", linewidth = 1, span = 0.6) 
    
    
    Smoothing

    Data Visualization - ggplot() function

    ↪ Hybrid Chart

    A scatter plot can be combined with a line plot for a hybrid chart of a connected scatter plot.

          base_line_plot +
            geom_line() +
            geom_point()
    
    
    Hybrid Chart

    Data Visualization - ggplot() function

    ↪ Grouped Charts

    A portfolio of stock data is used for demonstrating grouped charts.

          p <- read.csv('https://raw.githubusercontent.com/csxplore/data/main/portfolio10.csv', header=T)
          p
    
          ---Output---       Date Close Holding Corp Value       1 01-Jun-22 1394.85 20 Andromeda 27897.0       2 02-Jun-22 1385.10 20 Andromeda 27702.0       3 03-Jun-22 1380.30 20 Andromeda 27606.0       4 06-Jun-22 1378.45 20 Andromeda 27569.0       5 07-Jun-22 1362.60 20 Andromeda 27252.0       6 08-Jun-22 1367.40 20 Andromeda 27348.0       7 09-Jun-22 1377.70 20 Andromeda 27554.0       8 10-Jun-22 1351.10 20 Andromeda 27022.0       9 13-Jun-22 1326.60 20 Andromeda 26532.0       10 14-Jun-22 1312.00 20 Andromeda 26240.0       11 01-Jun-22 468.30 50 Canis Major 23415.0       12 02-Jun-22 469.85 50 Canis Major 23492.5       13 03-Jun-22 464.50 50 Canis Major 23225.0       14 06-Jun-22 463.70 50 Canis Major 23185.0       15 07-Jun-22 463.40 50 Canis Major 23170.0       16 08-Jun-22 471.30 50 Canis Major 23565.0       17 09-Jun-22 466.95 50 Canis Major 23347.5       18 10-Jun-22 461.85 50 Canis Major 23092.5       19 13-Jun-22 445.85 50 Canis Major 22292.5       20 14-Jun-22 448.10 50 Canis Major 22405.0

    Data Visualization - ggplot() function

    ↪ Grouped Bar Chart

    A grouped bar or column plot displays a numeric value for a set of entities split in groups and subgroups. The position="dodge" argument to geom_col() results in the columns aligned with each other.

          base_2bar_plot <- ggplot(data=p, aes(Date, Value, fill = Corp )) +
            theme(
              axis.text.x = element_text(angle=45, vjust = 0.5, hjust = 0.5),
              plot.title = element_text(hjust = 0.5, color = "black", face = "bold"),
              panel.grid = element_line(linetype = 0),
              panel.border = element_rect(color = "black", linewidth = 0, fill = NA),
              plot.background = element_rect(color = "black", linewidth = 2, fill = NA),
              legend.position = c("bottom"),
              legend.box = "horizontal",
            ) +
            labs(
              x = "Date",
              y = "Value $",
              title = "Portfolio",
              fill = "Stock"
            ) 
     
          base_2bar_plot + 
             geom_col(width = 0.75, position="dodge") +
             ylim(0000,30000) 
    
    
    Grouped Chart

    Data Visualization - ggplot() function

    ↪ Stacked Bar Chart

    A stacked bar plot is very similar to the grouped bar plot, but the subgroups are displayed on top of each other. The position="stack" argument to geom_col() results in columns stacked.

          base_2bar_plot + 
            geom_col(width = 0.75, position="stack") +
            ylim(0000,60000) 
    
    
    Stacked Bar Chart

    Data Visualization - ggplot() function

    ↪ Percentage Stacked Bar Chart

    The position="fill" argument to geom_col() results in a percent stacked bar plot. The scale_y_continuous(labels = scales::percent) convert the y-axis labels into a percentage.

          base_2bar_plot + 
            geom_col(width = 0.75, position="fill") +
            scale_y_continuous(labels = scales::percent) +
            labs(
              y = "Value in percent"
            )
    
    
    Percentage Stacked Bar Chart

    Data Visualization - ggplot() function

    ↪ Grouped Bar Chart: Choosing the column colors

    Columns area can be filled using the scale_fill_manual() function.

          base_2bar_plot +
            geom_col(width = 0.75, position="dodge") +
            ylim(0000,30000) +
            scale_fill_manual(values = c("grey","lightblue"))
    
    
    Choosing the column colors

    Data Visualization - ggplot() function

    ↪ Add labels to a Dodged Bar Chart

    The geom_text() adds a label to the bar plot. Note: In the graph below, the label has been altered dividing the value by 10000.

          d <- round(p$Value / 10000, digits =  1)
          base_2bar_plot +
            geom_col(width = 0.75, position="dodge") +
            ylim(0000,30000) +
            scale_fill_manual(values = c("grey","lightblue")) +
            geom_text(aes(vjust = 5,  label = d ), 
                      position = position_dodge(0.9),
                      size = 2,
                      color = "red"
            )
    
    
    Add labels to a Dodged Bar Chart

    Data Visualization - ggplot() function

    ↪ Add labels to a Stacked Bar Chart

    A position argument, position = position_stack(0.9), to geom_text() function adds a label to a stacked bar plot.

          d <- round(p$Value / 10000, digits =  1)
          base_2bar_plot +
            geom_col(width = 0.75, position="stack") +
            ylim(0000,60000) +
            scale_fill_manual(values = c("grey","lightblue")) +
            geom_text(aes(vjust = 5,  label = d ), 
                      position = position_stack(0.9),
                      size = 2,
                      color = "red"
            )
    
    
    Add labels to a Stacked Bar Chart

    Data Visualization - ggplot() function

    ↪ Line graphs with multiple lines

    By default, the group aesthetic is set to the interaction of all discrete variables in the data frame. Argument group = Corp to aes() function groups the data by 'Corp' column, which represents the Stock of the Corporation, of the data frame.

          base_line2_plot <- ggplot(data=p, aes(Date, Value, color = Corp, group = Corp )) +
            theme(
              axis.text.x = element_text(angle=45, vjust = 0.5, hjust = 0.5),
              plot.title = element_text(hjust = 0.5, color = "black", face = "bold"),
              panel.grid = element_line(linetype = 0),
              panel.border = element_rect(color = "black", linewidth = 0, fill = NA),
              plot.background = element_rect(color = "black", linewidth = 2, fill = NA),
              legend.position = c("bottom"),
              legend.box = "horizontal"
            ) +
            labs(
              x = "Date",
              y = "Value",
              title = "Portfolio",
              color = "Stock"            # Sets legend title to "Stock"
            ) 
     
          base_line2_plot +
            geom_line() +
            ylim(20000,30000) 
    
    
    Line graphs with multiple lines

    Data Visualization - ggplot() function

    ↪ Stacked Line graphs with multiple lines

    A position argument position = "stack" to geom_line() function draws stacked line of two variables which is grouped with group = Corp argument to aes().

          base_line2_plot +
            geom_line(position = "stack") +
            ylim(0000,60000) 
    
    
    Stacked Line graphs with multiple lines

    Data Visualization - ggplot() function

    ↪ Area Chart

    To change the stacked line chart into a chart with filled areas between the lines, map the fill aesthetics fill = Corp and add the geom_area() function with the argument position = "stack".

    Arguments color = Corp and fill = Corp would cause legends to print twice. The guides() function, guides(color = "none"), helps to suppress the legend corresponding to the line.

          ggplot(data=p, aes(Date, Value, color = Corp, group = Corp, fill = Corp)) +
            geom_line(position = "stack") +
            geom_area(position = "stack", stat = "identity", alpha = 0.5) +
            ylim(0000,60000) +
            theme(
              axis.text.x = element_text(angle=45, vjust = 0.5, hjust = 0.5),
              plot.title = element_text(hjust = 0.5, color = "black", face = "bold"),
              panel.grid = element_line(linetype = 0),
              panel.border = element_rect(color = "black", linewidth = 0, fill = NA),
              plot.background = element_rect(color = "black", linewidth = 2, fill = NA),
              legend.position = c("bottom"),
              legend.box = "horizontal"
            ) +
            labs(
              x = "Date",
              y = "Value $",
              title = "Portfolio",
              fill = "Stock"
            ) +
            guides(color = "none")
    
    
    Area Chart

    Data Visualization - ggplot() function

    ↪ Secondary Axis

    A stock index data is used for demonstrating charts with secondary axis.

          stockindex <- read.csv('https://raw.githubusercontent.com/csxplore/data/main/stockindex.csv', header=T)
          stockindex$Date <- format(as.Date(stock2$Date, '%d-%b-%Y'), format = '%d-%b-%y')
          stockindex10 <- stockindex[2:11,]
          stockindex10
    
          ---Output---       - Date Open High Low Close Adj.Close Volume       2 01-Jun-22 16594.40 16649.20 16438.85 16522.75 16522.75 249600       3 02-Jun-22 16481.65 16646.40 16443.05 16628.00 16628.00 236000       4 03-Jun-22 16761.65 16793.85 16567.90 16584.30 16584.30 245500       5 06-Jun-22 16530.70 16610.95 16444.55 16569.55 16569.55 233600       6 07-Jun-22 16469.60 16487.25 16347.10 16416.35 16416.35 233800       7 08-Jun-22 16474.95 16514.30 16293.35 16356.25 16356.25 243500       8 09-Jun-22 16263.85 16492.80 16243.85 16478.10 16478.10 205000       9 10-Jun-22 16283.95 16324.70 16172.60 16201.80 16201.80 189700       10 13-Jun-22 15877.55 15886.15 15684.00 15774.40 15774.40 225500       11 14-Jun-22 15674.25 15858.00 15659.45 15732.10 15732.10 225400

    Data Visualization - ggplot() function

    ↪ Secondary Axis

    The scale_x_continuous() and scale_y_continuous() functions display a secondary axis. The secondary axis is positioned opposite to the primary axis and the secondary axis is a one-to-one transformation of the primary axis. The sec_axis() creates the specifications for a secondary axis and controls the scale constructor.

          base2nd_axis_plot <- 
          ggplot() +
            geom_col(width = 0.75, position="dodge",
                     data=p, aes(Date, Value, fill = Corp )) +
            geom_line(data=stockindex10, aes(Date, Close), color = "red", group = 1) +
            scale_fill_manual(values = c("grey","lightblue")) +
            theme(
              axis.text.x = element_text(angle=45, vjust = 0.5, hjust = 0.5),
              plot.title = element_text(hjust = 0.5, color = "blue", face = "bold"),
              panel.grid = element_line(linetype = 0),
              panel.border = element_rect(color = "black", linewidth = 0, fill = NA),
              plot.background = element_rect(color = "black", linewidth = 2, fill = NA),
              legend.position = c("bottom"),
              legend.box = "horizontal",
              axis.title.y.right = element_text(color = "red")
            ) +
            labs(
              x = "Date",
              y = "Value",
              title = "Portfolio"
            ) 
     
          base2nd_axis_plot +
            scale_y_continuous(sec.axis = sec_axis(~.*1, name = "Index"),
                               limits = c(0000,30000)) 
    
    
    Secondary Axis

    Data Visualization - ggplot() function

    ↪ Duplicate Axis

    The dup_axis() provides a shorthand approach for creating a secondary axis duplicating or mirroring the primary axis.

          base2nd_axis_plot +
            scale_y_continuous(sec.axis = dup_axis(),
                               limits = c(0000,30000)) 
    
    
    Duplicate Axis

    Data Visualization - ggplot() function

    ↪ Summary

  • The ggplot2() is one of the most versatile and elegant functions in R used for drawing charts.
  • The ggplot2() function implements the grammar of graphics which consists of Data, Aesthetics, Layers or Geometries, Labels, Themes, and many other optional elements.
  • The Data, Aesthetics, and Layers of Geometries are mandatory elements of ggplot2() function.
  • Layers or Geometry defines the type of graphics, and elements start with geom_ prefix. For example geom_line() draws a line graph.
  • Aesthetics describes how variables in the data are mapped to visual properties of geom functions. Commonly used mappings are color, size, and shape.
  • Themes help make the plot aesthetically pleasing. For example, sets the plot title, axis text, plot background, grid lines, and so on.
  • Legends can be set by Theme elements.
  • Syntax of the ggplot() function is
  •       ggplot(data = .., aes(x, y, ...)) +
            geom_FUN() +                             # FUN is line, point, col, bar, etc.
            theme() +
            labs()
    
    
    OR
          ggplot(data = .., aes(x, y)) +
            geom_FUN(aes(...)) +                     # FUN is line, point, col, bar, etc.
            theme() +
            labs()
    
    
    OR
          ggplot(data = ..,) +
            geom_FUN(aes(x, y,...)) +               # FUN is line, point, col, bar, etc.
            theme() +
            labs()