Interesting Ways to Graph Stock Data in R

Compared to other programming languages, R offers some of the most astounding tools in order to visualize data. After all, a picture says a thousand words right? In this post, we will visit several different types of plots to examine stock data in unique ways. Let’s start off with the Box-Whisker Plot.

Box-Whisker Plot

Interpretation:

The box-whisker plot gives us a better statistical representation of our data for the S&P 500. We can represent seasonality of the returns by year or month. This isn’t really possible with a basic line graph. The box-whisker plot simply tells a better story. We can even conjure up trading strategies based on the periods from the graph (monthly or yearly period). The bold mid lines represent the median in the data range for each year of monthly returns. The lines extending from the boxes are called “whiskers.” These represent the minimum and maximum values in the data range for each year. The red dots represent the average for each year. The lines perpendicular to the whiskers represent the upper and lower quartile of the data range. With the lower quartile, we can examine 25% of the data is below the line and 75% of the data is above the line. Conversely, with the upper quartile, we can examine that 75% of the data is above the line while 25% of the data is above the line.

Calendar Heat Map

Below are a few iterations of calendar heat maps from two separate packages:

iClick:

ggplot2:

ggplot2:

Interpretation:

The calendar hat map lets us visualize the data on a daily basis represented by colors. Examining the legend for each graph, we can note that the higher returns are noted by the color values in the upper range of the legend. This allows us to quickly optically examine the calendar and note the range of highest and lowest returns corresponding to their day, week, and or month.

Correlogram

Interpretation:

The correlogram is a variation of the correlation matrix and or heat map which allows us to better visualize the relationships between multiple variables. In this case, the stronger relationships are denoted by a larger circle. The Pearson Correlation Coefficient ranges from -1 to 1. In this case, a perfect correlation (1) means the assets move directly in tandem, while perfect negative correlation (-1) means the assets do not move in tandem whatsoever. The values are represented by color in the legend. We would want to be looking for larger circles with an intense green hue for negative relationships.

Violin Chart

Interpretation:

The violin chart allows us to better understand the distribution of our data range for each month. Here, we are trying to convey seasonality within the market. A median and quartile representation is also added to each months figure. The variable width is an advantage we did not utilize within the box-whisker plot. Essentially, we can examine the month of December and note that many returns cluster around 1% to 2%. We can also deduce that volatility in December is relatively low given the minimum and maximum values, and or length of the “violin.” Conversely, we can examine October had a high amount of volatility.

About the author

programmingforfinance

Hi, I'm Frank. I have a passion for coding and extend it primarily within the realm of Finance.

View all posts

Leave a Reply

Your email address will not be published. Required fields are marked *