# Scatter Plot and Bubble Plot

Scatter Plot (or a Scatter Diagram) provides a graphical representation of the relationship between two continuous variables. It helps determine the direction of the relationship and make an estimate on the strength of the relationship

Bubble Plot (or a Bubble Chart) is a type of chart that displays three dimensions of data. Two dimensions are expressed on the two axis while the third dimension is represented through the size of the bubble. A bubble plot can also be understood as a Scatter Plot with a third dimension which is represented with the size of the bubble

An application-oriented question on the topic along with responses can be seen below. The best answer was provided by Mohamed Asif on 18th November 2019.

Applause for all the respondents - Mohamed Asif, Bhavin Mehta

## Question

Q 210. Scatter plot is useful to check correlations in two variables. Bubble plot (available in Excel too) is used sometimes to show an additional third variable. Let us explore the Bubble Plot.

1. What are the variants for bubble plots?
2. What are some of the limitations of bubble plot?
3. How can a Bubble plot be misused or misinterpreted?

• 1

Both, Scatter Plot as well as Bubble Plot examines relationship between 2 variables (X Variable and Y Variable).

However, In Bubble chart, the area of each bubble represents the value of third variable.

Size can represent - Area of bubble or width of bubble based on input specification.

Bubble chart is build upon scatter plot as a base.

Below Scatter Plot and Bubble Plot reference same data points.

Scatter Plot 1 - Examining relationship between Y Variable and X Variable

Bubble Plot 1- Examining relationship between Y Variable and X Variable, Bubble size representing third variable

Variants: Based on the groups, we could have a simple bubble plot or one with groups.

Bubble Plot 2- With Groups - 3 Category A, B, C

Limitations and Misinterpretations:

• Area or size of the bubble, proportionally increases or decreases in the plot and does not depend on the largest value/size of the bubble. Possibly there are high chances of misinterpretation to ascertain the value based on the bubble size. However, in Minitab, we have option to Edit Bubble Size (Minitab can calculate the size or we could go with actual size of the bubble in the mentioned variable)
• Complex to understand and read the data compared to that of a scatter plot
• It becomes chaos / confusing when there are more data points in the bubble plot (In above referred Bubble plot 2, 50 data points considered with 3 categories). Not Ideal for large set of data.
• hard to identify smaller bubble (smaller bubble might be covered/hidden), especially when it is closer/overlapped by a bigger bubble. Information lost
• Using Jitter can help in revealing overlapping points.
• However, it could confuse the reader as Jitter is generated based on random function (it is not the same point each time when it is generated)
• It could be difficult to determine the exact location of the data points when the bubbles are clustered
• When there is no clear legend, reader can misinterpret / misunderstand the data point and the relationship
• Negative Size? , Any negative/null value representing 3rd variable size would not be visible, after all, shape cannot have negative area

Data is valuable only if we know how to visualize and give context.

It would be better to select the Chart based on the message that we want to share with the audience rather than just going with the chart type.

• 0

1. Two variables and third one is size.

2. Difficult to understand, cannot display data if they are more, cannot or difficult to ascertain actual values. Many bubbles in one concentrated point can overlap each other completely.

3. Normally people interpret it's size based on radius and not area. Ex. If area is twice that does not mean radius is twice and this give false interpretation of size.

• 0

Excellent response by Mohamed Asif and he is the winner. Also commendable is the response by Bhavin. I am sure if Bhavin was able to spend more time, he would hae built up on his answer.

