Plotting similarity matrix using Matplot lib
Original post from here
Following from the previous post of plotting similar neighborhoods of San Francisco and Austin, in this post I will briefly mention how to plot the similarity scopes in the form of a matrix.
The plot will looks something like this
San Franscisco neighborhood similarity matrix.
- `ColorSchemes`: `default`
- ColorSchemes: `Greens`
- ColorSchemes: `YlGnBu`
- `ColorSchemes`: `RdYlGn`
How to plot the simialrity matrix.
- Here is the snipet of code you will need to plot this matrix.
```
import matplotlib.pyplot as plt
labels = []
for hood in hood_menu_data:
labels.append(hood["properties"]['NAME'])
fig, ax = plt.subplots(figsize=(20,20))
cax = ax.matshow(hood_cosine_matrix, interpolation='nearest')
ax.grid(True)
plt.title('San Francisco Similarity matrix')
plt.xticks(range(33), labels, rotation=90);
plt.yticks(range(33), labels);
fig.colorbar(cax, ticks=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, .75,.8,.85,.90,.95,1])
plt.show()
````
- The main components to note
- `matplotlib`: Ploting is done via `matplotlib`.
- `matshow` : This function takes the input similarity matrix. Note this can also be a correlation matrix between `n` variables.
- `Grid`: enable the grid using `ax.grix(True)`
- `labels.append `: You can add lables to the matrix by passingg the labels array to the xticks functions.
- `plt.xticks`: The lables arrays is passed to the xticks function along with the number of elements.
- `rotation=90`: Note I have to rotate the x ticks to `90 degree` so that they are plotted vertically
- `colorbar`: this is the color bar on the right of the matrix. This is used to plot teh gradeint of different colors. Red belongs to 1.0 and dark blue belongs to 0.
- `ticks`: You can pass an array of values which will represents the values in the legends of the colorbar.
Observation:
- The matrix is a diagonal matrix.
- Seaclif has an almost blue line, which signies that its similarity is very less with all neighborhood. Astute readers will note that the similarity values are not normalized across neighborhoods.
- `Chinatown`: One can easliy interpret that the similarity of Chinatown is very similar to the boxes which are redish and orange. Conretely, `Chinatown` is most similart to `Inner richmond`, `Outer richmond`, `Outer sunset` etc.
- There is a very high correlation (red, orange area) around `Outer mission`, `Outer Richmond`, `Outer sunset`
ColorMaps:
- Styles: `Sequential`, `Qualitative` etc
- You can also play around with color maps schemes do give different color schemes to the matrix.
- Chose from a list of color map from [here] (http://matplotlib.org/examples/color/colormaps_reference.html)
- The change in code is very minimal
````python
from matplotlib import cm as cm
cmap = cm.get_cmap('Greens')
//cmap = cm.get_cmap('YlGnBu')
//cmap = cm.get_cmap('RdYlGn')
cax = ax.matshow(hood_cosine_matrix, interpolation='nearest', cmap=cmap)
````
- Code:
- `cm`: Import the color map library
- `get_map`: Pick from a pre-defined list of color map schemes.
- `cmap`: Pass the color map variable as an argument to the matshow function
Related
- There are other ways to plot similarity matrix. Here are few more examples:
- Diagonal correlation matrix [Link] (http://stanford.edu/~mwaskom/software/seaborn/examples/many_pairwise_correlations.html)
Good one :) I am study Recommender System... Any pracical hints would be highly appreciated.