Characters Network
General Facts
Well first let's take a look at the characters of Rick & Morty and the information we collected about each of them:
Character Frame
Name: Evil Morty
Specie: Human
Status: Alive
Last Know Location: Citadel of Ricks
We did so for 313 characters using both the Rick & Morty API & Rickipedia, a fandom wiki page dedicated to the explore the amazing world of Rick & Morty. Overall, the character statistics are looking like this:
313
147
115
51
42
30
11
From those numbers, we can already see some interesting points. For
instance, we decided to address the fact that 36 % of them
are dead. So we are gonna look deeper at the characters status in
the Network and try to find a responsible for all those deaths!
We also noticed that 13 % of the characters are Ricks and
that 10 % are Mortys, which lead us to take a deeper look at
character diversity, and their species.
The Network
About the Network
From the 313 characters information we just talked about, we
created a "Characters Network".
- We used both Rickipedia and the Rick & Morty API to come up with
the numbers above. We took the well-formatted information about
characters from the API and we crossed them with the contents of
the wikipedia pages. To create the network we parsed each
Wikipedia page of the characters to find hyperlinks to other
characters. When a link was found we created an edge (a
link/connection) from the given character to the other character.
General View
Below is a simple visualisation of the network that we generated. If you haven't watched the show, you must know that it contains a lot of duplicate / alternative versions of the two main characters, Rick and Morty.
Above you can see the complete character network, we highlighted
the Ricks in
blue to match Rick's pointy, mad
scientist hair and the Mortys in
yellow to match the bright yellow
t-shirt that he wears throughout the show. We also scaled the node
size depending on their degree (how many links/connections they
have) which is why Rick & Morty have such big nodes. The three
black nodes that are also fairly big represent the remaining
members of the Smith family - the father Jerry, the mother Beth
and Morty's bigger sister Summer.
Now - Let's dive a little bit deeper into the analysis of our
network, by looking at how the characters are connected within the
network.
Degree Analysis
In the figure below, we plotted the degree distribution, i.e. number of nodes with a given number of links. On the horizontal x-axis is the degree (number of connections from a given node to other nodes), and on the vertical y-axis is the number of nodes with that given degree value. This plot provides insight about how the connections between characters are distributed in the network, i.e. how many characters have few connections to other characters and how many have a lot of those connections.
When we think about it, the plot above shows nothing very surprising. It just confirms something that was expected : Rick & Morty, as well as their family members, have a lot of connections compared to the remaining characters. What is more interesting is that lots of other characters only know a few others, which makes sense considering the structure of the series. As the show focuses on entirely different random adventures for each episode, it's logicial that only the main characters living those adventures have a lot of connections to a lot of other characters whereas most of remaining characters will never get to meet each other.
Centrality
Centrality assigns an importance score to a node based purely on its number of connections. So by taking a look at centrality, we can get a glance of which characters are the most popular, or which character to know if I want to know another one. So let's look at which of our characters are the most central.
Betweenness Centrality
Since betweenness is a measure of which node has the shortest paths from A to B going through it, a high betweenness centrality means that this node is the character to know, if you want to be able to get as fast as possible from A to B in the network. For example, if you want to get from Birdperson to Abradolf Lincler, what is the shortest path through the other characters to get to him? The betweenness centrality measure takes all the paths between characters into consideration, and therefore a ranking of characters can be done.
As expected, Rick and Morty are most central, followed my the remaining members of the Smith family. Jerry is probably third because he joins Rick and Morty on most adventures, compared to the other two family members. So if I'm a new character in Rick & Morty and I want to meet another one, the optimal way is to ask Rick first, then Morty, then Jerry (you might not want to do that, or the conversation might get very boring), etc.
Centrality Eigenvector
The Eigenvector centrality is a measure of how nodes (characters) are connected to other nodes with high degree (many links/connections), and in that sense it is also a ranking. Characters have high eigenvector centrality probably because of popularity, i.e. they appear often with other high-degree characters.
Again, Morty and Rick appear with highest centrality, whereas Jerry is on fifth place. Perhaps, like explained above with betweenness, because he is on many adventures in different locations, where a lot of characters with low degrees appear. So that makes Jerry popular...
Everything as expected
Well so far we haven't realized anything surprising. But is it a
bad sign ? No it's actually a good sign because it shows that the
data we collected is in correlation with the show which is a very
good things this far!
Rick wins the betweenness centrality
contest and Morty has the highest eigenvector centrality. These
results match our expectations very well, and could most likely be
explained by the fact that Rick knows almost each and every
character across all locations and episodes, therefore he has the
highest betweenness. Morty on the other hand, is a central member
of both the adventures and the Smith family, which makes him more
connected to other nodes with high degree than Rick.
Status View
Another fact about Rick & Morty is that it involves LOTS and LOTS of death. Luckily our data sources included this information about the characters so we decided to look into that, to find the "most dangerous character" of the show. We started by plotting the following network, depicting alive characters as green nodes, dead characters as red nodes, and those nodes with "unknown" status as black :
Mmmh out so out of 313 characters 114 are dead... Can we find the responsible for all those dead bodies, who could be the most dangerous characters of the show?
Most Dangerous character
We ran a little analysis about this through our Network. We looked at every nodes' neighbors (the nodes that are just 1 link/connection away) and analyzed their status (alive or dead) to have a sense of who has been in contact with the biggest amount of dead characters.
And again, it's without big surprise that Rick & Morty are the most dangerous characters. But the otherwise innocent father Jerry Smith gets the third place in our rankings of most dangerous character in the whole series, beating his own wife & daughter. Since these characters have a lot of edges (links/connections) and appear more often, their chance of being next to dead characters increase.
But when we look beyond the regular Smith family we stumble upon the infamous Evil Morty who probably has more kills than we know at that point of the series. Not surprising again but Beth & Summer better work on their danger game during season 4 otherwise Evil Morty might pass them !
Species View
Rick & Morty is a very imaginary, explorative and diverse show, in terms of characters. Especially the variety of species is interesting, and therefore we decided to also plot a network with information about the species of each node.
As shown by the figure, there are actually 10 known species at the moment. We wonder how many more will be added in season 4 on the new adventures of Rick & Morty to distants dimensions. While humans are still the most represented, we really enjoy the presence of all the others !
Communities
This section analyzes the communities within the characters
network by comparing four different algorithms for community
detection. Plots are used to show the structure of communities and
which of the main characters are central to a specific community.
For instance if Jerry Smith is the central character in a
given community, all nodes within that community will be colored
darkgreen to match Jerry's awesome
shirt. Likewise for Rick, where nodes are colored
turquoise to match his pointy hair.
If Rick and Morty are both in the same community, nodes will be
colored in a flashy green to recall
the portal gun.
For each plot, node sizes are dependent on the size of the
communities, such that bigger communities have bigger nodes.
But first, a little note on what modularity means, since we use this number to determine how good a given algorithm is to find communities in a network; Modularity: A number between 0 and 1 that measures how good a division of a network into communities is, where 1 is perfect division.
Greedy Communities
The greedy modularity maximization divides characters into communities by trying to optimize the modularity value.
Modularity : 0.48
As shown by the graph this actually results in community size decreasing as the communities are formed. Since communities are based upon maximization of modularity, a few big communities exist. This is due to the high degrees (many links/connections) of Rick and Morty. This algorithm, as described by it's name, will divide communities with the highest modularity, since that is inherent in the functionality of the algorithm.
Fluid Communities
Modularity : 0.41
The fluid communities algorithm requires defining a number of communities, since the first step of the algorithm is to assign communities randomly to this amount of nodes within the network. Then the algorithm divides communitites by randomly running through nodes and assigning them to communities, based on which communities the neighbors of the given node are divided into. This randomization results in more distinct communities than some of the other algorithms. It is worth noting that changing the predefined number of communities has a high impact on the modularity. The number 15 was chosen, since two of the other algorithms divided the characters into exactly 15 distinct communities automatically.
Label Propagation Communities
Modularity : 0.24
The label propagation algorithm performs worst of all the algorithms, since it has the lowest modularity value. All members of the Smith family are clustered together into one large community that dominates the graph, also visually. This large community exists because of the labelling approach used by the algorithm, where nodes are divided into communities if their neighbors contain similar labels. This approach will enhance communities with high-degree (many links/connections) nodes, thus generating this dominant community containing all members of the family.
Louvain Communities
Modularity : 0.45
What is interesting about the Louvain algorithm is that the approach it uses to divide communities is similar to greedy modularity maximization. The major difference is the initial grouping of nodes locally. Perhaps this local grouping is the reason why the number of communities is lower than the greedy modularity maximization algorithm.
Community Wrap-up
The four different community algorithms divide the characters in
slightly different ways. For example, Rick and Morty are not
always grouped into the same community. But perhaps more
interestingly, Summer Smith is the only member of the Smith family
to reside in her own community for most of the algorithms. This
could indicate that she is detached from Rick & Morty or detached
from the Smith family. Not necessarily jumping to conclusions
here, but her not always nice attitude towards Rick & Morty could
perhaps explain some of this detachment.
Even though the algorithms provide different results, there are
also a number of interesting similarities. For instance the
characters Terry and Purple Morty are central within a community
for all four algorithms, making them characters worth noting,
since they apparently have some central position in the show. All
algorithms divide the characters into 15 or less communities. This
is quite interesting, since the show contains far many more
distinct locations, and we expected a larger number of communities
that would be closer to the 100+ locations.
Conclusion
It was a fun ride wasn't it ? Even though we have not been that
surprised by the results. Why, you ask ? We're thinking it's because
of the show structure which doesn't appeal to a lot of surprises,
since episodes only have a few characters in common and they often
introduce lots of new characters. Main findings include:
Rick &
Morty being central to the show and absolutely deadly - leading by far
in the list of most dangerous characters. And don't forget the
deadliness of the show - a staggering
36% of characters are dead. And perhaps we should pay more
attention to Jerry now that we know he can be quite deadly.
Summer seems to do her own thing, as we saw in the community analysis,
where she was part of her own community in most cases.
Finally, we got a sense of the variety of species in the show;
10 different known species exist, including Alien, Human,
Parasite and the wonderful Poopybutthole species.
That wraps it up for the network analysis and if you've enjoyed the
ride so far, don't hesitate to head over to the
Natural Language Processing page, for much
more schwiftiness!