Spurred by our DH reading group at Northeastern, as well as my general tendency to jump into things before really knowing what I’m doing, I decided a few weeks ago to download Gephi and see what sort of rudimentary networks I could create.
I’d been cataloging the service record of each of my Preble’s Boys officers, setting up the chart so that I could see concurrent service. I started out just looking to see whether any of the Boys had actually served on the same ship as Edward Preble, but as I created the chart (the link here is to a more fleshed-out chart with more comprehensive data), some other patterns began to emerge.
So I thought, let’s plug this into Gephi and see what happens! I set up my network, fumbling through the Gephi readme to set up a very basic network in which the nodes were the officers and the ships were the edges.
I knew what was coming before I rendered the graph as a network visualization, but I was still a little surprised when I saw it. What I saw was a network that I knew from all my research heretofore to be completely false.
[gview file=”http://abbymullen.org/smallnetwork.pdf”]
(I apologize for the crazy way the graph sort of goes off the page. I tried every setting I could find to get it not to do that. Some mysteries of Gephi remain hidden to me.)
My initial reaction was to scrap the whole thing and start my thinking about networks all over. But on further examination, I realized that this graph still had something to teach me.
First, I learned the importance of good data. This graph shows Stephen Decatur as having only two links to anyone, a fact that is false. Additionally, it looks like Edward Preble is almost a tangential figure, a fact that is false. The person with the most links is David Porter, who is an important figure but not that important. So why the graph that looks like this?
Simply put, this is a bad data set. It starts to get at my question (How do these people link together?) by a very small subset of their interactions with each other. I don’t even have complete service records for some of these men, so it’s possible that there are connections missing from my chart. In addition, these men had several levels of interaction beyond just concurrent service (squadron concurrent service, shoreside interaction, correspondence, indirect influences…the list goes on). So the data set is quite incomplete.
What this bad data set teaches me is that the meaningful network of these men is going to be quite complex. It’s likely to need to be organized on several different interaction levels, as well as interactions over time and even perhaps spatially (do men feel others’ influence more when they’re at sea than when they are landbound? I don’t know).
Second, I saw new connections, forged through unintended groupings. Since this is a bad graph, it’s tempting to say that all the links it made between people are bogus. However, I realized that there is at least one interesting phenomenon going on that I hadn’t thought of before, but that perhaps is borne out by the documentary evidence.
This phenomenon, which may actually be a real breakthrough in my analysis, is the appearance of two groups. If you draw a connection between Stephen Decatur and Edward Preble (in your mind), then you see the loose formation of a group around them. The graph already shows a clique: the group with David Porter and William Bainbridge. What’s the connection between these two groups?
Interestingly, the two groups roughly fall into (1) those who were aboard the USS Philadelphia when it grounded in Tripoli Harbor, and (2) those who volunteered for the mission led by Stephen Decatur to destroy the Philadelphia. There are some outliers, officers who were not involved in that series of events in any way (Lewis Warrington, for instance), and one interesting anomaly, Charles Stewart, who was not aboard the Philadelphia, though he is well-ensconced into that group of officers. It will be interesting to see what happens to those men once there’s more data.
Without having done any other research yet into this grouping, I have an inkling that this way of looking at Preble’s Boys may show more about their careers after 1803 than their link to Edward Preble.
So what’s the major lesson for me? When I next take on Gephi, I’ll be armed with a lot more data, but even if the results are surprising, I’ll be keeping my eyes open for possibilities that I didn’t see coming down the pike.
I’d welcome any other insights on my first foray into network analysis.