Tag Archives: network

Lessons from a Google Fusion Table Graph

Armed with new and improved service record data, last night I set out to create a new network graph in Gephi, to see whether just new data would help to mitigate some of last time’s problems.

To be frank, Gephi beat me. My graph is so small, and my screen is so small, and the zoom function in the graph window is so bad (at least, I couldn’t figure it out) that I couldn’t really see my graph in order to draw any conclusions. All my data imported correctly, though, so I knew there was hope.

I turned instead to Google Fusion Tables, an experimental data visualization app from Google. Unfortunately, it appears that the data tables work completely differently from Gephi’s, so I did have to do some reformatting. (This isn’t a huge problem for a graph with only 34 edges, but it would be a pain for something bigger.)

Google Fusion Tables

For this small network graph, Google Fusion Tables seemed to have worked out very well. The graph itself is clear and easily readable, and it’s a relatively simple proposition to remove nodes and see what happens. Fewer options for manipulating the data mean that the graph renders quickly. It’s also nice to be able to hover over a node and see the attached edges highlighted.

Google Fusion Tables does do a few annoying things, which may be able to be disabled. It would be nice if the nodes were able to be moved to a specific location for ease of reading (as is, you can pull a node to a general area, but it won’t stay exactly where you put it). Also, it would be nice if the graph would hold still! It seems like some element of the graph is always moving all the time.

More options for edges would be helpful too. I’d like to be able to see the edge labels that are in my chart, and I’d also like to be able to click on an edge for more data, just like you can see the highlighted edges when you hover over a node.

Observations about the Graph

I’ve been doing some reading up about social network graphs in this book. But this network I’ve created is not actually a social network: there are not any connections that can accurately be predicted by network theory. Why? Because this set of data is about concurrent service, not something that the men themselves control (for the most part).

So how then is this helpful?

First, even though new connections cannot be posited, the graph does show the high degree of connectedness between some of the officers. Thomas Macdonough, for instance, has multiple connections to many people. The same is true for David Porter. Macdonough and Porter have some shared connections, but they also have some unique connections. Hopefully, seeing the connected officers through the eyes of Porter and Macdonough may yield information about them.

Second, once I do get the social networks mapped out, it will be interesting to compare the two graphs, seeing whether the connections established shipboard continue into correspondence. It seems unlikely that officers would write directly to each other without previous personal connections, and concurrent service seems the most likely place for that connection to have occurred.

The comparison will be a little tricky, because it will involve networks that evolve over time. In order to create these networks, I’ll probably have to move back over to Gephi. Maybe by then I’ll have figured it out a little more.

Mapping the service of just the Preble’s Boys connections to each other shows only an incomplete picture of their records. So the next step will be to add in connections to other officers, especially highly prominent officers such as Thomas Truxtun and John Rodgers. After that, mapping out squadron service will be the next step in establishing these formal connections.

 

The question I’ve been asked before about this sort of network is: how is this more helpful than a spreadsheet? The value here is how easy it is to remove nodes and see the resulting changes in the network. You can’t really do that with a spreadsheet. As the networks get more complex, and they have to change over time, visualization is going to be much more helpful than a spreadsheet.

 

As always, I’d welcome any insights on my network thoughts!

 

The Lessons of a Bad Network Graph

Spurred by our DH reading group at Northeastern, as well as my general tendency to jump into things before really knowing what I’m doing, I decided a few weeks ago to download Gephi and see what sort of rudimentary networks I could create.

I’d been cataloging the service record of each of my Preble’s Boys officers, setting up the chart so that I could see concurrent service. I started out just looking to see whether any of the Boys had actually served on the same ship as Edward Preble, but as I created the chart (the link here is to a more fleshed-out chart with more comprehensive data), some other patterns began to emerge.

So I thought, let’s plug this into Gephi and see what happens! I set up my network, fumbling through the Gephi readme to set up a very basic network in which the nodes were the officers and the ships were the edges.

I knew what was coming before I rendered the graph as a network visualization, but I was still a little surprised when I saw it. What I saw was a network that I knew from all my research heretofore to be completely false.

Download (PDF, 20KB)

(I apologize for the crazy way the graph sort of goes off the page. I tried every setting I could find to get it not to do that. Some mysteries of Gephi remain hidden to me.)

My initial reaction was to scrap the whole thing and start my thinking about networks all over. But on further examination, I realized that this graph still had something to teach me.

First, I learned the importance of good data. This graph shows Stephen Decatur as having only two links to anyone, a fact that is false. Additionally, it looks like Edward Preble is almost a tangential figure, a fact that is false. The person with the most links is David Porter, who is an important figure but not that important. So why the graph that looks like this?

Simply put, this is a bad data set. It starts to get at my question (How do these people link together?) by a very small subset of their interactions with each other. I don’t even have complete service records for some of these men, so it’s possible that there are connections missing from my chart. In addition, these men had several levels of interaction beyond just concurrent service (squadron concurrent service, shoreside interaction, correspondence, indirect influences…the list goes on). So the data set is quite incomplete.

What this bad data set teaches me is that the meaningful network of these men is going to be quite complex. It’s likely to need to be organized on several different interaction levels, as well as interactions over time and even perhaps spatially (do men feel others’ influence more when they’re at sea than when they are landbound? I don’t know).

Second, I saw new connections, forged through unintended groupings. Since this is a bad graph, it’s tempting to say that all the links it made between people are bogus. However, I realized that there is at least one interesting phenomenon going on that I hadn’t thought of before, but that perhaps is borne out by the documentary evidence.

This phenomenon, which may actually be a real breakthrough in my analysis, is the appearance of two groups. If you draw a connection between Stephen Decatur and Edward Preble (in your mind), then you see the loose formation of a group around them. The graph already shows a clique: the group with David Porter and William Bainbridge. What’s the connection between these two groups?

Interestingly, the two groups roughly fall into (1) those who were aboard the USS Philadelphia when it grounded in Tripoli Harbor, and (2) those who volunteered for the mission led by Stephen Decatur to destroy the Philadelphia. There are some outliers, officers who were not involved in that series of events in any way (Lewis Warrington, for instance), and one interesting anomaly, Charles Stewart, who was not aboard the Philadelphia, though he is well-ensconced into that group of officers. It will be interesting to see what happens to those men once there’s more data.

Without having done any other research yet into this grouping, I have an inkling that this way of looking at Preble’s Boys may show more about their careers after 1803 than their link to Edward Preble.

 

So what’s the major lesson for me? When I next take on Gephi, I’ll be armed with a lot more data, but even if the results are surprising, I’ll be keeping my eyes open for possibilities that I didn’t see coming down the pike.

I’d welcome any other insights on my first foray into network analysis.