Tag Archives: Google Fusion Tables

Lessons from a Google Fusion Table Graph

Armed with new and improved service record data, last night I set out to create a new network graph in Gephi, to see whether just new data would help to mitigate some of last time’s problems.

To be frank, Gephi beat me. My graph is so small, and my screen is so small, and the zoom function in the graph window is so bad (at least, I couldn’t figure it out) that I couldn’t really see my graph in order to draw any conclusions. All my data imported correctly, though, so I knew there was hope.

I turned instead to Google Fusion Tables, an experimental data visualization app from Google. Unfortunately, it appears that the data tables work completely differently from Gephi’s, so I did have to do some reformatting. (This isn’t a huge problem for a graph with only 34 edges, but it would be a pain for something bigger.)

Google Fusion Tables

For this small network graph, Google Fusion Tables seemed to have worked out very well. The graph itself is clear and easily readable, and it’s a relatively simple proposition to remove nodes and see what happens. Fewer options for manipulating the data mean that the graph renders quickly. It’s also nice to be able to hover over a node and see the attached edges highlighted.

Google Fusion Tables does do a few annoying things, which may be able to be disabled. It would be nice if the nodes were able to be moved to a specific location for ease of reading (as is, you can pull a node to a general area, but it won’t stay exactly where you put it). Also, it would be nice if the graph would hold still! It seems like some element of the graph is always moving all the time.

More options for edges would be helpful too. I’d like to be able to see the edge labels that are in my chart, and I’d also like to be able to click on an edge for more data, just like you can see the highlighted edges when you hover over a node.

Observations about the Graph

I’ve been doing some reading up about social network graphs in this book. But this network I’ve created is not actually a social network: there are not any connections that can accurately be predicted by network theory. Why? Because this set of data is about concurrent service, not something that the men themselves control (for the most part).

So how then is this helpful?

First, even though new connections cannot be posited, the graph does show the high degree of connectedness between some of the officers. Thomas Macdonough, for instance, has multiple connections to many people. The same is true for David Porter. Macdonough and Porter have some shared connections, but they also have some unique connections. Hopefully, seeing the connected officers through the eyes of Porter and Macdonough may yield information about them.

Second, once I do get the social networks mapped out, it will be interesting to compare the two graphs, seeing whether the connections established shipboard continue into correspondence. It seems unlikely that officers would write directly to each other without previous personal connections, and concurrent service seems the most likely place for that connection to have occurred.

The comparison will be a little tricky, because it will involve networks that evolve over time. In order to create these networks, I’ll probably have to move back over to Gephi. Maybe by then I’ll have figured it out a little more.

Mapping the service of just the Preble’s Boys connections to each other shows only an incomplete picture of their records. So the next step will be to add in connections to other officers, especially highly prominent officers such as Thomas Truxtun and John Rodgers. After that, mapping out squadron service will be the next step in establishing these formal connections.

 

The question I’ve been asked before about this sort of network is: how is this more helpful than a spreadsheet? The value here is how easy it is to remove nodes and see the resulting changes in the network. You can’t really do that with a spreadsheet. As the networks get more complex, and they have to change over time, visualization is going to be much more helpful than a spreadsheet.

 

As always, I’d welcome any insights on my network thoughts!