When you have data collected that consists of some form of connected elements, then visualization may be a fruitful method for exploring that data. If you are familiar with Python, or programming in general, creating network graphs that can be visualized in Gephi is a rather trivial task. The snippet below shows how to traverse an arbitrary number of connected elements and how to generate a graph that be loaded into Gephi:
The main element in the code is the line graph = generate_graph(r, graph) as it adds the recursion to the algorithm, meaning that the function calls itself. The call also provides the graph currently being constructed, and the node(s) to be added. In this example all nodes already exists as objects appended as a list of “replies” to the node. The node element is an element consisting of a dict object with nested list as the “replies” element. See below for the most simple example of a connected graph as dict/list elements.
The graph object needs to be created initially, and the data needs to be gathered. The data can be appended to the function in the cursive loop, but this would however affect both complexity and speed. I would advocate to first collect the data, then work with it. The code below illustrates how to initially setup create the graph object and later how to write the complete graph to file.
The following example is a graph generated to visualize reply-threads in Twitter data.
Data for the graph above was captured using Phirehose, the graphs were generated using Python and the pygexf library, and the network was visualized using Gephi. Read more about social web mining in ‘Mining the Social web’.