“Precrime” adopted by Chicago PD

Predictive analytics is not only used by financial investors to make a big win at the stock market, it is now also used to predict who that is likely to be involved in future crime. Philip K. Dick  introduced the idea of “precrime” in 1956 in the classic sci-fi book “The Minority Report”, where three pre-cognitive mutants were able to foresee not only future crime, but also the time, place and culprit. The idea was later made popular by the 2002 film “Minority Report” by Steven Spielberg starring Tom Cruise, and most lately by the Fox TV series premiering in September 2015.


The Chicago PD approach is however not based on mutant super powers, but rather on facts and statistics. Their model is based on the criminal records of local known offenders, and it is specific to the crimes committed, a not does not consider socio-economical factors, age, race, or gender. Patterns in crime is enough to tell who that is most likely to be involved in a violent crime within the next 18 months, either as a offender or as a victim.

“The model can’t necessarily tell if you’re going to be a victim or an offender. It just knows that you’re going to be a party. If you compare that prediction to one for another criminal, your chances are over five hundred times greater than they would be for a general member of the population or another criminal would be.”

The strategy of the police is then to locate and initiate a dialogue with these individuals. While the fictional model of precrime would have future felons convicted of the crime and put in jail before the fact, the Chicago model takes a “guardian” position where repeated offenders are offered a discussion on where there chosen path is taking them.

I found the article and approach really interesting as it is a striking example of how technology, big data, and predictive analytics is changing a well established and deeply culturally rooted practices at its core. Read the full story at Backchannel.

Appar för lokalisering vid 112 samtal

Ett av 112-operatörers största problem är att lokalisera en inringande person. Det är vanligt att personer inte kan ge en god beskrivning av vart de befinner sig. Kan du tex säga adressen till vart du är just nu? Det är många situationer där det kan vara svårt, tex om man är på besök hos en bekant, promenerar inne i en storstad, på vägen i sin bil, på vandring i naturen, turist på orten, eller om inringaren är ett barn.

När jag senast konfronterades för problemet stod lite över en halv kilometer hemifrån, långt ute på landet med en svårt skadad motorcyklist, och med vägar utan namn. Resultatet –

“öhh, jag står typ mellan A och B, alldeles vid vägen till C, och ca 500 meter sydväst om D … ok fortfarande svårt … jag ser en postlåda en bit bort, jag får springa dit och kolla om det står en adress på den”.

Problemet är så vanligt förekommande att man nu (äntligen) har infört appar i flera länder som informerar SOS-operatören om den inringandes position. Appen måste installeras på telefonen, vilket inledningsvis kommer orsaka både frustration och en låg install-base, men som över tid kommer att rädda liv. Fördelen med denna typ av applikation är att positionen är baserad på GPS-data vilken gör den mycket precis, till skillnad från triangulering mellan master som används av SOS-Alarm idag. En annan fördel är att det är helt frivilligt att installera den. Om myndigheter vill kan den även användas för att sprida Viktigt Meddelande till Allmänheten, eller annan viktig information, tex hur HLR utförs korrekt. Tyvärr finns inte applikationen för användning i Sverige, men det känns som en ofrånkomlig framtid. Läs mer om projekten i Nya Zeeland och Finland i länkarna under.

Generating network graphs with Python and pygexf

When you have data collected that consists of some form of connected elements, then visualization may be a fruitful method for exploring that data. If you are familiar with Python, or programming in general, creating network graphs that can be visualized in Gephi is a rather trivial task. The snippet below shows how to traverse an arbitrary number of connected elements and how to generate a graph that be loaded into Gephi:

def generate_graph(node,graph):

    # We start by adding the current node to the graph
    graph.addNode(id=str(node["label"]), label=str(node["id"]))

    # Loop over each child node, or twitter replies
    for reply in node["replies"]:
        # ... and go one level deeper if child nodes exists
        graph = generate_graph(reply, graph)

        # ... then connect the child node that now exists with the current parent node
        graph.addEdge(id=str(reply["id"]), source = str(reply["id"]), target = str(node["id"]))

    # When all child nodes has been appended, or if no more exists,
    return graph

The main element in the code is the line graph = generate_graph(r, graph) as it adds the recursion to the algorithm, meaning that the function calls itself. The call also provides the graph currently being constructed, and the node(s) to be added. In this example all nodes already exists as objects appended as a list of “replies” to the node. The node element is an element consisting of a dict object with nested list as the “replies” element. See below for the most simple example of a connected graph as dict/list elements.

{node_id=1, replies=[{node_id=2, replies=[]}]s}

The graph object needs to be created initially, and the data needs to be gathered. The data can be appended to the function in the cursive loop, but this would however affect both complexity and speed. I would advocate to first collect the data, then work with it. The code below illustrates how to initially setup create the graph object and later how to write the complete graph to file.

def main(argv):
   gexf = Gexf("Name of Graph collection","Info on graph collection")
   graph=gexf.addGraph("directed","static","Information about graph")

   # function to get data
   thread = get_single_thread()

   # generate graph
   graph = generate_graph(thread,graph)

   # we end by writing the graph to file

The following example is a graph generated to visualize reply-threads in Twitter data.

Replies on Twitter

Replies on Twitter

Data for the graph above was captured using Phirehose, the graphs were generated using Python and the pygexf library, and the network was visualized using Gephi. Read more about social web mining in ‘Mining the Social web’.

What is the main function of Twitter in crisis?

What is the main function of Twitter in crisis?

During a large scale forrest fire in Sweden this summer I was able to capture a few interesting dataset. One of those consists of a months worth of Twitter data. While doing a “test run” on the data to generate visualizations I found two really interesting patterns. By looking at these two graphs, I would say that Twitter users mainly look for interesting or important information, and forward it to our followers. There is hardly any conversations (or replies), and that really surprised me.

Screen Shot 2014-09-24 at 10.02.13 Screen Shot 2014-09-24 at 10.02.44

The left image shows the small number of reply networks with 315 reply tweets in total. The right image show a much larger network of retweets, with 9952 retweets. Clearly there is very little conversations going on Twitter, at least using the “official” hashtag of the forrest fire. Now the question for you! Why is this important, and how does it affect research on social media use in crisis in general?

More is to come regarding both this dataset and the event. Data was captured using Phirehose, the graphs were generated using Python and the pygexf library, and the network was visualized using Gephi.

Printing whiteboards

Aspects of digitizing response work interests me a great deal, and recent years innovation and adoption of services, methods and devices have been truly interesting to study. What we have seen is that communication, documentation, and many more aspects of work are getting digitized, thus allowing for new kinds of content to be generated and made part of response activities. One environment to study such change is “situation rooms”. These work intensive, white board clad, and collaboratory environments have seen huge changes in the last couple of years. What struck me today is that I found i predecessor to todays “smartboard” that had completely gone me by. Printing whiteboards allows for digital copies to be made, stored, shared and of course, printed. We commonly see whiteboards being photographed for backup, sharing and to make permanent as part of emergency response work and other similar and related disciplines. I have however not seen a whiteboard able of doing those tasks themselves. (And maybe it not all to hard to imagine why I haven’t seen these devices in the wild.) Here is a video for you to consider. Would features relate to situation room work as we know it today?