SOCMINT | Situational Awareness via Twitter
IndraStra Open Journal Systems
IndraStra Global

SOCMINT | Situational Awareness via Twitter

By Jamie Bartlett and Carl Miller

SOCMINT | Situational Awareness via Twitter

Image Attribute: Twitter Art by FavsCo / Creative Commons

Twitter is by far the platform of greatest interest in terms of event detection. Of all the uses of event detection technology, building situational awareness of rapidly developing and chaotic events – especially emergencies – is perhaps of most clear application to counter-terrorism. Emerging events are often reported on Twitter (and often spike shortly thereafter as ‘Twitcidents’) as they occur.

Social media users (especially Twitter users) can play a number of different roles in exchanging information that can detect events. They can generate information about events first-hand. They can request information about events. They can ‘broker’ information by responding to information requests, checking information and adding additional information from other sources and they can propagate information that already exists within the social media stream.

Multimedia content embedded on social media platforms can add useful information – audio, pictures and video – which can help to characterize events. One crucial area of development has been to combine different types of social media information across different platforms. One study used YouTube, Flickr and Facebook, including pictures, user-provided annotations and automatically generated information to detect events and identify their type and scale.

Due to the user generated nature of on social media there is a pervasive concern with the quality and credibility of information being exchanged. Given the immediacy and easy propagation of information on Twitter, plausible misinformation has the potential to spread very quickly, causing a statistically significant change in the text stream. Confirming the validity of the positive system response is a crucial step before any action is to be taken on the basis of that output. A vital requirement of event detection technology is the ability to verify the credibility of information announcing or describing an event. Some promising work has been done to statistically identify first-hand tweets that report a previously unseen story; however it is unclear how that system would perform with the relatively small amounts of data available in an emergency scenario.

Generally speaking, untrue stories tend to be short lived due to some Twitter users acting as information brokers, who actively check and debunk information that they have found to be false or unreliable. One study, for instance, found that false rumours are questioned more on Twitter by other users than true reportage.45 Using topically agnostic features from the tweet stream itself has shown an accuracy of about 85 per cent on the detection of newsworthy events.

One 2010 paper, ‘Twitter under crisis’, asked whether it was possible to determine ‘confirmed truth’ tweets from ‘false rumour’ tweets in the immediate aftermath of the Chilean earthquake. The research found that Twitter did tend toward weeding out falsehoods: 95 per cent of ‘confirmed truth’ tweets, were ‘affirmed’ by users, while only 0.3 per cent were ‘denied’. By contrast, around 50 per cent of false rumour tweets were ‘denied’ by users. Nevertheless, the research may have suffered a number of flaws. It is known, for example, that the mainstream media still drives traffic – and that tweets including URL links tend to be most re-tweeted, suggesting that many users may have simply been following mainstream media sources. Moreover, in emergency response, there tends to be more URL shares (approximately 40 per cent compared to an average of 25 per cent) and fewer ‘conversation threads’.

One important factor, especially important for situational awareness, is the ability to identify the geo-spatial characteristics of an event. Many of the techniques described above to infer the location data of social media content are also used in the field of event detection.

However, the reliability of event detection and situational awareness techniques may be context or even event specific. It appears especially useful in emergency response where a large number of people have a motive to produce accurate information. By contrast, one recent (unpublished) thesis analysed the extent to which useful real time information about English Defence League protests could be gleaned from Twitter.

In the build-up to three demonstrations for which data were collected in 2011, most tweets were negative; and very few were geolocated to the event venue. A very large number of tweets were re-tweets (49 per cent compared to 24 per cent during a control period), and on further analysis, a significant proportion of the re-tweets were negative, inaccurate rumors. Moreover, a very large proportion of tweets (50 per cent) came from a very small number (5 per cent) of – usually negative – commentators.49 The recent crowd-sourced effort to positively ID the suspects in the recent Boston terror attacks on Reddit were also less successful – although it is not clear whether and how information gained through the exercise was of use or value to the police.

This is only one of a number of difficulties relating to the validity and reliability of data sets. There are now, for example, systematic, highly organised operations to create fake reviews, although other researchers are using natural language processing to determine fake reviews from real ones – including verifying an IP address to determine frequency. Of course, at the very large scale, data can be widely skewed by automated information bots. Facebook recently revealed that seven per cent of its overall users are fakes and dupes.

The validity of large scale data sets partly relies not on the fact that every single data point will be taken as accurate, but that when aggregated and combined, large scale data sets can produce valid and robust results – or at least results more robust than any single, even expert, observation. This is the principle that ‘the wisdom of the crowds’ produces more accurate descriptions than any single observer when certain conditions are met: diversity of opinion, independence, decentralization, and aggregation. Social media, as a social network, does not always meet these conditions. One recent study of 140,000 Facebook profiles looked at the first three months of use and found that new members were closely monitoring and adapting to how their friends behaved, suggesting that social learning and social comparison are important influences on behavior. The 2011 London riots were widely discussed – and perhaps partly organised – via social media networks. It does not appear that Twitter was able to ‘dispel’ misinformation quickly. Indeed, rumors spread rapidly, and although some disagreement was found, they were within different, sealed networks.

This article is an excerpt from a technical paper, titled " THE STATE OF THE ART: A LITERATURE REVIEW OF SOCIAL MEDIA INTELLIGENCE CAPABILITIES FOR COUNTER-TERRORISM, published by Demos under Creative Commons license.

Download The Paper - LINK