Social data—also called social media data—is the freely available information published by social media users. Usually the domain of marketers, many other professionals rely on social data. Companies, for example, track customer satisfaction and create target customer segments. Even law enforcement agencies use this data to combat some of the most difficult and dangerous crimes.
Most of this data comes from social media apps and websites published by individual users: this includes blog posts, POI check-ins, comments, likes, shares, and clicks. More sophisticated data collection methods even gather information like the amount of time a user spends viewing an ad or a blog post.
Additional data includes passive information like time spent on an app or website, geographical location, device, and primary language used.
Common attributes of this data include biographical information, app usage, interests, and sentiment.
This data is primarily used by companies for the purpose of marketing and social media performance tracking.
Other groups that use this data, however, include academics, governments, and intergovernmental organizations. Interpol, for example, fights terrorists online who recruit new members and plan attacks through social media.
There are several concerns with social media datasets.
Firstly, bots abound on social media, making data cleansing difficult.
Second, NLP (natural language processing) algorithms must be continually tested, as social media users tend to be on the extreme ends—that is, either excessively positive or excessively negative. Even manual review of social media text cannot always accurately determine users’ true sentiments.
Finally, this data must be updated constantly—first because it is labile by nature and secondly because marketers must reach leads quickly in order to be effective. For example, a new like for a musician on social media should lead almost immediately to an ad for a concert in the same genre taking place in the next few days.
Western intelligence agencies express concern about massive databases of military and governmental personnel social media data. “[Corporate] records do not offer any indication that Zhenhua is controlled by the government, but the company positions itself among a constellation of data and security firms in the government’s close orbit.”
Washington Post: Chinese firm harvests social media posts, data of prominent Americans and military
GroupLens MovieLens provides movie recommendations based on user ratings. It also provides an advanced search engine, including community-based and even custom tags, to make finding new movies easier.
Flixster Database provides information on movies around the world. This includes plot, filmmaker, crew, current availability, and, of course, reviews.
Flixster even tracks news about movies, and the entertainment industry in general.
Criticker Database is one of the largest collections of movies, TV programs, and games online. It consists of short and full-length movies, TV programs and mini-series, and games.
The Database also provides lists of new releases, a filmmaker list (of those with at least three films on the site), and user collections. These collections may be public, private, or hidden, but they are all user-created.