How to cut through 'dirty data' for social media insight

OMG?! It's Twitter!! Analyzing Text from Social Media
Wading through social media comments can be time consuming and irritating

Social media is a difficult thing to manage for many companies. Most high-ranking business people think Twitter is a goldmine, and the best place to find information about their Next Big Thing™.

They hire data collection services to provide them with thousands of tweets, hoping to see a trend that tells them what their clients are looking for. The goal? Identify potential new products, or a new market.

Instead, they're confronted and horrified by what they find: thousands upon thousands of tweets about "laaaame", "luv it #selfieolympics", "#productx #cool #ninja #batman #teamfollowback #hashtag".

I've seen dirty data like this a countless number of times, and it's impossible to sort manually. You'll comb through it for hours, get frustrated, quit, then come back and spend another couple of hours before finding a couple of tweets that are relevant. Users aren't writing for your benefit: they're tweeting for their own enjoyment.

The value of text analytics

The best way to sift through the rubbish is text analytics. Don't waste time sorting; let a machine do it for you.

In a matter of seconds, the text analytics engine will give you a nice list of important topics that appear in the tweets. It'll look similar to #trends, but will actually be useful. Once you have these topics, you can run searches for them and dig deeper.

For example, ""bad service"" or ""defective"" might come up as topics when you run through the data about your company. You know they've appeared often enough that they're worth looking into. At that point, you can run manual searches to see if there are patterns of "bad service" or "defective" products.

It turns out that all of the "bad service" complaints came from the same hotel – someone's getting fired. Every "defective" product was made with the new set of cheap screws – back to the original product design.

To put it simply, text analytics is a great way to spare you of a severe headache when trying to analyze your social data.

  • Rami Nuseir is Semantria's Marketing Director, and a regular contributor to the Lexalytics corporate blog (Lexablog). Both companies specialize in text analytics and sentiment analysis technology.
Latest in Software & Services
TinEye website
I like this reverse image search service the most
A person in a wheelchair working at a computer.
Here’s a free way to find long lost relatives and friends
A white woman with long brown hair in a ponytail looks down at her computer in a distressed manner. She is holding her forehead with one hand and a credit card with the other
This people search finder covers all the bases, but it's not perfect
That's Them home page
Is That's Them worth it? My honest review
woman listening to computer
AWS vs Azure: choosing the right platform to maximize your company's investment
A person at a desktop computer working on spreadsheet tables.
Trello vs Jira: which project management solution is best for you?
Latest in News
The Witcher 4
You're probably not playing The Witcher 4 until 2027 at the earliest, per CD Projekt's latest financial update
DeepSeek
DeepSeek’s new AI is smarter, faster, cheaper, and a real rival to OpenAI's models
Open AI
OpenAI unveiled image generation for 4o – here's everything you need to know about the ChatGPT upgrade
Apple WWDC 2025 announced
Apple just announced WWDC 2025 starts on June 9, and we'll all be watching the opening event
Two Android phones on a green and blue background showing Google Messages
Google Messages just added a fun upgrade to one of its best chat features
Hornet swings their weapon in mid air
Hollow Knight: Silksong gets new Steam metadata changes, convincing everyone and their mother that the game is finally releasing this year