The Impact of Machine Learning and Artificial Intelligence on Unstructured Data and The Importance of Human Analysis

23 Feb 2023
   —  by Giles Brown

Over the last six months, AI has been at the forefront of public conversation. This is partly thanks to Chat GPT and others dominating news coverage, painting a picture of a dystopian society where computers take over the decision-making and creative processes that are the usual realm of human beings.

Not quite yet, humans still play the starring role, but all this attention has highlighted the fact that AI and Machine Learning (ML) can deliver substantial benefits. For example, this technology can be (and is being) applied to managing vast amounts of data, often referred to as unstructured data, to help us synthesize information and make better, more informed decisions. This will significantly impact data-heavy industries, such as banking and finance, law, healthcare, retail, manufacturing, and others.

Unstructured data is information not arranged according to a pre-set data model or schema and, therefore, cannot be stored in a traditional relational database. Many business documents are unstructured, as are email messages, videos, photos, webpages, and audio files. Text and multimedia are two common types of unstructured content.

Unstructured data stores contain a wealth of information that can be used to guide business decisions. However, unstructured data has historically been complicated to analyze. With the help of AI and machine learning, new software tools are emerging that can search vast quantities of it to uncover valuable and actionable business intelligence, helping real people make more informed decisions.

Unstructured data can be created by people or generated by machines.

Here are some examples of the human-generated variety:

  • Email: Email message fields are unstructured and cannot be parsed by traditional analytics tools. That said, email metadata affords it some structure and explains why email is sometimes considered semi-structured data.
  • Text files: This category includes word processing documents, spreadsheets, presentations, email, and log files.
  • Social media and websites: data from social networks like Twitter, LinkedIn, and Facebook, and websites like Instagram, photo-sharing sites, and YouTube.
  • Mobile and communications data: For this category, look no further than text messages, phone recordings, collaboration software, chat, and instant messaging.
  • Media: This data includes digital photos, audio, and video files.

Here are some examples of unstructured data generated by machines:

  • Scientific data: This includes oil and gas surveys, space exploration, seismic imagery, and atmospheric data.
  • Digital surveillance: This category features data like reconnaissance photos and videos.
  • Satellite imagery: This data includes weather data, landforms, and military movements.

At Social360, we apply AI and ML tools to help sift through the vast amounts of information we harvest using our proprietary search tools for our clients. In the same way, a law firm might apply AI and ML to search through reams of case law, looking for the few pieces of information that might be used to influence a case or achieve a better deal, we apply similar techniques to assist our skilled human analysts in delivering relevant, actionable data to communications teams and senior executives, so they better understand what’s being said about their company.

We are beginning to utilize AI and ML techniques that enable computers to learn from data and perform tasks that usually require human intelligence, such as understanding natural language, summarizing texts, etc. AI and ML can help process and analyze unstructured data in various ways:

Classification: the technology can assign labels or categories to unstructured data based on predefined criteria or rules. For example, AI and ML can classify emails as spam or not, images as cats or dogs, sentiments as positive or negative, etc.

Clustering: it can group unstructured data based on similarities or patterns without using predefined labels or categories. For example, it can cluster customers based on their preferences, behaviors, or demographics or cluster documents based on their topics, keywords, or authors.

Extraction: AI and ML can extract specific information or entities from unstructured data, such as names, dates, locations, prices, etc. For example, it can extract product reviews from social media posts or key facts from news articles.

Generation: the technology can generate new unstructured data based on existing data, such as text, images, audio, video, etc. For example, it can create captions for images, text summaries, language translations, or music for lyrics.

However, AI and ML could be better and may make errors or produce biased or inaccurate results. This is where human editing comes into play. Human editing is the process of reviewing, correcting, or improving the output of AI and ML models, such as checking for spelling, grammar, logic, coherence, relevance, etc. Human editing can help ensure the quality, reliability, and validity of the output, as well as provide feedback and guidance for improving the AI and ML models.

The growth in the volume of social media posts has been exponential, making it unmanageable for humans to analyze and identify reputationally impactful content with the speed necessary to make the data actionable. At Social360, we used over a decade’s worth of human-analyzed data to train our models to help with this task.

Every piece of content we identify is individually scored on the probability of reputational impact on an organization. This means that by the time our analysts review the day’s social coverage for one of our clients, a large part of the analysis has already been completed. It’s down to our analysts to take the baton from the machine and analyze the content that has the most potential impact. In this way, by using AI and ML, a task that might have taken days could take less than an hour – think about how that increases the value of the process for clients.

We embrace the use of technology to improve our work. ML and AI are powerful techniques that can help process and analyze vast amounts of data, but the final mile will always (or at least for the foreseeable future) rely on human skills to deliver value for clients.

Social360 is an advanced social media monitoring company. For more information, please get in touch with Alex Baker or Giles Brown or visit our website –

Giles Brown