Data mining is commonly perceived as a process that deals with data extraction, but actually, it’s far more complex than that.

It looks at how invested parties assess large banks of information to generate new information. In a greater sense, this revolves around analyzing patterns and extrapolating the knowledge necessary to create new knowledge, utilizing existing data as effectively as possible.

There are various techniques to explore, which have been perfected by specialists who have dedicated vast swathes of time to processing and drawing conclusions from considerable information.

But what exactly are these techniques?

Data mining techniques

  • Pattern tracking

This basic technique enables the recognition of patterns in data sets.

By spotting these patterns you can more effectively assess anomalies that regularly occur or other deviations that need to be brought to the surface.

For example, you might notice a sales spike just before the holidays, or how warmer weather has a tendency to attract visitors. This type of information has multiple uses and applications.

  • Association

This also relates to pattern tracking, more specifically to connected variables. With this technique, you’ll be assessing events that strongly correlate with other events.

A good example of this is the analysis of consumer buying habits. You might notice when customers purchase a certain item they’re highly likely to buy another related item. It is this sort of information which is used to generate the ‘people also bought’ sections you see in online stores.

  • Regression

Planning and modeling functions to identify the likelihood of a given variable in the presence of other factors. For example, the cost department of a company might price products based on factors like consumer demand, availability, and competition.

By using this method you’ll determine the relationship between two variables in a data set.

  • Prediction

One of the most valuable data mining techniques of all, the prediction is at the heart of future projections.

A company’s ability to accurately predict future outcomes is a huge success factor. Recognizing historical trends is a fantastic way to foresee future market conditions. For example, if you review the credit history and past purchases of consumers, you’ll be more likely to assess whether they’ll be a credit risk in the future.

  • Outlier detection

Perhaps one of the more simple methods on the list, outlier detection involves recognizing a pattern to give you a better understanding of a data set.

In addition, you can capably identify aberrations. For example, if your target audience is predominantly male, but then you notice a sudden huge spike in female buyers. By investigating this spike you can configure what caused it, enabling you to replicate it at a later date and enhance the perception and understanding of your audience.

  • Classification

A more complicated data mining technique, classification involves the collection of various attributes in discernible categories.

From this point, you can draw further conclusions which serve a specific purpose.

A great example of this is when companies evaluate the financial backgrounds of customers to work out their purchasing history. From this information they can classify consumers based on credit risk, establishing categories like “low,” “medium,” and “high.” This is useful because you can learn more about your clientele.

  • Clustering

This is very much like classification, except you’ll be looking at how chunks of data are grouped together by similarities.

For example, companies often cluster multiple demographics within their audience into packets. These are based on different variables like disposable income, or a customer’s tendency to shop with you.

Facebook’s #10yearchallenge – an elaborate setup?

Earlier this year, the Facebook challenge went viral, asking users to post photographs from ten years prior.

It was advertised as an interactive opportunity to share the effects of aging with others while giving users an opportunity to see how much they’ve changed. More than 5.2 million people partook, but many began to question whether Facebook had a secret agenda.

Was it a ploy to extract facial recognition data from users?

If so, Facebook devised a very clever data mining tactic designed to train its facial recognition algorithm. This would help them calibrate age-related characteristics, with special regards to how people’s appearances develop as they progress through life. It would’ve captured a huge audience under a significantly more attractive premise than asking people to submit photographs for market research.

Some would argue this is a bit far-fetched, but at the same time, the dots connect.

Professor Amy Webb was quoted saying it was a ‘perfect storm for machine learning’, a credible opinion when you consider her AI expertise.

Though Facebook adamantly said otherwise, claiming it had nothing to gain from the challenge, speculations continue to run rampant. Regardless of your thoughts on the matter, there’s no denying the potential for Facebook to gain from gathering a comprehensive collection of faces, a data goldmine!

Do VPNs collect data and sell it on?

This question was brought to the forefront when free VPN provider Hola! VPN converted its users into a botnet without consent.

Read our full Hola! VPN review

An alarming proposition for certain, one which deters users from engaging with VPNs altogether.

Hola! VPN is available as a plug-in for Google Chrome, and before outrage ensued it was praised for its ease-of-use and free service.

But nothing is ever for free in this world my friends, as many users, unfortunately, found out!

The issue was brought to light when multiple DoS attacks were targeted at the controversial 8chan forum. When the attacks appeared to be coming from the Hola! network, this sheds light on the Hola! Business model.

Ultimately the temptation to profit from the near 9 million IP botnet was inevitably too much for Hola! VPN to resist. Selling access to data was perceived as invasive, and could potentially put users at harm.

Hola! VPN doesn’t offer bandwidth or servers, a critical reason it was enabled as a botnet in the first place. Most VPNs have thousands of servers spread around the world and can divert users through different portals so they appear to be in another country.

In this case, Hola! would simply redirect its users, operating a peer-to-peer VPN which routed connections through user devices. In effect, this is something like a telephone exchange. Hola! VPN makes money by selling the idle bandwidth of free users. Those who don’t want to contribute bandwidth have to pay for the service.

Though the concept was made clear in the terms and conditions, many users were oblivious to the fact their bandwidth was being sold. The prospect of strangers seeding user internet connections outraged consumers, especially those that protested how their internet could be used for illegal purposes.

Though it’s important to note most reputable VPNs are trustworthy, this case certainly raised some questions in the minds of many.