01Supervised learning
Supervised learning trains an AI model on labelled examples: inputs paired with the correct output. A spam filter trained on supervised learning uses thousands of emails that have been labelled 'spam' or 'not spam' by humans; the model learns to classify new emails based on patterns in the labelled examples.
Supervised learning is the dominant approach for AI tasks where you know what the correct answer looks like and can provide examples: fraud detection (labelled as fraudulent or legitimate), credit scoring (labelled with actual default outcomes), image classification (labelled with what the image contains), and sentiment analysis (labelled with positive, negative, or neutral).
The requirement is labelled data: someone has to have done the work of labelling examples correctly. This is often the most labour-intensive and expensive part of building a supervised learning system.
02Unsupervised learning
Unsupervised learning trains an AI model on unlabelled data and finds patterns without being told what patterns to look for. Rather than learning from correct answers, the system discovers structure in the data on its own.
Clustering is a classic unsupervised learning task: given customer purchase data, the system identifies groups of customers with similar purchasing patterns without being told in advance how many groups exist or what characterises each group. Topic modelling is another example: given a set of documents, the system identifies the major topics in the collection without being given a topic taxonomy.
Unsupervised learning is appropriate when you want to discover structure in data that you do not yet understand, or when labelling data for supervised learning is impractical. It is also how many modern AI systems build initial representations of data before fine-tuning for specific tasks.
03Does the distinction matter for executives?
For most business AI decisions, the supervised/unsupervised distinction is a level of detail that the technical team needs to understand rather than the executive team. However, it is relevant in a few specific situations.
When evaluating a proposed AI solution, understanding whether it requires labelled data (supervised) is important for assessing the data preparation investment required. If a vendor proposes a supervised learning solution for a problem where you do not have labelled historical data, the labelling work is a significant project that needs to be scoped and funded.
When exploring AI for discovery use cases (finding patterns in customer behaviour, identifying unusual transactions, understanding unstructured feedback), unsupervised approaches may be more appropriate than supervised, and this is worth confirming with the technical team.
Key Takeaways
- 1.Supervised learning trains on labelled input-output pairs; it requires labelled data but produces models for specific prediction tasks.
- 2.Unsupervised learning finds patterns in unlabelled data; it is appropriate for discovery tasks where the correct output is not known in advance.
- 3.For executives, the key practical question is whether a proposed supervised learning solution requires labelled training data and what the investment in creating it is.
- 4.Labelling data for supervised learning is often the most labour-intensive and costly part of AI system development and is frequently underestimated.
- 5.Modern large language models primarily use a combination of unsupervised pre-training and supervised fine-tuning; understanding this helps contextualise how they work.
References & Further Reading
- [1]Machine Learning: Supervised vs UnsupervisedGoogle Developers
Want to discuss this with an expert?
Book a strategy call to explore how these insights apply to your organisation.
Book a Strategy Call