Clustering Workflow in ML

December 10, 2019

[ad_1]

To cluster your data, you’ll follow these steps:

Prepare data.
Create similarity metric.
Run clustering algorithm.
Interpret results and adjust your clustering.

This page briefly introduces the steps. We’ll go into depth in subsequent sections.

Prepare Data

As with any ML problem, you must normalize, scale, and transform feature data. While clustering however, you must additionally ensure that the prepared data lets you accurately calculate the similarity between examples. The next sections discuss this consideration.Review: For a review of data transformation see Introduction to Transforming Data from the Data Preparation and Feature Engineering for Machine Learning course.

Create Similarity Metric

Before a clustering algorithm can group data, it needs to know how similar pairs of examples are. You quantify the similarity between examples by creating a similarity metric. Creating a similarity metric requires you to carefully understand your data and how to derive similarity from your features.

Run Clustering Algorithm

A clustering algorithm uses the similarity metric to cluster data. This course focuses on k-means.

Interpret Results and Adjust

Checking the quality of your clustering output is iterative and exploratory because clustering lacks “truth” that can verify the output. You verify the result against expectations at the cluster-level and the example-level. Improving the result requires iteratively experimenting with the previous steps to see how they affect the clustering.

[ad_2]

This article has been published from the source link without modifications to the text. Only the headline has been changed.

Source link

Clustering Workflow in ML

Prepare Data

Create Similarity Metric

Run Clustering Algorithm

Interpret Results and Adjust

Most Popular

In Crypto we Still “Trust”

Understanding why Wall Street likes Crypto

Meta to Spend $40 Billion on AI this year

Is Bitcoin halving the launch of new era for crypto

Drawing the line on using AI in TV and film?

Making AI Sustainable

Follow Us

POPULAR POSTS

In Crypto we Still “Trust”

Meta to Spend $40 Billion on AI this year

Understanding why Wall Street likes Crypto

POPULAR CATEGORY

Clustering Workflow in ML

Prepare Data

Create Similarity Metric

Run Clustering Algorithm

Interpret Results and Adjust

RELATED ARTICLES

Most Popular

Follow Us

POPULAR POSTS

POPULAR CATEGORY