Home Artificial Intelligence Artificial Intelligence DIY Fixing FutureWarning Messages in scikit-learn

Fixing FutureWarning Messages in scikit-learn

February 26, 2019

[ad_1]

Upcoming changes to the scikit-learn library for machine learning are reported through the use of FutureWarning messages when the code is run.

Warning messages can be confusing to beginners as it looks like there is a problem with the code or that they have done something wrong. Warning messages are also not good for operational code as they can obscure errors and program output.

There are many ways to handle a warning message, including ignoring the message, suppressing warnings, and fixing the code.

In this tutorial, you will discover FutureWarning messages in the scikit-learn API and how to handle them in your own machine learning projects.

After completing this tutorial, you will know:

FutureWarning messages are designed to inform you about upcoming changes to default values for arguments in the scikit-learn API.
FutureWarning messages can be ignored or suppressed as they do not halt the execution of your program.
Examples of FutureWarning messages and how to interpret the message and change your code to address the upcoming change.

Discover how to prepare data with pandas, fit and evaluate models with scikit-learn, and more in my new book, with 16 step-by-step tutorials, 3 projects, and full python code.

Let’s get started.

How to Fix FutureWarning Messages in scikit-learn
Photo by a.dombrowski, some rights reserved.

Tutorial Overview

This tutorial is divided into four parts; they are:

Problem of FutureWarnings
How to Suppress FutureWarnings
How to Fix FutureWarnings
FutureWarning Recommendations

Problem of FutureWarnings

The scikit-learn library is an open-source library that offers tools for data preparation and machine learning algorithms.

It is a widely used and constantly updated library.

Like many actively maintained software libraries, the APIs often change over time. This may be because better practices are discovered or preferred usage patterns change.

Most functions available in the scikit-learn API have one or more arguments that let you customize the behavior of the function. Many arguments have sensible defaults so that you don’t have to specify a value for the arguments. This is particularly helpful when you are starting out with machine learning or with scikit-learn and you don’t know what impact each of the arguments has.

Change to the scikit-learn API over time often comes in the form of changes to the sensible defaults to arguments to functions. Changes of this type are often not performed immediately; instead, they are planned.

For example, if your code was written for a prior version of the scikit-learn library and relies on a default value for a function argument and a subsequent version of the API plans to change this default value, then the API will alert you to the upcoming change.

This alert comes in the form of a warning message each time your code is run. Specifically, a “FutureWarning” is reported on standard error (e.g. on the command line).

This is a useful feature of the API and the project, designed for your benefit. It allows you to change your code ready for the next major release of the library to either retain the old behavior (specify a value for the argument) or adopt the new behavior (no change to your code).

A Python script that reports warnings when it runs can be frustrating.

For a beginner, it may feel like the code is not working correctly, that perhaps you have done something wrong.
For a professional, it is a sign of a program that requires updating.

In either case, warning messages may obscure real error messages or output from the program.

How to Suppress FutureWarnings

Warning messages are not error messages.

As such, a warning message reported by your program, such as a FutureWarning, will not halt the execution of your program. The warning message will be reported and the program will carry on executing.

You can, therefore, ignore the warning each time your code is executed, if you wish.

It is also possible to programmatically ignore the warning messages. This can be done by suppressing warning messages when your program is run.

This can be achieved by explicitly configuring the Python warning system to ignore warning messages of a specific type, such as ignore all FutureWarnings, or more generally, to ignore all warnings.

This can be achieved by adding the following block around your code that you know will generate warnings:

# run block of code and catch warnings

with warnings.catch_warnings():

# ignore all caught warnings

warnings.filterwarnings(“ignore”)

# execute code that will generate warnings

...

Or, if you have a very simple flat script (no functions or blocks), you can suppress all FutureWarnings by adding two lines to the top of your file:

# import warnings filter

from warnings import simplefilter

# ignore all future warnings

simplefilter(action=‘ignore’, category=FutureWarning)

To learn more about suppressing in Python, see:

Python Warning control API

How to Fix FutureWarnings

Alternately, you can change your code to address the reported change to the scikit-learn API.

Typically, the warning message itself will instruct you on the nature of the change and how to change your code to address the warning.

Nevertheless, let’s look at a few recent examples of FutureWarnings that you may encounter and be struggling with.

The examples in this section were developed with scikit-learn version 0.20.2. You can check your scikit-learn version by running the following code:

# check scikit-learn version

import sklearn

print(‘sklearn: %s’ % sklearn.__version__)

You will see output like the following:

1	sklearn: 0.20.2

As new versions of scikit-learn are released over time, the nature of the warning messages reported will change and new defaults will be adopted.

As such, although the examples below are specific to a version of scikit-learn, the approach to diagnosing and addressing the nature of each API change and provide good examples for handling future changes.

FutureWarning for LogisticRegression

The LogisticRegression algorithm has two recent changes to the default argument values that result in FutureWarning messages.

The first has to do with the solver for finding coefficients and the second has to do with how the model should be used to make multi-class classifications. Let’s look at each with code examples.

Changes to the Solver

The example below will generate a FutureWarning about the solver argument used by LogisticRegression.

# example of LogisticRegression that generates a FutureWarning

from sklearn.datasets import make_blobs

from sklearn.linear_model import LogisticRegression

# prepare dataset

X, y = make_blobs(n_samples=100, centers=2, n_features=2)

# create and configure model

model = LogisticRegression()

# fit model

model.fit(X, y)

Running the example results in the following warning message:

1	FutureWarning: Default solver will be changed to ‘lbfgs’ in 0.22. Specify a solver to silence this warning.

This issue involves a change from the ‘solver‘ argument that used to default to ‘liblinear‘ and will change to default to ‘lbfgs‘ in a future version. You must now specify the ‘solver‘ argument.

To maintain the old behavior, you can specify the argument as follows:

1 2	# create and configure model model = LogisticRegression(solver=‘liblinear’)

To support the new behavior (recommended), you can specify the argument as follows:

1 2	# create and configure model model = LogisticRegression(solver=‘lbfgs’)

Changes to the Multi-Class

The example below will generate a FutureWarning about the ‘multi_class‘ argument used by LogisticRegression.

# example of LogisticRegression that generates a FutureWarning

from sklearn.datasets import make_blobs

from sklearn.linear_model import LogisticRegression

# prepare dataset

X, y = make_blobs(n_samples=100, centers=3, n_features=2)

# create and configure model

model = LogisticRegression(solver=‘lbfgs’)

# fit model

model.fit(X, y)

Running the example results in the following warning message:

1	FutureWarning: Default multi_class will be changed to ‘auto’ in 0.22. Specify the multi_class option to silence this warning.

This warning message only affects the use of logistic regression for multi-class classification problems, instead of the binary classification problems for which the method was designed.

The default of the ‘multi_class‘ argument is changing from ‘ovr‘ to ‘auto‘.

To maintain the old behavior, you can specify the argument as follows:

1 2	# create and configure model model = LogisticRegression(solver=‘lbfgs’, multi_class=‘ovr’)

To support the new behavior (recommended), you can specify the argument as follows:

1 2	# create and configure model model = LogisticRegression(solver=‘lbfgs’, multi_class=‘auto’)

FutureWarning for SVM

The support vector machine implementation has had a recent change to the ‘gamma‘ argument that results in a warning message, specifically the SVC and SVR classes.

The example below will generate a FutureWarning about the ‘gamma‘ argument used by SVC, but just as equally applies to SVR.

# example of SVC that generates a FutureWarning

from sklearn.datasets import make_blobs

from sklearn.svm import SVC

# prepare dataset

X, y = make_blobs(n_samples=100, centers=2, n_features=2)

# create and configure model

model = SVC()

# fit model

model.fit(X, y)

Running this example will generate the following warning message:

1	FutureWarning: The default value of gamma will change from ‘auto’ to ‘scale’ in version 0.22 to account better for unscaled features. Set gamma explicitly to ‘auto’ or ‘scale’ to avoid this warning.

This warning message reports that the default for the ‘gamma‘ argument is changing from the current value of ‘auto‘ to a new default value of ‘scale‘.

The gamma argument only impacts SVM models that use the RBF, Polynomial, or Sigmoid kernel.

The parameter controls the value of the ‘gamma‘ coefficient used in the algorithm and if you do not specify a value, a heuristic is used to specify the value. The warning is about a change in the way that the default will be calculated.

To maintain the old behavior, you can specify the argument as follows:

1 2	# create and configure model model = SVC(gamma=‘auto’)

To support the new behavior (recommended), you can specify the argument as follows:

1 2	# create and configure model model = SVC(gamma=‘scale’)

FutureWarning for Decision Tree Ensemble Algorithms

The decision-tree based ensemble algorithms will change the number of sub-models or trees used in the ensemble controlled by the ‘n_estimators‘ argument.

This affects models’ random forest and extra trees for classification and regression, specifically the classes: RandomForestClassifier, RandomForestRegressor, ExtraTreesClassifier, ExtraTreesRegressor, and RandomTreesEmbedding.

The example below will generate a FutureWarning about the ‘n_estimators‘ argument used by RandomForestClassifier, but just as equally applies to RandomForestRegressor and the extra trees classes.

# example of RandomForestClassifier that generates a FutureWarning

from sklearn.datasets import make_blobs

from sklearn.ensemble import RandomForestClassifier

# prepare dataset

X, y = make_blobs(n_samples=100, centers=2, n_features=2)

# create and configure model

model = RandomForestClassifier()

# fit model

model.fit(X, y)

Running this example will generate the following warning message:

1	FutureWarning: The default value of n_estimators will change from 10 in version 0.20 to 100 in 0.22.

This warning message reports that the number of submodels is increasing from 10 to 100, likely because computers are getting faster and 10 is very small, even 100 is small.

To maintain the old behavior, you can specify the argument as follows:

1 2	# create and configure model model = RandomForestClassifier(n_estimators=10)

To support the new behavior (recommended), you can specify the argument as follows:

1 2	# create and configure model model = RandomForestClassifier(n_estimators=100)

More Future Warnings?

Are you struggling with a FutureWarning that is not covered?

Let me know in the comments below and I will do my best to help.

FutureWarning Recommendations

Generally, I do not recommend ignoring or suppressing warning messages.

Ignoring warning messages means that the message may obscure real errors or program output and that API future changes may negatively impact your program unless you have considered them.

Suppressing warnings might be a quick fix for R&D work, but should not be used in a production system. Worse than simply ignoring the messages, suppressing the warnings may also suppress messages from other APIs.

Instead, I recommend that you fix the warning messages in your software.

How should you change your code?

In general, I recommend almost always adopting the new behavior of the API, e.g. the new default, unless you explicitly rely on the prior behavior of the function.

For long-lived operational or production code, it might be a good idea to explicitly specify all function arguments and not use defaults, as they might be subject to change in the future.

I also recommend that you keep your scikit-learn library up to date, and keep track of the changes to the API in each new release.

The easiest way to do this is to review the release notes for each release, available here:

scikit-learn Release History

Summary

In this tutorial, you discovered FutureWarning messages in the scikit-learn API and how to handle them in your own machine learning projects.

Specifically, you learned:

FutureWarning messages are designed to inform you about upcoming changes to default values for arguments in the scikit-learn API.
FutureWarning messages can be ignored or suppressed as they do not halt the execution of your program.
Examples of FutureWarning messages and how to interpret the message and change your code to address the upcoming change.

[ad_2]

Source link

Fixing FutureWarning Messages in scikit-learn

Tutorial Overview

Problem of FutureWarnings

How to Suppress FutureWarnings

How to Fix FutureWarnings

FutureWarning for LogisticRegression

Changes to the Solver

Changes to the Multi-Class

FutureWarning for SVM

FutureWarning for Decision Tree Ensemble Algorithms

More Future Warnings?

FutureWarning Recommendations

How should you change your code?

Summary

Related

Follow Us

POPULAR POSTS

POPULAR CATEGORY

The AI revolution is sorting people into three user categories