Mining Interpretable Rules from Classification Models

As data scientists, we come across numerous classification problems every once in a while. Ensemble learning techniques like bagging and boosting typically give us quite high classification performances. But all such models are much complex and hard to interpret. To make sure that everything is working fine and also to understand the prediction results/logic better, it becomes necessary to find out (understand) interpretable rules from classification models.

This problem of finding interpretable rules from ML models is not new and has been widely considered in Machine Learning. There are quite a few solutions available for this. In this article, we will learn about one such solution that I found to be very helpful-

Interpretable Rules from Classification Models using Skope-Rules

In this article, we will learn about a python machine learning library called Skope-Rules that can extract interpretable rules from classification models. Skope-Rules is built on top of the scikit-learn library and aims at learning logical and interpretable rules from complex machine learning models like Random Forest Classifier. This library is distributed under the 3-Clause BSD license.

Skope-Rules module aims at learning decision rules for “scoping” a target class, i.e. detecting instances of the target class with high precision. Let’s understand how it works-

There are three main steps involved-

Training Bagging Estimator
Filtering important Rules
Removing duplicate Rules

let’s understand more about these intermediate steps.

1. Training Bagging Estimator

The very first step involves training a bagging estimator. This estimator consists of multiple decision tree-based classifiers. The number of estimators, maximum depths, and other related parameters can be customized and passed as arguments just like scikit-learn modules. Each node of these classifiers represents a decision rule. Thus we end up with quite a few rules in this step.

You may be interested in reading my article on:
“Bagging, Boosting and Stacking in Machine Learning“

APPLYING RANDOM FOREST (CLASSIFICATION) — MACHINE LEARNING ... — Image Source

2. Filtering Important Rules

The second step involves filtering the important rules. After training the estimator, all the decision rules are extracted and then evaluated based on user-defined parameters ie., the minimum required precision and the minimum needed recall value. Qualifying important rules are selected for further analysis and rest are rejected. This step aims at selecting the high performing set of rules from all generated/possible rules.

Interpretable rules from classification models — Image Source

3. Removing Duplicate Rules

The final step involves the removal of the duplicated rules from the selected set of important rules. This step is done to maintain the diversity of the resulting rules. The similarity between rules is decided based on similar terms and the corresponding decider symbols (> or <=). After this step, we end up with a set of high performing rules with significant variety. Skope-Rules finally provides all these selected rules as for output.

Applying Skope-Rules in your python project

Skope-Rules modules are very easy to use in python as it is built on top of the famous scikit-learn library. The syntax is very similar to the general classifiers and it gives the flexibility to set important argument values also. Let’s take a look at the sample code-

Getting the latest source of the library using pip-

pip install skope-rules

Defining Skope-Rules Classifier-

from skrules import SkopeRules

clf = SkopeRules(max_depth_duplication=2,
                 n_estimators=30,
                 precision_min=0.3,
                 recall_min=0.1,
                 feature_names=feature_names)

Loading IRIS dataset and finding the rules for target class:-

(All output rules can be found using the clf.rules_ method)

from sklearn.datasets import load_iris

dataset = load_iris()
feature_names = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']

for idx, species in enumerate(dataset.target_names):
    X, y = dataset.data, dataset.target
    clf.fit(X, y == idx)
    rules = clf.rules_[0:3]
    print("Rules for iris", species)
    for rule in rules:
        print(rule)
    print()
    print(20*'=')
    print()

This classifier can also be used as a predictor using the “score_top_rules” method:-

from sklearn.datasets import load_boston
from sklearn.metrics import precision_recall_curve
from matplotlib import pyplot as plt
from skrules import SkopeRules

dataset = load_boston()
clf = SkopeRules(max_depth_duplication=None,
                 n_estimators=30,
                 precision_min=0.2,
                 recall_min=0.01,
                 feature_names=dataset.feature_names)

X, y = dataset.data, dataset.target > 25
X_train, y_train = X[:len(y)//2], y[:len(y)//2]
X_test, y_test = X[len(y)//2:], y[len(y)//2:]
clf.fit(X_train, y_train)
y_score = clf.score_top_rules(X_test) # Get a risk score for each test example
precision, recall, _ = precision_recall_curve(y_test, y_score)
plt.plot(recall, precision)
plt.xlabel('Recall')
plt.ylabel('Precision')
plt.title('Precision Recall curve')
plt.show()

Conclusion

This article throws light on the need for model interpretability and further goes into the details of a python library (Skope-Rules) designed to extract interpretable rules from classification models. This library is easy to use and pretty important in understanding the complex models.

You might be interested in reading my article on:
“How to deal with Imbalanced data in classification?“

Thankyou for reading the article, I hope it was helpful. Don’t forget to provide your feedback through comments below.

References and Further Reading

https://github.com/scikit-learn-contrib/skope-rules
https://skope-rules.readthedocs.io/en/latest/skope_rules.html
http://2018.ds3-datascience-polytechnique.fr/wp-content/uploads/2018/06/DS3-309.pdf
Doshi-Velez et al. Accountability of AI Under the Law: The Role of Explanation, 2017
Friedman and Popescu. Predictive learning via rule ensembles,Technical Report, 2005
Cohen and Singer. A simple, fast, and effective rule learner, National Conference on AI, 1999
Weiss and Indurkhyar. Lightweight rule induction, ICML,
Dembczynski, Kotlowski, and Slowinski. Maximum Likelihood Rule Ensembles, ICML, 2008