Optimal and Pure Leaf Classification Trees for Machine Learning (ML) Decision-Making

A method to improve the performance and accuracy of ML-based decision trees.

Decision trees are popular machine learning (ML) methods used in classification and regression problems, and they have numerous applications in the real world. Various industries use decision trees to help decide strategies, investments, and operations. In addition, they are used in healthcare to help physicians diagnose based on laboratory results, symptoms, and other data.

The Need

Mixed-integer optimization (MIO) enables researchers to solve decision tree problems more efficiently over traditional sub-optimal greedy approaches. However, these new MIO formulations are still limited as they require many decision variables and can experience undesirable or invalid tree structures, leading to significant inefficiencies and resource requirements.

The Technology

This technology is a Modified Optimal Classification Tree (M-OCT) formulation with enhanced leaf-branch interaction constraints guaranteeing valid tree structures when generating optimal trees. It incorporates binary encoding to reduce the total number of variables, with binary splits and complexity constraints. Thus far, the inventors have shown proof of concept with 16 standard data sets, showing that their M-OCT is ten times faster than existing methods.

Commercial Applications

This invention can be deployed in any system involved in decision-making processes such as medical diagnostic prediction and finance.

Benefits/Advantages

Compared to inefficient MIO methods, this invention improves its speed by up to 10X while enhancing model accuracy and generalizations.

Loading icon