Whenever any data scientist thinks of developing a pipeline, they try bringing automated machine learning into the picture to make the task easier. However, due to inconsistent syntax and limited support for advanced features like topology search or higher-order operators, the development becomes tedious. Introducing a solution to this, IBM Research, USA has published a research paper on ‘LALE’: high-level Python interfaces’ library, which simplifies automated machine learning.
The research tends to overcome the following shortcomings of previous research on the inconsistency of Auto-ML libraries:
- There is inconsistency in pipeline specification syntax across the manual and automated spectrum.
- User needs to learn different syntax to rewrite the code while switching between various Auto-ML tools.
- Previous tools do not optimize the topology of the pipeline.
- Invalid configuration while combining different hyperparameters
Characteristics of LALE:
- LALE helps in selecting algorithms and tune hyperparameters of pipelines, compatible with scikit-learn.
- LALE provides a highly consistent interface to existing tools such as Hyperopt, SMAC, and GridSearchCV for automation.
- LALE uses JSON schema for checking correctness.
- LALE has an expanding library of estimators and transformers for interoperability.
- LALE uses Python subclassing to implement lifecycle states
Users can install LALE just like any other Python package and edit it with off-the-shelf Python tools such as Jupyter notebooks.
Source: https://arxiv.org/pdf/2007.01977.pdf
Github: https://github.com/ibm/lale
This article has been published fom the source link without modifications to the text. Ony the headline has been changed.