AutoML: A method to automate machine learning processes?
By Aditya Abeysinghe
Automatic Machine Learning (AutoML) is a set of processes that removes time-consuming steps of building AI (Artificial Intelligence) models by automatically processing several steps in the model building process. AutoML tools can be used to automatically process steps like data preprocessing, feature selection and selecting types of algorithms to build. Apart from faster building of AI models, users with limited knowledge and/or expertise in certain phases of model building process can use AutoML to build models.
Where is AutoML used in building AI models?
There are several ways in which AI models are built. The method used depends on the type, intended method, and algorithms used. Data for building models could come from several sources. For example, most models are developed using open source or publicly viewable datasets. However, for some research data needs to be gathered to train and test AI models such as where data cannot be released for public or for which data has not been gathered.
Data preprocessing is necessary for most data that are used to build models. During data preprocessing, values which are not suitable to build AI models, data where there are no values and values which need to be removed due to issues with biasness or values where variables that have relationships with others are removed. Most AutoML processes require data preprocessing which needs to be performed using manual techniques.
After preprocessing, feature selection selects features mostly required to build an AI model. During feature selection, the variance and dimensionality is reduced and features that are mostly related to each other are selected. During feature selection, feature transformation may be required where all features are converted to one data type. If categorical variables are present in a dataset, they may be converted to a numeric value. These types of feature transformations or removing features that have less co-relation can be automated using AutoML.
The next stage is model selection and training. Different algorithms can be used to train AI models. The algorithm used depends on the type of learning methodology used. For example, regression and classification algorithms can be used for supervised learning. AutoML can be used to identify which algorithms to be used for training, which parameters need to be optimized in training and how to fine tune parameters or inputs/outputs of multiple layer models to improve the accuracy of models. As training cycles required to improve the accuracy and reduce error in outputs is a time-consuming and laborious task, models could be built with less human input and effort using AutoML.
Challenges of using AutoML
One of the challenges of using AutoML is the lack of transparency in inner processes. Often processes of AutoML used to select features, select models, and tune parameters are not visible to users and users can only see a built AI model when data is selected for processing. Therefore, there is less trust on the accuracy of a model and why certain algorithms were chosen to build models. These cause most users who use AI models to not use AutoML for large-scale AI applications and applications that involve critical tasks.
Another challenge of AutoML is that it is rarely used for AI applications that use complex data and/or complex learning techniques. For example, AutoML is often not suitable for techniques involving deep learning or reinforcement learning due to multiple learning cycles and layers used. AutoML is also often not used for complex image or voice processing AI processes due to complex algorithms that are used.
Image Courtesy: https://datachannel.co/