It is basically machine learning technique for creating features based on the raw data given. It refers to transforming & training data and augmenting it with additional, more useful features to make ML more effective.
For example, let’s take a dataset where we have two variables. call-time and calling-rate. It’s always better to create a new feature by multiplying these two columns, which we can name as total-calling-price.
df = pd.read_csv("dataset.csv")
X = df.copy()
y = df["churn"]
X["total_callingPrice"] = X["call_time"] * X["rate"]
Let’s look at another example, given below where we’re adding a feature Cost_per_squareFt based on given two variables, Area and Amount to enhance the model accuracy…
Feature creation, Transformation, Feature extraction & Exploratory Data Analysis (EDA) are the processes involved in feature engineering.
Feature engineering is essential because, in the transformation phase, we can reduce the number of independent variables from the dataset which helps us by preventing the model trained on it, from being overfitted.
Representation learning, also called as feature learning is a set of techniques that enables the model to automatically discover the representations required to detect the features.
Evidently, with representation learning applied, we don’t need to manually create new features for the model.
That’s why this is used in deep learning models.
As we all know, Feature engineering and feature extraction are key but time-consuming parts of the ML workflow.
Have a look at the pie chart below, we’ll get to know how a data scientist spends most of their time on organizing given data.
That time can be saved using representation learning.
Following is the excerpt from Deep Learning and Feature Engineering :
The feature engineering approach was the dominant approach till recently when DL techniques started demonstrating recognition performance better than the carefully crafted feature detectors.
With deep learning, one can start with raw data, since the features will be automatically created by the neural network as it learns.
Hence, in the book Deep Learning with Python by Francois Challot, he wrote
Deep Learning removes the need for feature engineering.