[MLOps Basics]: Feature Store, an overview

By admin, Oct 8, 2021

It is almost an obvious fact that features are crucial in helping machine learning models to process and understand datasets for training and production. Also, Feature extraction and storage are among the most important but often overlooked aspects of machine learning solutions. If we are trying to build a single machine learning model, feature extraction might seem basic but it can get complicated as the teams continue to scale.

For a large organization with dozens of data science teams cranking up machine learning models, each team would need to process different datasets and extract the corresponding features which makes it computationally expensive and also nearly impossible to scale. Feature Store also reduces computational time.

Another key challenge faced by high performance machine learning teams is to be able to build mechanisms with reusable features.In this context, Feature Store is emerging to be prevalent in modern machine learning solutions. A Feature Store serves as a repository of features that can be used on the training and evaluation of machine learning models. In this article, we will discuss about an open source feature store available for machine learning models named Feast. It abstracts many fundamental building blocks of feature extraction, transformation and discovery which are omnipresent in machine learning applications.

Importance of Feast as a Feature Store

Models need consistent access to data

Machine Learning (ML) systems built on traditional data infrastructure are often coupled to databases, object stores, streams, and files. However, as a result of this coupling is that, any change in data infrastructure may break dependent ML systems. Feast decouples your models from your data infrastructure by providing a single data access layer that abstracts feature storage from feature retrieval. It also provides a consistent means of referencing feature data for retrieval and therefore, ensures that models remain portable when moving from training to serving.

Deploying new features into production is difficult

Members of ML teams may have different objectives. Data scientists for example, aim to deploy features into production as soon as possible, while engineers want to ensure that production systems remain stable. These differing objectives can create an organizational friction that slows time-to-market for new features. Feast addresses this friction by providing both a centralized registry to which data scientists can publish features and a battle-hardened serving layer. Together, they enable non-engineering teams to ship features into production with minimal oversight.

Models need point-in-time correct data

ML models in production require a view of data that is consistent with the one on which they are trained, failing which the accuracy of these models could be compromised. But despite this need, many data science projects suffer inconsistencies introduced by future feature values being leaked to models during training. Feast solves the challenge of data leakage by providing point-in-time correct feature retrieval when exporting feature datasets for model training.

Features aren’t reused across projects

Different teams within an organization are often unable to reuse features across projects. Feast addresses this problem by introducing feature reuse through a centralized registry. This registry enables multiple teams working on different projects not only to contribute features, but also to reuse these same features.

Advantages of Feast Feature Store

Bridges gap between teams

Feast enables track and share of features between data scientists including a version-control repository. It Bridges the gap between data scientists and data & ML engineers.

Model Training-Serving Consistency

Feast enables feature consistency between model training and serving. This addresses the constant mismatch between the development and production version of machine learning models.

Feature Discovery

Feast enables the exploration and discoverability of features. This allows for a deeper understanding of features and their specifications, more feature reuse between teams and projects and also faster experimentation.

Implementation-agnostic

The output from Feature Stores is implementation-agnostic. No matter which algorithm or framework we use, the application/model will get data in a consistent format.

Time Saving

Feature Store saves time that would otherwise be spent in computing features. It gives more time for building new models.

Below figure shows how a Feature Store and its components help reduce computational time.

So,this was an overview of Feast as a Feature Store.
In our next article, Let’s look at How to setup and Integrate Feast with MLflow.

Stay tuned.

Check out the earlier articles in this series to understand how to install MLflow and implementing MLOps using MLflow.

Author

Data engineering team

GainInsights

info@gain-insights.com

Explore No-code, automated machine learning for analytics teams here.

You seems to be usingold browser.

Services

Unleashing hospitality insights with a Unified Data Product

Solutions

Analytics-Ready Data Lake

Unified Data Platform

Analytics-Ready Data Lake

Airport

Analytics-Ready Data Lake

Consumer Packaged Goods

Analytics-Ready Data Lake

Retail

Analytics-Ready Data Lake

Financial Services

Analytics-Ready Data Lake

Automotive

Analytics-Ready Data Lake

HR Analytics

Partnerships

Analytics-Ready Data Lake

Company

Analytics-Ready Data Lake

About Us

Analytics-Ready Data Lake

Partnerships

Analytics-Ready Data Lake

Resources

Analytics-Ready Data Lake

CSR

Analytics-Ready Data Lake

Contact Us

Analytics-Ready Data Lake

Blogs

[MLOps Basics]: Feature Store, an overview

By admin, Oct 8, 2021

Importance of Feast as a Feature Store

Models need consistent access to data

Deploying new features into production is difficult

Models need point-in-time correct data

Features aren’t reused across projects

Advantages of Feast Feature Store

Bridges gap between teams

Model Training-Serving Consistency

Feature Discovery

Implementation-agnostic

Time Saving

Author

RECENT POSTS

Looking to connect with us?

You seems to be using
old browser.