Structured Data

The Structured Data chapter focuses on tabular datasets & time series.

Our team of experts research machine learning techniques for datasets that you usually find in Excel sheets, CSVs or relational databases. We translate the newest insights from the literature into practical guidelines, and provide ML6 with the knowledge and tools to solve various problems:

Domains

Regression & Forecasting

Predicting the future is hard. We study tools that allow us to make predictions for next week’s energy consumption or sales volumes for next year in August. We do this by building on top of our set of external data sources and leveraging/benchmarking the latest model improvements.

Classification & Clustering

Adding labels to data records can add tremendous value to make sense of large quantities of data and to automate actions. Coming up with types of labels (clustering) and assigning known labels to new records (classification) is used in many different environments. Applications include understanding types of machine failures, reasons to disengage in a sales process or clustering users on an e-commerce platform.

Anomaly Detection

Machines break down and systems/processes get abused. Once in a while, events happen that are not supposed to happen. Detecting outliers is a real challenge and we love to figure out how to get really good at separating “abnormal” events from “normal” behaviour. Explainability and causality are key here: why are certain items anomalous? These insights are critical to improve your systems.

Operational Research & Optimization

It’s possible that a process works well, but you have a feeling that it could work even better. In this context we research how to solve difficult problems: e.g. production planning, job scheduling, vehicle routing, box packing, ...

Demos

Multivariate Time Series Similarity

This demo is a result of research conducted in the field of multivariate time series similarities. The goal was to develop a fast and scalable system, able to unsupervisedly find the 5 most similar datapoints for a given reference.

View full demo

Quick tips

5 Tips to start working with imbalanced datasets

Lees meer

Tabular Data: Serving decision forests with Tensorflow Serving

Lees meer

Time series labeling tools

Lees meer
Discover more tips

Case studies

Internships

Interactive optimization solver
Financial forecasting competition
Research similarity metrics and contribute to TF
Active learning cockpit for managers with (unsupervised) anomaly detection
Let's get started
Neem contact met ons op
Contacteer ons