A Data Science Perspective on Time Series and Requirements for Machine Learning

Discussion created by con-bnovic Employee on May 7, 2018
Latest reply on May 8, 2018 by Paurav Joshi

In the classroom, machine learning can appear quite straightforward. You first identify model inputs and outputs. Machine learning software expects to receive a matrix of data where rows are cases (batches, parts, events, …) and columns are predictors(inputs) or outputs. The matrix is read into the software, predictors and outputs are identified, algorithm settings are chosen and, voila, a model is created. Easy peasy, right??? Well, not so fast!!! What happens if one or more of your inputs is not a static measurement but is a time series of measurements for every row/case? In baseball that’s called a ‘curve ball’ and handling it can be tricky business! I presented a talk on this topic at a Lunch and Learn in San Leandro earlier this year, where I described the role of feature definition to handle situations like this. If you didn’t get a chance to attend, here is the link!