Adding Machine Learning and Predictive Analytics to Your ERP Using Bezlio
Updated June 30, 2017
I’ve always been passionate about bringing new technology to companies to help them be more competitive. An exciting emerging technology in the IT/IS crowd right now is using machine learning to allow for smarter data analyzation. Companies spend a significant amount of money generating data compared to analyzing it. With the technology Bezlio is built on, we allow companies to take emerging technology that becomes available and leverage it regardless of the backend system.
Overview
With any hyped-up topic, there is always some misunderstanding (I’m still explaining the difference between cloud applications and cloud-hosted applications), so I’d like to clear up what machine learning is and how it can be used in manufacturing and other business functions. I will use real world examples from Epicor ERP and Infor VISUAL ERP along with Bezlio to utilize machine learning and predictive analytics. Bezlio’s plugin architecture means that the source data can be from anything, along with a plugin available in Bezlio that utilizes the Accord.NET Framework – a very popular machine learning library.
Why Statistics?
To get to the golden promise land of machine learning, we first need to walk the dark path of statistics. Seriously though, most of the machine learning concepts were derived from past statistical analytic techniques. Even applying some simple statistics can be used with ERP data to gain better insights. I have always liked statistics because it takes the gut out of gut decisions based on incomplete data.
Our Source Data
In this next part, I will be using an Excel spreadsheet for source data along with the Bezlio custom designer to demonstrate the output. As for the source data, we are going to use labor ticket information showing us how many hours were worked on an operation and the quantity complete. A time study was done when the method was created stating users should be able to produce 283 pieces per hour. Now, this becomes crucial because our schedule uses that standard in terms of knowing how much time is needed. Realistically, are we achieving that goal?
Building the Machine Learning Bezl
Our first step on our bezl is to add a data subscription. This uses the ExcelPlugin to read data from an Excel file, but in real-life you would most likely be pulling this from a database.
With that complete, I’m going to add a variable called var0 to save the data we get back from the data subscription and edit the onDataChange function so that when data is returned, we automatically ship it off to our machine learning library to get a statistical analysis of it. You could just do this in Excel, but again the goal is to get here automatically with ERP data.
Here is my onDataChange function:
bezl.data = dataResp; if (bezl.data.datasub0) { bezl.vars['var0'] = bezl.data.datasub0; bezl.dataService.remove('datasub0'); var input = []; bezl.vars['var0'].forEach(v => { input.push(v.QuantityPerHours); }); bezl.dataService.add('stats','brdb','AccordFramework','NormalDistribution', { "Input": JSON.stringify(input)},0); }
In my onDataChange I am checking for datasub0 to return. When it does, I’m stuffing the results into var0 and extracting the QuantityPerHour (how many each job produced in total and the number of hours it took) and passing that into the Accord.NET Framework as a Normal Distribution. At this point, there is no machine learning because a normal distribution is only statistics, but this sets the foundation for what we will do in future lessons. *Note that the new data service we added is called stats.
The markup is going to be really generic, I just dragged the raw data view from the markup snippets onto the canvas twice and made the top one be bezl.data.stats and the bottom var0.
<pre>{{bezl.data.stats | json }}</pre> <pre>{{bezl.vars['var0'] | json }}</pre>
You’d probably put a search box on there and dump the results in a table, but this is just a quick way to get started. See the results below to show where we are at this point:
Making Sense of the Data
Now, when most people use any kind of statistics they are using the mean (average). You take the sum of all the numbers, divide by the total number of records, and what you’re left with should be right there in the middle. The mean is the reciprocating saw of statistics; useful in a surprising number of scenarios, but lacking in some finesse. If the numbers are widely dispersed, the mean can be meaningless (rimshot). That is where standard deviation steps in to help you out.
(This is as good of a time as any to explain I know enough about statistics to be dangerous, so professionals, please don’t flame me too bad.)
Standard deviation is a tool which allows us to go from it takes x time to produce y items, to the more correct it takes us between a and b time to produce c items. It shows you the variation in the numbers and the spread. In our example, the standard deviation of the numbers is 51. Let’s say that we wanted to be 80% confident that we are exceeding our production estimates (in that we’re producing at least or more units than we had estimated). There are standard tables, called z-tables, which tell us how many standard deviations above and below equal a confidence of some percent. The exact values are dependent on the size of the sample. For our sample, we had around 80 records and we want to be 80% confident in our projections so the t-value is 1.29. Our mean was 227 (already lower than the estimated number of units that was 283 per hour) and if we want to say we are 80% confident that we can produce at least x units in y amount of time, we multiply our standard deviation by 1.29 to get +/- 65.79 or a range that we can produce between 161.21 units per hour and 292.79 units per hour.
By applying simple statistics against our production data, we can see in this example that we might have a problem here. The goal should not run it on a case by case basis, but run it against all operations and use exception management to show us where we need to focus our attention because our standards being off could result in our schedule being off and possibly selling products for a loss.
The problem is that if I told a manager that we could produce between 161 to 292 units per hour they would look at me like I’m absolutely crazy. The problem is that in our sample data the number of units produced per hour go from 122 all the way up to 389 units per hour. A scatterplot of the hours worked and units produced visualizes the problem:
So, in future entries we will stick with this same sample, but use different techniques using machine learning and predictive analytics to get us some better answers.
Predictive Analytics and Machine Learning for Any ERP
During the 2017 Bezlio Mainspring Developer Conference, Brian Ellis spoke about the subject of predictive analytics and machine learning as it relates to using it with Bezlio and any ERP or CRM.