A machine learning configuration is a set of features that control the behavior of the forecasting engine during a volume run. You must define a machine learning configuration before you specify Machine Learning as the method for a category property set. You can then apply the category property set to any category or hierarchy of categories for use when running a volume forecast.
Machine Learning is a type of predictive analysis that creates a computer program, or model, by uncovering patterns in data. For example, if you want to predict an estimated selling price of your house, Machine Learning would look at the price of sold houses and their characteristics like location, number of rooms, living area, land area, and so on. From this data, Machine Learning would build a model by finding historical data patterns between the selling price of a house and its characteristics. It would use that model to predict the selling price of a given house.
Machine Learning uses this approach to predict volumes for different drivers, such as sales and items sold, based on characteristics like the day-of-week average.
There are three computational steps in the machine learning method:
- Feature calculation collects the values of predictive characteristics such as a category’s recent day-of-week average for the purposes of the next step, training. Feature calculation runs when you first import data that includes a category with a machine learning category property set defined. Afterward, feature calculation is automatically updated whenever volume data is imported to that category.
- A feature is simply a relevant element of the input data (for instance day of week), or a derivative of one or more of these inputs (for instance average volume on a given day of week), or even of other features (for instance a ratio of two averages).
- The Machine Learning forecast considers many features, including some that can be configured by the user. The training process will determine which features are the best predictors of volume.
- Training compares the feature calculation to the existing volume data to devise the best model for mapping business conditions to volume predictions.
- In the training process, a machine learning algorithm builds a (possibly complex) function that accurately maps feature values to recorded volumes
Note: The training step demands enormous computational resources and must be coordinated with product support.
- Run Volume applies the generated model to the selected business unit and timeframe to predict business volume.
Warning: When you create or edit a machine learning entity, the system must be retrained. Contact Dimensions Support for advice.
- Go to Administration > Application Setup > Forecaster Setup > Machine Learning.
Note: If there are many existing machine learning configurations in the table you can more easily find the one you are looking for by clicking Filter
and typing a keyword in the field at the top of the Name column or the Description column.
- Do one of the following:
- Click Create
and enter a Name (and optionally a description).
- Select an existing configuration and click Edit
Choose one of two purposes for the edit:
- To modify the features of the selected configuration everywhere it is assigned: For this purpose, select Save changes everywhere that the named entity is used
- To use an existing configuration as a template to define a new configuration: For this purpose, select Save as a new named entity and give the entity a new name and (optionally) a description.
- Select an existing configuration and click Delete
Note: You cannot delete a machine learning configurationif it is selected in a category property set.
- Click Create
- Select one or more values for Recent Average - Day of Week tab, Current Assigned displays a list of current selections. Higher values produce more stable but less responsive results. The default setting (4 weeks) works well for most implementations. By adjusting the Recent Average - Day of Week setting, you can reduce the impact of anomalies associated with specific values and capture longer or shorter-term trends.
In a weather-sensitive business, if unusual weather has caused a change in volume over the past month, a period of 4 weeks may provide an inaccurate forecast. Yet normal seasonal variation indicates that an average of 16 weeks would include too much of the previous season's weather, introducing inaccuracy. In this case set the Recent Average -Day of Week to 8 or 12 for best results.
- Select one or more values for Recent Average tab, Current Assigned displays a list of current selections. Higher values produce more stable but less responsive results. It is recommended that you choose at least two of these features to capture both recent trends and longer term trends. The default settings of 60 and 90 days work well for most implementations. By adjusting the Recent Average setting, you can reduce the impact of anomalies associated with specific values.
A store with seasonal trends may benefit from a 30-day Recent Average that can quickly capture a surge in volume. However, sporadic weather events may cause ths 30-day average to fluctuate to quickly, so it can be tempered by also including a more stable 90-day Recent Average. Together, these (recommended) settings capture both short-term and medium-term seasonal trends.
- Under Organization:
- Specify which levels of the business structureRepresents the logical structure of an organization as it concerns staffing. It contains a hierarchy of locations that contain jobs to which an employee may be assigned. you want to use as features for predicting volume. If the volume trends differ between entities at a level, it should be checked. For instance, if different districts have different patterns, “District” should be checked. The default selection of all levels is recommended since, in most cases, Machine Learning will determine if these features are not needed.
- Map two levels in your business structure to the levels here designated as District and Region.For District, choose lowest level that contains multiple sites. For Region, choose the lowest level that contains multiple instances of the level you mapped to District
- Under Special Event specify the level of the business structure where you want to apply the volume multiplier. A lower level provides more granularity, while a higher level lessens the effects of outliers. If special events tend to impact different categories in the same store differently, Category may be used, but otherwise Site may provide less noise in the calculation of special event effects, and therefore provide more accurate forecasts.
- Under Other Configuration define the following features.
- Training Period specifies the number of weeks in the past that the method uses to build the model. A minimum of two years of data plus the forecast horizon is required, and in general a full 3 years of data is recommended. In general a longer period is provides more accuracy at the expense of computational time, but there are diminishing returns after several years. If business conditions or activities have changed radically at some point, including earlier data may be skewed. If so, consider starting the training period after the change.
- Pooling Strategies specifies the business structure level where the engine pools volume data and features to create the machine learning model. Individual predictions are still made at the category level, but the model utilizes trends in historical data throughout the pool. The default setting, By Driver, generally provides best results.
- Generic Department to Exclude allows you to specify one or more department types which you want to exclude from the machine learning model. Excluding departments that have known and significant problems with data cleanliness including missing, negative or fractional values, or data that is duplicated in other categories, will improve model accuracy considerably. Matching departments are excluded regardless of where they reside in the business structure being forecasted. Current Assigned displays a list of generic departments excluded.
- Category to Exclude allows you to specify one or more categories which you want to exclude from the volume run. The categories must be selected as specific locations in the business structure; the exclusion is not inherited. Current Assigned displays a list of categories excluded.
- Click Save.