New Model

Machine learning models and various pre-processing techniques are provided in this section. Price data of different assets and technical indicators are fed to models to enable them to predict buy, sell, and no-action labels or forecast numeric values for upcoming candles. To train a model, three sets of parameters must be defined:

Train Data
Model Parameters
Feature Processor

Train Data

In this tab, the parameters of the training data are defined.

Field	Description
Name	The name given to the model.
Market Data	The market data that contains the training data.
Train Pair(s)	The pairs used for training the model.
Timeframe	The timeframe in which the model is trained.
Numeric Feature(s)	The features used as inputs for the model; only numeric features are allowed.
Target	The reference for the training phase; the model learns the transformation of inputs (numeric features) to the target.
Start Date	The start date of the training data; must be within the time range of the market data.
End Date	The end date of the training data; must be within the time range of the market data.

Model Parameters

The structure of the model used for the learning task is selected here. Users can choose from various machine learning models to develop a strategy. Each model comes with a default set of parameters, but users can modify them in the Advanced section by adjusting model parameters. To learn more about different machine learning models, read here.

Feature Processor

The feature processor modifies the input data to achieve maximum performance in the training process. These modifications include adding memory to the system, sampling the inputs over time, normalization, and scaling of the input features.

Memory and sampling

Instead of passing the feature values of the last candle, the model can be fed with feature values of a few last candles. Memory defines the number of the most recent candles given to the model. Although increasing the size of the model memory provides more information for the model, this does not necessarily improve the results. As the size of the memory increases, the learning model might need to have more learning capability (larger N Estimators and N Neighbors) to use the extra information provided by the memory.

Sampling is the process of selecting a number of data instances for training. Sampling is applied when the size of the training data is too large for the model or when the training time is too long due to the size of the training data. Sample defines the rate of sampling. When sample is set to x, out of each x candles, only one is used for the training, and the rest is ignored. When data is sampled, Stride defines the number of candles that the window is shifted after making a training sample of size memory.

Price-Like Normalization

Any indicator or data with its unit being $ has meaning with respect to the price itself. To this end, the QuantiX platform provides price-like normalization. When it is activated, users can select between the Auto and Manual modes. The Auto mode selects columns automatically, but in the Manual mode, the user selects the columns from a list.

Two types of normalization are available in the Normalization Type. The normalization can be done with respect to a single candle (the number of candle is defined in the Number of Candle) or attributes of the window (here, window means all the candles passed to the model for label prediction. The size of the window changes when Memory changes). When the Number of Candle is set to -1, the window is normalized with respect to the last candle in each window. This parameter can vary between -1 and - Memory.

Normalization Reference defines the feature used for normalization. This could be any price-like feature. Note that the value of the feature used as the Normalization Reference will become zero for all candles after normalization.

The following formula is used for price-like normalization:

In this formula, t denotes the time step. In other words, for each candle, we divide the value of the feature by the Normalization Reference and then subtract one from the result.

Scaler

Different numeric features have different scales, and the distribution of the values of the features is also different from one to another. Scalers are used to make all the features have the same range or the same distribution. For example, the StandardScaler changes the values of all numeric features such that they all have a standard distribution (mean value of zero and standard deviation of 1). The MinMaxScaler and MaxAbsScaler map features such that scaled features range from zero to one. The RobustScaler removes the median and scales the data according to the quantile range (defaults to IQR: Interquartile Range). The IQR is the range between the 1st quartile (25th quantile) and the 3rd quartile (75th quantile). The PowerTransformer modifies the data to become more Gaussian-like. The QuantileTransformer transforms the features to follow a uniform or a normal distribution; therefore, for a given feature, this transformation tends to spread out the most frequent values. It also reduces the impact of outliers. The Normalizer changes the data so that the L2 norm of each feature becomes one.

Last modified: 29 April 2025