Feature Analysis
Feature Analysis is tool designed to enable users to statistically evaluate the performance of features in terms of precision and recall.This tool is applicable for both numeric and categorical targets and features. Precision and recall are common performance metrics in statistics. The definitions of precision and recall are:
Precision represents the proportion of true positive predictions (correctly identified instances) out of all positive predictions (instances identified as positive, regardless of correctness). It measures the accuracy of the positive predictions made by a classifier or a feature, reflecting its ability to avoid false positives. A high precision score indicates that the classifier or feature is effective at correctly identifying relevant instances within the positive class, minimizing the risk of misclassification. In essence, precision assesses the capability of a feature/model to provide reliable positive predictions.
Recall signifies the proportion of true positive predictions (correctly identified instances) out of all actual positive instances in the dataset. It measures the ability of a classifier or feature to capture or retrieve all relevant instances of the positive class. A high recall score indicates that the classifier is proficient at identifying a large portion of the actual positive instances, minimizing the risk of false negatives. To use Feature Analysis, the following fields must be filled:
Market Data: The market data in which the feature(s) and the target are
Pair(s): The pairs that their data is used in the analysis
Timeframe: The timeframe for which the analysis is being done
Target Name: The name of the target involved in the analysis
Start Date: The beginning of the analysis period
End Date: The end of the analysis period
Feature(s): Feature(s) involved in the analysis
Signal Analysis Reports
Using very short periods might lead to invalid results. This is because statical analysis, in its nature, needs to have sufficient data.
Note that the maximum number of features involved in the analysis is two when one of the features is numeric.
Reports for Numeric Features
To calculate precision and recall, both the inputs and the outputs must be categorical. This means that not only the target must be categorical, but also the features must be categorical. To generate reports, signal analysis tool breaks the range of numerical features into ten equal shorter ranges and reports the precision and recall. To solve this issue for numeric targets, the system uses Mean Target Values. To do so, mean values of the target are reported for different ranges of features. When only one feature is used, the report is presented in the form of a stem plot. When two features are used, the report is presented in the form of a heatmap plot. for numerical targets, there will be only one heatmap which shows the mean value of the target for each sector of the heatmap. When a categorical target is used, for each label in the target, a separate heatmap is presented which shows the precision and the recall scores for different combination of the features.
Reports for Categorical Features
When categorical features are used, if the target is numerical, a heatmap that shows different combinations of intervals for the features is reported. In this heatmap, mean target values for each sector of the heatmap is available. When categorical features are involved and the target is also categorical, two interactive charts are reported. The first one allows you to navigate through different combinations of the features and read precision or recall scores. The second chart shows different combinations of labels in the features included in the analysis. These two lets you explore and investigate various combinations of labels and find optimum combinations for maximum performance.