GDA’s novel Automated Chart pattern recognition infrastructure(ACPRI) framework using Deep Learning
The escalation of financial Big Data and the upsurge of trading (HFT) leads to new challenges for dealers, financial data analysts and stakeholders. Deciphering the beneficial evidence from the raw financial data and performing the trade in the appropriate time constraint is becoming much more intricate. Trading information is flourishing in such a manner that the investor is unable to keep the track of all the daily trades and the businesses being traded, which is a must for making a great big move for investing. The most complex task is to find the right investment under natural time constraints and firm deadlines of the market and then make a decision on the spot.
In today’s fast-paced, competitive and high commercial data-rich world, analyzing the financial big data is the wearisome process. Thus, the data is visualized into financial charts and graphs to ease the analysis process. The charts epitomize patterns which are generated when traders buy and sell, giving the representation of oscillation of prices which will save the time of traders making quick decisions.
Chart pattern recognition plays a vital role in helping traders to find good opportunities and make financial decisions. Learning and identifying the patterns in the charts can help to sense the price fluctuations of cryptocurrencies. The different patterns in the charts are made by the fluctuations in prices which indicate the highs and lows for the past weeks or months for cryptocurrencies like Ethereum and Bitcoin. Traders try to decode these patterns for decision-making. Remarkably, these patterns are technical and usually indicate a high likelihood of vital moves to be taken.
With the boom of Artificial Intelligence (AI), every human capability is augmented for automation like bots for surgery, consulting practitioners, decision making for traders, identifying the best next move in the business and a lot more. There are many such bots available for automated crypto trading like Pionex, Cryptohopper, Bitsgap, Coinrule, Trality, etc¹. However, the recent advances in Machine Learning (ML) are vigorously reshaping the financial trading market. Particularly, by using Deep Learning (DL) and Reinforcement Learning (RL) algorithms, it is possible to train the machine to make the financial trading decisions for investors. The decisions made by machines i.e. bots, are faster and more accurate. Furthermore, they can operate 24/7, are not affected by emotions like fear and greed, and can process gigabytes of data within milliseconds².
There are already many use cases of ML in financial trading like AI-based Asset Price Prediction, RL for Modeling Other Trading Agents, Deep Learning for Financial Technical Analysis, CNNs: Plugging-in Image Recognition into Financial Charts, AI for Stock Ranking, Portfolio Diversification with Machine Learning and Evaluating Market Sentiment with Natural Language Processing (NLP)³.
Thus, this paper aims to create a novel Automated Chart Pattern Recognition Infrastructure (ACPRI) framework by using Deep Learning techniques like Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) for crypto market analysis. It is designed to predict the cryptocurrencies trend by identifying the complex patterns in the charts and providing business strategies to traders to maximize profit. The major objectives of creating an ACPRI framework are as below:
- To replace the current traditional methods used for analyzing the charts and predicting the future of cryptocurrencies.
- Replicate the trader’s visual outlook into the training data for bots that leads to automation.
- Give more data insights and extract the minute information by connecting the data points to give more productive information with high probability.
- Unbiased framework for analysis of different chart patterns.
- Correlate between charts and patterns for fruitful information.
- Processing multiple datasets (visual datasets as well as numerical datasets) together to give a cohesive representation of current and past market analysis.
- For high productive results, years of data are required to analyze, which is not possible manually or by traditional pattern recognition techniques, but is done by ACPRI. At the moment, traders can at most consider one month’s data for analysis which can lead to an error.
The major contribution of the proposed approach is:
- A novel CNN architecture for processing 2000 typical chart patterns.
- A novel CNN architecture for processing volume and liquidity clusters charts
- A novel CNN architecture for processing proprietary charts.
- Pre-processing the charts and giving labels manually for training the model.
ML in finance is not a new domain. Finance has been a trending application area for ML over the last 40 years⁴. There are a lot of studies that include ML for financial decisions. Even today, researchers are focusing on finding novel ML approaches for finance applications and use cases for crypto market analysis. Finance has always been one of the most studied application areas for ML and overall interest in the domain has increased after the introduction of the first cryptocurrency in 2008 by “Satoshi Nakamoto”. Even deeper techniques of ML like Deep Learning (DL) and Transfer Learning are explored for analysis of crypto markets.
Crypto is not similar to forex or equities as crypto markets rely on retailers, which are affected a lot by emotional investment. Crypto markets also follow a tribal phenomenon i.e., where there is group participation either for trading or crowdfunding. Furthermore, because the crypto space remains unregulated, this allows large market investors to abuse the small traders. Thus, technical chart pattern analysis works differently compared to other markets like forex and equities. To date, some tools have been designed and released for technical chart pattern analysis for misleading investors⁵. Hence, there is a need for automated technical chart pattern analysis for algorithmic trading which can help the traders make correct decisions to maximize profit.
The key benefits for using DL for technical chart pattern analysis are³ ⁴:
- DL networks are designed to use the cascades of multiple layers that empower the model to discover the complex nonlinear hidden pattern through the abstraction of training data. Technical chart pattern analysis (TCPA) requires investigating the long-term trends in raw data by identifying hidden patterns, emerging trends and market signals, and finding the correlation between the data.
- DL networks can train themselves on knowledge representations that have multiple levels of abstraction. Thus, it is useful in finding unstructured knowledge within the raw data like financial time series data and the appropriate reasoning.
- DL models work well on detecting the pattern from images such as lines, edges, etc. Thus, it will detect the trends in the crypto market analysis from technical chart patterns.
- DL models prove to give great performance for large image datasets i.e. financial time series data or chart patterns. Thus, combining the supervised ML for price prediction with DL architectures like Convolutional Neural networks (CNN) and Long short term memory (LSTM) can enable the traders to predict the long-term trends in the crypto market.
- DL improves financial risk management by providing better volatility prediction, market crashes and non-linear events as it is good at discovering non-linear structures in charts.
- DL is black box testing that works in the latent space. Thus, it allows the combination of market data (quantitative) with the market reports, blogs and news (qualitative) to create a model which incorporates technical as well as non-technical data.
B. Basic Architecture of a CNN
A CNN is the DL model which is mostly used for image classification and segmentation. Fig 1. shows the general CNN architecture for an image classification task. CNN architecture consists of three layers; Convolution layer, pooling layer and fully connected layer. These layers are stacked to form the CNN architecture. Some other important parameters of CNN architecture are the dropout layer and activation function.
The number of output nodes in the next layer can be determined by using eq. 1.
- N = Size of output nodes
- Size of the image is W x W x D
- F = Size of the kernal/filter
- P = Amount of padding
- S = Stride
a. Convolutional layer
The convolutional layer is the first layer in the architecture where input is fed. This layer is responsible for extracting the various features from the input images. To extract the features, convolution — a mathematical operation — is performed between the input image and the filter i.e. kernel, with a particular size n × n. By sliding the filter over the input image, the dot product is performed between the filter and the part of the input image which is under the filter respectively as shown in fig 2. The output is known as a feature map which gives more specific information about the images such as lines, edges, corners, etc.
b. Pooling layer
The convolutional layer is followed by the pooling layer. The pooling layer is responsible for reducing the size of the feature map obtained from the convolutional layer as shown in figure 3 which decreases the required amount of time for computing the weights. Thus, the pooling layer reduces the connections between the layers and operates on each feature map.
There are various types of pooling operations as given below:
- Max Pooling — Largest element is selected as shown in fig. 3.
- Min Pooling — Smallest element is selected
- Average Pooling — Average of all the elements in the selected image section is performed.
- Sum pooling — The sum of all the elements in the selected image section is performed.
The number of output nodes in the next layer after pooling can be determined by using eq. 1.
c. Fully connected layer
In a fully connected layer, all the neurons i.e. nodes, in the preceding layer are connected to each neuron in the next layer. Thus, it is computed similar to matrix multiplication followed by bias effect. These layers are positioned before the output layer. The fully connected layer takes the flattened input from the previous layers. There may be more than one fully connected layer. These layers are responsible for performing the classification task.
d. Dropout layer
When neurons are fully connected to the preceding layer, the model suffers from overfitting of data in the model. Hence, the model will perform well on training data but have poor performance on test data i.e. new data.
The role of the dropout layer is to resolve the overfitting problem in CNN. It is utilized to “drop” a few neurons from the CNN during the training process which reduces the size of the model. The size of dropout needs to be mentioned, for example, if the dropout size is 0.2, then 20 % of the nodes are randomly dropped from the CNN.
e. Activation function
The most important parameter of a CNN is the activation function which is utilized to learn the linearity and non-linearity of the data and to estimate any kind of continuous and complex relations between the variables of the network. It is used to decide which information of the model should be fed to the next layers and which shouldn’t be given to the end of the network.
There are many activation functions such as Sigmoid, Hyperbolic tangent Activation Function (tanH), Rectified Linear activation Unit (ReLU), Leaky ReLU and Softmax. Every function has a specific use. Like for binary classification, sigmoid and softmax functions are used while for multi-class classification. For multi-class classification, usually softmax is preferred.
C. Long Short Term Memory (LSTM)
The recurrent neural networks (RNN) face the problem of long-term dependency due to the vanishing gradient problem. Thus, the intention for designing the LSTM networks was to overcome this problem. The difference between the feedforward neural networks and LSTM is that the latter has a feedback connection. LSTM’s can process an entire sequence of time series data without considering each data point independently in the sequence. It also retains useful information about previous data points which helps it to process the new data. This enables LSTM to perform well on text, speech and time-series data.
LSTMs consists of a series of ‘gates’ that regulates incoming information in a sequence, and controls information storage and the output of the network. A typical LSTM architecture contains three gates; an input gate, output gate and forget gate. These gates act as the filters and each represents an individual neural network⁸.
- Input Gate(i): It is responsible for determining the extent of information to be written onto the Internal Cell State.
- Forget Gate(f): It is responsible for determining how much previous data to forget.
- Output Gate(o): It is responsible for determining what output i.e. the next Hidden State to be generated from the current Internal Cell State.
LSTM output depends on three things at a specific time.
- Cell State: The current long-term memory of the network
- Previous hidden state: The output at the previous specific time.
- Input data at the current time.
III. Proposed Approach/Methodology
The paper aims to propose a novel method for automated chart pattern detection for crypto market analysis.
The paper proposes three DL architectures for three different charts.
1. Typical Chart Patterns
Typical chart patterns are of different types as depicted in the fig. 5. The paper proposes the DL architecture for analysis of these patterns. The analysis of these patterns will help to find correlations in the data and market levels which will help traders to make the best decisions.
2. Volume and Liquidity Clusters Chart (VLCC)
Uniquely designed libraries of liquidity and volume clusters on spots and futures from major centralized exchanges are used to identify and represent the market microstructure strength, liquidity depth, market relativity and momentum. Figure 6 depicts the crypto native liquidity/volume clusters chart.
The paper proposes the second DL architecture for VLCC analysis that will help to identify the major liquidity concentration areas, spot stop hunting, liquidity traps and SFP areas in a shorter (5m, 15m, 30m) or a longer (4h, 1d, 1w) time.
3. Proposed Proprietary Pattern
The proposed proprietary pattern as depicted in figure 7 is highly customized and tailored for Bitcoin (BTC) and Ethereum futures markets. It represents and identifies the market microstructure correlations, strength or weakness, trend development, shapeshifting phenomenon, identify a trading range and the trend distribution, determine the location of price within the broad spectrum of uptrends, downtrends, and sideways markets. ATPA module is predominantly built using customized Fibonacci (arcs, fans, retracements, extensions, and time zones), customized BTCUSD-P & BTCUSD-F harmonic chart pattern, based on Fibonacci numbers and ratios and other relevant principles such as Wyckoff methods and Elliott Wave theory.
The paper proposes the third DL architecture for analysis for such proprietary pattern charts.
 Barbara Thompson, https://www.guru99.com/best-crypto-trading-bot.html, last accessed on 24/11/2021.
 Janny Kul, https://towardsdatascience.com/crypto-trading-bots-a-helpful-guide-for-beginners-60decb40e434, last accessed at 24/11/2021.
 Kirill Goltsman, “AI in Financial Trading: White Paper”, A Data Science Foundation White Paper, 2018.
 Murat Ozbayoglu, A., Ugur Gudelek, M., & Berat Sezer, O. (2020). Deep Learning for Financial Applications: A Survey. arXiv e-prints, arXiv-2002.
 GDA Fund, https://gdafund.medium.com/acpr-a-breakthrough-chart-pattern-recognition-tool-7c5b99cb4ef8, last accessed on 27–11–2021.
 https://towardsdatascience.com/convolutional-neural-networks-explained-9cc5188c4939, last accessed on 29–11–2021.
 Rian Dolphin, https://towardsdatascience.com/lstm-networks-a-detailed-explanation-8fae6aefc7f9, last accessed on 30–11–2021
 Sherstinsky, A. (2020). Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena, 404, 132306.