By - kmdrfx
Nice, I wish this sub was more of this. Success is really hard.
6 months of hard! Fulltime job that sh*t. And thanks.
If it looks to good to be true, it almost always is. Try your algo in a realtime market and validate the results. I hope you are not overfitting, but you probably are.
Works realtime on the live binance API. Same results. It's not too good, but positive. I would be concerned if the test data inference would yell 20% profits at me, yes. It does not. Hitting the targets 100% would yield around 5% daily, the model hits >90% and can get ~1.5 - 2.5% daily. What I wonder is that it should be more with over 90% accuracy.
For completeness: not all days are positive, on a day like today it's a loss, but on average throughout a month it's positive with said dailies.
If you can get that daily, do it while it lasts.
Woo, doubling your money once a monthish. That's pretty amazing
The more trials you run overtime the more false positives you will get in your data. If you are not adjusting your p values for the number of trials, then you should expect higher sharpe's.
p values. Plenty of unexpected events.
You'll need to run it por at least as long as the time period the test data to declare victory and through different market regimes. If it works, congrats.
Teach me master
...even though the targets/labels would create a neat performance as in being profitable, the result barely is. The targets themselves would be profitable over a longer period, with some losses here and there, but the inference of the model is not. I don't really get how this is possible with over 90% accuracy.
I know this is very general, but maybe there is something general I am missing.
With regression accuracy means nothing. You have to look at the loss to see how well it's performing.
Loss is fine, going down for train and test datasets.
Interestingly, the loss plotted along the price with an ema on it, moves very much along the price movement.
If your W:L is 90% but result is still not great - you must be having a terrible risk to reward ratio and one loss can easily eat profits of many trades.
WLR is more like 2:1, but yes, I had the problem with previous models more, that heavy losses shrink the profit.
How'd it do with today's selloff?
Lost 3-4% per symbol in the big drop, but still fine overall. Not all days are positive. But on average it's doing well.
I said it before and will repeat it over and over in this subs slang if necessary. Learn the basics for trading and statistics. So you want to learn the dynamics of your market data. One of the most important indicators for dynamics is volume. If the price rises by 10 coins or 10.000 coins there is a difference. MACD shows you past not future dynamics( lol ), it’s even a bad proxy for future estimations. It is more or less useless. Predicting prices directly is omegalul, google why it’s better to predict ratios, first order differences of prices, normalized, scaled data etc. This post should be in trading sub not algotrading. There is literally 0 real intellectual algotrading here. It’s just pick random indicator, MACD, RSI, Bollinger, define arbitrary threshold and try it on a backtest and overfit as often and as much as you can on the same “test set”. Use evaluation metrics you don’t even understand to make an “algo”. Yeah it is the same as going to the casino, with the rise of talks about martingale… I feel like it is already casino. With >90% of posts about crypto you know it’s gone to shit. With this sub unmodded there is never a real discussion about the science behind anything around algotrading. No instead ppl want to hear casino stories.
TLDR: this sub is turning into shit. Feel free to ban me and spare me another klick
You mad bro? 🤣
Seriously though, I did my research and it is predicting difference, not prices, duh. All normalized and scaled, taking volume into account. I did 3 months of trading upfront to get into the basics. Maybe read and ask intellectual questions if you want a discussion. Lolz, take care.
Seriously though, good that you made your own due dilligence. I agree that MACD is not the price but difference of MAs. Ok I have a simple question; regardless of model ( nn, decision tree, lstm, etc.) do you really think labeling/targeting/forecasting a highly lagging indicator will earn you money? You traded 3 months upfront, I guess that’s a lot. :)But that’s just my 2 cents take it or leave it.
Edit: i have a thought process in mind: what if prices have a high noise and a low signal to noise ratio. Would difference of 2 MAs give me noises which cancel each other out, or they add up, or something in between. If they add up what is the value of my signal then. How do I know if my accuracy was induced by noise so I was lucky or did I really find some signal there.
Well, I am still new to this game, I don't claim to know shit about the markets or trading at all. I am just doing what I can in the time and with the resources I have. Still learning.
That said... I am able to make profits manually with MACD and RSI. So why not automate that? While automating, I found it difficult just plain algo wise if/else so I startet with the models and tensorflow. To some success, making at least minor gains.
On your thought process: If I got you right, I am asking myself the same questions. Did it catch a real signal there and is it sustainable? Only way I see to find out is put it to the test live. Works. More or less. Could be better and probably will decay over the next month or so.
So I am taking all the learnings I got from all the great Input here, read up, apply it all and go again. Fun!
I have other ideas on taking more realtime input than EMAs, like looking at order books (already did, nothing working yet), most recent trades, social Media sentiment (too much work currently) etc.
Still have to do some catching up with the new info I got here, but... as long as the model with lookahead on the targets can make up for the lag of the indicators, I am taking it until I got something better.
I am grateful for any tip and will do my research and hopefully learn more. If you have any idea on what non-lagging data one could use, I am happy to listen.
are you sure you haven't included your test data in your training data?
If the test data would be included in training, the accuracy would be even highee, like 100% and it would not work live, but it does. I have it test covered from all sides.
Show us out of sample test results
That is the test data range 90/10 training/test data
Edit: if I train further than that, it overfits obviously. I have plenty of models that cannot generalize. So I can reproduce overfitting and this is not.
the question is how did you split your data. did you use a random samples from your data set as a test set or did you choose a separate window of data?
It's a completely separate window of data.
one more thing you could try is to test your prediction on multiple windows by shifting your testing window forward and see during which period your algo works and when not.
Yes, thanks, I'll be trying to optimize the system that way, thinking about a meta model that can tell in which areas it performs well, sorting the symbols by projected performance and giving the top ten higher priority and more moneys.
What do you mean by separate? Of course all targets will be outside of the data used for prediction....I have a feeling you misunderstood what a test set means. You need to "embargo" time periods so there can be no overlap at all in training and test samples. If your training set is day 0-100 then the first test set should be whatever your longest lookback feature is + 100 days. So if you have some feature that looks back 5 days, don't test with anything except day 105+ and later
The recent 10%, yes. I have models that do not perform on test data at all, but these do, so it is well separated. I banged my head into the wall a lot this year and have the data pipeline completely test covered now.
Are you taking into account the sampling bias? If you have tested multiple models on test set and choose the one that performs best, you might be overfitting.
Been there, done that (Exact same thing lol).
Hope you make it 🙌
Thanks! Had to read twice, but I think I got it. I run two models in parallel on the live API and try to make sure to have it run long enough before I give it more money and exchange the previous version. Only running live for barely two months now, so not much experience yet, but getting there.
It's called p-hacking and is a very common trap if you want to read about it
Yeah, that is possibly true but he is running it live, so less chance of p-hacking .
Hold up. Are you saying that you tested multiple models using this 90/10 split and this is the one of your top performing models?
If that's the case, you'e got massive multiple testing bias. You can run an experiment once using the data. If you select a model based on the test set performance, you've used it more than once.
The models cover everything from not-generalizing to 96%, this one is at 91%. I get that choosing the best might be the most overfittet one. As I wrote in another reply, I try to work against that by running multiple models live (currently only two) and compare them there
Which brings me to why I originally posted this... Depending on initial weights and even though the accuracy can be quite high, it performs well in some areas and not in others, one run better on one symbol, then on another. Fine, stochastic... But, overall it seems to equal out it's performance somehow, even though it hits the targets quite well. That's still a bit of mystery.
It’s unreasonable to expect a model to fit more than one symbol. Heck, am totally green with envy that you could fit to one symbol even. You have some code on github by any chance?
Have you tried 80/10/10 training/validation/test
Was using that setup months ago but left it for 90/10 as running validation along with training is taking time and I so far ran thousands of trainings on hundreds of models and my resources are scarce with just one rtx 3080.
Try to implement a buy and sell signal and deploy model to paper trading.
Great video, lolz for that, but I went past overfitting months ago thank you very much.
How else can I prove that it is not overfittet, other than separating training data from test data? Many readers here seem to be convinced it's overfittet, when it's definitely separated data that the model does _not_ see during training AND the model does _not_ see the lookahead, which is only used to generate the targets.
You should have training, validation, and test data. Something like 10% test, 10% validation, and 80% training. The validation is used with training to verify u dont overfit the model and stop if you do. Test data is used to check the predictions afterwords since it's never been used in any aspect of training.
Beyond that 🤷♂️ you can always just let it run on life data for a while and have it do psuedo trades to see if you're not crazy
It's doing well live already, with 1.5 - 2.5% daily, what I wonder is that it should be more with 90% accuracy. And why the error moves with price movements, but slightly lagging... Thanks for the sane reply.
I'm sure you've already got this covered, but for other people reading along since I haven't seen it mentioned in the thread yet...
Make sure you have very strong risk controls and stop-losses in place if you're implementing a mean-reversion strategy like this. You'll get blown up when a trend hits if you don't have solid risk management in place. You can make 1-2% per day for a month and then lose it all in a day as you keep getting buy signals as it goes lower and lower and you think "this has to be the bottom." But it can always go lower.
Yup this shit doesn't work
Which shit does? I mean, what general direction would you go instead of mean-reversion?
I feel like the only thing that works for trading is taking into account all variables. I E. Some sort of ML that takes news, fundamentals, hype and technicals all into account.
How would one calculate hype? Number/frequency of tweets or other social media mentions for coins/symbols? What are fundamentals for crypto? Sorry, yeah, doing crypto. Don't have stonk broker access currently, the fees are crazy and tax is shit in my country.
Yes you would have to use social media. Not sure where to get that data for free.
Probably because the distribution of live data is different than any other distribution your model has seen. Maybe you can make it "smarter" by adding some domain adaptation techniques during its learning so that it learns to recognise data distribution as well.
Also, do you have a "re-training" strategy ?
Good luck !
Interesting, thanks. I do not yet really have a re-train strategy, other than "get new data and train the model again with same initial weights". Any resources on that?
Do you have any hints on resources for "adaption techniques"?
How much time do you wait before training again ?
Concerning the resources about "domain adaptation" you could just research those terms on google scholar and do the same on youtube
I only started live trading two months ago and so far have only exchanged the models against completely different ones, so no re-training yet. Experimenting with different models live.
It's not overfittet, I train and test on different datasets
How many times did you reoptimize for your test set? That’s where secondary overfitting occurs. You train on 80% of data, test on 10%. Then once you get the best you can on the test set you run the validation set once to get a better representation of real world performance. Any more playing with the validation set and you’ll fit to that specific data.
I did not re-optimize for the test set. At least model wise. I have parameters for my broker which communicates with the live API like signal thresholds and drop risk (aka stop loss). Those parameters I optimize for a time window of roundabout 3 months until now. Works live. As stated in other replies, just not as good as I expect for that high accuracy.
I just now really got what you're saying. Will make sure to review my process for that. Thanks!
If your model isn’t overfit then you are probably incorporating look-ahead somewhere
I sure do, one timestep lookahead, otherwise I would not need the model if the labels would be profitable standalone without lookahead.
Very interesting, thanks.
"The best solution to avoid the look-ahead bias is a thorough assessment of the validity of developed models and strategies."
My results are not exceptional, so I am not using information that would not otherwise be available at that point in time, other than using the lookahead on the MACD to give the model a hint where a change in direction might happen, which it can apparently grasp quite well, at least accuracy wise.
The model does NOT get the lookahead as Input, obviously.
I would examine your inputs more closely, look-ahead is often baked into historical data in ways that are not immediately obvious
It is one of the most common mistakes made by beginners, especially programmers who dive into trading without a finance background
given the results you are getting it seems the most likely explanation
Thank you a lot. What is an example of "baked-in"? My model definitely does not see any future data or inputs it would not get live either, it works on the live API, just not as performant as I expect it to be.
lookahead bias usually silently enters data during a normalization or standardization process
I built a normalization layer and a layer to index scale the data, at that point, the lookahead is already cut off.
To all the down voters: how do you generate labels, which are profitable, to train a supervised model, without lookahead? Would that not be a working algorithm with no need for a model?
How else can I prove that it is not overfittet, other than separating training data from test data? Many readers here seem to be convinced it's overfittet, when it's definitely separated data that the model does not see during training AND the model does _not_ see the lookahead, which is only used to generate the targets.
I gotta give it to Op, you're taking a beating in the comments and sound reasonably cool. Props to you man
Pff... Beating. Ignorance is bliss. Thanks mate!
I agree with kaitje, you might be overfitting your model. Even if you ain’t your sample size according to the screenshot is 96 trades which is way too small. Also I’m just guessing here but I believe that your backtesting is less than 1 year or 2 year data, so your strategy might just be working in a selected environment rather than in a bullish, bearish, and range market which will probable appear in a near future. Last but not least depending solely in the % of accurate trades is generally a bad idea, due to the fact that if your strategy reduces that percent you will be going straight to losses. Many times the best strategies are those with a low win/lose % ratio, but a very big profit factor as a bad streak won’t really affect, yet if the market goes in your favor with some luck, results would be exponential. Just some word of advice, hope it helps.
Thanks a lot, good advice. I've been looking for high win/loss ratio (WLR) for stability, but you are right, in hindsight the models with lower WLR but high profit get through rough patches way better.
The example I posted is with data that goes back slightly over a year, correct. It works for longer listed symbols like LINK or ADA as well though. My approach here is divide and conquer. I train two models for trend directions up/down and use the one according to current Trend.
Will definitely revisit the low WLR/high profit models!
Looks awesome! What model did you use? LSTM?
Did you follow any book/resource to reach this point?
Most importantly, why is this a picture of your screen as opposed to a screenshot?
Not using reddit on my work machine... And too lazy to screenshot and send it to my phone to upload. Simple as that.
Why are you downvoted lol
For being honest
I dunno 🤷 the internetz...
Not using reddit but you're using the work machine to write a personal algo? That's hilarious.
Even better... It's my personal work machine 🤣
There's no doubt you're over fitting. There are statistical tests that tell you if you are. I recommend you do that first.
Might be, but it's working on live data, so I am fine so far. What statistical tests would you recommend?
First, keep track of the correlation and its p-value between live returns and validation returns over the same time period. Also get those values between the training, test, and validation returns. If they are all below .6 with less than a .95 p, move on to the next idea.
Second, use a probabilistic sharpe ratio.
Third, this is just [one article](https://www.nature.com/articles/nmeth.3968) on the subject.
Fourth, I'd recommend reading endlessly about performance decay. In my opinion, it's more important than finding the fit on an algo in first place.
👍 awesome, thanks a ton, will do and update at some point.
No worries. Good luck.
Also, it doesn't matter if it's working now on live data, but know what the probability is of continuance. You might have an algo that's highly prone to long-tail risk.
I'm interested in how you trained the model, did you choose a specific data period/set or leave it to learn unassisted?
It's 256 timesteps of 5m and axis -1 concatenated 1m Data, OHLC + Trades/Volume from binance. It's assisted/supervised, I create two labels on the input data frame upfront, based on MACD, outliers removed, 1 timestep look ahead.
Don't know whether UNI historically avoids the zero notice pump-n-dumps that BTC/ETH/DOGE move to since the triggers for that are outside the data set and often random?
You could try some kind of velocity-of-change factor, or maybe even look at a Heikin-Ashi average as a decision factor. Binance not loading trading view for me right now, so guessing right now. In Webull can't see UNI but if I look at BTC this afternoon as HA candles that tracks decently .
As far as profitability, how often are you trading vs fees you're generating?
Thank you! For a reasonable and sane answer.
70/26/23 is the ratio for won/lost/drop trades in this particular example and time window (over ~9 days). The model is trained on UNI/DOT/FIL/KSM/SOL, all behaving similar in inference.
Trading fees are accounted for with 0.075% per buy/sell, using the VIP1 fees with 25% discount on BNB burn at binance.
As for the pumps and dumps, I try to make up for that by filtering outlier movements when creating the targets/labels. Any other way to get them filtered better?
Will look into the velocity of change factor, good point 👍
The look ahead might be an issue on the test dataset
It looks very similar to the training data inference. Same issue there. Visualizing the sum(abs(y_true - y_pred)) error on the chart and applying a ~200 EMA on it, the error moves somewhat with the price, then when there is larger price movements up or down the error goes up as well, the larger the price movement.
I tried this one doesnt last a month, machine learning with vwap, rsi, macd and candle patterns 98% accuracy with 200 profit ratio, my data was 1m 2months gotta test it with longer time frames
So you are saying you have a better strat then all the hedgefunds in the world with only tensor and macd.
My gut tells me something is off.
Like others stated. It’s not about the win rate.
A 50 50 w:l ratio would make money with good risk management.
To be honest most of the ML models are good enough to make profit but you need to implement proper betting strategy. Adding, cutting and taking profit.
The prediction part isn’t hard anymore. Even without tensorflow SK learn has models that are good enough for classifying certain buying conditions or price predictions on a ‘short’ timeframe.
The problem had always been the betting odds and how to manage the money
Well, I don't know about most, seems to me it's still not an out of the box experience. But I agree that there is a lot more to it than a working model. Even as a very experienced developer, it's a full-time job. It's not nearly just train, deploy, profit.
LOL I wonder what will happen with out of distribution sample...
I want to see a picture of your yacht when you make your first billion
Be Very careful with anything that gives more than 70% accuracy on validation or testing, it’s too good to be true and I have been there multiple times only to realize later that t was just overfitting/high bias towards history
When it comes to trading strategies, I always recommend you to compare it to a passive benchmark (say S&P 500) rather than just looking at accuracy of predictions
Care to share your sources for learning? I'm a computer programmer but am new to algo trading
Phew... Doing that for half a year now, did some basic tensorflow tutorials in the beginning, just YouTube stuff, the tensorflow docs and examples obviously. Deeplizard has some good stuff to get started like https://youtu.be/dXB-KQYkzNU.
I got a kraken and binance account, got tradingview pro and started trading to get a grasp/feel on doing it manually first. A lot of staring at charts and reading about technical indicators.
The rest was my experience as developer and playing with the data and models.
Cool thanks for the brief summary
Do you have any other youtube recommendations?
What are MACD based labels/targets ?
The larger the difference between MACD and signal line, the stronger the trade signal. Below buy, above sell. Quite simple. As you would treat it manually. But for the targets I look one timestep ahead and let the model anticipate low/high, without it seeing any lookahead.
You wrote your model in python and tested it on which platform?
Model built and trained in Python with data from binance, running a bot with node and tfjs, built my own back testing lab and running live on binance.
Binance data aggregated here btw: https://github.com/binance/binance-public-data
>Model built and trained in Python with data from binance, running a bot with node and tfjs, built my own back testing lab and running live on binance.
>Binance data aggregated here btw: https://github.com/binance/binance-public-data
Solid job mate. Would you say node and tfjs is required or could you go directly from python to finance.
I'm looking to start deploying some algos in crpto at the beginning of 2022
Thanks. You can go directly from python, no problem. I make heavy use of threading and async features of node and it's also a personal preference, since I've been working with node for ten years now and get everything test covered easily. Websockets... Not much experience handling that in Python in a performant way. Of you know python better, go with that I'd say. Good luck 🤞
Yea I know a bit of C++ but feels like it would be a mess. How's performance increase in JS vs python for what you're using it for?
I am not much of a python pro, can work with it. Don't know about async or parallel in Python, so the performance increase in node is enormous for me, using threads for parallel and async (well, non-blocking io) throughout the system.
Have been working with C++ a lot b fore node came along, would not want to do that in C++.
Interesting. Thanks mate.
EMH says this should not work for positional strategies…
What is this platform?
Built my own "platform" to analyze my results and run backtesting. I have a partner doing this, so I did not build everything alone.
Which graphing library is being used here?
Using this one https://github.com/tradingview/lightweight-charts
How much historical training data, and how much compute power? Asking for a friend.
It's Intel i7, 32gb RAM with an rtx 3080 16gb.
All historical data for DOT, KSM, SOL, UNI, FIL - 90/10 for train/test separation.
Curious if you account for transaction cost. Liquidity varies throughout the day. How often are you trading and what happens if you assume a 10, 25, 50 bps transaction cost per trade? How fast does your alpha decay?
I account for transaction cost. I do not yet dynamically adjust to liquidity. How many trades depends on which symbol, which model and current market situation. With this example I posted on average around 15 trades per day.
I don't have numbers for decay yet, learned a lot already since posting this and have to adjust and gather some more data first. What is "10, 25, 50 bps transaction cost"?
For binance it's a fixed 0.075% per transaction.
Got it. Might be worth calculating average gains per trade.
If my understanding is correct, you’re saying binance has a fixed 7.5 bps cost per trade but that’s only a portion of your transaction cost. I think you also need to consider the bid ask spread. If you’re backtesting and assuming you always get filled at the mid, it might be a bit too generous. What I mean by 10, 25, 50 bps is to assume that each time you trade, you incur an additional 10 bps, 25 bps, or 50 bps in transaction cost due to bid ask. If your alpha still holds up, then you can be much more confident in your backtested results.
Ah, now I understand, base points of value traded. Not really accounting for spread/slippage in backtesting, other than using double the transaction costs. Will consider refining that, thanks 👍
Great that you share your results and thanks a lot for the discussion around it. It is very educational. I have a question about what is the software you show on the screenshot ?
It's a custom react application with https://github.com/tradingview/lightweight-charts
Why are you using a classification measure (accuracy) for (what should be) a general regression problem? What about, for example, out-of-sample r^2?
I am using regression metrics in 0.1 and 0.01 resolution. I have models where I treat other targets as a classification problem, works as well. What is out of sample r²?
In regression, r^2 is the fraction of outcome variance that is explained by your covariates. out-of-sample r^2 specifically is referring to r^2 for data you didn't train on. If you're predicting log price, then r^2 would make sense as a performance measure, for instance.
> I have models where I treat other targets as a classification problem
I never understand why classification is used so much on r/algotrading. What are you trying to predict that is binary?
Thanks for the explanation. Classification is not prediction per se, when trying to classify certain market conditions at the current time without any lookahead. Using an unsupervised model will try to cluster the data as well, into classes. From my understanding and from what I saw working in my models, classification is a valid first pass as additional input for another model and used widely this way in other areas.
If you hit 90%+ using ML on markets it is because you have data leakage and/or over fitting the model. It is most obvious on the left side of the chart with that sawtooth pattern around 11am.
IMO anything with a lagged window is highly problematic and especially if using closing price for the bar.
Data leakage? What do you see in the sawtooth pattern? What would be non-lagging? What price to use instead of close and why? Curious.
If it's overfitting too much wouldn't I see hardcore accurate profits? I have models that really overfit and there I see profits in the training range, but nothing whatsoever in test range. Here it's similar for training and test data range. If it's too overfit on training data, the training range should at least show me higher accuracy and therefore higher profits. I agree that there might be some overfitting happening, but that's not the main problem anymore for this model.
lol, 90% is a sure sign of overfitting. You model has memorized the test data already.
There are way to many noobs on this reddit.
>I agree that there might be some overfitting happening, but that's not the main problem anymore for this model.
You need to prove that statistically. Try training on randomly generated data, if it still gets above 50% accuracy, your model is flawed.
Will do, sounds like a good way to double check model complexity for the problem, thanks.
Using randomly generated data the accuracy does _not_ go above 0.5 or 50% in any metric. Guess my model is fine then.
Interesting. Well good luck on your trading.
This might help u: https://stock-shark.com/developers
Interesting! I built my own custom neural network from classical blocks like (KDE). My model works differently though. It takes into account a lot of fundamental data like (Earnings, debt, etc) as well as looks at price trends over a period of time. I’m using it to identify outliers (when a stock is over/under valued). The results were decent at 85%. That’s what I expected based on how I designed it and live results were hovering around that figure. I recently tuned it after what happened this past week and now it’s at 96%. I’m very skeptical to say the least haha but I’ve tested it time and again. I am going to deploy it and see if it really performs that well.
Yea , you are overfitting.
But even if not, 90% accuracy isnt helpful if your gains vs losses are unbalanced enough -- happens pretty often.
I don't understand, is this like an experimental research model or do you just think slapping some layers together in tensorflow and running on basic indicators is gonna be profitable? it is very wishful thinking to be honest if you arent doing some real niche stuff with TF.
I am building my own custom layers... Really niche stuff. Sure it is experimental, but what does it matter if it works?