Logic of some of the robustness tests

11 replies

5 years ago #237813

So I’ve been at this a couple of years now and have been deep enough into the details of algo trading that I’m questioning some of the logic that I took for granted when I first began. I’m sure each person has their own view on my challenges below and I’m really interested in the logic behind the thinking. Feel free to share your thoughts on why you use what you do and how has it helped you with your strategy identification.

1. I completely agree with robustness testing for: variable slippage, variable spread, randomizing the starting bar, randomizing the historical data (probability 20, max change of ATR 5-10%), randomize min distance from price, randomize trade order (Exact and Resampling), and randomly skip trades. All of these make total sense to me within certain ranges and seem to be enough to keep me from over optimizing and curve fitting.

Here are the ones that don’t make sense to me. If you have sound logic for why you think otherwise, please share it with me and help me see where I may be wrong.

1. Test against other currency pairs – Why would we expect a strategy that was built on one pair to be effective on another, even if they are “highly” correlated for periods of time? Every currency pair (especially the majors) have their own “personality” due to the Central Banks and other key players that control them. If there is an independent Central Bank behind it, then it makes no sense to expect a “related” currency to perform the same way (ie EURUSD, GBPUSD). They are managed independently of each other. So when SQ builds a strategy that is effective on one pair, in my view, I should only expect it to be valid only for that pair and for no other.

2. Test against other time frames – When SQ builds a strategy that meets our requirements (PF, DD, Net Profit, etc.) on one time frame, why would we expect it to perform even remotely close on another shorter or longer timeframe – especially if indicators are involved? The indicator values are driven by the parameters and number of bars, etc. If the same indicator is placed on a chart that is half, 1/4, or 4x another, the values it outputs are, in many cases, completely different. Thus, the rules for entering/exiting a trade based on the value of those indicators is no longer applicable in the different timeframe. So why in the world would we expect the strategy to be even close in performance to the original timeframe?

3. Randomize strategy parameters – As with #2 above, with indicator driven strategies, the strategy parameters are what drive the calculations. So adjusting them higher/lower will certainly influence the output values and thus the rules of the entry/exit. This test seems reasonable to me only if I lower the performance expectation of the strategy to be proportionate to the amount of change I’m introducing/allowing into the parameters. If my standard for PF is 1.5 and I introduce 20% change into the parameters, it makes no sense to me to expect the strategy to retain a PF of 1.5. The strategy was built to match certain conditions that are reflected in the value of the indicators. If those conditions change then I shouldn’t expect the rules to be met and thus the same level of performance.

I would welcome the views of other more experienced users of SQ on these topics. There certainly may be things that I’m not considering and where I need to be enlightened. Thoughts?

5 years ago #237818

Probably I won’t be a more experienced user but if I may, I’d like to add my two cents because that’s a topic I’ve read, thought and I believe it’s key on this business. I read somewhere that you must crush your strategy as much as you can or the market will do it for you. For me that’s the point to add the Multimarket and Multitimeframe tests as part of the Robustness testing. I don’t care too much behind the reasons but I prefer to discard a bunch of good strategies instead of taking some overfited for good.

Regarding the Randomize strategy parameters. For me that’s maybe the most important test. Between our Backtests and Live execution there’s a mist of variables all of them playing against us; i) SQ Backtest doesn’t exactly fit the reality. Even SQ Backtest doesn’t match exactly with MT4 Backtest (you can check it), ii) Broker execution is not precise, iii) Historical data we used not the same of Broker’s, at least my case because my broker don’t provide its historical data.

And you skiped a test; Walk Forward Matrix. I apply it as a final test to confirm the strategy

5 years ago #237822

As a short answer to all Your points is that the only reason to do these tests is to figure out how sensitive the strategy is to different changes in the market if the current settings totally fail it was probably curve fitted to the data it is made on and finding it was just a lucky shot or it was just over optimized by luck. But if change of markets or randomizing of parameters has an median negative impact on performance it would tell You that it is pretty good as is and it can then most probably even be improved and as well adapted to another market thru optimization. Normally strategies that can be adapted thru optimization to other markets were it failed in it’s original form is Robust and can be traded live as well and for that we now have the excellent tool Opt.profile and SPP.

5 years ago #237827

Thanks for sharing your insights Enric and Mabi. Both of you mention overfitting and overoptimization. As I shared in the first post, the robustness tests that I listed in point #1 that make total sense to me – everyone of them deal with something realistic in the market to avoid overfitting. I can assure you, that nothing that has passed my set of robustness tests was arrived at “by luck.” They weather the possibilities of change that should be expected – all of which would address the points you both brought up.

So the question back to you is, if the things you are concerned about have already been addressed by specific robustness tests, then why would you require it to pass other tests that have absolutely no relationship to the “realistic” obstacles your strategy will encounter in the market? What are the specific issues that those tests address (which haven’t already been addressed in previous tests)?

5 years ago #237829

@afhampton. You’re questioning the necessity of Multimarket and Multitimeframe as valid Robustness tests. I’ve seen opinions of educated traders arguing on both sides. I don’t think anybody can give an absolute answer.

I prefer to include these tests in my workflow. See it like this; if I’m wrong I’ll have discarted good strategies, a pity. On the other side if you exclude these tests and you’re wrong you’ll be including overfited strategies. First option gives me more peace of mind.

Finally, @maby included two additional tests available on SQX. Of course I’ll include them in my workflow too. I’ll try to crush my strategies on the Building process as much as I can to avoid they crush my account.

5 years ago #237830

No one say you need to run them. In futures i only ran test on correlated instruments otherwise it would fail without optimizing it on the uncorrelated instruments. In Forex most are correlated so it is like extra data. I f you make it on EurUsd and then test it on GbpUsd you have another 15 years of OOS. But You are right it is not evident that it help much looking at performance from live traded strategies.

5 years ago #237832

Great thread, I’ll put my 2 cents as well.

Though I can understand people that choose to challenge their strategies in any way possible, personally, I do agree with you on points 1 & 2, I do not see multi-market and multi time frame checks as valid robustness tests, I do them to see if I can trade a different market as well with my strategy, but nothing more. Definitely not disregard a strategy that passed all my other tests, based on it.

You definitely should crush and challenge your strategy as much as possible, but if I am a good rapper, I will not go to perform in an Opera hall, it’ll probably end up in a disaster and booing, right?.. I’ll stick to the Rap clubs.

On point 3, “Randomize strategy parameters”, I make sense of it this way:

– For me it’s basically the same as “Randomize Historical Data”, upside down. I’ll give an example:

For simplicity, let’s say you have a strategy that goes long when price crosses & closes below SMA of 5 last bars.

Let’s assume that the closing prices of the last 5 bars at a certain point in time are: 5, 10, 15, 20, 25. That makes the SMA at that point 15.

Now during your randomize historical data test, we permutate those closing prices slightly, to, for example “6, 11, 14, 22, 27” to see if the slight historical data change, will impact your strategy or not, right? So this time the SMA at that certain point is 16 (as opposed to 15 of the true data), and you want to see if that SMA of 16, will still make your EA go long/short, how much you win/lose this time compared to the original situation, etc, right? and for you that’s a valid test to challenge the strategy in different, but relevant conditions.

Now, let’s say you do a randomize strategy parameters test with the same data, so this time, your EA takes into account SMA(6) instead of SMA(5) as the original strategy, and lets say that the historical closing price of the bar before the 5 bars mentioned is 21. Now you have the following closing prices of the last 6 bars: 21, 5, 10, 15, 20, 25. SMA of which is? 16. So essentially you got to the same mutated data result as you got with a slightly different historical data, just “upside down”, diversifying your testing.

In other words, we try to change historical data from the EA point of view (obviously without very wide changes), rather than the data point of view, and see how the strategy handles it. It’s an additional form of challenge for the strategy, that you can expect it to survive. As the changes we make are more radical compared to the historical data changes, this is also a more demanding test usually..

5 years ago #237833

I see your point and yet disagree.

Following your example. If you’re a good singer probably you could play rap or opera. If you’re rapper won’t be able to make a living with the opera but I’m sure you’ll do well 🙂

You’re right with the randomize parameters however imagine a strat with a StopLoss = 0.73 * SMA(5). Changing the SMA period I’ll get same result as changing historical data. Ok, no problem for me having a redundant test. But if I change the parameter 0.73 that’s something could happen in real trading

5 years ago #237834

Enric wrote:

I see your point and yet disagree. Following your example. If you’re a good singer probably you could play rap or opera. If you’re rapper won’t be able to make a living with the opera but I’m sure you’ll do well 🙂 You’re right with the randomize parameters however imagine a strat with a StopLoss = 0.73 * SMA(5). Changing the SMA period I’ll get same result as changing historical data. Ok, no problem for me having a redundant test. But if I change the parameter 0.73 that’s something could happen in real trading

It’s perfectly fine that you disagree and I wish you luck. To me it’s just a form of perfectionism that is nice if it works, but not necessary. (I mean, not everybody is Freddy Mercury, right?)

Also I agree with the second part of your message, that’s just the more reason to use that test. My example was just a basis for the need for that test, but it doesn’t really help me if I end up just doing the same as changing the history data. The whole point is presenting a different situation to our EA, whether it’s the period change, slight changes of SL/TP if it’s based on ATR/Pips, or coefficient changes as you suggested. As long as the change isn’t too radical, it all just makes a good and relevant challenge to overcome.

Ilya

5 years ago #237845

Ilya wrote:

…. My example was just a basis for the need for that test, … Ilya

This is exactly the kind of engagement I was hoping for with this question and kudos to Ilya for presenting an excellent logical reason for the randomizing parameter values test – supported by a legit example. Very well explained and I might reconsider my view toward that particular test.

So the next question I have is…”how much change is reasonable for randomizing parameter values”? There has to be a sensible approach. I completely agree with thoroughly testing strategies but there has to be sensible boundaries. I’ve seen a few posts talk about “crushing” strategies. I can crush any strategy someone develops by making the conditions and filters unreasonably high. But that makes no sense. The objective is to find strategies that can weather the normal conditions, routine manipulations, and occasional extremes in the market. That makes them robust. But just like other tests, the range has to be justified. I can tell you why my slippage test is set to 5-10 pips … because there is precedent for that in the market. I can tell you why my spread tests is also 3-12 pips – because there is precedent in the market.

So how much change is reasonable for the randomize parameter values test and why?

5 years ago #237900

Many strategies i trade has been made on another instrument or time frame but tested good so i used it . Saved me a lot of months of generation and allowed me to quickly build diversified portfolios.

5 years ago #237905

Thanks Mabi. You bring up a very good point. How does the correlation of those strategies look when you take a portfolio and run it through Quant Analyzer?

Viewing 11 replies - 1 through 11 (of 11 total)

Logic of some of the robustness tests

Products

Resources

Company

Follow us