Robustness Tests, do they really work?
10 replies
rsantinikk
8 months ago #288514
The following procedure could be performed to show if “Robustness Tests” really work.
- Build 1000 strategies with EURUSD H1, using the period 2018-2023 as “In Sample” data and leaving 2024 as “Unseen Data”.
- Once you have built the 1000 strategies, with a nice equity curve, a good profit factor and an acceptable fitness, test them in the “Unseen” period.
Let’s say that 30% of the initial strategies, i.e. three hundred, give a profit if tested only in 2024. This means that if I had chosen a strategy at random among the thousand, I would have had a 30% probability of ending 2024 in profit.
- Test the 1000 initial strategies with a “Robustness Test”, the “Walk Forward Matrix” for example, in the “In Sample” period, 2018-2023.
Let’s say that 10% of these strategies, one hundred, pass the test.
- Now test these one hundred strategies in the “unseen” period, 2024.
If the “Robustness Test” worked well, it should select the best strategies, and the percentage of profitable ones should be greater than the 30% obtained without using the “Robustness Test”.
I have performed this procedure many times with different assets, in different time periods, changing parameters and using every type of “Robustness test”, but the result was always the same: the “Robustness Test” was not able to select the strategies that were most likely to be profitable in “Unseen” periods.
I wrote this post to be able to compare myself with other traders, to understand if anyone has evidence that the “Robustness tests” can really be useful for something and in what way.

Theo Gottwald
8 months ago #288528
Wenn du mit den Toos von SQX nicht zufrieden bist probier Mal die Tools von Thomas Nickel aus. Er schreibt:
Die Anfangsphase von SQX war wirklich problematisch – überall gab es Fehler. Mittlerweile hat sich das deutlich verbessert, aber es war ein langer Weg.
Ich programmiere in Java, habe früher auch mal in C und C++ gearbeitet. Assembler liegt schon eine Weile zurück; heutzutage benutzt man das kaum noch.
Kennst du schon meine drei Programme? Das Monitortool installiert und überwacht SQX-generierte EAs, die Toolbox generiert Workflows, und der Metrikanalyser enthält eine KI auf Basis von Weka.
Diese Hilfstools helfen dabei, SQX-Roboter zu analysieren und zu installieren. Sie sind als Freeware verfügbar.
Ich habe auch mal darüber nachgedacht, die Software zu verkaufen oder Schulungen anzubieten, um anderen zu zeigen, wie man profitable Strategien erstellt.
Es gibt viele Leute, die das versuchen, aber meistens funktioniert es nicht. Profitable Strategien sind schwer zu finden – fast niemand hat welche, die wirklich gut laufen.
Das Problem liegt oft darin, dass viele Strategien einfach ‘curve-fitted’ sind – das heißt, sie funktionieren nur auf den historischen Daten und brechen in der Praxis zusammen.
Ich habe das selbst durchgemacht und festgestellt, dass solche Systeme nicht funktionieren.
Man muss den gesamten Workflow betrachten und alles simulieren. Dafür habe ich die Toolbox entwickelt, und seitdem zeigt sie mir meistens, dass die Workflows nicht funktionieren.
Bisher habe ich nur einen Workflow gefunden, der einigermaßen gut ist: den GBPJPY H1 Workflow aus dem SQX-Forum.
Ich bin noch zu neu und kann das nicht kommentieren. Thomas nutzt das SQX jedoch schon länger als 10 Jahre.
PS: Downloadlink ist im PDF-File.
**Mit besten Grüßen | With best regards | Cordiali saluti**
**Theo Gottwald**
*Leading Expert in SPR & Visual Automation*
???? **What's New in the Q4 Update?**
1️⃣ Open AI Vision: Empower your SPR with cutting-edge visual recognition. See the unseen. ????????
2️⃣ Open AI TTS: Seamlessly transform text into natural, lifelike speech. Hear the difference. ????️????
3️⃣ GPT-4 with 128k Token Context Window by OpenAI: Dive into unparalleled depth in AI conversations. Think deeper. ????????
4️⃣ DALLE-3 by OpenAI: Command revolutionary AI to generate stunning images. Imagine more. ????️????
5️⃣ ElevenLabs Text to Speech: Experience text-to-speech so lifelike, it speaks to you. ????️
6️⃣ Stable Diffusion Local & Online via Automatic1111: Unlock creativity with both local and cloud-based image generation. Create anywhere. ☁️????
7️⃣ GPT4All & LM Studio: Harness powerful offline AI capabilities. Your studio, smarter. ????️????
8️⃣ DeepL Translator with SPR Integration: Achieve real-time, accurate translations. Speak the world's language. ????????
9️⃣ ChatGPT by OpenAI: Engage with your SPR through natural, dynamic conversations. Connect genuinely. ????️
???? WHISPER by OpenAI: Convert voice to text with effortless precision. Listen, transcribe, act. ????️????
???? Mistral AI: Scale your AI integration with seamless efficiency. Elevate your AI journey. ????????
???? Claude 3 by Anthropic: Experience next-level AI understanding and interactivity. Discover AI with a human touch. ????????
???? **Address:**
Herrenstr. 11, 76706 Dettenheim, Germany
???? **Phone:**
Office: +49 (7247) 9851112
Mobile: +49 160 6688 222
Fax: +49 (7247) 9851113
???? **Email:**
[[email protected]](mailto:[email protected])
???? **Websites:**
[IT-Berater](http://www.it-berater.org) | [Smart Package](http://www.smart-package.com)

Theo Gottwald
8 months ago #288531
The initial phase of SQX was really problematic – there were bugs everywhere. In the meantime, things have improved considerably, but it’s been a long road.I program in Java, but I also used to work in C and C++. Assembler was a while ago; it’s hardly used nowadays.Do you already know my three programs? The monitor tool installs and monitors SQX-generated EAs, the toolbox generates workflows, and the metrics analyzer contains an AI based on Weka.These auxiliary tools help to analyze and install SQX robots. They are available as freeware.I have also thought about selling the software or offering training to show others how to create profitable strategies.There are many people who try this, but most of the time it doesn’t work. Profitable strategies are hard to find – almost nobody has any that work really well.The problem is often that many strategies are simply ‘curve-fitted’ – that is, they only work on historical data and break down in practice.I’ve been through this myself and realized that such systems don’t work.You have to look at the entire workflow and simulate everything. That’s why I developed the toolbox, and since then it has usually shown me that the workflows don’t work.So far I have only found one workflow that is reasonably good: the GBPJPY H1 workflow from the SQX forum.
**Mit besten Grüßen | With best regards | Cordiali saluti**
**Theo Gottwald**
*Leading Expert in SPR & Visual Automation*
???? **What's New in the Q4 Update?**
1️⃣ Open AI Vision: Empower your SPR with cutting-edge visual recognition. See the unseen. ????????
2️⃣ Open AI TTS: Seamlessly transform text into natural, lifelike speech. Hear the difference. ????️????
3️⃣ GPT-4 with 128k Token Context Window by OpenAI: Dive into unparalleled depth in AI conversations. Think deeper. ????????
4️⃣ DALLE-3 by OpenAI: Command revolutionary AI to generate stunning images. Imagine more. ????️????
5️⃣ ElevenLabs Text to Speech: Experience text-to-speech so lifelike, it speaks to you. ????️
6️⃣ Stable Diffusion Local & Online via Automatic1111: Unlock creativity with both local and cloud-based image generation. Create anywhere. ☁️????
7️⃣ GPT4All & LM Studio: Harness powerful offline AI capabilities. Your studio, smarter. ????️????
8️⃣ DeepL Translator with SPR Integration: Achieve real-time, accurate translations. Speak the world's language. ????????
9️⃣ ChatGPT by OpenAI: Engage with your SPR through natural, dynamic conversations. Connect genuinely. ????️
???? WHISPER by OpenAI: Convert voice to text with effortless precision. Listen, transcribe, act. ????️????
???? Mistral AI: Scale your AI integration with seamless efficiency. Elevate your AI journey. ????????
???? Claude 3 by Anthropic: Experience next-level AI understanding and interactivity. Discover AI with a human touch. ????????
???? **Address:**
Herrenstr. 11, 76706 Dettenheim, Germany
???? **Phone:**
Office: +49 (7247) 9851112
Mobile: +49 160 6688 222
Fax: +49 (7247) 9851113
???? **Email:**
[[email protected]](mailto:[email protected])
???? **Websites:**
[IT-Berater](http://www.it-berater.org) | [Smart Package](http://www.smart-package.com)
rsantinikk
8 months ago #288550
Ok, thanks for the reply. I will study the documents you sent carefully. Let’s stay in touch!

Drunksingha
8 months ago #288553
rsantinikk
8 months ago #288558
Hi Drunkshingha, thank you for your post.
I couldn’t find the topic I addressed at your link, is it correct?
In my experience, Robustness tests like Montecarlo ones, Walk Forward Matrix etc.. etc.. are not able to select strategies that are more likely to be profitable in the future.
I’m asking traders if they have some evidence that they work.

Theo Gottwald
8 months ago #288564
**Mit besten Grüßen | With best regards | Cordiali saluti**
**Theo Gottwald**
*Leading Expert in SPR & Visual Automation*
???? **What's New in the Q4 Update?**
1️⃣ Open AI Vision: Empower your SPR with cutting-edge visual recognition. See the unseen. ????????
2️⃣ Open AI TTS: Seamlessly transform text into natural, lifelike speech. Hear the difference. ????️????
3️⃣ GPT-4 with 128k Token Context Window by OpenAI: Dive into unparalleled depth in AI conversations. Think deeper. ????????
4️⃣ DALLE-3 by OpenAI: Command revolutionary AI to generate stunning images. Imagine more. ????️????
5️⃣ ElevenLabs Text to Speech: Experience text-to-speech so lifelike, it speaks to you. ????️
6️⃣ Stable Diffusion Local & Online via Automatic1111: Unlock creativity with both local and cloud-based image generation. Create anywhere. ☁️????
7️⃣ GPT4All & LM Studio: Harness powerful offline AI capabilities. Your studio, smarter. ????️????
8️⃣ DeepL Translator with SPR Integration: Achieve real-time, accurate translations. Speak the world's language. ????????
9️⃣ ChatGPT by OpenAI: Engage with your SPR through natural, dynamic conversations. Connect genuinely. ????️
???? WHISPER by OpenAI: Convert voice to text with effortless precision. Listen, transcribe, act. ????️????
???? Mistral AI: Scale your AI integration with seamless efficiency. Elevate your AI journey. ????????
???? Claude 3 by Anthropic: Experience next-level AI understanding and interactivity. Discover AI with a human touch. ????????
???? **Address:**
Herrenstr. 11, 76706 Dettenheim, Germany
???? **Phone:**
Office: +49 (7247) 9851112
Mobile: +49 160 6688 222
Fax: +49 (7247) 9851113
???? **Email:**
[[email protected]](mailto:[email protected])
???? **Websites:**
[IT-Berater](http://www.it-berater.org) | [Smart Package](http://www.smart-package.com)
rsantinikk
8 months ago #288618
Strategyquant does its job: it finds strategies that were profitable in the past; but most of them will not work with unseen data. We have to select the strategies that will work in the future to be profitable. I haven’t found yet a tool that can do this. In my experience Classic Robustness tests don’t work.
I started to read the papers you posted in this forum by Thomas Nickel. He has developed a new tool to select strategies generated by Strategyquant. The idea is preatty good, but I need time to test his method.
I’m developing a method by myself too, if it works I will let you know.
If anybody is interested in this topic, please leave your ideas!

Theo Gottwald
8 months ago #288619
**Mit besten Grüßen | With best regards | Cordiali saluti**
**Theo Gottwald**
*Leading Expert in SPR & Visual Automation*
???? **What's New in the Q4 Update?**
1️⃣ Open AI Vision: Empower your SPR with cutting-edge visual recognition. See the unseen. ????????
2️⃣ Open AI TTS: Seamlessly transform text into natural, lifelike speech. Hear the difference. ????️????
3️⃣ GPT-4 with 128k Token Context Window by OpenAI: Dive into unparalleled depth in AI conversations. Think deeper. ????????
4️⃣ DALLE-3 by OpenAI: Command revolutionary AI to generate stunning images. Imagine more. ????️????
5️⃣ ElevenLabs Text to Speech: Experience text-to-speech so lifelike, it speaks to you. ????️
6️⃣ Stable Diffusion Local & Online via Automatic1111: Unlock creativity with both local and cloud-based image generation. Create anywhere. ☁️????
7️⃣ GPT4All & LM Studio: Harness powerful offline AI capabilities. Your studio, smarter. ????️????
8️⃣ DeepL Translator with SPR Integration: Achieve real-time, accurate translations. Speak the world's language. ????????
9️⃣ ChatGPT by OpenAI: Engage with your SPR through natural, dynamic conversations. Connect genuinely. ????️
???? WHISPER by OpenAI: Convert voice to text with effortless precision. Listen, transcribe, act. ????️????
???? Mistral AI: Scale your AI integration with seamless efficiency. Elevate your AI journey. ????????
???? Claude 3 by Anthropic: Experience next-level AI understanding and interactivity. Discover AI with a human touch. ????????
???? **Address:**
Herrenstr. 11, 76706 Dettenheim, Germany
???? **Phone:**
Office: +49 (7247) 9851112
Mobile: +49 160 6688 222
Fax: +49 (7247) 9851113
???? **Email:**
[[email protected]](mailto:[email protected])
???? **Websites:**
[IT-Berater](http://www.it-berater.org) | [Smart Package](http://www.smart-package.com)
Jason
6 months ago #289003
I think you have to look at robustness tests as a tool you use on strategies that pass a basic suite of backtesting. This tool shows you characteristics of your system for evaluation live.
For me, I divide my data into 3 chunks and each chunk needs to provide a statistically meaningful set of trades to evaluate.
Typically that means you need greater than 10 years of historical data. Lets assume we have 2013 – 2017 as chunk 1, 2017 – 2021 as chunk 2 and 2021 – 2025 as chunk 3. You train your system on 1/2 of chunk 2 and you evaluate it on the other half. So you working exclusively with data from 2017 – 2021. This strategy has to pass a range of basic filters. What we’re assuming is that a strategy that has a consistent profit factor of 1.75 will most likely have a PF close to 1.75 over this range of data.
We know from this small bit of testing if we have a sufficiently large set of acceptable strategies, then somewhere in that set we hopefully have a smaller subset of strategies that are robust enough to trade. How do we find those? Well from the perspective of chunk 2 data, chunk 1 and 3 are ‘unseen’ data. So lets test our newfound strategies on chunk 1 and see what we get.
Intuitively a robust strategy should trade well on chunk 1. I have found anywhere from 10 – 50% of my strategies will perform acceptably on chunk 1. So now we have at least the beginning of something.
Next test those passing strats on a suite of different timeframes and securities. Be careful to test your strategy stringently, but not so strictly that it has no chance to pass. This requires a fair amount of discretion and knowledge of the market. For instance don’t choose the VIX as your second security for validating a strategy on ES. Exercise the same caution on timeframes. If you’re evaluating a 1 hour TF strategy, don’t validate it on a 1m timeframe. There is a LOT more noise on the 1m and strategies that trade down there need different characteristics. Personally I don’t think its possible for retail to trade on that short of a timeframe. I wouldn’t even bother mining strategies below 15m TF.
If you get a strategy that passes these test, then have a look at the different monte carlo tests available. Use them to understand how your strategy should perform when faced with different market realities. Check out how it performs with a range of slippages, or skipped trades. Check out the different equity curves the parameters test shows.
IF you get it to pass all of those, THEN you can evaluate your strategy on the last chunk of data. IF your strategy performs well on the the remaining chunk of data then now maybe you have something. There are still some tests to perform but its possible you’re now working with a strategy that is robust enough to trade live.
I have found that a lot of people, including myself, are a bit unrealistic on how a strategy should perform on live data. We want to make money immediately, but if we read our backtest carefully we realize thats not really what happens. A good strategy might have a stagnation period of 180 days or more (thats a long time to evaluate if you’re in that ‘hole’). I’ve found some perfectly acceptable strategies kinda trade flat and have a couple good wins every quarter or so. That means you might have 3 or 4 losses in a row on trading that falls very nicely within a good confidence band of what your strategy should do.
So heres what I think works well. Evaluate your back test and your robustness tests honestly. Don’t pass a strat that you couldn’t personally trade. For me that means win rate of 50%, ret/dd > 8, PF > 1.5, risk < 2% of account. Have a look at what the backtest is telling you. If its saying you could have 6 losses in a row, assume those losses could be maximum losses. Acknowledge 6 losses in a row is a statistic, and not a real maximum. You could have 7, or 8. What should you do if you have 7 in your evaluation period? Should you pull the plug? Have a plan. The market is unpredictable.
Try to evaluate different parts of your equity curve with knowledge of what the market was like back then. How did you strat do doing the covid crash? Maybe you could have a set of statistics for ‘normal’ market trading and a second set for exceptional market conditions. What kind of market are you in now? Is it highly volatile? Highly volatile markets will hit your stop loss a lot more and you need to know whether thats acceptable, or if your strat is breaking down.
Realize that 6 losses in a row * 2% might be a lot of money for you. Can you stomach that? If not you need to adjust your systems to get your risk into a band you’re comfortable with. That might mean you can’t trade some securities. I’ve found a lot of retail scrape together $5000 and head off to trade ES not realizing ES might need $2000 – $5000 of SL to move around for 1 trade, even on a shorter time frame. A lot of algos benefit from survivorship bias and you want a wider stoploss than the math tells you to take advantage of this. Recognize the limitations of your equity and don’t try to trade securities you have no business trading. For me I would not consider trading mini’s without at least $100k USD because of personal experience. If you don’t have that consider the micros.
I’ve have been building strategies on the side and practice discretionary trading for 4 years now. I support my family on my discretionary trading. I’ve only been using SQx for a few months but I’ve learned a lot. Some of the strategies I would evaluate in the past I would switch off because they lost more money that I wanted. Occasionally I would go back to find those same strategies had powered out and made a fair amount of money later. If I had honestly evaluated and understood what the backtest was telling me, I might have allowed them to trade longer and then I would have been in profit.
The last bit of evaluation, the part with live data and real money is always the most difficult. Its important to build a picture, or a map of what your strategy should do within a period of time. Don’t be afraid to pull profit early but let the system trade and tell you where it would have gotten out. I’ve found, as a discretionary trader, I’m much better at intuitively evaluating the market and pulling profit early but I have to look at the equity curve the system would have generated, not the equity curve I got.
I do this as part of the evaluation period so I have a little better confidence that I can preserve the equity I have if I’m evaluating a losing strategy. After I have confidence in the system, I let it go because discretionary trading is time consuming and the point of running algos is, for me, to free up more time. So if you’re having trouble finding robust algos, maybe its not the algo testing you need to work on, its your interpretation of the backtest data. I heard a good algo trader on a podcast say that trading algos is not an emotion free experience, the only real difference with trading algos and discretionary trading is your have more confidence in your backtested data. I think what he really meant was you have more data at your disposal. But that means we have to use it effectively when we’re evaluating live systems.
I hope this helps.
niclearns
1 week ago #290792
At the end of the day, robustness only gives you a better chance of having a successful strategy.
Things to think about:
1. What do I want to achieve and what tests would fit that.
Want to find stable parameters? maybe sequential optimisation, spp or even walk forward.
Want to find the worst drawdown? maybe monte carlo.
Want to find patterns an animalities in your backtest data? maybe use what if simulations.
Want to see how your backtest performs on unknown data? maybe backtest on additional markets.
**The point is: everything is a maybe in trading. It is our job to put the odds in our favour and monitor the performance.
2. Once you have your backtest data and robustness test, ask yourself this… Does your backtest seem normal?
– Are there only buy/sell trades? i.e. 0% trade symetry.
-What is the biggest loss/wins?
-Am I expecting a lot or very few trades?
**No one mentions this, but from one perspective your human filters is a robustness test in itself.
hope that helps. 🙂
Viewing 10 replies - 1 through 10 (of 10 total)