Reply

Portfolio money management and multi-armed bandits

3 replies

mikeyc

Customer, bbp_participant, community, 877 replies.

Visit profile

7 years ago #115126

During my regular reads on machine learning and optimisation I came across a problem known as multi-armed bandit problem.  These are a series of slot machines (think vegas) that have different payout probabilities and distributions. 

 

In this problem, we don’t know what are the best ones to play to maximise our returns.

 

Some discussion here

 

https://dataorigami.net/blogs/napkin-folding/79031811-multi-armed-bandits

https://msdn.microsoft.com/magazine/mt703439(for C# coders like myself)

http://iosband.github.io/2015/07/19/Efficient-experimentation-and-multi-armed-bandits.html

 

 

 

Seems to me, it is somewhat similar to a selection of SQ strategies we want to use in a portfolio.  Which ones should we put our money to as they return profits and losses over time? 

 

Thinking of making a shared library for MQ4 that handles the money management aspect of the portfolio.

 

Any thoughts?

 

0

Patrick

Customer, bbp_participant, community, 424 replies.

Visit profile

7 years ago #136963

increase trade size to a strategy which is earning and decrease trade size for loosing strategy.

 

option is to risk fixed amount (100$ for 10k $ account = 1%)and if the strategy is in profit (Total profit/loss >0) then increase the trade size for next trade to 100$+x% from the total profit. If the strategy is loosing (total profit/loss<0), then decrease the risk to 100$-x% of the total loss.

So the mq4 has to check the history and count total profit/loss and after change to risk of the strategy. 

 

just need to be programmed.

0

Threshold

Customer, bbp_participant, community, 723 replies.

Visit profile

7 years ago #136970

This is the same thing as trying to “equity trade” a single strategy. To my knowledge and research, “trading” a strategy’s equity curve has largely been debunked.

If a strategy in portfolio turns into a loser, with quantitative proof that it is performing outside of its historical accuracy why not just cut it off (perhaps keep watching it on demo to see if it returns to ‘normal’). If its performing within its historical accuracy and you turn down its risk thats exactly the time when its chance is to break-out of a drawdown, but now that its risk has been turned down, its trapped in a DD twice as long. This is pretty damaging over the long term: higher risk into DDs and lower risk to break out of them. This all falls within equity-curve trading. Skipping trades (quantitatively) based on historical equity curve also falls within equity curve trading. Perry Kaufman did massive research into all this and concluded it was useless. However, I’m sure some banks and Hedge funds do use it thanks to teams of genius quants who spend 1000s of hours test/verifying it but-

To me there isn’t a gray area and I try to keep things simple which is the most robust measure I can take: a strategy is either within its historical testing and its fine(even in a DD), or its not and its done.

For running 25+ simultaneous EAs on low 0.25%-0.5% risk the “slot machine” algo would be a nice way to turn off the losers completely I think or manage their risk size.
If Marc eventually does the SQ Trader I think that type of stuff would be ideal for it.

0

mikeyc

Customer, bbp_participant, community, 877 replies.

Visit profile

7 years ago #136980

A really good visual and interactive demonstration is here:

 

https://e76d6ebf22ef8d7e079810f3d1f82ba1e5f145d5.googledrive.com/host/0B2GQktu-wcTiWDB2R2t2a2tMUG8/

 

As you can see, the algorithm explores the “bandits” and learns which ones give the best payouts.  You can drag the mouse in the probability of payout box for each bandit to change the payout as the simulation is running, and it will learn to adjust to the new payouts.

 

I can see a similar application to automatically dropping poorly performing strategies from a portfolio and bringing new one in, in live trading.  The key with this approach is that is maximises reward whilst minimising “regret” for choosing bad outcomes, and it does this very quickly.

0

Viewing 3 replies - 1 through 3 (of 3 total)