Thousands of simulations to learn to sail an AC75? We only needed one.
Written by Dr. Oliver Watkins
In recent days, my LinkedIn feed has been full of justifiably delighted Quantum Black engineers talking about having trained thousands of bots to learn to sail the Emirates Team New Zealand (ETNZ) boat for the America’s Cup. However, here at Canopy Simulations we think that if you need thousands of bots, this is less a sign of success and more a sign that you may just have chosen the wrong method. The following article attempts to explain why collocation is a better choice.
This is an article about the 36th America’s Cup, so it would be churlish not to congratulate the victors, Emirates Team New Zealand, and the engineers with which they worked. They developed a brilliant class rule, and a superb boat to go with it. However, the aim of this article is to suggest that the opposition will have been doing some things better. Even the most successful F1 teams never stop keeping close tabs on the opposition to spot the latest opportunities to improve, and any America’s Cup teams who do will soon find themselves being overtaken.
As a company specializing in performance simulation, we believe that our method of dynamic optimization by collocation is the best method of offline simulation. It has become the dominant technology in motorsport, and for such a powerful technology we think this should be just the start. As we announced in July 2019 we have been working with INEOS Team UK (ITUK) for the 36th America’s Cup. Obviously we’re not going to tell you anything about what we did with ITUK. Instead we’re going to discuss the general advantages and disadvantages of Dynamic Optimisation by Collocation with reference to well-known trade-offs and engineering challenges of the AC75. We can also draw parallels with what we do in motorsport.
It’s a very small world; Emirates Team New Zealand’s technical director, Dan Bernasconi, was one of my predecessors as Head of Simulation Development at McLaren Racing. ETNZ have worked closely with Quantum Black (a McKinsey subsidiary) for whom my wife used to work. One of the engineers from my team at McLaren also worked in-house with ETNZ. In such a tangled web, I will have to attempt to tread carefully!
Machine Learning vs. Optimization
Effectively, the competition here is between Dynamic Optimization by Collocation and Machine Learning Control by Reinforcement Learning. Machine learning methods get out of the blocks with a big head start here; they are cool, fashionable and zeitgeisty. In summary these two methods work as follows.
Come up with a suitable form for a controller, this could range from simple feedback control, to a controller with some memory and sight of the future. We then enter a loop, broadly consisting of:
- Choose a large number of controllers at random.
- Let them attempt to control the boat.
- Choose the ones which do best and use some intelligent method of iteration to create a new generation of controllers.
We keep going round and round this loop until the “winners” are making a really good job of the control problem. The course steered by these winning controllers then comprises the results of the simulation run.
Dynamic Optimization by Collocation
Here we recast the simulation problem as a large non-linear optimization problem. We then use an optimization solver to find the set of controls for the boat which gets it from A to B as fast as possible. An accompanying set of states comprise the results of this simulation.
We make no secret of the fact that we think the second of these is the superior approach, so why is this the case?
Advantages of Collocation
One Run, Optimal Results, Mathematically Guaranteed
Transforming a simulation problem to an optimization problem is not straightforward (although we’ve done all the hard work), but having done so, the results come with a set of very useful mathematical properties. Solving an optimization problem means satisfying some mathematical conditions (the KKT conditions, if you must know), and having satisfied these you effectively have a guarantee of optimality. Run your machine learning algorithm for as long as you like, it will never outperform the results of a single collocation simulation, and you will never know how close to optimal you have got.
This has some further implications. The McKinsey video describes the process whereby their engineers would use the results of offline simulation to advise their sailors in “Sailor-in-the-Loop” simulators (and on the water). There is quite a difference between saying to your sailors:
- You’re getting closer to our offline results, which seem to be a bit better than what you’re doing at the moment.
- You’re closing gap between your current results and the best that can possibly be achieved.
The Top Down View of Everything
The trade-offs required to “fly” an AC boat are complex, not only in terms of finding the best foil arm cant angles, foil flap angles, rudder angles, rig tension and sail trim to go as fast as possible in a straight line, but also in terms of the trades presented in executing manoeuvres. When entering a tack, should one turn away from the wind to build speed, but have to turn through a bigger angle, or should one sail close to the wind to minimize turn angle? (The answer is that it depends on your boat and the wind strength).
The wonderful thing about a collocation solver is that it has a view of every single trade-off between controls and trajectory at every point around the course all at the same time. Should we turn further away from the wind than the simulation does to carry more speed? In a word, no. You shouldn’t because you have a mathematical guarantee of optimality that says you can’t do better.
The first America’s Cup collocation simulation we ever ran was of a single mark-rounding. Getting this result out was one of the unquestioned highs of my professional career; not least because the solver which generated it was identical to the one being used by our motorsport clients to simulate F1 cars.
Having a solver which is totally agnostic to the model it is presented with has some pretty big implications, all of which centre around whether simulated performance is down to the boat being simulated, or the combination of the boat and controller.
- Consider the situation where we have two components – foil flaps, for instance – we wish to compare in simulation. If we use our machine learned control and “Flap A” outperforms “Flap B”, is this because it is faster, or because the boat/controller combination works better? We may recognize this issue, and so retrain our controllers for each foil flap and now “Flap B” seems better. This, however, just shuffles the problem along a bit – perhaps the training process just worked better for “Flap B”? Alternatively, we could just simulate the two flaps with a collocation solver and know, for certain, the ultimate performance of each option.
- Competitor analysis. When every competitor lifts their boat into the water for the first time, the photographers of every other team will get photos from every possible angle and the engineers will pore over the key design concepts. CFD models of these concepts can then be swiftly created and the team running a collocation simulation will know, with a single simulation, the ultimate performance for the competitor boat (something said competitor may not even know themselves). By contrast, any learning-based approach will have to completely re-learn how to sail the boat and may never find out its innermost secrets.
- Fine grained parameter sweeps are a key ingredient of simulation. Every simulation engineer at an F1 team will have performed innumerable sweeps of rear ride height in 1mm steps, to find the best setup for the car at a particular track. Similarly, an AC team may wish to (for instance) perform a fine sweep of the foil anhedral angle. A single controller is unlikely to be appropriate for the entire range of foil designs, but it is really worth retraining the controller if the course completion time changes by only 0.01s? A collocation-based simulation, on the other hand, will simply resolve the course completion time and the changes in control style required along the way.
A casual observer turning on the America’s Cup for the first time will no doubt wonder what on earth all the big lads on the hand bikes are doing. These are “grinders”: athletes, often drawn from the top end of elite sport, delivering all the hydraulic power with which the control surfaces of the boat are moved. At 50kts, moving the foil arms against the flow of water can require vast amounts of power; a naïve controller can easily start signing cheques the boat hydraulics and grinders cannot cash.
Meanwhile, even if the grinders may have the oomph to get through one manoeuvre with such high hydraulic demands, the races are 25 minutes long; asking a human to put out more than a few hundred watts of power for this long just isn’t feasible.
This power and energy limitation looks a lot like the total energy and battery heating limitations of Formula E (in which almost every team uses our collocation simulations), where we can deal with these kinds of constraints by putting a limit on overall energy consumption, and on instantaneous power. This means not only that the results of simulations respect these limits, but that they also give us a sensitivity of overall performance to these limits. In FE that might be a number to feed into race strategy, in the AC that might be a number to stick on the wall of the gym as motivation!
This can have some pretty big implications. If your manoeuvres involve lavish movements of the sails and dragging the boards through the water against the flow it is easy to hit power demands far beyond what is available. The way the boat is controlled will change fundamentally with the power available to control it. Furthermore, recognizing the limits on control authority will change the way you design the boat in the first place.
This is a second visit to the benefits of casting the control problem as an optimization. As well as a mathematical guarantee of optimality the optimizer provides a variety of other mathematical by-products, including a set of numbers called the “Lagrange Multipliers”. With a bit of cunning we can turn these into local parameter sensitivities. These deserve an article of their own (which we wrote back in 2018) .
Effectively, what these sensitivities enable us to do is say “if I change any parameter on my boat (or car), how does that change performance, and exactly where on the course do the performance changes originate?”. This is big news, if a change in foil lift means we can go a bit faster mid-tack, and then carry this advantage through the exit, we can localize all the benefit to the mid-tack point at which it accrues. This gives engineers a rich view of the performance landscape of their candidate designs and an understanding of how different performance concepts work far beyond the headline course completion times.
Speed and Hardware
One decent computer and between 2 and 20 minutes and we’re all done. No need for server farms or supercomputers.
So What Are The Downsides?
The preceding sections can hardly be claimed to be a balanced discussion of the advantages and disadvantages of collocation. We’re evangelical about it, so evangelism is what you get. However, we’re not blind to the downsides. A casual observer might very reasonably ask “if this is so blooming marvellous, how come I’ve never heard of it? All the while machine learning is all over the media”. Part of the reason is that if you run a collocation simulation, you get some results, which have been learned, by a machine… so the media are only going to report it one way.
Machine Learning is a wonderful technology for systems where we can’t write down a nice description of the dynamics. Chess, Go and classic 80’s video games all fit into this category, and machine learning methods have been achieving some breath-taking results. However, for systems with known, deterministic physics we should be able do better! If we could change the world in one way through our proselytizing of collocation it would be that when we know the physics of a system we use performance simulation methods appropriate to the task, rather than reaching for the black box of machine learning.
Another reason collocation is not so well known, we believe, is that collocation is hard. We talk casually about recasting the problem as an optimization, but this belies some huge difficulties. A minority of F1 teams have truly got on top of this extremely thorny problem, and this difficulty most likely accounts for why it is not more widespread.
However, set against these downsides:
- We’ve already done much of the hard work. Our solver can and does hook up to any models we choose.
- The results are so valuable that it’s well worth it.
So there, we have it, now we have made a fair and balanced discussion of the advantages and disadvantages. Only kidding…
If you would like to find out how to exploit our unique technology in your industry, do get in touch via firstname.lastname@example.org.