Dual Control and Active Learning

Find out about the innovative Dual Control for Exploitation and Exploration framework which enhances High-Levels of Automation systems' adaptability in uncertain environments by actively exploring and learning.

Driverless cars, unmanned aerial vehicles, healthcare robots looking after elder and disabled people at home, fully automated factory and warehouse are some typical examples of High-Levels of Automation (HLA) systems enabled by Artificial Intelligent (AI) and other latest technologies.

When operating in an unknown or changing environment, an ideal feature of a HLA system is that it is able to automatically optimise its performance irrespective of variations or uncertainty in the environment, which is referred as auto-optimisation control. That is, the HLA is able to automatically acquire and maintain its optimal operation condition in terms of a defined metrics or reward function (e.g. productivity, energy loss or efficiency) in an unknown environment, and automatically follow the optimal operational condition when the environment changes.

A flow diagram showing...
A flow diagram showing that the key functional components of an HLA and how are connected to form a feedback loop for learning and decision making in uncertain environments.

To achieve this capability, it is important that the HAL is able to learn environment efficiently and understand the impact of the environment changes on the performance and system behaviour. Under the support of EPSRC Established Career Fellow, the concept of the Goal-Oriented Control System (GOCS) was proposed to realise the paradigm shift in control design as demanded by HAL. Recent main progress in GOCS is the development of  a Dual Control for Exploitation and Exploration (DCEE) framework that is able to actively explore the environment so promote active learning capability. This is based on the observation that a control action has a dual effect for a system operating in an unknown environment. That is, a control action not only alters the behaviours and the status of the dynamic system, but also changes its interaction with the environment so generates informative data that changes its understanding of the surrounding world.  It is proven that DCEE achieves the optimal trade-off between the exploitation (achieving a control task based on the current belief) and exploration (actively probing the environment).

DCEE has been applied to a number of engineering applications with promising performance, such as autonomous emergency braking on unknown surface conditions, wave energy generation under changing ocean conditions, and UAV tracking in an unknown environment. Initial analysis of stability and convergence of DCEE is provided. It is also argued that DCEE is better than reinforcement learning for engineering systems operating in uncertain environments such as HAL.                   

One recent interesting discovery is that the concept of DCEE is closely related to the Free Energy Principle and Active Inference in neuroscience for explaining and understanding human and animal intelligent behaviour. The free energy principle was proposed by Karl Friston as an attempt to unify existing brain theories and offers a First Principle account of sentient behaviour. Surprisingly, although Active Inference and DCEE are developed from a completely different background with a completely different motivation, they land onto the same place.

References

W-H Chen, C Rhode and C Liu (2021). Dual control for exploration and exploitation in autonomous search. Automatica. Vol.133, No.11, 109851. https://doi.org/10.48550/arXiv.2012.06276

W-H Chen (2022). Perspective View of Autonomous Control in Unknown Environment: Dual Control for Exploitation and Exploration vs Reinforcement Learning. Neurocomputing. Vol.497, pp.50-63. https://doi.org/10.1016/j.neucom.2022.04.131

C Rhodes, C Liu, W-H Chen (2022). Autonomous source term estimation in unknown environments: From a dual control concept to UAV deployment. IEEE Robotics and Automation Letters. Vol. 7(2). Pages: 2274-2281. 10.1109/LRA.2022.3143890.

Z Li, W-H Chen, J Yang C Liu (2024). Cooperative Active Learning based Dual Control for Exploration and Exploitation in Autonomous Search. IEEE Transactions on Neural Networks and Learning Systems. 10.1109/TNNLS.2024.3349467.