基于强化学习的水下振动台时滞补偿与控制优化

汤继川; 李宁; 李忠献

doi:10.6052/j.issn.1000-4750.2022.02.0145

基于强化学习的水下振动台时滞补偿与控制优化

UNDERWATER SHAKING TABLE TIME DELAY COMPENSATION AND CONTROL OPTIMIZATION BASED ON REINFORCEMENT LEARNING

摘要

摘要: 开展水下地震模拟振动台试验时，振动台多向振动受到的水体动力效应十分复杂、难于补偿，因而在试验前合理考虑台体实际加载能力与精度十分重要。该文考虑水深、激励频率和运动方向等因素，开展水-振动台系统控制性能的影响研究。通过实测数据辨识水-振动台系统的数学模型，提出基于模型的前馈补偿与强化学习结合的数据驱动混合控制方法。根据DDPG算法利用位移指令的误差数据离线训练Actor-Critic网络，并将之用于实时修正基于模型的补偿指令。通过与前馈补偿方法对比，开展了不同水深、激励频率和振动台运动方向的50组水下振动台测试验证及性能评价。结果表明：随着水深与激励频率增加，控制性能下降；水深对振动台垂直向运动影响更大；与基于模型方法相比，在水深2 m垂直向运动的设备最不利条件下，该文所提控制方法的性能指标J₁和J₂分别提升了6.54%和7.52%。所提方法在考虑水-振动台系统动力非线性控制时具有优良的时滞补偿性能，且是一种宽频带补偿方法。

Abstract: The water dynamics and their effects on the different direction vibrations of facility are complicated and difficult to compensate during underwater shaking table tests. As a result, before conducting the tests, it is necessary to confirm the capacity and accuracy of facility. This paper investigated the effect of several factors on the coupling control performance of a hydrodynamic-shaking table, such as water depth, excitation frequency, and movement directions. The transfer function model for the water-shaking table system is firstly identified using measured data, and then a data-driven hybrid control strategy is proposed, combining model-based feedforward compensation and reinforcement learning (RL). The Actor-Critic networks in RL are trained offline using the error data of displacement commands according to the DDPG algorithm, and they are utilized to compensate the model-based commands in real-time. By comparing with the feedforward compensation, 50 test cases were conducted, considering different water depths, excitation frequency and shaking directions, to validate the method and to evaluate its performance. The results reveal that: the control performance decreases with the increase of water depth and excitation frequency; the water depth has a greater impact on the vertical motion of a shaking table. Under the most unfavorable condition of the vertical motion with a water depth of 2m, the proposed method enhanced the performance with 6.54% and 7.52% for indicators J₁ and J₂, respectively. The proposed method has an optimized time-delay compensation effect when considering the nonlinear dynamics of the water-shaking table interaction system, and it is also a broadband compensation technique.

HTML全文

参考文献(23)

施引文献

资源附件(0)