许久未记录正式的博客笔记了,在此补上一系列文章
它们是我读完PDE-Net 2.0后顺着这条线索收集的
搜集文献(不分顺序)
- Learning to Discretize: Solving 1D Scalar Conservation Laws via Deep Reinforcement Learning
- Data-driven recovery of hidden physics in reduced order modeling of fluid flows
- https://aip.scitation.org/doi/abs/10.1063/5.0002051
- 期刊名Physics of Fluids,20年初
- DeepMoD: Deep learning for model discovery in noisy data
- https://www.sciencedirect.com/science/article/pii/S0021999120307592
- 期刊名Journal of Computational Physics,20年11月
- Stability selection enables robust learning of partial differential equations from limited noisy data
- https://arxiv.org/abs/1907.07810
- ArXiv上的分类是Mathematics—>Numerical Analysis,19年7月
- Derivatives Pricing via Machine Learning
- Extracting Interpretable Physical Parameters from Spatiotemporal Systems Using Unsupervised Learning
- https://journals.aps.org/prx/abstract/10.1103/PhysRevX.10.031056
- 期刊名PHYSICAL REVIEW X,20年9月
- DLGA-PDE: Discovery of PDEs with incomplete candidate library via combination of deep learning and genetic algorithm
- https://www.sciencedirect.com/science/article/pii/S0021999120303582
- 期刊名Journal of Computational Physics,20年10月
- Feature engineering and symbolic regression methods for detecting hidden physics from sparse sensor observation data
- https://aip.scitation.org/doi/abs/10.1063/1.5136351
- 期刊名Physics of Fluids,20年1月
- Data-driven Discovery of Partial Differential Equations for Multiple-Physics Electromagnetic Problem
- https://arxiv.org/abs/1910.13531
- ArXiv上分类Physics—>Computational Physics,19年10月
- TIME: A Transparent, Interpretable, Model-Adaptive and Explainable Neural Network for Dynamic Physical Processes
- Sparse Symplectically Integrated Neural Networks
- DeepM&Mnet: Inferring the electroconvection multiphysics fields based on operator approximation by neural networks
- https://arxiv.org/abs/2009.12935
- ArXiv上分类Physics—>Computational Physics,20年9月
- Integration of Neural Network-Based Symbolic Regression in Deep Learning for Scientific Discovery
- https://ieeexplore.ieee.org/abstract/document/9180100
- IEEE Transactions on Neural Networks and Learning Systems,20年8月
1. Learning to Discretize: Solving 1D Scalar Conservation Laws via Deep Reinforcement Learning
文献资料
董彬老师组的文章,2020/10挂出来
表格总结
总结完了觉得文章的思路正常,文章的一个亮点是抽象数值方法为RL问题时引入了meta-learner的概念?
文献条目 | 具体内容 |
---|---|
Target |
|
Motivation/Idea |
|
Method |
|
Pros and Cons |
Cons: |
部分细节
模型基本设置
先讲讲所谓有守恒律的方程指什么吧,其实就是指这个方程有守恒律,形式如下:
这个形式就叫守恒律。由参数 $\{x,t\}$ 取值在区间上可知,ob数据的形式是把时空间分别分割,得到网格式数据,分割一般均匀,设分割长度和总数分别为 $\Delta x, \Delta t$ 和 $J, N$,详见原文公式 $(2.2)$。
然后有几个概念要说一下,真实解 $u(x,t)$ 在网格上的取值为 $u(x_j,t_n)$,它的近似记为 $\mathcal{U}_j^n$;另外,$(1)$ 式中间的 $f$ 称为flux,可以理解为守恒律中的流,真实的flux记为 $f_j^n=f(u(x_j,t_n))$。
最后有个空间中的插值,记号为 $\displaystyle x_{j\pm\frac{1}{2}}=x_j\pm\frac{\Delta x}{2}$,感觉就是更细一点的插值。
WENO数值方法
从名字上看这个方法,Weighted Essentially Non-Oscillatory Schemes,推测就是插值更细因此可能拟合结果波动更小,插值的时候还有重要性(插值准确性)加权。
下面给个WENO方法的计算过程:
如上最后两行,WENO就像一个多重平均插值近似,主要问题在于不同插值权重的计算、和最后一个upwind direction计算。后者应该是希望数值解不要震荡的方法,参考二阶迎风格式。
WENO对应到MDP
没啥好说的,根据WENO的计算方法写成算法,然后对应到MDP中的state $S$,action $A$,transition dynamics $P$,reward $r$。


这样就完事了,再简要介绍一下里面的东西是什么:
- $s$ 是state,有个state function,把指标集对应的$\mathcal{U}_j^\lambda,\ \lambda\in \Lambda$ 映射到 $\hat{f}_j^n$。一个例子是对pdf的三次插值,那 $s$ 就映到3个向量,每个向量是每次插值的所有点($f_j^\lambda$)和该次插值对应的权重,具体实现是方式是用6层,每层64神经元的MLP,并use the Twin Delayed Deep Deterministic (TD3) policy gradient algorithm to train the RL policy。。。
- $A$ 是action,就是pdf最后一行提到的选择哪些插值,由 $s$ 函数生成
- $P$ 相当于迭代机制,比如前向欧拉对应的迭代形式。。。
- $r$ 用的插值时的无穷范数的相反数
哪来的meta-learner
思想不错,$A$ 成为一个meta-learner,原因只有这个靠谱:这个RL里的 $A$ 是通过 $s$ 函数输出得到的,是从当前状态判断的,不是像原来的数值方法那样,在没有其它网络(如上文MLP)的帮助下直接从数值机制推断的。
文章提的其它原因不太靠谱:
- Learning the policy $P$ within the RL framework makes the algorithm meta-learning like [1, 5, 10, 20, 29].
- The learned policy network is carefully designed to determine a good local discrete approximation based on the current state of the solution, which essentially makes the proposed method a meta-learning approach.
- We attribute the good generalization ability of RL-WENO to our careful action design, which essentially makes RL-WENO a meta-learner under the WENO framework and thus have strong out-of-distribution generalization.
2. DeepMoD: Deep learning for model discovery in noisy data
突然想到一个问题,目前没看到几篇文章很关注PDE的边界条件!本文也没有考虑。
文献资料
- https://www.sciencedirect.com/science/article/pii/S0021999120307592
- 好像是巴黎大学的研究者写的
- 期刊名Journal of Computational Physics,20年11月
小结:我觉得不彳亍
模型:DeepMoD及具体内容
本文提出模型,DeepMoD,指的是deep learning based model discovery algorithm,目标是从数据中学习背后的PDE。该PDE模型的形式其实比较局限,固定为:
两个大困惑,读了好几遍搞不明白,文章为什么要刻意避开这个问题❔我觉得只考虑了这样的形式是因为把神经网络 $f_i$ 作为函数字典,输入是 $(\mathbf{x, t})$,那么输出对输出自动求导看成偏导数。但是为什么原文公式 $(1)$ 没有 $u$ 对 $t$ 求导呢,是不是在刻意混淆?而且文章的实验表示不是整个grid上都有数据,可以随机取,那么偏导数也是不能全的啊,怎么保证神经网络就能对输入求导得到字典中的基函数,见下面图中的Library?

具体方法采用了函数字典,使用稀疏回归,回归时加正则。使用densely-connected feed-forward neural network作为函数的估计,来构建函数字典。考虑了三种loss,MSE损失针对 $u(x,t)$,只考虑每条轨线末端值的监督;回归损失针对 $\Theta\xi$ 的拟合,但是原文的式子下标没写清楚,弄不明白哪里做了监督;最后是字典系数 $\xi$ 的 $L_1$ 正则
网络训练有2个骚操作:
- 数据有一些处理,当神经网络训练完之后,得到的函数字典的稀疏系数其实会不那么稀疏($L_1$ 不能保证完全稀疏),进一步进行无量纲化,方式是所有变量标准化,包括原文 $(2)$ 式中的 $\partial_t u,\Theta,\xi$,具体意义见原文 $(3,4)$ 式。
- 网络训练完之后,再练一次,不加 $L_1$ 正则了,回归项只用之前筛选出来的,原文只说这样得系数的无偏估计❔
这个方法起效果需要函数字典充足,不过实验结果似乎表示很充足也不至于过拟合,有对函数系数的正则,这个正则和系数某个阈值的设置有关。这个设置似乎不是general的(见Discussion部分)。字典中函数的系数代表了某种模型选择。
Pros and Cons
前两个文章自己说的优点很奇怪:
一个是对噪声非常稳健,是实验结论
第二个是不需要训练集。这应该指的是不需要太多训练数据,小样本也可以,并不是完全不需要,原文进行了5种方程的人工实验,一个结论是几种方程的模拟只需要 $\mathcal{O}(10^2)$ 的数据量。
优点原文:This construction makes it extremely robust to noise, applicable to small data sets, and, contrary to other deep learning methods, does not require a training set.
实验结果之一:
find that it requires as few as $\mathcal{O}(10^2)$ samples and works at noise levels up to 75%
第三个是DeepMoD对数据的维度没有要求,之前有些模型是针对1维数据的
- 第四个,函数字典有模型选择的功能,只是少了点味
文章表示DeepMoD很稳健,需要数据量少是因为利用regression-based approach完成model discovery任务,用神经网络infer system parameters。。。说🔨呢,真的是原文。。。我很不喜欢这样的说法
缺点来了,读不明白就甩锅:
- PDE模型的形式到底局限么?原文是不是在故意回避这个问题
- 怎么保证神经网络就能对输入求导得到字典中的基函数?偏导数不一定能全?
- 两次训练为什么是无偏估计,没说
- 阈值的设置好像是special的
- 只靠loss得到所谓的稳健、小样本。难道又是实验验证?我不信,而且全是模拟数据
- 本文提了一下边界条件,但是仍然没考虑
10. TIME: A Transparent, Interpretable, Model-Adaptive and Explainable Neural Network for Dynamic Physical Processes
参考文献
APA 7th格式
[1] Wang, Y., Shen, Z., Long, Z., & Dong, B. (2019). Learning to Discretize: Solving 1D Scalar Conservation Laws via Deep Reinforcement Learning. arXiv e-prints, arXiv:1905.11079. https://ui.adsabs.harvard.edu/abs/2019arXiv190511079W
[2] Pawar, S., Ahmed, S. E., San, O., & Rasheed, A. (2020). Data-driven recovery of hidden physics in reduced order modeling of fluid flows. Physics of Fluids, 32(3), 036602. https://doi.org/10.1063/5.0002051
[3] Both, G.-J., Choudhury, S., Sens, P., & Kusters, R. (2020). DeepMoD: Deep learning for model discovery in noisy data. Journal of Computational Physics, 109985. https://doi.org/https://doi.org/10.1016/j.jcp.2020.109985
[4] Maddu, S., Cheeseman, B. L., Sbalzarini, I. F., & Müller, C. L. (2019). Stability selection enables robust learning of partial differential equations from limited noisy data. arXiv e-prints, arXiv:1907.07810. https://ui.adsabs.harvard.edu/abs/2019arXiv190707810M
[5] Ye, T., & Zhang, L. (2019). Derivatives Pricing via Machine Learning. Journal of Mathematical Finance, 09, 561-589. https://doi.org/10.4236/jmf.2019.93029
[6] Lu, P. Y., Kim, S., & Soljačić, M. (2020). Extracting Interpretable Physical Parameters from Spatiotemporal Systems Using Unsupervised Learning. Physical Review X, 10(3), 031056. https://doi.org/10.1103/PhysRevX.10.031056
[7] Xu, H., Chang, H., & Zhang, D. (2020). DLGA-PDE: Discovery of PDEs with incomplete candidate library via combination of deep learning and genetic algorithm. Journal of Computational Physics, 418, 109584. https://doi.org/https://doi.org/10.1016/j.jcp.2020.109584
[8] Vaddireddy, H., Rasheed, A., Staples, A. E., & San, O. (2020). Feature engineering and symbolic regression methods for detecting hidden physics from sparse sensor observation data. Physics of Fluids, 32(1), 015113. https://doi.org/10.1063/1.5136351
[9] Xiong, B., Fu, H., Xu, F., & Jin, Y. (2019). Data-driven Discovery of Partial Differential Equations for Multiple-Physics Electromagnetic Problem. arXiv e-prints, arXiv:1910.13531. https://ui.adsabs.harvard.edu/abs/2019arXiv191013531X
[10] Singh, G., Gupta, S., Lease, M., & Dawson, C. N. (2020). TIME: A Transparent, Interpretable, Model-Adaptive and Explainable Neural Network for Dynamic Physical Processes. arXiv e-prints, arXiv:2003.02426. https://ui.adsabs.harvard.edu/abs/2020arXiv200302426S
[11] DiPietro, D. M., Xiong, S., & Zhu, B. (2020). Sparse Symplectically Integrated Neural Networks. arXiv e-prints, arXiv:2006.12972. https://ui.adsabs.harvard.edu/abs/2020arXiv200612972D
[12] Cai, S., Wang, Z., Lu, L., Zaki, T. A., & Karniadakis, G. E. (2020). DeepM&Mnet: Inferring the electroconvection multiphysics fields based on operator approximation by neural networks. arXiv e-prints, arXiv:2009.12935. https://ui.adsabs.harvard.edu/abs/2020arXiv200912935C
[13] Kim, S., Lu, P. Y., Mukherjee, S., Gilbert, M., Jing, L., Čeperić, V., & Soljačić, M. (2020). Integration of Neural Network-Based Symbolic Regression in Deep Learning for Scientific Discovery. IEEE Transactions on Neural Networks and Learning Systems, 1-12. https://doi.org/10.1109/TNNLS.2020.3017010