Learnable inductive biases in Neural Netrok


本文主要介绍关于机器学习,python,算法的知识点,对Learnable inductive biases in Neural Netrok和深度学习不能做因果推理吗有兴趣的朋友可以看下由【SyncStudy】投稿的技术文章,希望该技术和经验能帮到你解决你所遇的graduate studies相关技术问题。

深度学习不能做因果推理吗

Learnable inductive biases in Neural Netrok Mark van der Wilk
Growing research group focusing on Gaussian process inference backgred by theory to makereliable decision making systems automatic learning of inductive vias in neural works When should neurons be connected?
Hyperparameter selction and architecture design
Every time we train a NN we need to decide on hyperparameters How many layers?hOW MANY UNITS IN A layer?What layer structure?Convolutional? Skip connections?Data argumentation parameters? As architectures get more complex Multi taskWhich layers to share?What kind of task specific layers?How much capacilty to assign to each task
Main tool is trail and errorcross validation Invariances Every prediction problem needs an inductive biasfx, imagey, labelinductive bias for the unseen bias
Architecture determins inductive vias through equivariances Convolutions are a common solutiontranslational variances
Can we automatically adjust invariance properties in layers?
Summary Goal Given a datasetadapt the inductive bias to it key requirements find a parameterisation for different inductive biasesFind a learning objective that works for inductive viases We want to optimise it through backprop So it is easy We will look at Invariances and equivariances parameterised though Transformations on the input data argumentation transformations on the filter convolutions how bayes helps with finding a learning objectiveSingle layer and deep models Invariance, data argumentation and training loss Data argumentation

Take a dataset

y = { ( x n , y n ) } n = 1 N y=\{(x_n, y_n)\}_{n=1}^N y={ (xn,yn)}n=1N

Create larger dataset

y ′ = { } y'=\{\} y={ }

Making models invariant f ( x ) = f ( t i ( x ) ) f(x)=f(t_i (x)) f(x)=f(ti(x))

w ∗ w^* w

f ( x ) = f ( t i ( x ) ) f(x)=f(t_i(x)) f(x)=f(ti(x))

p ( t ∣ θ ) p(t|\theta) p(tθ)

θ \theta θ

amount of different transformations to apply

what if we do not know the transformations for which

f ( x ) = f ( t i ( x ) ) f(x)=f(t_i(x)) f(x)=f(ti(x))

θ \theta θ

w ∗ , θ ∗ = m i n w , p w^*,\theta^*=\mathop{min}\limits_{w,p} w,θ=w,pmin

p ( t ∣ θ ) p(t|\theta) p(tθ)

Bayesian model selection

f w , θ = ϕ θ ( x ) T W = ∑ i = 1 K ϕ θ ( i ) ( x ) w i f_{w,\theta}=\phi_\theta(x)^T W = \sum_{i=1}^K \phi_\theta^{(i)}(x)w_i fw,θ=ϕθ(x)TW=i=1Kϕθ(i)(x)wi

L t r a i n L_{train} Ltrain

p ( f , θ ∣ y ) = p ( y ∣ f ) p ( f ∣ θ ) p ( θ ) p ( y ) = p ( y ∣ f ) p ( f ∣ θ ) p ( y ∣ θ ) p ( y ∣ θ ) p ( θ ) p ( y ) p(f,\theta|y)=\frac{p(y|f)p(f|\theta)p(\theta)}{p(y)}=\frac{p(y|f)p(f|\theta)}{p(y|\theta)}\frac{p(y|\theta) p(\theta)}{p(y)} p(f,θy)=p(y)p(yf)p(fθ)p(θ)=p(yθ)p(yf)p(fθ)p(y)p(yθ)p(θ)

p ( y ∣ θ ) = ∫ p(y|\theta)=\int p(yθ)=

N N NN NN

E L B O ELBO ELBO

Learning data augumentation Formulate data learning data argumentations as Bayesian hyperparameter learning

f w , θ ( x ) = E p ( t ∣ θ ) [ ϕ θ ( t ( x ) ) T w ] f_{w,\theta}(x)=\mathbb{E}_{p(t|\theta)[\phi_\theta (t(x))^T w]} fw,θ(x)=Ep(tθ)[ϕθ(t(x))Tw]

p ( y n ∣ w , θ ) = N p(y_n|w,\theta)=\mathcal{N} p(ynw,θ)=N

l o g p ( y ∣ θ ) ≥ L = ∑ n E q ( w ) [ l o g p ( y n ∣ E p ( t ∣ θ ) ) ] log p (y | \theta) \ge \mathcal{L}=\sum_{n} \mathbb{E}_{q(w)}[log p (y_n|\mathbb{E}_{p(t|\theta)})] logp(yθ)L=nEq(w)[logp(ynEp(tθ))]

l o g p ( y ∣ θ ) ≥ log p (y|\theta) \ge logp(yθ)

L ( q ( w ) , θ ) \mathcal{L}(q(w),\theta) L(q(w),θ)

p ( t ∣ θ ) p(t|\theta) p(tθ)

$$

$$

one modeltwo in two problems

L ( q ( w ) , θ ) \mathcal{L}(q(\bold{w}),\theta) L(q(w),θ)

$$

$$

Learning equivariance previouslyadded invariance by transforming the input image

f w , θ ( x ) = E p ( t ∣ θ ) [ ϕ ] f_{\bold{w}, \theta} (\bold{x})=\mathbb{E}_{p(t|\theta)}[\phi] fw,θ(x)=Ep(tθ)[ϕ]

$$

$$

Single architecture for different domains Learning invariances using the Marginal Likelihood Learning invariance by backpropBut Gaussian processes only Data argumentation in BNNs and the cold posterior effect Whether a principled approach to DA influences the gold Summary Given a datasetadapt the inductive bias to it Key requirements parameterisationlearning objective Backprop invariances Outlook better than trial and error design NNsBayesian methods are helping the automation of selecting invariances making ity as easy as backprop Can nake NNs More accurateeasier to usemore energy efficient Get to the smarter neuron! Meta learningMore Bayes?Causality?

本文《Learnable inductive biases in Neural Netrok》版权归SyncStudy所有,引用Learnable inductive biases in Neural Netrok需遵循CC 4.0 BY-SA版权协议。


注意!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系我们删除。



 
  © 2014-2022 ITdaan.com 联系我们: