当前位置:首页 > 文章中心 > 正文内容

爱可可AI论文推介(10月30日)

dgx6662个月前 (05-20)文章中心12


LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人

(*表示值得重点关注)


1、[LG] *Algorithms for Causal Reasoning in Probability Trees

T Genewein, T McGrath, G Déletang, V Mikulik, M Martic, S Legg, P A. Ortega

[DeepMind]

概率树因果推理算法。提出了离散概率树中因果推理的具体算法,覆盖全部因果层次(关联、干预和反事实),并对任意命题和因果事件进行操作。该工作将因果推理的领域,扩展到非常通用的的离散随机过程。

Probability trees are one of the simplest models of causal generative processes. They possess clean semantics and -- unlike causal Bayesian networks -- they can represent context-specific causal dependencies, which are necessary for e.g. causal induction. Yet, they have received little attention from the AI and ML community. Here we present concrete algorithms for causal reasoning in discrete probability trees that cover the entire causal hierarchy (association, intervention, and counterfactuals), and operate on arbitrary propositional and causal events. Our work expands the domain of causal reasoning to a very general class of discrete stochastic processes.

https://weibo.com/1402400261/JrwuViYrR


2、[AS] *Attention is All You Need in Speech Separation

C Subakan, M Ravanelli, S Cornell, M Bronzi, J Zhong

[Mila-Quebec AI Institute & Universita Politecnica delle Marche & University of Rochester]

基于Transformer的语音分离架构SepFormer。提出一种新的语音分离神经网络模型SepFormer(Separation Transformer),一个非RNN的网络架构,采用由Transformer组成的屏蔽网络,掩蔽网络通过多尺度方法学习短期和长期依赖关系。与最新的基于RNN的系统相比,SepFormer的速度快得多,对内存的要求也少得多。

Recurrent Neural Networks (RNNs) have long been the dominant architecture in sequence-to-sequence learning. RNNs, however, are inherently sequential models that do not allow parallelization of their computations. Transformers are emerging as a natural alternative to standard RNNs, replacing recurrent computations with a multi-head attention mechanism. In this paper, we propose the `SepFormer', a novel RNN-free Transformer-based neural network for speech separation. The SepFormer learns short and long-term dependencies with a multi-scale approach that employs transformers. The proposed model matches or overtakes the state-of-the-art (SOTA) performance on the standard WSJ0-2/3mix datasets. It indeed achieves an SI-SNRi of 20.2 dB on WSJ0-2mix matching the SOTA, and an SI-SNRi of 17.6 dB on WSJ0-3mix, a SOTA result. The SepFormer inherits the parallelization advantages of Transformers and achieves a competitive performance even when downsampling the encoded representation by a factor of 8. It is thus significantly faster and it is less memory-demanding than the latest RNN-based systems.

https://weibo.com/1402400261/JrwygqcX8


3、[CL] *Pre-trained Summarization Distillation

S Shleifer, A M. Rush

[Hugging Face]

面向摘要任务的预训练Transformer模型蒸馏方法。从Seq2Seq转换器中去除精心选取的解码器层,继续微调,可快速生成高质量的学生模型,在某些情况下,使用相同初始化策略的更复杂的训练技术可以产生额外的质量改进。

Recent state-of-the-art approaches to summarization utilize large pre-trained Transformer models. Distilling these models to smaller student models has become critically important for practical use; however there are many different distillation methods proposed by the NLP literature. Recent work on distilling BERT for classification and regression tasks shows strong performance using direct knowledge distillation. Alternatively, machine translation practitioners distill using pseudo-labeling, where a small model is trained on the translations of a larger model. A third, simpler approach is to 'shrink and fine-tune' (SFT), which avoids any explicit distillation by copying parameters to a smaller student model and then fine-tuning. We compare these three approaches for distillation of Pegasus and BART, the current and former state of the art, pre-trained summarization models, and find that SFT outperforms knowledge distillation and pseudo-labeling on the CNN/DailyMail dataset, but under-performs pseudo-labeling on the more abstractive XSUM dataset. PyTorch Code and checkpoints of different sizes are available through Hugging Face transformers here this http URL.

https://weibo.com/1402400261/JrwCpsESr


4、[LG] Training Generative Adversarial Networks by Solving Ordinary Differential Equations

C Qin, Y Wu, J T Springenberg, A Brock, J Donahue, T P. Lillicrap, P Kohli

[DeepMind]

通过求解常微分方程训练GAN。该工作将当前机器学习研究的一个重要部分(生成模型)与一个古老的研究领域(动态系统集成)联系起来。假设GAN训练的不稳定性,是由连续动力学离散化的积分误差引起的,通过实验验证,当与控制积分误差的调节器结合时,著名的ODE求解器(如Runge-Kutta)能使GAN模型的训练更加稳定。

The instability of Generative Adversarial Network (GAN) training has frequently been attributed to gradient descent. Consequently, recent methods have aimed to tailor the models and training procedures to stabilise the discrete updates. In contrast, we study the continuous-time dynamics induced by GAN training. Both theory and toy experiments suggest that these dynamics are in fact surprisingly stable. From this perspective, we hypothesise that instabilities in training GANs arise from the integration error in discretising the continuous dynamics. We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training - when combined with a regulariser that controls the integration error. Our approach represents a radical departure from previous methods which typically use adaptive optimisation and stabilisation techniques that constrain the functional space (e.g. Spectral Normalisation). Evaluation on CIFAR-10 and ImageNet shows that our method outperforms several strong baselines, demonstrating its efficacy.

https://weibo.com/1402400261/JrwOFcuKc


5、[LG] Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local Elasticity

S Chen, H He, W J. Su

[University of Pennsylvania]

标签感知神经切核。提出标签感知的概念,来解释和减少由NTK训练的模型与真实网络神经网络之间的性能差距。受通用标签感知Hoeffding分解的启发,提出了两个标签感知版本的NTK,通过理论研究和综合实验表明,用所提出核训练的模型,在泛化能力和局部弹性方面能更好地模拟神经网络的行为。

As a popular approach to modeling the dynamics of training overparametrized neural networks (NNs), the neural tangent kernels (NTK) are known to fall behind real-world NNs in generalization ability. This performance gap is in part due to the \textit{label agnostic} nature of the NTK, which renders the resulting kernel not as \textit{locally elastic} as NNs~\citep{he2019local}. In this paper, we introduce a novel approach from the perspective of \emph{label-awareness} to reduce this gap for the NTK. Specifically, we propose two label-aware kernels that are each a superimposition of a label-agnostic part and a hierarchy of label-aware parts with increasing complexity of label dependence, using the Hoeffding decomposition. Through both theoretical and empirical evidence, we show that the models trained with the proposed kernels better simulate NNs in terms of generalization ability and local elasticity.

https://weibo.com/1402400261/JrwSkocoX


[LG] Enforcing Interpretability and its Statistical Impacts: Trade-offs between Accuracy and Interpretability

增强可解释性及其统计影响:用简化的学习模型研究准确性(风险)和可解释性之间的权衡

G K Dziugaite, S Ben-David, D M. Roy

[Element AI & University of Waterloo & University of Toronto]

https://weibo.com/1402400261/JrwGVDuwv


[LG] The geometry of integration in text classification RNNs

文本分类RNN的整合几何

K Aitken, V V. Ramasesh, A Garg, Y Cao, D Sussillo, N Maheswaranathan

[University of Washington & Google]

https://weibo.com/1402400261/JrwY6vdA0


[LG] Generalized eigen, singular value, and partial least squares decompositions: The GSVD package

广义特征值、奇异值和偏最小二乘分解:GSVD包

D Beaton

[Baycrest Health Sciences]

https://weibo.com/1402400261/Jrx0xxcB7

扫描二维码推送至手机访问。

版权声明:本文由第六芝士网发布,如需转载请注明出处。

本文链接:http://www.dgx666.com/post/1725.html

分享给朋友:

“爱可可AI论文推介(10月30日)” 的相关文章

AUTO CAD2014 激活错误处理办法

我们在安装CAD的时候,在序列号和产品密钥都正确的情况下,但就是激活不了,这是怎么回事呢?下面的方法可以给以参考,实验是可以的。这种情况的出现,多数是因为我们安装了两套或多套CAD软件造成的,第一套CAD软件可以正常激活,第二套为什么就不可以了呢....方法如下:针对证书授权错误0015.111的解...

怎么把备忘录放桌面 怎么在桌面显示备忘录便签里的内容

手机上自带的备忘录app是很多人都使用过的一款软件,以iPhone为例,自从iOS系统升至iOS14版本以来,就新增了桌面小组件功能。为了方便自己的使用,提高查看内容的效率,怎么把备忘录放桌面?怎么在桌面显示备忘录便签里的内容?要想在iOS14及以上版本的iPhone手机桌面上摆放备忘录很简单,首先...

Windows电脑带有日历的桌面备忘记事工具

工作计划、备忘清单、会议文件等怎么能化繁琐为简约,统统存储在一个记事工具中呢?Windows电脑上的备忘记事工具哪一款好用呢?推荐大家可关注敬业签,敬业签是一款集备忘、提醒和日历等功能于一体的桌面记事工具,可悬挂桌面方便大家随时添加记录。备忘记事功能使用敬业签记录备忘事项,可以创建不同类型的分类。便...

一分钟教会你在iPhone桌面添加备忘录小组件

iPhone手机在不少年轻人的心目当中,都是选择新手机时的首选。之所以有这样傲人的成绩,跟iPhone手机的各种设计是分不开的。在iPhone手机的iOS系统上,可以使用小组件功能,这个设计有很多软件都支持,包括备忘录。一分钟教会你在iPhone桌面添加备忘录小组件。在iPhone手机的桌面上添加备...

FIFA13跳出、闪退、卡顿等运行问题及解决方法汇总

觉得掉帧可以用D3DOverrider为FIFA13.exe(强制)开启三重缓冲和垂直同步,并也在显卡驱动控制面板中(强制)开启抗锯齿请使用最新版显卡驱动和Direct XAPPCRASH”错误请将fifaconfig.exe 和 FIFA13.exe的属性设为在Win XP SP3兼容模式下运行。...

office安装程序找不到office.zh-cn\msvcr80.dll解决方法

近期在安装office2007程序的时候,出现了一个错误,提示了“安装程序找不到office.zh-cn/msvcr80.dll,请浏览确定有效的安装源,然后单击确定”,这可能是因为“Visual Studio Authoring Component组件”的问题导致的,那么要如何解决呢?下面峰哥分享...