EFTA02586434

← Prev Next →

Loading document…

From: Misha Gromov < Sent: Wednesday, October 11, 2017 7:47 PM To: Jeffrey E. Subject: Re: Fwd: Like Bach's comments:) On Wed, 11 Oct 2017 20:01:46 +0200, Jeffrey E. wrote: Forwarded message Fro=: Joscha Bach < Date: Wed, Oct 11, 2017 at 7:55 PM Subject: Re: To: Jeffrey Eps=ein <[email protected] <mailto:[email protected]> =gt; After skimming their paper, the idea seemed unexcitin= to me at first: basically, if we have enough feature dimensions we can al=ost always find a linear separation. This is also related to how Support V=ctor Machines work: they project the data into an extremely high-dimension=l space, find a separating hyperplane with linear regression, and then pro=ect that plane back into the original space as the separator. A similar id=a is behind Echo State networks, which use a randomly wired recurrent neur=l network and then only train the output layer with a single linear regres=ion. The authors take an existing trained neural network, and whenev=r it makes a mistake, they train a linear classifier on the network state =nd data, i.e. they try to find out when the network goes wrong. Instead of=improving the network (which is also likely to make it worse in other case=), they add an additional layer to it. For engineering, this makes a lot o= sense, because large neural networks are cheap to use and deploy but expe=sive to train. On a more philosophical level, it is tempting t= ask if that might be a general learning principle for brains: when you do='t perform well, add more control structure on top. It probably makes sens= whenever you are confident that training the existing structure won't imp=ove it that much, but unless training the weights in an existing network, =t also adds quite a few milliseconds to the processing time. There is prob=bly an optimal tradeoff for this. The other thing is that the new layer is=a linear classifier only (at least in this paper), and it is creating a lo=al override on the s

EFTA02586434 — Epstein Files

From: Misha Gromov < Sent: Wednesday, October 11, 2017 7:47 PM To: Jeffrey E. Subject: Re: Fwd: Like Bach's comments:) On Wed, 11 Oct 2017 20:01:46 +0200, Jeffrey E. wrote: Forwarded message Fro=: Joscha Bach < Date: Wed, Oct 11, 2017 at 7:55 PM Subject: Re: To: Jeffrey Eps=ein <jeevacation@gmail.com <mailto:jeevacation@gmail.com> =gt; After skimming their paper, the idea seemed unexcitin= to me at first: basically, if we have enough feature dimensions we can al=ost always find a linear separation. This is also related to how Support V=ctor Machines work: they project the data into an extremely high-dimension=l space, find a separating hyperplane with linear regression, and then pro=ect that plane back into the original space as the separator. A similar id=a is behind Echo State networks, which use a randomly wired recurrent neur=l network and then only train the output layer with a single linear regres=ion. The authors take an existing trained neural network, and whenev=r it makes a mistake, they train a linear classifier on the network state =nd data, i.e. they try to find out when the network goes wrong. Instead of=improving the network (which is also likely to make it worse in other case=), they add an additional layer to it. For engineering, this makes a lot o= sense, because large neural networks are cheap to use and deploy but expe=sive to train. On a more philosophical level, it is tempting t= ask if that might be a general learning principle for brains: when you do='t perform well, add more control structure on top. It probably makes sens= whenever you are confident that training the existing structure won't imp=ove it that much, but unless training the weights in an existing network, =t also adds quite a few milliseconds to the processing time. There is prob=bly an optimal tradeoff for this. The other thing is that the new layer is=a linear classifier only (at least in this paper), and it is creating a lo=al override on the s

This document is part of the DOJ Epstein Files Transparency Act production (Public Law 119-38) — a corpus of 1,416,848 documents (2,915,593 pages) including prosecution files, FBI investigation records, court filings, and defense materials.

Enable JavaScript to view full page images, metadata, and cross-references.

Search the corpus: epstein-data.com/search

REST API: Get full document data as JSON

PDF: Download original PDF

Source Data Investigation Reports DOJ EFTA CC BY-NC-SA 4.0 Contact

You are leaving epstein-data.com

You are being redirected to an external website not operated by this project. We are not responsible for the content or privacy practices of external sites.

Research

Explore

Reports

Source

EFTA02586434

You are leaving epstein-data.com