tag:blogger.com,1999:blog-19803222.post5175748262897667616..comments2024-03-18T01:45:45.724-06:00Comments on natural language processing blog: Structured prediction is *not* RLhalhttp://www.blogger.com/profile/02162908373916390369noreply@blogger.comBlogger6125tag:blogger.com,1999:blog-19803222.post-56577724049143553552017-08-06T07:39:38.774-06:002017-08-06T07:39:38.774-06:00Hi Hal, an interesting blog post!
You might be in...Hi Hal, an interesting blog post!<br /><br />You might be interested in our recent paper which adapts the LOLS approach to training RNNs:<br /><i>SEARNN: Training RNNs with Global-Local Losses</i><br />RĂ©mi Leblond, Jean-Baptiste Alayrac, Anton Osokin, Simon Lacoste-Julien<br /><a href="https://arxiv.org/abs/1706.04499" rel="nofollow">arXiv:1706.04499</a><br /><br />It will be presented at the <a href="https://deepstruct.github.io/ICML17/ac/" rel="nofollow">ICML 2017 Workshop on Deep Structured Prediction</a><br />(in case that you are around...).Simon Lacoste-Julienhttps://www.blogger.com/profile/12471143626693719175noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-79310470871664373762017-04-07T05:15:46.365-06:002017-04-07T05:15:46.365-06:00There is a lot of psychological evidence about wha...There is a lot of psychological evidence about what happens during sentence processing, including evidence that there is some kind of representation of non-determinism. For example, there is Marslen-Wilson's Cohort Model (https://en.wikipedia.org/wiki/Cohort_model). But models like this are largely silent on details of the representations used. Which computational models cannot be.Chris Brewhttps://www.blogger.com/profile/15950294272852443488noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-57185501676614346712017-04-03T19:24:54.836-06:002017-04-03T19:24:54.836-06:00@Chris: thanks! I don't know enough about how ...@Chris: thanks! I don't know enough about how the brain works to say something interesting, but this is cool to think about. In eye-tracking, people do look back, for instance, when they get garden pathed or whatever, which isn't necessarily maintaining multiple hypotheses, but maintaining some sort of uncertainty. (Like Jinho Choi's selective branching.)<br /><br />@Dipendra: Agreed, good point. CPI also assumes you can reset (which is one reason we chose it), which is totally a fine assumption in SP. Even if you do ten passes over the data, though, you're still only trying a very very small subset of possible trajectories. But this definitely goes to the question of: would I rather expand more now, or do this sentence again later in a few hours?halhttps://www.blogger.com/profile/02162908373916390369noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-7300826313037588532017-04-03T16:10:06.588-06:002017-04-03T16:10:06.588-06:00I wonder if the argument that "RL does not bu...I wonder if the argument that "RL does not build out the whole search tree" true in practice. Most people perform epochs over the dataset and therefore reset the world to a start state. The agent can then execute a trajectory, different from what it tried earlier, and keep doing it to eventually explore the entire search space. In fact papers such as Trust Region Policy Optimization (Figure 1 https://arxiv.org/pdf/1502.05477.pdf) perform multiple rollouts from the same state (to be fair they do mention that it works in a simulation). Of course this won't hold in a real world, you cannot remake a glass jar that a robot has accidentally broken while exploring. Similarly, you cannot just reset to a start state that easily. <br />Anonymoushttps://www.blogger.com/profile/00402700087475050406noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-88389665383270614572017-04-03T14:22:03.924-06:002017-04-03T14:22:03.924-06:00This comment has been removed by the author.Anonymoushttps://www.blogger.com/profile/00402700087475050406noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-6705553020630494022017-04-03T09:21:13.240-06:002017-04-03T09:21:13.240-06:00One of the advantages of RL is that it does NOT bu...One of the advantages of RL is that it does NOT build out the whole search tree, so does not require detailed symbolic representations of multiple alternatives at the same time. Neither the brain's neural hardware nor the various fancy-dan deep learning networks comfortably accommodate the representation of detailed symbolic alternatives, as far as we know. So it is well worth pursuing RL and similar methods in which the representation of prior context and current uncertainty is less necessarily cut-and-dried than in (say) CRFs.Chris Brewhttps://www.blogger.com/profile/15950294272852443488noreply@blogger.com