Learning Semantics Workshop
TechTalks from event: Learning Semantics Workshop
We are still uploading slides and videos for this event. Please excuse any discrepancy.
Learning Natural Language from its Perceptual ContextMachine learning has become the best approach to building systems that comprehend human language. However, current systems require a great deal of laboriously constructed human-annotated training data. Ideally, a computer would be able to acquire language like a child by being exposed to linguistic input in the context of a relevant but ambiguous perceptual environment. As a step in this direction, we have developed systems that learn to sportscast simulated robot soccer games and to follow navigation instructions in virtual environments by simply observing sample human linguistic behavior. This work builds on our earlier work on supervised learning of semantic parsers that map natural language into a formal meaning representation. In order to apply such methods to learning from observation, we have developed methods that estimate the meaning of sentences from just their ambiguous perceptual context.
Learning Dependency-Based Compositional SemanticsThe semantics of natural language has a highly-structured logical aspect. For example, the meaning of the question â€œWhat is the third tallest mountain in a state not bordering California?â€ involves superlatives, quantification, and negation. In this talk, we develop a new representation of semantics called Dependency-Based Compositional Semantics (DCS) which can represent these complex phenomena in natural language. At the same time, we show that we can treat the DCS structure as a latent variable and learn it automatically from question/answer pairs. This allows us to build a compositional question-answering system that obtains state-of-the-art accuracies despite using less supervision than previous methods. I will conclude the talk with extensions to handle contextual effects in language.
How to Recognize EverythingOur survival depends on recognizing everything around us: how we can act on objects, and how they can act on us. Likewise, intelligent machines must interpret each object within a task context. For example, an automated vehicle needs to correctly respond if suddenly faced with a large boulder, a wandering moose, or a child on a tricycle. Such robust ability requires a broad view of recognition, with many new challenges. Computer vision researchers are accustomed to building algorithms that search through image collections for a target object or category. But how do we make computers that can deal with the world as it comes? How can we build systems that can recognize any animal or vehicle, rather than just a few select basic categories? What can be said about novel objects? How do we approach the problem of learning about many related categories? We have recently begun grappling with these questions, exploring shared representations that facilitate visual learning and prediction for new object categories. In this talk, I will discuss our recent efforts and future challenges to enable broader and more flexible recognition systems.
From Machine Learning to Machine ReasoningA plausible definition of â€œreasoningâ€ could be â€œalgebraically manipulating previously acquired knowledge in order to answer a new questionâ€. This definition covers first-order logical inference or probabilistic inference. It also includes much simpler manipulations commonly used to build large learning systems. For instance, we can build an optical character recognition system by first training a character segmenter, an isolated character recognizer, and a language model, using appropriate labeled training sets. Adequately concatenating these modules and fine tuning the resulting system can be viewed as an algebraic operation in a space of models. The resulting model answers a new question, that is, converting the image of a text page into a computer readable text. This observation suggests a conceptual continuity between algebraically rich inference systems, such as logical or probabilistic inference, and simple manipulations, such as the mere concatenation of trainable learning systems. Therefore, instead of trying to bridge the gap between machine learning systems and sophisticated â€œall-purposeâ€ inference mechanisms, we can instead algebraically enrich the set of manipulations applicable to training systems, and build reasoning capabilities from the ground up.
Towards More Human-like Machine Learning of Word MeaningsHow can we build machines that learn the meanings of words more like the way that human children do? I will talk about several challenges and how we are beginning to address them using sophisticated probabilistic models. Children can learn words from minimal data, often just one or a few positive examples (one-shot learning). Children learn to learn: they acquire powerful inductive biases for new word meanings in the course of learning their first words. Children can learn words for abstract concepts or types of concepts that have no little or no direct perceptual correlate. Children's language can be highly context-sensitive, with parameters of word meaning that must be computed anew for each context rather than simply stored. Children learn function words: words whose meanings are expressed purely in how they compose with the meanings of other words. Children learn whole systems of words together, in mutually constraining ways, such as color terms, number words, or spatial prepositions. Children learn word meanings that not only describe the world but can be used for reasoning, including causal and counterfactual reasoning. Bayesian learning defined over appropriately structured representations â€” hierarchical probabilistic models, generative process models, and compositional probabilistic languages â€” provides a basis for beginning to address these challenges.
Learning Semantics of MovementIn this presentation, we consider how to computationally model the interrelated processes of understanding natural language and perceiving and producing movement in multimodal real world contexts. Movement is the specific focus of this presentation for several reasons. For instance, it is a fundamental part of human activities that ground our understanding of the world. We are developing methods and technologies to automatically associate human movements detected by motion capture and in video sequences with their linguistic descriptions. When the association between human movement and their linguistic descriptions has been learned using pattern recognition and statistical machine learning methods, the system is also used to produce animations based on written instructions and for labeling motion capture and video sequences. We consider three different aspects: using video and motion tracking data, applying multi-task learning methods, and framing the problem within cognitive linguistics research.