Example Internships: Palo Alto Research Center (PARC)

Here are some example internships from last summer. Not all would necessarily be relevant to our student population. This list, however, will give you a feel for the types of internships PARC offers. (Thanks to Tracy King for providing this information.)

  • Anaphora resolution (mid-level PhD student): The goal was to design an anaphora/coreference resolution system to run off of functional (dependency)structures as opposed to strings or trees. The idea was to define features on a marked up version of the f-structures and use our stochastic disambiguation model to choose the most probable set of coreferences. The intern worked closely with Annie Zaenen and myself. This involved a lot of data preparation, designing of features, and then running various configurations to figure out what worked best.

  • Matching/Contradition/Entailment (upper lever PhD student): As part of our question-answering project, we needed code to take the knowledge representation of a passage and of a question and determine whether the passage contradicted the question (answer = no), entailed it (answer = yes), or neither (answer = unknown). This involved looking at the KRs and figuring out an algorithm for contradiction/entailment and then implementing it. There was also a small subproject to do "fuzzy" matching by expanding certain parts of the structure using WordNet and/or Cyc to align things like "girl" with "child"; this fed into the contradiction/entailment matching. Note that this was an exceptionally hard project in that we had no clue how to do the main part. We would not have put an undergrad or masters student on this unless the project had been much better defined.

  • Content analysis blackbox (undergrad symbolic systems student): The project was to take the output of our syntactic parser, semantics, and knowledge representation and make the system a black box so that non-specialists could use it. The main part of this was defining and implementing an xml format for each level of representation and then working with us to define a set of commands that people could run to get the output in whichever format they wanted. This project included not just programming but also talking to people who might use the output to see what types of things they needed and giving a couple of talks to let people in the lab know what was available.

  • Perl hacker (masters symbolic systems student): This was not a project but rather we needed someone to write a set of perl scripts for us for various things. Some were simple scripts to do things like extract all the examples from the rules files and turn them into regression test format or to extract all the verbs with a particular subcat frame from a lexicon to help in debugging. Others were much more complicated like extracting information from VerbNet and putting it into a format we could use for various lexical resources. Note that the guy who worked on this was originally a summer intern specifically to help with resource extraction and we then had him work during the year some on scripts as we needed them.

  • Knowledge Representation (2 PhD students and one postdoc student): As part of the question answering project, we had three people working on mapping from semantics to KR. One worked on deverbal nouns ("writer"), one on presuppositions with clausal complements ("fail" vs. "manage"), and one on multiword expression nouns ("tractor trailer"). These all involved figuring out what data to focus on, what type of representation should appear in the KR, and then implementing the rules to create these representations within the bigger system that was already in place.

  • Lexicon development (undergrad symbolic systems student): This project was to fill out the verb lexicon that we use with better/more subcat frames. As a first step, it involved a bunch of scripting to take examples in the comments and turn them into subcat frames. Then for verbs that had no frames, they used a bunch of techniques to fill them in (calling websters, using derivational morphology to base the frame off another one, and as a last resort just guessing trans/intrans).

  • Regression testing (undergrad symbolic systems student): The project was to design a regression testing system for the parser output. It allowed the grammar writer to specify a possibly stripped down gold standard parse and then compared the actual parse to it and reported back on errors. It also reported ambiguity and timing statistics. This involved a lot of coding and working with the grammar writers to figure out what would be most useful. Note that we now have a different system in place, but the basic idea behind it is based on what was done in this project and for many years this was the best tool we had.

-- WilliamLewis - 07 Nov 2005

Topic revision: r1 - 2005-11-07 - 21:23:39 - WilliamLewis
 

This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
Privacy Statement Terms & Conditions