Information extraction from text, Week 4



The solutions should be ready for inspection by Thursday 7.3.2002 (midnight).

Remember that always, if you are in doubt what you should do, you can ask Reeta or send a message to our newsgroup!!


  1. In this exercise, we study the algorithm of AutoSlog-TS. Assume, our text collection contains the following documents:

    
    text 1; relevant
    
    s: A group of terrorists
    v: attacked
    do: a post
    pp: in Nuevo Progreso.
    
    
    text 2; relevant
    
    s: The National offices
    v: were attacked
    time: today.
    s: Unidentified individuals
    v: detonated
    do: a bomb.
    s: The bomb
    v: destroyed
    do: a car.
    
    
    text 3; not relevant
    
    s: The Armed Forces units
    v: killed
    do: one rebel.
    s: They
    v: destroyed
    do: an underground hideout.
    
    
    text 4; relevant
    
    s: Unidentified individuals
    v: attacked
    do: a high tension tower.
    s: They
    v: destroyed
    do: it.
    
    
    text 5; not relevant
    
    s: The coca growers
    v: protest
    do: the destruction of their fields.
    s: The strike
    v: is supported
    pp: by the Shining Path.
    
    

    Explain the process of AutoSlog-TS using these documents and give the ranking for the extraction patterns that are generated.

    Abbreviations: s subject, v verb, do direct object, pp preposition phrase

    More information: Riloff, E. "Automatically Generating Extraction Patterns from Untagged Text" Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96) , 1996, pp. 1044-1049.


  2. Compare the methods for learning information extraction rules (AutoSlog, Crystal, AutoSlog-TS, Multi-level bootstrapping, ExDisco):



Helena Ahonen-Myka
Last modified: Wed Feb 27 19:03:55 EET 2002