Antoine Doucet and Helena Ahonen-Myka, An efficient any language approach for the integration of phrases in document retrieval, in International Journal of Language Resources and Evaluation, special issue on "Multiword expressions: hard going or plain sailing?", Springer, 44 (1-2): p.159-180, 2010.
Antoine Doucet and Helena Ahonen-Myka, Statistical Methods for the Evaluation of Indexing Phrases to appear in Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (KDIR 2010), Valencia, Spain, 9 pages, October 2010.
Gaël Dias, Rumen Moraliyski, João Paulo Cordeiro, Antoine Doucet and Helena Ahonen-Myka, Automatic Discovery of Word Semantic Relations using Paraphrase Alignment and Distributional Lexical Semantics Analysis, in Journal of Natural Language Engineering. Special Issue on Distributional Lexical Semantics, Cambridge Journals, volume 16, issue 4, pp. 439-467, 2010.
Seppo Nyrkkö, Lauri Carlson, Matti Keijola, Helena Ahonen-Myka, Jyrki Niemi, Jussi Piitulainen, Sirke Viitanen, Martti Meri, Lauri Seitsonen, Petri Mannonen, Jani Juvonen: Ontology-based Knowledge in Interactive Maintenance Guide. 40th Annual Hawaii International Conference on System Sciences (HICSS'07).
Antoine Doucet and Helena Ahonen-Myka, Fast extraction of discontiguous sequences in text: a new approach based on maximal frequent sequences in Proceedings of IS-LTC 2006, Information Society - Language Technologies Conference, Ljubljana, Slovenia, October 9-14, 2006, p. 186-191.
Antoine Doucet and Helena Ahonen-Myka. Probability and Expected Document Frequency of Discontinued Word Sequences, an efficient method for their exact computation. TAL journal, special issue on "Scaling of Natural Language Processing: Complexity, Algorithms and Architectures", 46 (2): 25 pages, 2006.
Antoine Doucet and Helena Ahonen-Myka.
A Method to Calculate Probability and Expected Document
Frequency of Discontinued Word Sequences.
In proceedings of ACM SIGIR 2005, ELECTRA Workshop on
Methodologies and Evaluation of Lexical Cohesion Techniques
in Real-world Applications (Beyond Bag of Words), Salvador,
Brazil, August 15-19, 2005, p. 33-40.
Helena Ahonen-Myka. Mining all maximal frequent word sequences in a set of sentences. Proceedings of the 14th International Conference on Information and Knowledge Management (CIKM 2005), poster, p. 255-256, ACM Press 2005.
H. Ahonen-Myka and A. Doucet. Data mining meets collocations discovery. In Inquiries into Words, Constraints, and Contexts. Festschrift for Kimmo Koskenniemi on his 60th Birthday. A. Arppe, L. Carlson, K. Lindén, J. Piitulainen, M. Suominen, M. Vainio, H. Westerlund and A. Yli-Jyrä, eds. p. 194-203. CSLI Studies in Computational Linguistics ONLINE. Copestake, Ann (Series Editor). ISSN 1557-5772. CSLI Publications, Stanford, California. Available on-line at: http://cslipublications.stanford.edu/site/SCLO.html
Juha Makkonen, Helena Ahonen-Myka, and Marko Salmenkivi. Simple Semantics in Topic Detection and Tracking. Information Retrieval 7,3-4(2004), p. 347-368.
Antoine Doucet, Helena Ahonen-Myka, Non-Contiguous Word Sequences for Information Retrieval. In Proceedings of the 42nd annual meeting of the Association for Computational Linguistics (ACL-2004), Workshop on Multiword Expressions: Integrating Processing, Barcelona, Spain, July 21-26, 2004, pp. 88--95.
Juha Makkonen, Helena Ahonen-Myka, Utilizing Temporal Expressions in Topic Detection and Tracking. In Proceedings of 7th European Conference on Research and Advanced Technology for Digital Libraries (ECDL03), August 2003, Trondheim, Norway, pp. 393--404.
Juha Makkonen, Helena Ahonen-Myka, Marko Salmenkivi, Topic Detection and Tracking with Spatio-temporal Evidence. In Proceedings of 25th European Conference on Information Retrieval Research (ECIR 2003), April 2003, Pisa, Italy, pp. 251--265.
Helena Ahonen-Myka. Discovery of frequent word sequences in text. The ESF Exploratory Workshop on Pattern Detection and Discovery in Data Mining, Imperial College, London, 16-19 September, 2002.
Antoine Doucet and Helena Ahonen-Myka, Naive clustering of a large XML document collection. In Proceedings of the First Annual Workshop of the Initiative for the Evaluation of XML retrieval (INEX), Schloss Dagstuhl, Germany, December 9-11, 2002, ERCIM Workshop Proceedings, March 2003, pp. 81--88.
Juha Makkonen, Helena Ahonen-Myka, Marko Salmenkivi, Applying Semantic Classes in Event Detection and Tracking. In Proceedings of International Conference on Natural Language Processing (ICON 2002), December 2002, Mumbai, India, pp. 175--183.
Marko Salmenkivi, Juha Makkonen, Helena Ahonen-Myka, Topic Detection and Tracking based on Extracting Words with Meaning of the Same Type. In Proceedings of 10th Finnish Artificial Intelligence Conference (STeP 2002), December 2002, Oulu, Finland, pp. 19--30.
Helena Ahonen-Myka, Barbara Heikkinen, Oskari Heinonen, and Mika Klemettinen. Printing Structured Text without Stylesheets In XML Scandinavia 2000, May 2-4, Gothenburg, Sweden, 2000.
Helena Ahonen-Myka. Finding All Frequent Maximal Sequences in Text. Proceedings of the 16th International Conference on Machine Learning ICML-99 Workshop on Machine Learning in Text Data Analysis, eds. D. Mladenic and M. Grobelnik, p. 11-17, J. Stefan Institute, Ljubljana 1999.
Helena Ahonen-Myka, Oskari Heinonen, Mika Klemettinen, and A. Inkeri Verkamo. Finding Co-occurring Text Phrases by Combining Sequence and Frequent Set Discovery. Proceedings of 16th International Joint Conference on Artificial Intelligence IJCAI-99 Workshop on Text Mining: Foundations, Techniques and Applications, ed. R. Feldman, p. 1-9.
Helena Ahonen. Knowledge Discovery in Documents by Extracting Frequent Word Sequences. An invited article for the special issue of Library Trends on knowledge discovery in bibliographical databases, eds. J. Qin and M.J. Norton, 48(1), Summer 1999, 160-181.
Helena Ahonen. Features of Knowledge Discovery Systems. InterCHANGE, The Newsletter of the International SGML Users' Group. April 1998, Vol. 4, Issue 2, p. 15-16.
Helena Ahonen, Oskari Heinonen, Mika Klemettinen, and A. Inkeri Verkamo: Applying Data Mining Techniques for Descriptive Phrase Extraction in Digital Document Collections. Proceedings of the IEEE Forum on Research and Technology Advances in Digital Libraries (IEEE ADL '98), April 22-24, 1998, Santa Barbara, CA, USA, p. 2-11, IEEE Computer Society, 1998.
Helena Ahonen, Barbara Heikkinen, Oskari Heinonen, Jani Jaakkola, Pekka Kilpeläinen, and Greger Lindén: Design and Implementation of a Document Assembly Workbench. Proceedings of the 7th International Conference on Electronic Publishing, EP '98, April 1-3, St. Malo, France, p. 476-486, Lecture Notes in Computer Science 1375, Springer Verlag, 1998.
Helena Ahonen, Heikki Mannila, and Erja Nikunen. Generating grammars for SGML tagged texts lacking DTD. Mathematical and Computer Modelling, 26(1), 1-13, 1997.
Helena Ahonen, Barbara Heikkinen, Oskari Heinonen, and Pekka Kilpeläinen: Assembling Documents from Digital Libraries. 8th International Conference and Workshop on Database and Expert Systems Applications (DEXA '97), Toulouse, France, September, 1997. Lecture Notes in Computer Science, Springer Verlag, 1997.
Helena Ahonen, Oskari Heinonen, Mika Klemettinen, and A. Inkeri Verkamo: Mining in the phrasal frontier. Principles of Knowledge Discovery in Databases Conference, Trondheim, Norway, June 1997. Lecture Notes in Computer Science, Springer Verlag, 1997.
Helena Ahonen, Oskari Heinonen, Mika Klemettinen, and A. Inkeri Verkamo: Applying data mining techniques in text analysis. Report C-1997-23, Department of Computer Science, University of Helsinki, 1997.
Helena Ahonen, Barbara Heikkinen, Oskari Heinonen, and Mika Klemettinen: Improving the accessibility of SGML documents - A content-analytical approach. Proceedings of SGML Europe '97 Conference, 13-15 May, Barcelona, Spain, pages 321-327, Graphic Communications Association, 1997.
H. Ahonen: Generating grammars for structured documents using grammatical inference methods. PhD thesis, Department of Computer Science, University of Helsinki, Series of Publications A, Report A-1996-4, 1996.
H. Ahonen: Automatic generation of SGML content models. Electronic Publishing -- Origination, Dissemination and Design, 8(2\&3), 195--206, Wiley Publishers, 1995.
H. Ahonen: Disambiguation of SGML content models. Proceedings of the Workshop on Principles of Document Processing '96, 23 September, Palo Alto, USA, 1996. Lecture Notes in Computer Science 1293, Springer-Verlag, 1997.
H. Ahonen, H. Mannila & E. Nikunen: Generating grammars for SGML tagged texts lacking DTD. In M. Murata & H. Gallaire (eds.), Proc. Workshop on Principles of Document Processing (PODP) '94 . Darmstadt, 1994. Appeared also in Mathematical and Computer Modelling (see above).
H. Ahonen, H. Mannila & E. Nikunen: Forming grammars for structured documents: An application of grammatical inference. In R. Carrasco & J. Oncina (eds.), Proc. Second International Colloquium on Grammatical Inference and Applications (ICGI), Lecture Notes in Computer Science 862 (pp. 153-167). Springer-Verlag, 1994.