English recipe flow graph corpus

Yamakata, Yoko, Mori, Shinsuke and Carroll, John (2020) English recipe flow graph corpus. 12th Language Resources and Evaluation Conference, Marseille, France, 11th - 16th May 2020. Published in: Proceedings of the 12th Language Resources and Evaluation Conference. 5187-5194. European Language Resources Association (ELRA)

[img] PDF - Published Version
Available under License Creative Commons Attribution-Non-Commercial.

Download (282kB)


We present an annotated corpus of English cooking recipe procedures, and describe and evaluate computational methods for learning these annotations. The corpus consists of 300 recipes written by members of the public, which we have annotated with domain-specific linguistic and semantic structure. Each recipe is annotated with (1) `recipe named entities' (r-NEs) specific to the recipe domain, and (2) a flow graph representing in detail the sequencing of steps, and interactions between cooking tools, food ingredients and the products of intermediate steps. For these two kinds of annotations, inter-annotator agreement ranges from 82.3 to 90.5 F1, indicating that our annotation scheme is appropriate and consistent. We experiment with producing these annotations automatically. For r-NE tagging we train a deep neural network NER tool; to compute flow graphs we train a dependency-style parsing procedure which we apply to the entire sequence of r-NEs in a recipe.In evaluations, our systems achieve 71.1 to 87.5 F1, demonstrating that our annotation scheme is learnable.

Item Type: Conference Proceedings
Keywords: cooking recipe corpus, English recipes, recipe named entity, recipe flow graph, procedural text annotation
Schools and Departments: School of Engineering and Informatics > Informatics
SWORD Depositor: Mx Elements Account
Depositing User: Mx Elements Account
Date Deposited: 08 Jun 2020 07:40
Last Modified: 08 Jun 2020 07:40
URI: http://sro.sussex.ac.uk/id/eprint/91740

View download statistics for this item

📧 Request an update