We present a novel parsing algorithm for all context-free languages. The algorithm features a clean mathematical formulation: parsing is expressed as a series of standard operations on \emph{regular} languages and relations. Parsing complexity w.r.t. input length matches the state of the art: it is worst-case cubic, quadratic for unambiguous grammars, and linear for LR-regular grammars. What distinguishes our approach is that parsing can be implemented using only immutable, acyclic data structures. We also propose a parsing optimization technique called context-free memoization. It allows handling an overwhelming majority of input symbols using a simple stack and a lookup table, similarly to the operation of a deterministic LR(1) parser. This allows our proof-of-concept implementation to outperform the best current implementations of common generalized parsing algorithms (Earley, GLR, and GLL). Tested on a large Java source corpus, parsing is 3–5 times faster, while recognition—35 times faster.
Fri 19 JunDisplayed time zone: Pacific Time (US & Canada) change
06:20 - 07:40 | Parsing, Debugging, and Code SearchPLDI Research Papers at PLDI Research Papers live stream Chair(s): Dan Barowy Williams College | ||
06:20 20mTalk | Faster General Parsing through Context-Free Memoization PLDI Research Papers Grzegorz Herman Jagiellonian University, Poland | ||
06:40 20mTalk | Zippy LL(1) Parsing with Derivatives PLDI Research Papers | ||
07:00 20mTalk | Debug Information Validation for Optimized Code PLDI Research Papers Yuanbo Li Georgia Institute of Technology, USA, Shuo Ding Georgia Institute of Technology, USA, Qirun Zhang Georgia Institute of Technology, USA, Davide Italiano Apple, USA | ||
07:20 20mTalk | Semantic Code Search via Equational Reasoning PLDI Research Papers Varot Premtoon Massachusetts Institute of Technology, USA, James Koppel Massachusetts Institute of Technology, USA, Armando Solar-Lezama Massachusetts Institute of Technology, USA |