Protocol Validation using Minimally Supervised Semantic Interpretation of Text



The networks that comprise the Internet are fundamental to our society, facilitating access to medical and financial services, supporting critical infrastructure such as the power grid, and enabling emergent services such as those provided by autonomous cars and IoT (Internet of Things) devices. Network behavior is dictated by a set of instructions, or protocols, developed and tested over time. Such protocols must operate correctly and comply with requirements that are usually described in a document(s), i.e., in a textual representation. If they do not operate properly, the performance and security of a network could be compromised. The goal of this project is to increase assurance in network protocols, specifically in their compliance to specified rules, in their inter-operability and in their functionality. This project will accomplish this via a novel scheme to perform protocol testing through automated extraction of protocol requirements from their textual specification. This would mark a significant advance in the field, towards automated mechanisms that assure that network protocols are behaving as we expect them to, making networks more reliable and secure.

This multidisciplinary project combines expertise from natural language processing and computer networks to create methodologies, frameworks, a knowledge base, and tools for protocol validation for (1) compliance checking, (2) bug finding, and (3) interoperability testing. The general approach is to apply machine learning, semantic parsing and information extraction techniques to structured text (RFCs, internet-drafts) and unstructured text (blogs, forums, and bug reports), and create a knowledge base about the protocols, containing formal information such as message formats, protocol state machine, constraints, and semi-formal information such as temporal properties, tuning conditions and parameters, changes from one version to another, or known bugs. This information is organized into a knowledge base and used to validate protocol implementations through protocol fuzzying, program analysis, software model checking, and measurement methods, to check whether protocols are compliant with their specifications, to detect semantic bugs dependent on intrinsic protocol properties, or check for interoperability issues between different versions, or protocol stacks. This work is guided by protocols from three representative domains -- TCP variants, the SDN ecosystem, and IoT smart home environment.



    • Leveraging Textual Specifications for Grammar-based Fuzzing of Network Protocols. Samuel Jero, Maria L Pacheco, Dan Goldwasser and Cristina Nita-Rotaru. Thirty-First Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-19) in Honolulu, Hawaii, USA., 2019.



    Current Members

    • Ben Weintraub


This project is funded by NSF grant NeTS:1815219 , PI Cristina Nita-Rotaru. This is a collaboration with Dan Goldwasser, Purdue University.