April 3 Organizational meeting
Organizational meeting: Sign up to give a talk!
April 10 Resource Limits for Haskell
Speaker: Edward Yang
Abstract: We describe the first iteration of a resource limits system for Haskell, taking advantage of the key observation that resource limits share semantics and implementation strategy with profiling. We pay special attention to the problem of limiting resident memory usage: we describe a simple implementation technique for carrying out incremental heap censuses and describe a novel information-flow control solution for handling forcible resource reclamation. This system is implemented as a set of patches to GHC.
April 17 Searching on Encrypted Data without Revealing the Search Predicate
Speaker: Ananth Raghunathan
Abstract: Starting with the work of Boneh et al. (EUROCRYPT '04) the past decade has seen extensive work on public-key encryption schemes that allow searching on encrypted data. Such schemes have applications in payment gateways that route transactions differently based on the type of transaction. They also have a similar application in systems that filter and forward encrypted emails. Typically, in such a scheme, the secret key holder can construct special tokens that allow anyone to test whether a ciphertext has a particular underlying plaintext (or more generally, whether the plaintext satisfies a particular predicate).
Due to the public-key nature of the encryption scheme, previous efforts ruled out the possibility of protecting the predicate used to construct a token as an adversary may try to learn the predicate by encrypting arbitrary plaintexts of his choice. However, a security notion that captures predicate privacy is easily motivated by considering the payment gateway scenario where the predicate itself might contain sensitive information.
In this work, we show that it is possible to define a security notion protecting predicate privacy and we construct the first public-key searchable encryption schemes that satisfy this notion.
Joint work with: Dan Boneh and Gil Segev
April 24 No speaker
May 1 Layout Randomization and Nondeterminism
Speaker: Jeremy Planul
Abstract: In security, layout randomization is a popular, effective attack mitigation technique. Recent work has aimed to explain it rigorously, focusing on deterministic systems. In this paper, we study layout randomization in the presence of nondeterministic choice. We develop a semantic approach based on denotational models and simulation relations. This approach abstracts from language details, and helps manage the delicate interaction between probabilities and nondeterminism.
May 8 Privacy-Preserving Data Mining
Speaker: Valeria Nikolaenko
Abstract: Data mining known as knowledge discovery attempts to extract meaningful information from large volumes of data. In the information age data mining is a rapidly developing field that connects Artificial Intelligence, Statistics and Databases. In current implementations, data mining algorithms must see the data in the clear, that gives rise to confidentiality and privacy concerns.
In this talk I'll show a generic design of a system for building any data-mining algorithm without the data in the clear, I'll focus on a concrete case of building a privacy preserving algorithm for ridge regression. Ridge regression takes as input a large number of data points and finds the best-fit linear curve through these points. The system outputs the best-fit curve in the clear, but exposes no other information about the input data.
May 15 Testing Noninterference, Quickly
Speaker: Catalin Hritscu (U Penn)
Abstract: Information-flow control mechanisms are difficult to design and labor intensive to prove correct. To reduce the time wasted on doomed proofs for broken definitions, we advocate modern random testing techniques for finding counterexamples during the design process. We show how to use QuickCheck, a property-based random-testing tool, to guide the design of a simple information-flow abstract machine. We find that both sophisticated strategies for generating well-distributed random programs and readily falsifiable formulations of noninterference properties are critically important. We propose several approaches and evaluate their effectiveness on a collection of injected bugs of varying subtlety. We also present an effective technique for shrinking large counterexamples to minimal, easily comprehensible ones. Taken together, our best methods enable us to quickly and automatically generate simple counterexamples for all these bugs.
Bio: Catalin is Research Associate at the University of Pennsylvania, where he is working on the DARPA-funded CRASH/SAFE project under the guidance of Prof. Benjamin C. His research is primarily focused on developing rigorous formal techniques for solving security problems. Catalin received a PhD in Computer Science from Saarland University in 2012. While in Saarbrücken, he held fellowships from Microsoft Research Cambridge and the International Max Planck Research School for Computer Science.
May 22 Native Client
Speaker: Manolis Papadakis
Abstract: In this talk we review the techniques that browser vendors have employed to achieve near-native-speed code execution on the browser. We focus on Google Chrome's approach, Native Client. We describe its underlying technique, Software Fault Isolation, how it was originally formulated, and how it was adapted to modern CPU architectures. We talk about recent advances on the original Native Client architecture, and how it has been received among browser vendors.
May 29 Understanding the Challenges with Medical Data Segmentation for Privacy
Speakers: Ellick Chan and Eric Lam
Abstract: Electronic Health Records (EHRs) are perceived by some government agencies and law makers as a path to significant improvement in healthcare, and patient privacy is an important consideration in the adoption of EHRs. Medical record segmentation is one of the techniques to provide privacy and protect against discrimination for certain medical conditions such as STDs, substance abuse and mental health, by sequestering or redacting certain medical codes from a patient's record.
We present an initial study that describes an approach for segmenting sensitive medical codes to protect patient privacy and to comply with privacy laws. Firstly, we describe segmentation strategies for sensitive codes, and explore the link between these concepts using sources of medical knowledge. Secondly, we mine medical knowledge sources for correlations between concepts. Thirdly, we describe an approach that an attacker may use to infer redacted codes based off second order knowledge. More specifically, the attacker could use the presence of multiple related concepts to strengthen the attack. Finally, we evaluate defensive approaches against techniques that an adversary may use to infer the segmented condition.