Stanford Security Lunch
Summer 2020

Get announcements:

June 24, 2020 The Price is (Not) Right: Comparing Privacy in Free and Paid Apps

Speaker:  Catherine Han (UC Berkeley)

Abstract:  It is commonly assumed that "free" mobile apps come at the cost of consumer privacy and that paying for apps could offer consumers protection from behavioral advertising and long-term tracking. This work empirically evaluates the validity of this assumption by comparing the privacy practices of free apps and their paid premium versions, while also gauging consumer expectations surrounding free and paid apps. We use both static and dynamic analysis to examine 5,877 pairs of free Android apps and their paid counterparts for differences in data collection practices and privacy policies between pairs. To understand user expectations for paid apps, we conducted a 998-participant online survey and found that consumers expect paid apps to have better security and privacy behaviors. However, there is no clear evidence that paying for an app will actually guarantee protection from extensive data collection in practice. Given that the free version had at least one thirdparty library or dangerous permission, respectively, we discovered that 45% of the paid versions reused all of the same third-party libraries as their free versions, and 74% of the paid versions had all of the dangerous permissions held by the free app. Likewise, our dynamic analysis revealed that 32% of the paid apps exhibit all of the same data collection and transmission behaviors as their free counterparts. Finally, we found that 40% of apps did not have a privacy policy link in the Google Play Store and that only 3.7% of the pairs that did reflected differences between the free and paid versions.

Paper:  PETS 2020

July 22, 2020 Proof-Carrying Data from Accumulation Schemes

Speaker:  Pratyush Mishra (UC Berkeley)

Abstract:  Recursive proof composition has been shown to lead to powerful primitives such as incrementally-verifiable computation (IVC) and proof-carrying data (PCD). All existing approaches to recursive composition require SNARKs with sublinear verification. In recent work, Bowe, Grigg, and Hopwood (ePrint 2019/1021) outlined a novel approach to recursive composition, and applied it to a particular SNARK construction which does not have sublinear verification. In this talk I will present a formalization of this approach called an 'accumulation scheme', and show how to construct PCD from SNARKs with accumulation schemes, even if the SNARK verifier is not sublinear. I will also present some accumulation schemes for SNARKs, which yield PCD schemes with novel properties via this construction.

August 19, 2020 Routing for Anonymous Communication

Speaker:  Kyle Hogan (MIT)

Abstract:  Anonymous communication requires indistinguishable traffic patterns: in many schemes, communication proceeds in synchronous rounds, where every user must both send and receive a message in each round. Implicitly, this assumes minimum bandwidth and maximum latencies on the underlying network. Real-world internet conditions, such as congestion, outages, and suboptimal routing hurt performance and reliability. When this delays messages beyond the end of a round, synchronicity assumptions do not hold and user anonymity suffers. We propose an overlay routing protocol for widely-distributed anonymity systems that allows the discovery and use of faster paths across the internet. Nodes in such systems can send traffic via other nodes, rather than using default internet routes. This allows routing around congestion or bad paths, ensuring that messages arrive on-time. We use empirical evidence from Tor, an anonymous communication system used by millions daily, to argue that such a technique would benefit general anonymous communication systems.

September 09, 2020 nPrint: A Standard Data Representation for Network Traffic Analysis

Speaker:  Jordan Holland (Princeton)

Abstract:  Conventional detection and classification ("fingerprinting") problems involving network traffic commonly rely on either rule-based expert systems or machine learning models that are trained with manually engineered features derived from network traffic. Automated approaches in this area are typically tailored for specific problems. This paper presents nPrint, a standard, packet-based representation of network traffic that can be used as an input to train a variety of machine learning models without extensive feature engineering. We demonstrate that nPrint offers a suitable traffic representation for machine learning algorithms across three common network traffic classification problems: device fingerprinting, operating system fingerprinting, and application identification. We show that models trained with nPrint are at least as accurate as widely used tools, but in contrast do not rely on brittle, manually updated rules and features. Finally, we release nPrint as a publicly available software tool to encourage further use, testing, and extensions to the existing network traffic representation.


September 16, 2020 Using SMT Solvers to Automate Chosen Ciphertext Attacks

Speaker:  Max Zinkus (JHU)

Abstract:  In this work we investigate the problem of automating the development of adaptive chosen ciphertext attacks on systems that contain vulnerable format oracles. Unlike previous attempts, which simply automate the execution of known attacks, we consider a more challenging problem: to programmatically derive a novel attack strategy, given only a machine-readable description of the plaintext verification function and the malleability characteristics of the encryption scheme. We present a new set of algorithms that use SAT and SMT solvers to reason deeply over the design of the system, producing an automated attack strategy that can entirely decrypt protected messages. Developing our algorithms required us to adapt techniques from a diverse range of research fields, as well as to explore and develop new ones. We implement our algorithms using existing theory solvers. The result is a practical tool called Delphinium that succeeds against real-world and contrived format oracles. To our knowledge, this is the first work to automatically derive such complex chosen ciphertext attacks.

Paper:  eprint 2019/958