Prossimo

Q1 2026 Rustls Performance Update

Thu, 09 Apr 2026 00:00:00 +0000

Overview

Offering top tier performance is a primary goal for the Rustls project. As such, the project has developed benchmarks representing some of the most performance critical functions and monitors them closely.

The Rustls project periodically publishes test results that compare Rustls performance to other popular TLS libraries, OpenSSL and BoringSSL.

The previously published test results are from July of 2025. The Rustls project is planning to start publishing performance reports more frequently going forward.

Here's how library versions have changed since the previous results:

Rustls: 0.23.31 (aws-lc-rs 1.13.1) -> 0.23.37 (aws-lc-rs 1.16.0)
OpenSSL: 3.5.1 -> 3.6.1
BoringSSL: July 2024 -> March 2026 snapshot

The testing discussed here was done in March of 2026.

Results

Analysis

Library	First Place	Second Place	Third Place
Rustls	14	2	0
BoringSSL	2	10	4
OpenSSL	0	4	12

Since the last tests in July of 2025, neither BoringSSL or OpenSSL significantly improved or regressed in any test. Rustls got a bit faster in a few tests, likely because of improvements to the underlying cryptography via updates to aws-lc-rs.

The results on the whole are roughly what we'd expect. OpenSSL is known to have serious performance issues, BoringSSL avoids most of those, and Rustls takes performance a step further.

Looking Forward

Rustls 0.24 will be released this year with a large number of changes focused on building a strong foundation for a 1.0 release. In particular, Rustls is making some changes to its APIs that will serve users better for the long term.

One performance-related change we are pursuing is "split mode". This is where — after the TLS handshake — a connection can be split into sender and receiver objects. The sender can send TLS-protected data, while the receiver can receive it. Historically this has been challenging because in TLS, a receiver may need to occasionally write (e.g. to send an alert). To address this challenge the split objects have an internal relationship to ensure that (for example) if the receiver object needs to send a message, that can happen in a transparent way. This is a very infrequent occurrence, so doesn't cause contention in normal use.

The intention is that those objects can be used on separate threads, which allows total throughput for a connection to be roughly doubled. The above measurements are per-core, but we're not aware of other TLS libraries that allow the use of one connection from two threads like this. Therefore, the send and receive measurements of Rustls could be added together. We'll confirm that is practically possible in a future performance report.

That is all caveated on applications that can be structured to benefit from full-duplex use of a single connection. Luckily, tokio already has this pattern as a first-class concept. We hope the Rustls split-mode feature will contribute to cementing Rust as a great choice for mega-fast and safer network services.

Large releases like this can be a mixed bag for performance, but the team monitors regressions closely and so at a minimum, no significant regressions are expected. For some of the closely contested results it's possible that Rustls 0.24 could pick up or lose a place. If that happens, we'll prioritize improvements that get Rustls back to the prior position.

Four Years of Momentum: Craig Newmark Philanthropies and the Future of Memory Safety

Wed, 11 Feb 2026 00:00:00 +0000

We are pleased to announce that Craig Newmark Philanthropies has renewed its support for the Internet Security Research Group (ISRG) with a $100,000 grant for 2026.

This marks the fourth consecutive year of support from craigslist founder Craig Newmark. As a pioneer in the "Cyber Civil Defense" movement and a former programmer himself, Craig has long understood that memory safety vulnerabilities are an avoidable but critical threat to cybersecurity. His early and consistent support has allowed ISRG's Prossimo project to move from research and development to real-world deployments.

Delivering on the Promise of Memory Safety

When we first announced Craig's support in 2023, we were focused on building the tools. In 2024/25, we were refining them. In 2026, we are seeing them deployed at scale:

sudo: Our memory-safe implementation became the default in Ubuntu in late 2025, improving security for millions of users.
Rustls: We continue to work with organizations that have added Rustls to their roadmap and we are seeking input from potential adopters.
Hickory DNS: As the world's first open-source, memory-safe, fully recursive DNS resolver, Hickory is on track for production use by mid-2026 in the Let's Encrypt infrastructure.

The Value of Long-Term Advocacy

We are grateful for Craig's continued trust as we build more resilient digital systems. This support allows our team to focus on the technical work that makes the internet fundamentally safer for everyone. We hope Craig's leadership will inspire others to consider how they can leave a legacy of making a better internet for everyone.

As we look ahead to 2026, we are proud to continue this work alongside those who understand the importance of protecting the internet. We are grateful for Craig's continued trust in our mission.

A Note from our Executive Director

Mon, 29 Dec 2025 00:00:00 +0000

This letter was originally published in our 2025 Annual Report.

This year was the 10th anniversary of Let's Encrypt. We've come a long way! Today we're serving more than 700 million websites, issuing ten million certificates on some days. Most importantly, when we started 39% of page loads on the Internet were encrypted. Today, in many parts of the world, over 95% of all page loads are encrypted. We can't claim all the credit for that, but we're proud of the leading role we played. Being able to help ISRG and Let's Encrypt get to where we are today has been the opportunity of a lifetime for me.

There's more I could talk about from the past ten years, but this 10th year was about as good as any before it so I want to focus on our most recent work. I'll get the headline for 2025 out right away: over the past year we went from serving 492 million websites to 762 million. That's a 50% increase in a single year, equivalent to the growth we saw over our first six years of existence combined. Our staff did an amazing job accommodating the additional traffic.

I'm also particularly proud of the things we did to improve privacy this year, across all of our projects.

At the start of 2025 we were serving over four billion Online Certificate Status Protocol (OCSP) requests per day. That's 180 million per hour, or 50,000 per second. OCSP has been an important mechanism for providing certificate revocation information for a long time, but the way it works is bad for privacy. It requires browsers to check with certificate authorities for every website they visit, which is basically providing your browsing history to third parties. Let's Encrypt never held onto that data; it got dropped immediately. However, there is no way to know if that was standard practice across the industry, and even well-intentioned CAs could make a mistake or be compelled to save that data. It was a system ripe for abuse, so we decided to become the first major CA to turn off our OCSP service. We couldn't be sure what the full impact would be, but this was a way in which the Internet needed to get better. In August of 2025 we turned off our OCSP service. There was no major fallout and we haven't looked back.

Another big privacy-focused change we made to Let's Encrypt in 2025 was no longer storing subscriber email addresses in our CA database, associated with issuance data. In June of this year we stopped adding the optional email addresses that subscribers send to our database, and we deleted the millions of email addresses that had accumulated over the years. Making this change was not an easy thing to decide to do—it limits our ability to contact subscribers and we had to turn off our expiration reminder email service—but we feel the ecosystem has grown enough over the past ten years that the privacy implications of holding onto the email addresses outweighed the utility.

Privacy was at the forefront for the folks at ISRG researching human digital identity as well. They have been hard at work on an implementation of the Anonymous Credentials from ECDSA scheme, also known as Longfellow. This is a cryptographic library that can be used in digital identity management, including things like digital wallets, in order to improve privacy when sharing credentials. Digital identity systems should have strong privacy and compatibility requirements, but such requirements pose challenges that existing digital credential technologies are going to struggle to meet. New schemes such as Longfellow aim to address these challenges, bringing privacy improvements to systems that need to work with existing cryptographic hardware. This is exciting stuff, but not easy to build (so much math!)—watching our talented engineers make progress has been thrilling.

The last example of great privacy work I want to highlight from 2025 is our Prossimo project's work towards encrypted recursive-to-authoritative DNS. Prossimo is focused on bringing memory safety to critical software infrastructure, but sometimes that dovetails nicely with other initiatives. DNS queries are fundamental to the operation of the Internet. Without getting into the details here too much, there are basically two types of DNS queries: stub-to-recursive and recursive-to-authoritative. A lot of work has gone into encrypting stub queries over the past decade, mostly through DNS over HTTPS (DoH) initiatives. Authoritative queries, however, remain almost entirely unencrypted. This is a particular problem for Certificate Authorities like Let's Encrypt. During 2025, our Prossimo project started work on changing that, investing heavily in encrypted authoritative resolution by implementing RFC 9539 Unilateral Opportunistic Deployment of Encrypted Recursive‑to‑Authoritative DNS and other related improvements in Hickory DNS. Once this is ready, early in 2026, Hickory DNS will be a high performance and memory safe option that DNS operators can use to start making and receiving encrypted authoritative DNS queries. It can also be used for integration testing with other DNS implementations.

It's wonderful, and a real responsibility, to be able to have this kind of positive impact on the lives of everyone using the Internet. Charitable contributions from people like you and organizations around the world make what we do possible. We are particularly grateful to Jeff Atwood, Betsy Burton, and Stina Ehrensvärd for their special gifts this year. Since 2015, tens of thousands of people have donated. They've made a case for corporate sponsorship, given through their DAFs, or set up recurring donations. If you're one of those people, thank you. If you're considering becoming a supporter, I hope this annual report will make the case that we're making every dollar count.

Every year we aim to make the dollars entrusted to us go as far as possible, and next year will be no exception.

Improving Error Handling in Rustls

Wed, 01 Oct 2025 00:00:00 +0000

Dirkjan Ochtman is a maintainer of the Rustls TLS library that we've invested in since 2021. While he and the other maintainers have made many improvements and landed important features, we've asked Dirkjan to talk about another important part of increasing adoptability of Rustls: reducing friction in how it handles errors.

Josh Aas, Head of ISRG's Prossimo project

The Rustls team has been improving error handling in Rustls over the past couple of years, and today I'd like to talk about why and how we've done it.

Last year, I kept getting a NotValidForName error from a Rustls client connecting to a HTTPS API service I was trying to debug:

invalid peer certificate: NotValidForName

I thought it was frustrating that the error didn't provide more context to help me understand how to fix the problem. As a Rustls maintainer, I shared my frustration in our Discord channel, and we agreed that this is something we could improve on. Two days later, I submitted an initial PR for gathering up the required context. I spent some time preparing our collection of libraries to make this work out, but a few months later rustls 0.23.24 included "more detailed and helpful error reporting for common certificate errors":

invalid peer certificate: certificate not valid for name "example.com";
certificate is only valid for DnsName("www.example.com")

Three months ago, an issue came in from a reporter that saw Rustls clients failing with:

invalid peer certificate: BadSignature

One of our maintainers diagnosed this within an hour after it was reported, concluding that the configured crypto provider did not support the signature scheme the server was offering. We merged a PR one day later that changed this error to UnsupportedSignatureAlgorithm instead, making it more obvious why the signature was bad -- and that this might be remediated by client-side configuration:

invalid peer certificate: UnsupportedSignatureAlgorithm

This fix was released two weeks later in 0.23.28. I then wanted to add more context to this error as well, and once 0.23.29 was released a few weeks later, the error looked like this:

invalid peer certificate: UnsupportedSignatureAlgorithmForPublicKeyContext {
    signature_algorithm_id: [6, 8, 42, 134, 72, 206, 61, 4, 3, 4],
    public_key_algorithm_id: [6, 7, 42, 134, 72, 206, 61, 2, 1, 6, 8, 42, 134, 72, 206, 61, 3, 1, 7]
}

(Two months later, our maintainer additionally submitted a PR to aws-lc-rs to add support for the missing signature scheme, and Rustls 0.23.32 comes with support for these built-in.)

Although we have so far blogged more about our performance results, as the two examples above show we also invest quite a bit of effort into making sure Rustls is easy to use -- even when something fails. Partly this is for selfish reasons: if the error is clear, that means people are less prone to filing an issue, so that we can spend more time on improving the code rather than helping folks diagnose issues. We're not just maintainers, we're users too, and when we break something during development or in our own Rust projects that rely on Rustls, errors will ideally be of high enough quality that we can minimize time spent troubleshooting. Of course we have to be careful to avoid leaking sensitive data in error values.

As a result, in 0.23.31 our Error type distinguishes more than 200 distinct machine-readable error variants; machine-readable variants help downstream libraries deal with specific errors. This is spread out over 8 subcategories of errors, including things like the IANA-specified TLS alerts but also categories like PeerIncompatible and PeerMisbehaved. Some of these values have precise names like "PeerMisbehaved::AttemptedDowngradeToTls12WhenTls13IsSupported", while others offer some documentation explaining how or why they might occur. In our Rust code, all of these are specified as enums which are defined to be non-exhaustive, which helps us evolve our error types without affecting API stability. Our last API-incompatible release happened 18 months ago, and we have since shipped 31 releases that improved Rustls without making API-incompatible changes. As such, the ability to improve our error handling without incompatible API changes is very important to our usability goals.

The Rust compiler is known for emitting pretty friendly error messages. That was thanks to an explicit effort from the Rust project, continuing to this day. I think this has increased ambition across the Rust ecosystem to make libraries and applications whose error messages are friendlier: easier to digest and ideally precise enough to help you quickly pinpoint the problem. In Rustls, I feel like we've done pretty well, although there's always opportunities for further improvement. If you run into Rustls errors that are hard to understand, please file an issue so we can see if there's something to be improved!

Rustls Joins Rust Foundation's Rust Innovation Lab

Wed, 03 Sep 2025 00:00:00 +0000

The Rust Foundation just announced the launch of the Rust Innovation Lab, with the Rustls TLS library as the inaugural hosted project. We're excited to see Rustls gain a new long-term administrative home where they will receive fundraising, governance, legal, marketing, and administrative support.

When we started the Prossimo project in 2020, we knew that investing in a TLS library that was both high performance and memory safe was a top priority. After looking at the options, we decided that the best path forward was to invest in the Rustls TLS library. Rustls was already a nice library back then, but it needed a number of features and performance optimizations, as well as a C API, to really become an attractive alternative to the most commonly used (but unfortunately not memory safe) TLS libraries.

Since we decided to invest in Rustls, we've raised well over a million dollars for work on the project and it has come a long way. It has a great feature set, the code quality is high, and the performance is excellent. In 2024, it gained FIPS support. It also has multiple C APIs (one mirroring the native Rust API, and one for OpenSSL compatibility) to allow it to be easily used from other programming languages besides Rust.

In our opinion, Rustls is now both the fastest and the safest TLS library out there. Now is a great time for other software projects to consider switching to Rustls in order to provide better security and performance to their users.

With features and performance in a good place, the priorities for Rustls are responsiveness to the needs of potential adopters and stable support for long-term maintenance. The Rust Innovation Lab will help Rustls pursue those priorities and more. We couldn't be happier to see it!

We'd like to thank Joe Birr-Pixton, the creator of Rustls, for our years of great collaboration. We'd also like to thank the organizations who supported our vision and investment: the Sovereign Tech Agency, Google, Fly.io, AWS, Alpha-Omega, and craig newmark philanthropies.

Opportunistic Encryption Is Coming to Hickory DNS

Wed, 30 Jul 2025 00:00:00 +0000

ISRG creates more secure Internet infrastructure by operating the Let's Encrypt certificate authority, and also by promoting the creation and adoption of memory safe software via the Prossimo project. Prossimo initiatives, most of which relate to critical Internet functionality, avoid memory corruption vulnerabilities that have plagued Internet server software for decades. We work to make Prossimo projects more secure in other ways as well, which is why we're going to be adding support for RFC 9539 opportunistic encryption to Hickory DNS.

Prossimo invests heavily in the Hickory DNS project, in part because we believe the Internet needs a high performance and memory safe Domain Name System (DNS) resolver, but also because we want to use it for Let's Encrypt. Let's Encrypt performs huge numbers of DNS queries in order to issue millions of certificates per day.

DNS is a fundamental but subtle part of the Internet infrastructure, governed by a long list of protocol specifications, involving interactions among clients and servers run by many different organizations. DNS implementations have to parse protocol traffic to extract the data they need and have been a recurrent source of exploitable security vulnerabilities. With Hickory, we are mitigating many of these risks with a modern clean-slate DNS implementation in Rust.

As part of our commitment to security and privacy, Hickory DNS will be adding support for RFC 9539. DNS was historically entirely unencrypted, and DNS traffic can reveal a lot of metadata about specific users' or networks' interactions with particular Internet services. Several encrypted upgrades to DNS have been created by the Internet standards community, but their rollout has been uneven. With regards to authoritative servers, there is a discoverability problem: there isn't yet a widely-agreed or widely-implemented way to tell DNS clients that a specific DNS server can be accessed by an encrypted mechanism. RFC 9539 is a specification that explains how DNS clients can "opportunistically" try to connect to authoritative servers via the encrypted DoT and DoQ protocols, remembering their success or failure for reference when repeating connections to those same servers. In the future, we expect this discoverability problem will be addressed by specifications from the DNS Delegation working group. Once those are available and deployed, there will be a natural upgrade path from unencrypted DNS, to opportunistically encrypted protocols, to authenticated indication of encrypted protocol support.

Support for RFC 9539 opportunistic encryption provides a path toward more routinely protecting the privacy of DNS queries, and a chance to give the DNS community more experience with routine use of DoT and DoQ. Proactively encrypting DNS queries will also improve privacy and security for DNS users in the future when, we hope, Hickory is used by Internet service providers and others as a DNS resolver. We look forward to a future where we can encrypt a significant fraction of the DNS traffic that Let's Encrypt generates.

Hickory's opportunistic encryption functionality is expected to be completed in Q4 of 2025. Financial support for RFC 9539 implementation is provided by ICANN.

sudo-rs Headed to Ubuntu

Wed, 16 Jul 2025 00:00:00 +0000

Every day, system administrators all over the world ask their computers to perform security-sensitive tasks across privilege boundaries, such as a standard user executing a command as root. The software most commonly used to navigate privilege boundaries is sudo (pronounced like "soo" and "do"), a 1980s evolution of the classic su ("substitute user") system administration tool.

Where su lets a user fully log in as another user with the other user's credentials, sudo carries out individual commands in the context of another user with more fine-grained security policies regarding what can be run and what credentials are required.

The sudo application is complex and highly configurable. It exists at the heart of Unix security policy enforcement, but, like almost all classic Unix software, it was originally written in C. The C language allows for many kinds of mistakes that can lead to vulnerabilities (memory safety vulnerabilities in particular), and as such, implementation errors in sudo have repeatedly led to exploitable security vulnerabilities -- sometimes allowing any user or software on a system to completely take over that system.

This is exactly the sort of critical system software that the Prossimo project aims to make safer through reimplementation with modern software development tools and practices. So, in 2022, we hired two consultancies to reimplement sudo and su in Rust. We call the new implementation sudo-rs.

Less than a year from the project's start, sudo-rs was ready for users to try out. The tech infrastructure maintenance nonprofit Trifecta Tech Foundation then took over long-term development and support of the project in 2024.

Now sudo-rs has reached a new milestone: the developers of Ubuntu, one of the world's most popular Linux distributions, have replaced the traditional C versions of sudo and su with sudo-rs in the upcoming Ubuntu 25.10 release due out in October. (See also Trifecta Tech's announcement.)

The 25.10 release is not a long term support Ubuntu release, it is supported only through July 2026. It's aimed at users who are interested in trying very recent software releases and don't mind having to upgrade their operating system more frequently. However, following Ubuntu development practice, the software choices of Ubuntu 25.10 are a trial run for the 26.04 LTS release in April 2026, which will be recommended to all Ubuntu users and supported until 2031. We anticipate that sudo-rs will also be the default in that version, which is likely to be installed by tens of millions of people.

This is a great demonstration that essential system software can be made memory-safe, and we and our colleagues look forward to continuing that process with other applications. Check out memorysafety.org to see what else we're up to on this front. We send our congratulations to our friends at Tweede Golf, Ferrous Systems, Trifecta Tech Foundation, and Canonical, the developer of Ubuntu.

Users on earlier Ubuntu versions, or other popular Linux distributions, can opt in to try sudo-rs.

In addition to rewriting the tools in Rust, a safer language, some little-used features of sudo were not implemented in order to reduce vulnerability surface area. This turned out to be meaningful in July of 2025 when two vulnerabilities (CVE-2025-32462 and CVE-2025-32463) were discovered in sudo features not implemented in sudo-rs. In response to one of those, sudo has deprecated and will remove the feature hosting the vulnerability.

Compatibility with C is Key for Memory Safe Software

Thu, 05 Jun 2025 00:00:00 +0000

When we're evaluating potential Prossimo initiatives we take a number of things into consideration. Perhaps one of the most important is compatibility with existing C (and C++) software.

Most of the Internet's critical low-level software infrastructure is written in C or C++. This includes the Linux, Windows, and Apple kernels, most DNS, TLS, NTP, and BGP implementations, most popular server and proxy software, and the core utilities in Linux and most Unix-like operating systems. It's also the case for critical higher-level software infrastructure, like web browsers and media codecs.

This is not going to change any time soon. We're in the beginning phases of a journey towards memory safety for the Internet's critical software infrastructure, and as we get going it makes the most sense to break down big problems into smaller ones by focusing on replacing components within existing C and C++ software. This is why it's a high priority for Prossimo projects to interoperate with C and C++.

Rustls (TLS)

We've invested quite a bit in the Rustls TLS library, which is now one of the fastest and safest TLS implementations out there. In particular, we've invested heavily in C compatibility.

Rustls has a native Rust API, but it also has a C interface to the Rust API as well as an OpenSSL v3 compatibility layer. The former is probably most useful for C developers doing initial integration with Rustls, or for those transitioning from OpenSSL and hoping to make use of a better API. The latter is a way for existing OpenSSL 3 API users to migrate to Rustls with as little effort as possible, perhaps without even recompiling their code!

Check out these blog posts for more information about Rustls:

zlib-rs

Similarly, the zlib-rs project provides a high performance Rust implementation of the zlib compression format that is drop-in compatible with the zlib C API.

Hopefully we (and others) can take what we've learned here and use it to build other memory safe compression libraries. The Trifecta Tech Foundation is the new home for zlib-rs, and they've published some great blog posts about their work:

rav1d (AV1 Decoder)

The rav1d AV1 decoder is a fork of the dav1d decoder with the C code replaced by memory safe Rust code. It comes with a C API promising drop-in compatibility with dav1d's C API so it's easy to integrate into existing C programs.

As with compression libraries, hopefully we (and others) can take what we've learned here and use it to build other memory safe media decoders. Here are some blog posts about what we've learned:

Rust for Linux

Rust for Linux aims to allow integrating components written in Rust with the rest of the Linux kernel, which is written in C. To that end, developing Rust interfaces to C APIs is central to the work. Kernel developers working in C do not need to know Rust, and Rust code can be introduced component by component.

You can read more about the Rust for Linux project here.

bindgen

From 2022-2023, we made major contributions to bindgen, which automatically generates Rust FFI bindings to C (and some C++) libraries. This has made life easier for many people who integrate Rust with C and C++.

Ferrous Systems was the contractor for this work, they wrote some great blog posts about it:

$20,000 rav1d AV1 Decoder Performance Bounty

Wed, 14 May 2025 00:00:00 +0000

In March of 2023 we announced that we were starting work on a safer high performance AV1 decoder called rav1d, written in Rust. We partnered with Immunant to do the engineering work. By September of 2024 rav1d was basically complete and we learned a lot during the process. Today rav1d works well—it passes all the same tests as the dav1d decoder it is based on, which is written in C. It’s possible to build and run Chromium with it.

There’s just one problem—it’s not quite as fast as the C version. We want to change that and we need your help.

Our Rust-based rav1d decoder is currently about 5% slower than the C-based dav1d decoder (the exact amount differs a bit depending on the benchmark, input, and platform). This is enough of a difference to be a problem for potential adopters, and, frankly, it just bothers us. The development team worked hard to get it to performance parity. We brought in a couple of other contractors who have experience with optimizing things like this. We wrote about the optimization work we did. However, we were still unable to get to performance parity and, to be frank again, we aren’t really sure what to do next.

After racking our brains for options, we decided to offer a bounty pool of $20,000 for getting rav1d to performance parity with dav1d. Hopefully folks out there can help get rav1d performance advanced to where it needs to be, and ideally we and the Rust community will also learn something about how Rust performance stacks up against C.

The official rules are here, but to summarize:

The contest is open to individuals or teams of individuals who are legal residents or citizens of the United States, United Kingdom, European Union, European Economic Area, Switzerland, Canada, New Zealand, or Australia.
The rules provide instructions for benchmarking performance improvements.
You work on improving performance. Your improvements can be in rav1d, the Rust compiler, or the Rust standard library.
The dav1d and rav1d decoders share the exact same low-level assembly code optimizations—you cannot modify this assembly. You must improve the Rust code (or the Rust compiler), which is what differs between dav1d and rav1d. You may not introduce code into rav1d in a language other than Rust. We encourage you to ask questions early on in issues or by emailing us so as to avoid investing heavily in something that might not be eligible!
Get your performance improvements merged into the relevant project per the project's standard contribution process and under its open source license(s), then email us per the instructions in the official rules to enter and potentially be rewarded for your contribution.
When the contest ends (likely either because we met our goal or time has run out) we will, at our discretion, divide the bounty proportionally between the largest contributors to performance gains.

At the end of the day, we reserve the right to award the money to the person(s) or team(s) that we deem to have helped us reach or exceed performance parity in the best possible way.

If we update the rules we'll post a note here and on the official rules page.

Good luck! Have fun!

2025.05.14 Notice: European Economic Area and Switzerland added to the list of places in which legal residents or citizens are eligible.

2025.12.31 Notice: This program has concluded.

Rustls Server-Side Performance

Tue, 13 May 2025 00:00:00 +0000

In past years, the Rustls project has been happy to receive substantial investments from the ISRG. One of our goals has been to improve performance without compromising on safety. We last posted about our performance improvements in October of 2024, and we're back to talk about another round of improvements.

What is Rustls?

Rustls is a memory safe TLS implementation with a focus on performance. It is production ready and used in a wide range of applications. You can read more about its history on Wikipedia.

It comes with a C API and FIPS support so that we can bring both memory safety and performance to a broad range of existing programs. This is important because OpenSSL and its derivatives, widely used across the Internet, have a long history of memory safety vulnerabilities with more being found this year. It's time for the Internet to move away from C-based TLS.

On the server

In our previous post we looked at handshake latency and traffic throughput for connections on the client and the server. While clients will usually have a small number of connections active at any time, TLS servers generally want to optimize for high utilization, supporting as many connections as possible at the same time. TLS server connections usually share a reference to a backing store, which can be used to resume sessions across connections for a substantial latency improvement in connection setup. Our goal is then to minimize the slowdown that sharing the resumption store imposes on individual connections.

We first validated the assumption that turning off resumption would allow linear scaling:

As our testing showed, Rustls manages to avoid any impact from scaling in this case, up to the 80 cores offered by the Ampere ARM hardware used in this test. This is similar to BoringSSL, which shows no impact -- although it spends more time per handshake. OpenSSL handshake latency deteriorates as it scales, although comparing OpenSSL versions shows that its development team have made strides to improve this, as well.

Resumption mechanisms

TLS supports two different resumption strategies:

Stateful resumption stores resumption state on the server in some kind of map (or database). The key into this map is sent across the wire. Because the key is relatively compact, this uses less bandwidth and therefore slightly reduces latency. On the other hand, it is harder to scale efficiently when multiple servers are serving the same potentially resuming clients.
Stateless resumption sends encrypted resumption state to the client. This is easy to horizontally scale because there is no server-side state, but the resumption state is a good deal larger, with an associated increase in bandwidth used (and the associated latency impact).

The resumption state that is sent to a client is commonly called a "ticket". Ticket encryption keys must be regularly rolled over because a key compromise destroys the security of all past and future tickets. In order to enable key rollover while supporting multiple concurrent sessions, Rustls 0.23.16 and earlier wrapped the encryption key in a mutex, which resulted in substantial contention as the number of concurrent server connection handshakes increased. In Rustls 0.23.17, we started using an RwLock instead, which limits contention to the short period when a key rollover happens (by default, every 6 hours).

Finally, we made another change in Rustls 0.23.17 to reduce the number of tickets sent by default when stateless resumption is enabled from 4 to 2, to align with the OpenSSL/BoringSSL default. This leads to doing less work both in terms of CPU time (encryption) and bandwidth used.

Handshake latency distribution

Apart from specific resumption concerns, we also compared Rustls to other TLS implementations in terms of the latency distribution experienced on the server: not just looking at the average latency, but also at worst-case (in this case, P90 and P99) latency. Rustls does quite well here:

While this chart shows full TLS 1.3 handshakes in particular, similar results were observed for other scenarios.

Conclusion

Current versions of Rustls show competitive performance when processing many connections at the same time on a server. Rustls servers scale almost linearly with the number of cores available, and server latency for the core TLS handshake handling is roughly 2x lower than OpenSSL in our benchmarks.

An Update on Memory Safety in the Linux Kernel

Thu, 06 Mar 2025 00:00:00 +0000

Rust for Linux is an effort to support the use of the Rust language in the Linux kernel. Memory safety errors account for a large portion of kernel vulnerabilities, but this can be reduced as more drivers are written in a memory safe language. In 2021, we started working with the project's primary maintainer, Miguel Ojeda, shortly after he published the original RFC for Rust in the Linux kernel:

New code written in Rust has a reduced risk of memory safety bugs, data races, and logic bugs overall, thanks to the language properties.
Maintainers are more confident in refactoring and accepting patches for modules thanks to the safe subset of Rust.
New drivers and modules become easier to write, thanks to abstractions that are easier to reason about, based on modern language features, as well as backed by detailed documentation.
More people get involved overall in developing the kernel thanks to the usage of a modern language.
By taking advantage of Rust tooling, we keep enforcing the documentation guidelines we have established so far in the project. For instance, we require having all public APIs, safety preconditions, 'unsafe' blocks and type invariants documented.

Progress continued, and in October 2022, Rust was merged into the Linux kernel as an official language. We were excited by the milestone, but understood that this change could be rolled back if there was no support and progress in driver development, so the work continued! Today, Miguel continues to lead the Rust for Linux effort by maintaining the development and stable branches, managing the Rust for Linux core team and building its community, along with contributing to technical development and other subsystem maintenance in order to solidify and expand the Rust community in the kernel. This is a lot of work and is critical to continuing to foster a more memory safe future for the Linux kernel.

While our goal was never to rewrite the entire kernel in Rust, we are glad to see growing acceptance of Rust's benefits in various subsystems. Today, multiple companies have full time engineers dedicated to working on Rust in the Linux kernel. As recently noted by Jonathan Corbet, kernel maintainer and Executive Editor of LWN:

In my mind, the Rust for Linux project has already achieved an important goal: proving that Rust is indeed a viable and desirable language for kernel development... This work is important for the long-term viability of Linux, and I am glad that it is succeeding.

There are several efforts that are now underway:

Upstreamed Users	Targeted Upstream Users
PHY Drivers Null Block driver DRM panic screen QR code generator	Android Binder driver Apple AGX GPU driver NVMe driver Nova GPU driver

We expect that one of them will be merged into the mainline kernel in the next 12-18 months. In the recent 6.13 merge window, Greg Kroah-Hartman noted, "rust misc driver bindings and other rust changes to make misc drivers actually possible. I think this is the tipping point, expect to see way more rust drivers going forward now that these bindings are present. Next merge window hopefully we will have pci and platform drivers working, which will fully enable almost all driver subsystems to start accepting (or at least getting) rust drivers. This is the end result of a lot of work from a lot of people, congrats to all of them for getting this far, you've proved many of us wrong in the best way possible, working code :)".

At this point, the goal of the effort will start to be realized: products and services running Linux with Rust drivers will be more secure, and that means the people using them will be more secure, too.

We'd like to thank Miguel for tirelessly working on this effort and thank the Alpha-Omega project for their financial support.

How Prossimo's Risk and Opportunity Criteria Help Us Plan

Tue, 04 Mar 2025 00:00:00 +0000

Prossimo's primary goal is to move the Internet's most security-sensitive software infrastructure to memory safe code. Many of the most critical software vulnerabilities are memory safety issues in C and C++ code, and while there are ways to reduce the risk, including fuzzing and static analysis, memory safety vulnerabilities continue to plague the Internet.

The good news is that with the rare exception of code that must be written in assembly for performance and/or security reasons (e.g. cryptographic routines), we know how to get rid of memory safety vulnerabilities entirely: write code in languages that don't allow for those kinds of mistakes. It's a more or less solved research problem, and as such we don't need to suffer from this kind of thing any more. It can be relegated to the past like smallpox, we just have to do the work.

We recognize that it will be a lot of work to move significant portions of the Internet's C and C++ software infrastructure to memory safe code, but the Internet will be around for a long time. There is time for ambitious efforts to pay off. The relevant stakeholders certainly have the resources to do this for the most critical software out there. By being smart about our initial investments and focusing on the most critical components, we can start seeing significant returns within a few years.

We don't do our work alone. We get advice from community members, and most of the work we facilitate is done by open source maintainers and contractors. Our role is to provide strategic planning, facilitation, and communication. We identify high impact projects, build and maintain relationships with open source maintainers and funders, help develop plans, coordinate the work, and communicate information about the work to the public and our partners.

In order to achieve the positive impact we're aiming for, the first thing we need to do is identify work that is both high impact and efficiently achievable. We do this at Prossimo with two different sets of criteria. It's important to note that our criteria are just one way of approaching this kind of work - other people and organisations might have different criteria that work well for them. This is not a situation in which there is a single correct way of looking at things.

Risk Criteria

The first set of criteria are our risk criteria. These inform us about the level of risk that a software component represents. These are not the only things we consider, but we're trying to keep it somewhat simple conceptually and these things are at the top of our list.

Very widely used (nearly every server and/or client)
On a security boundary (e.g. network boundary, privilege boundary)
Performing a critical function

The first criterion here, widespread use, addresses a single but important aspect of determining the severity of a vulnerability. When something is widely used, there is more surface area across the Internet for attackers to choose from and more systems to exploit.

The second criterion, on a security boundary, relates to the fact that usually an attacker is trying to get from one position to another and must cross a boundary to get there. The closer software is to receiving data from untrusted networks like the Internet, the easier it is to exploit and, typically, the more value there is in exploitation. Privilege boundaries are often hard to exploit in terms of opportunity, but the consequences of exploitation can be more devastating. An example might be a vulnerability in a utility like sudo and it's why we invested in a memory safe implementation of sudo.

The third criterion is another one focused on severity. Exploits in software performing critical functions are usually (though not always) more negatively impactful than exploits in software performing less important functions.

It's important to note that our criteria differ from approaches based primarily on historical analysis of where we've already seen concentrations of memory safety vulnerabilities. Prioritizing work based on historical vulnerability analysis is important - we need to address known problems in the software we depend on! However, this work already gets a lot of attention, it's almost purely reactive, and it often gives too much weight to volume over severity. We think Prossimo has the most to offer by looking ahead a bit and working on some of the more difficult investments in memory safety that we ought to be making.

To boil all of this down to a single sentence... Widely used software performing critical functions on network boundaries is, in our opinion, a set of software with a lot of opportunity for high impact vulnerabilities.

To give a specific example of the kind of thing we are trying to avoid in the future: Heartbleed. It matches these criteria almost perfectly. People had every reason to believe that OpenSSL was dangerously vulnerable prior to Heartbleed, but only after this momentous vulnerability did the relevant stakeholders engage in a campaign to shore things up (still, a decade later, OpenSSL has suffered five more memory safety vulnerabilities in the past year alone). That campaign was important - OpenSSL's security properties needed to be improved - but going forward we can and should just prevent that kind of thing from happening in the first place.

Opportunity Criteria

Our second set of criteria helps us understand where we have the most opportunity to make a difference. Just because something is high risk doesn't mean we have the ability to do something about it with the kind of efficient investments we're able to make.

Is this a library or component that can be used in many different projects?
Can we efficiently replace key components with existing memory safe libraries?
Are funders willing to fund the work?
Are the maintainers on board and cooperative?
Are we aware of likely significant adopters?

The first criterion here raises the question of whether we would be able to apply the results of an investment to many different projects. An example would be a TLS library like Rustls that can be "plugged in" to many different applications.

The second criterion sort of reverses the question raised by the first - is this a piece of software in which we can simply replace certain critical components with memory safe ones. In other words, can we take a modular approach to this and take advantage of existing memory safe libraries.

The third criterion probably needs the least explanation - is anyone willing to pay for the work? We seek funding from companies who understand the urgency to move toward a memory safe software stack and visionary funders like Craig Newmark who seek a positive societal impact.

The fourth criterion refers to the fact that it's very difficult to modify existing software if the maintainers are not on board and cooperative. If they are not, we either can't do our work or we would need to engage in a much more costly rewrite. Sometimes a rewrite is the right thing to do, but it's definitely something to consider up front.

The fifth criterion has to do with how quickly we think something might get adopted. Adoption is hard for most new software, and it's particularly difficult when we're talking about making changes in low-level Internet infrastructure software. We're prepared to deal with long adoption timelines, most of what we do will take years to get strong adoption, but if we have a chance to get an accelerated timeline that's something worth considering.

Conclusion

These criteria really get to the heart of what we're trying to do with Prossimo. Hopefully this post has helped to understand them more clearly.

We've helped to build some great software, like the Rustls TLS library, Hickory DNS, a memory safe NTP implementation, and a memory safe implementation of sudo. If you run software like this, we encourage you to try these implementations out.

If you're interested in updates on our memory safety work, please subscribe to the mailing list below.

Hickory DNS is Moving Toward Production Readiness

Tue, 11 Feb 2025 00:00:00 +0000

The Domain Name System (DNS) is a foundational part of the Internet. It stores data associated with domain names, like web server addresses and mail server addresses. Almost all network connections are preceded by a DNS lookup. The most popular DNS server implementations are written in C, and as a result, they have been affected by a series of memory safety vulnerabilities. These vulnerabilities can put DNS infrastructure at risk, as well as any system that depends on DNS.

We've been investing in Hickory DNS, an open source DNS implementation, which provides a memory safe alternative. The Hickory DNS project implements all major protocol roles, including a client library, stub resolvers, forwarding resolvers, recursive resolvers, and authoritative name servers. It has a growing community of users and contributors. Our current goal is to prepare Hickory DNS for deployment at Let's Encrypt, as the recursive resolver used during domain control validation. We're happy to see initial production use of the client and stub resolver growing already.

Over the past year, Hickory DNS contributors, including the maintainers, Ferrous Systems, and others, have improved Hickory DNS's support for Domain Name System Security Extensions (DNSSEC). They have fixed spec conformance and compatibility bugs, and added features that will be needed in production deployments.

DNSSEC

Hickory DNS's support for DNSSEC advanced by leaps and bounds this year. DNSSEC adds digital signatures to DNS zones, allowing records to be authenticated. Implementation of these features was led by Ferrous Systems (see their blog post for more details).

The recursive resolver now supports DNSSEC validation. Previously, there was only basic support for DNSSEC validation on the client side. This required special handling of DNSSEC-related flags and record types throughout the resolver.

Support was added for generating and validating NSEC3 records. NSEC3, specified in RFC 5155, is the successor to the NSEC record type, and both are used to authenticate negative responses, i.e. confirm that specific records do not exist.

A conformance test suite was added, to confirm that specific requirements in DNSSEC RFCs were implemented. This test framework has also proven useful beyond DNSSEC because it allows running multiple DNS server implementations together in a virtual network. This both simplifies testing the recursive resolver, which expects to communicate with other hosts on port 53, and compares Hickory DNS's behavior against other DNS implementations. Various DNSSEC bugs were identified with the expanded test suite and fixed, especially regarding insecure and bogus validation results.

Specification Conformance Bug Fixes

There have been many protocol correctness fixes to Hickory DNS over the past year. These were driven in part by new test suites covering a variety of edge cases. In addition to the DNSSEC conformance test suites mentioned above, new conformance tests based on RFC 8906 were added, which identified issues when handling unrecognized opcodes and empty question sections. Another test suite was added based on test fixtures from an IMC 2023 paper by Nosyk et. al. (PR #2385, PR #2711). This provides a broad variety of misconfigured zones, representative of problems seen in the wild.

Multiple improvements were made to the recursive resolver to fix infinite loops when encountering missing glue records, lame delegations, and other edge cases (issue, PR #2332, PR #2522).

Error propagation was improved throughout Hickory DNS, both in plain DNS contexts and DNSSEC contexts (issue, PR #2379, PR #2502). These changes correct the response codes sent when certain errors are encountered, and enable DNSSEC validation of negative responses.

The recursive resolver's handling of truncated responses was fixed (issue, PR). Initial DNS queries are sent over UDP, and thus responses are limited to a fixed maximum size, in one UDP datagram. If the authoritative name server can't fit all records into a response message, it sends as many records as it can and sets the "truncation" bit in the response message's header. In this case, the recursive resolver is supposed to retry its request via TCP, which is not affected by the same message size limitations. This bugfix corrected an edge case where the response received via TCP could be discarded in favor of the truncated UDP response. Now, the TCP response is always used if the UDP response was truncated.

The CAA record type, defined in RFC 8659, is relevant to Certificate Authorities, and thus it's important that Hickory DNS processes these records correctly before Let's Encrypt can deploy it. Multiple fixes were made to CAA record handling, all involving invalid property values or issuer names (issue, PR #2373, PR #2418, PR #2419). Since the specification says that invalid issuer names must be interpreted to forbid certificate issuance, invalid records must be preserved when transiting through Hickory DNS.

Production-readiness Features

Running a DNS service in production imposes a number of non-functional requirements such as robustly handling traffic from malfunctioning or malicious third parties, flexible configuration, and scalable performance.

We hired OSTIF to perform a security audit of Hickory DNS and addressed the issues it identified. These were all denial of service vulnerabilities via resource exhaustion, affecting the call stack in the recursive resolver, memory consumption by the recursive resolver, TCP connection setup, TLS/QUIC handshakes, and DNSSEC verification (this is the KeyTrap vulnerability). These fixes will improve the availability of Hickory DNS-based servers when under attack.

New options were added to the Hickory DNS server configuration. First, allow/deny lists were added to enable controlling access to the server. Second, cache policy configuration was added to the recursive resolver (issue, PR). This allows setting minimum and maximum time-to-live values for cached records, and customizing these limits for particular resource record types.

The record cache used by the stub resolver, forwarding resolver, and recursive resolver was replaced with a cache that allows concurrent access. This removed a coarse-grained lock from the main code path, improving scalability.

Looking Forward

In the coming year, our work will continue along the same themes. We are moving through the items on this GitHub issue to get to our goal of deployment in Let's Encrypt. We expect to fix more spec compliance bugs and edge cases, add support for DNSSEC signing and verification using aws-lc-rs, improve performance, and more. Dirkjan Ochtman, one of the Hickory DNS maintainers, will be working under contract with Prossimo to drive this work forward.

If you are interested in trying out Hickory DNS as a memory safe alternative, the authoritative name server and stub resolver implementations have deployment experience already in various applications. Version 0.25 will include the above improvements, and 0.25 alpha prereleases are available for testing now. If you are interested in using the recursive resolver, stay tuned for further improvements in the coming year, and keep an eye on the above tracking issue.

Benjamin Fry is the creator of Hickory and has been a great partner along this journey. We'd like to thank the Sovereign Tech Agency for their financial support of Prossimo to fund improvements to Hickory DNS and craig newmark philanthropies for ongoing support to improve memory safety in critical infrastructure.

If you're interested in updates on Hickory and our memory safety work in general, subscribe to the mailing list below.

A Note from our Executive Director

Wed, 11 Dec 2024 00:00:00 +0000

This letter was originally published in our 2024 Annual Report.

The past year at ISRG has been a great one and I couldn’t be more proud of our staff, community, funders, and other partners that made it happen. Let’s Encrypt continues to thrive, serving more websites around the world than ever before with excellent security and stability. Our understanding of what it will take to make more privacy-preserving metrics more mainstream via our Divvi Up project is evolving in important ways.

Prossimo has made important investments in making software critical infrastructure safer, from TLS and DNS to the Linux kernel.

Next year is the 10th anniversary of the launch of Let’s Encrypt. Internally things have changed dramatically from what they looked like ten years ago, but outwardly our service hasn’t changed much since launch. That’s because the vision we had for how best to do our job remains as powerful today as it ever was: free 90-day TLS certificates via an automated API. Pretty much as many as you need. More than 500,000,000 websites benefit from this offering today, and the vast majority of the web is encrypted.

Our longstanding offering won’t fundamentally change next year, but we are going to introduce a new offering that’s a big shift from anything we’ve done before - short-lived certificates. Specifically, certificates with a lifetime of six days. This is a big upgrade for the security of the TLS ecosystem because it minimizes exposure time during a key compromise event.

Because we’ve done so much to encourage automation over the past decade, most of our subscribers aren’t going to have to do much in order to switch to shorter lived certificates. We, on the other hand, are going to have to think about the possibility that we will need to issue 20x as many certificates as we do now. It’s not inconceivable that at some point in our next decade we may need to be prepared to issue 100,000,000 certificates per day.

That sounds sort of nuts to me today, but issuing 5,000,000 certificates per day would have sounded crazy to me ten years ago. Here’s the thing though, and this is what I love about the combination of our staff, partners, and funders - whatever it is we need to do to doggedly pursue our mission, we’re going to get it done. It was hard to build Let’s Encrypt. It was difficult to scale it to serve half a billion websites. Getting our Divvi Up service up and running from scratch in three months to service exposure notification applications was not easy. Our Prossimo project was a primary contributor to the creation of a TLS library that provides memory safety while outperforming its peers - a heavy lift.

Charitable contributions from people like you and organizations around the world make this stuff possible. Since 2015, tens of thousands of people have donated. They’ve made a case for corporate sponsorship, given through their DAFs, or set up recurring donations, sometimes to give $3 a month. That’s all added up to millions of dollars that we’ve used to change the Internet for nearly everyone using it. I hope you’ll join these people and help lay the foundation for another great decade.

Josh Aas
Executive Director

Security-Sensitive Industries Move to Memory Safety

Tue, 03 Dec 2024 00:00:00 +0000

Prossimo has been investing in the memory safe, high performance TLS library called Rustls for nearly four years. During that time, we've seen Rustls improve and we've seen growing adoption. Organizations like 1Password, Google Fuchsia, and Fly.io have been using Rustls for a while, and we're pleased that FIS is joining that list. FIS, a global fintech firm whose services underpin a huge portion of the financial world, has adopted Rustls in order to bring memory safety to TLS for critical aspects of its internal infrastructure.

The FIS team was able to make the switch with just a few hours of engineering time thanks to the Rustls OpenSSL compatibility layer for Nginx. This recently added feature made it possible to swap in Rustls without needing to modify or recompile Nginx.

Moving to Rustls is an excellent response to the recent cross-industry call from the White House's Office of the National Cyber Director (ONCD) for companies to add memory safety to their roadmaps. National Cyber Director Harry Coker stated, "we, as a nation, have the ability -- and the responsibility -- to reduce the attack surface in cyberspace and prevent entire classes of security bugs from entering the digital ecosystem but that means we need to tackle the hard problem of moving to memory safe programming languages."

We see the Nginx OpenSSL compatibility layer as an important tool to accelerate the move to memory safety and encourage any organization running Nginx to try it out. If your organization is able to dedicate a few hours of engineering time like FIS did, your memory safety roadmap will have one item marked as 'complete' in 2024. "This may be a multi-decade endeavor that will require all of us, those in government, the private sector, and across the technical community to play our part and that's why we must begin this work today," Anjana Rajan, Assistant National Cyber Director at the White House, stated previously. Recently, Rajan commented: "the urgency of addressing the memory safety problem cannot be overstated and it's time to start the next chapter of back to the building blocks and show how we are executing against this vision."

If you're interested in exploring how Rustls could work for your organization, check out the project on GitHub or contact us at press@abetterinternet.org.

A new home for memory safe Zlib

Thu, 07 Nov 2024 00:00:00 +0000

Today we're pleased to announce that the recently developed open source memory safe implementation of zlib — zlib-rs — has a new long-term home at the Trifecta Tech Foundation.

We set out to develop a strategy, raise funds, and select a contractor for a memory safe zlib implementation in 2023. We did this because data compression algorithms, and zlib in particular, are used in a vast number of protocols and file formats throughout all of computing. In the past, compression libraries have encountered memory safety vulnerabilities, a common phenomenon for libraries written in C/C++ and a class of issues that critical system software should not suffer from.

We contracted Tweede golf in December of 2023 for an initial implementation based on zlib-ng, with a focus on maintaining excellent performance while introducing memory safety. The project was made possible through funding provided by Chainguard and a time investment by Tweede golf.

An early release of the zlib-compatible dynamic library is available on crates.io.

New home

Trifecta Tech Foundation is already the long-term home of two other Prossimo initiatives: memory safe NTP and sudo.

When the Tweede golf team suggested having zlib-rs become part of Trifecta Tech Foundation's data compression initiative, it was an easy decision to make on our end. Trifecta Tech Foundation is backed by the team from Tweede golf and we know that they are good stewards of open source while also being leading experts in writing in memory safe languages.

Given the widespread use of zlib across the tech industry, offering a memory safe alternative to C implementations is a huge win. The investment required is tiny compared to the gain, as zlib is relatively small in terms of lines of code. When a memory safe zlib is in place, it allows adding (performance) improvements with confidence; to iterate without breaking things.

Erik Jonkers, chair of Trifecta Tech Foundation and Director of Open source at Tweede golf

Trifecta Tech Foundation aims to mature the zlib-rs project and support its maintainers. Zlib-rs will be part of the Foundation's data compression initiative that includes four compression libraries: zlib, bzip2, zstd and xz.

What's next?

Work on Webassembly optimizations, kindly funded by Devolutions, is underway. A security audit by Prossimo is nearing completion and is expected to be done in November 2024. When successfully finished, the Trifecta Tech Foundation team will continue to work with Mozilla, who are interested in potentially shipping zlib-rs in Firefox.

That said, work on zlib-rs is not yet complete. Trifecta Tech Foundation is seeking funding to make the initial implementation ready for production. Contact Trifecta Tech Foundation if you're interested.

Rustls Outperforms OpenSSL and BoringSSL

Tue, 22 Oct 2024 00:00:00 +0000

ISRG has been investing heavily in the Rustls TLS library over the past few years. Our goal is to create a library that is both memory safe and a leader in performance.

Back in January of this year we published a post about the start of our performance journey. We've come a long way since then and we're excited to share an update on Rustls performance today.

What is Rustls?

Rustls is a memory safe TLS implementation with a focus on performance. It is production ready and used in a wide range of applications. You can read more about its history on Wikipedia.

Handshake Performance

The first metric we'll look at is the number of handshakes that can be completed per second on the same hardware with the same resource constraints. These tests connect one client to one server over a memory buffer, and then measure the time elapsed in client and server processing — therefore, they give an upper bound on performance given no network latency or system call overhead.

Rustls leads in every scenario tested.

Throughput Performance

The next metric we'll look at is throughput on the same hardware with the same resource constraints, in terms of megabytes per second:

Rustls leads across the board in throughput as well.

Testing Methodology

Tests were performed using Debian Linux on a bare-metal Intel Xeon E-2386G CPU with hyper-threading disabled, dynamic frequency scaling disabled, and the CPU scaling governor set to performance for all cores. More details are available here.

Try Rustls!

Rustls is ready for production use today and we encourage folks to try it out. In addition to memory safety and great performance, it offers:

C and Rust APIs
FIPS Support
Post-quantum key exchange (updated algorithms coming soon)
Encrypted Client Hello (client side)
OS trust verifier support

Thank You

Rustls uses the aws-lc-rs cryptographic library by default. We'd like to thank the aws-lc-rs team at AWS for helping us reach our performance goals, and for being generally helpful with our adoption of their library. We couldn't have asked for better partners in this.

We'd also like to thank Intel for helping with AVX-512 optimizations for aws-lc-rs recently. This was an important part of achieving our performance goals.

We would not be able to do this work without our funders. Thank you to Sovereign Tech Fund, Alpha-Omega, Google, Fly.io, and Amazon Web Services for their support.

River Reverse Proxy Making Great Progress

Tue, 17 Sep 2024 00:00:00 +0000

Our Latest Release

The River reverse proxy recently has come a long way since we announced the project in February. In addition to basic proxy functionality, River now has:

Load Balancing Support: The ability to divide up incoming traffic to be forwarded to different back-end destinations in order to spread out load. It is one of the most essential pieces of functionality for a reverse proxy these days.

Rate Limiting Support: The ability to control the rate at which the reverse proxy will accept certain kinds of requests. This can be used for anti-abuse or load control purposes.

KDL-based Configuration: KDL is an easy-to-use language for writing out configurations. It's particularly strong when it comes to configuration with nested directives, which is common for reverse proxy configurations. This will likely be the default configuration method in the future.

Static File Support: It's common for reverse proxy deployments to want to be able to serve static files in response to certain requests, things like HTML and CSS files or media assets.

Graceful Reloads: The ability to restart River with a new configuration, without disrupting the handling of existing or new downstream connections, allowing existing connections to be handled by the previous instance of River, and all new connections being handled by the new instance.

CIDR Range Blocking: The ability to block entire incoming IPv4 or IPv6 ranges, for example in response to malicious traffic, where it is desirable to terminate the connection as quickly as possible to prevent attacks that would require additional resources to process.

Please see the release notes for more detailed information, as well as the user manual.

Why River?

Just about every significantly-sized deployment on the Internet makes use of reverse proxy software, and the most commonly deployed reverse proxy software is not memory safe. This means that most deployments have millions of lines of C and C++ handling incoming traffic at the edges of their networks, a risk that needs to be addressed if we are to have greater confidence in the security of the Internet.

There are reverse proxies written in memory safe languages, but for the most part their language choice caps performance at a level below what C and C++ can accomplish. River aims to offer safety and high performance at the same time by using Rust, as well as offering the following architectural advantages:

Better connection reuse than proxies like Nginx due to a multithreading model, which greatly improves performance.
WASM-based scriptability means scripting will be performant and River will be scriptable in any language that can compile to WASM.
Simple configuration, as we've learned some lessons from configuring other software for the past couple of decades.

The risk that comes with running reverse proxy software that consists of millions of lines of C and C++ on the edge of just about every significant network ought to be viewed as simply unacceptable going forward. We need a memory safe alternative with performance characteristics that can meet the needs of the most demanding high-volume environments.

What's Next

While we do not believe River is ready for production deployments yet, we encourage people to give it a spin and file issues and feature requests. Our own goal is to have River ready to replace Nginx and other reverse proxy software used by Let's Encrypt within the next year, and we encourage other organizations to start considering where they might start to improve the security of their networks with memory safe proxy software.

In the next release of River we plan to include full support for getting and managing certificates using the ACME protocol, as well as a change from BoringSSL to Rustls as the default TLS library. After that, per our roadmap, we will work on adding better support for active service discovery modes followed by improvements to path control facilities. Soon after that we will likely start the process of integrating extensibility via Web Assembly (WASM), a cornerstone feature.

Thank You

We'd like to thank Shopify and Chainguard for their financial support of this project, as well as the Pingora team at Cloudflare for making their networking library available and working closely with us. Without them, this project would not have been possible. If your organization would like to contribute financial support, please reach out to us at donate@abetterinternet.org. We have a lot of work left to do and your help will get us there faster.

Optimizing rav1d, an AV1 Decoder in Rust

Tue, 10 Sep 2024 00:00:00 +0000

AV1 is an increasingly important video format and it needs a memory safe, high performance decoder. We worked with the team at Immunant to develop rav1d. Performance is critical in this context, so we've asked Stephen Crane, CTO of Immunant, to explain their efforts in achieving performance parity. If you'd like to dig deeper, check out our recent blog post about how we ported the C AV1 decoder to Rust.

Josh Aas, Head of ISRG's Prossimo project

rav1d is a port of the high performance dav1d AV1 decoder from C to memory safe Rust. An essential goal of this project is maintaining performance, building a memory safe decoder with competitive performance compared to the leading C implementation. In our last blog post, we described our migration process for this project and some of the challenges and solutions we found while rewriting unsafe Rust transpiled from the dav1d C code into safe, idiomatic Rust.

In this post, we will further explore the performance optimization side of this project. Many of the performance critical operations in dav1d are implemented in native assembly, so we reused the dav1d assembly code for these low-level, highly-tuned routines. Rust still accounts for almost exactly half of the total run time in decoding, even with all available assembly routines enabled. Optimizing this Rust code (after transpiling) has been critical to the performance of our implementation. We will cover some of the factors that we found were important to performance (along with some that weren't) and the process of optimizing rav1d.

Performance Measurement

Before starting in on the performance journey of rav1d, it's worth noting exactly how we measured performance. When not otherwise specified, we measured performance by using hyperfine to measure the dav1d CLI tool (built from either C or our rav1d Rust implementation) decoding the Chimera¹ 8-bit, 1080p AV1 reference video on a Ryzen 7700X with 8 threads. We measured with 10-bit input and on other CPUs (including Intel and AMD x86, and Apple and Android ARM) and found roughly similar performance results unless we note otherwise. We measured performance counters with Linux perf and Intel VTune, and compared sampled performance profiles using both tools.

All percentage performance comparisons use dav1d compiled with Clang 18 as a baseline. We baselined with Clang 18 because this is the same compiler backend used in the version of rustc we were using (1.81-nightly). Surprisingly, we found that dav1d built with Clang 18 had worse performance than a version built with Clang 17 (~3% slower). We do not know the source of this regression, but found it to be consistent across Clang and rustc using corresponding LLVM backends.

Starting Point

We began this project from Rust code we transpiled from dav1d using the c2rust tool. Initially we were focused on refactoring and rewriting this starting point into idiomatic, safe Rust, while attempting to not obviously degrade performance. In later measurements, we found that this initial transpiled version was 3.8% slower than the original C code, after we enabled all of the available assembly routines in both versions. This was surprising, as the functioning of the Rust code at this point should be basically identical to the C version. c2rust preserves low-level semantics when translating to unsafe Rust, and Rust and Clang share the same LLVM compiler backend.

We theorized that bounds checks introduced during transpilation could be part of this slowdown. c2rust replaces some fixed-size local and global arrays with Rust arrays, which introduces bounds checks on accesses. Transpiling does not add bounds checks for access to variable-sized buffers, heap allocations, and some fixed-size arrays. These are translated to unsafe raw pointer operations in Rust, so the vast majority of pointer operations are still unchecked at this point in our process. To test our theory about array bounds checking, we hacked on rustc to remove bounds checks from all slice and array accesses, and compiled a version with this modified compiler. This version was still 3% slower than dav1d, so we can only attribute less than 1% of the transpiled code overhead to bounds checking. Compiling fully ported rav1d with the bounds check-less rustc similarly resulted in only a small performance improvement.

Rust also defines signed integer overflow to wrap in release mode, while in C such overflow is undefined and the compiler is free to optimize assuming overflow cannot occur. dav1d tends to use signed integers in a lot of operations, so we suspected this could be a source of performance regression. To test whether this lost optimization potential was a factor, we modified rustc to generate unchecked signed arithmetic (add, sub, mul) rather than wrapping. Some isolated instances of unchecked signed arithmetic resulted in more efficient code, but this was not a large enough difference to be measurable in aggregate.

When investigating performance counters, we found that the transpiled Rust version executed significantly (7%) more instructions than the original C version. Approximately 2% of the additional 7% instructions executed are from bounds checks. We believe that some of the remaining additional instructions are from additional type conversions and arithmetic in the transpiled code, possibly along with less efficient LLVM IR generation by the Rust frontend.

Optimization Process

As we rewrote the unsafe c2rust output into safe and idiomatic Rust, we tried to avoid making changes that significantly affected performance. We monitored performance regressions by graphing decode time for each commit and checking for noticeable jumps, as well as checking performance manually on changes that we guessed might hurt performance. As we went along, we found increasingly subtle factors became performance bottlenecks. The first performance issue we hit was dynamic dispatch to assembly, as these calls are very hot. We then began adding inner mutability when necessary but had to carefully avoid contention. We found as we removed pointers and transitioned to safe Rust types that bounds checks increasingly became a larger factor. Buffer and structure initialization was also an issue as we migrated to safe, owned Rust types. Once we had rewritten the bulk of the code into safe Rust, we profiled and carefully optimized small factors such as branching, inlining, and stack usage until we hit diminishing returns.

Dynamic Dispatch

dav1d and rav1d handle assembly routines by providing different assembly functions corresponding to different CPU SIMD features, as well as a backup function in C and Rust, respectively. If assembly was enabled at compile-time, the library then uses runtime CPU feature detection to select which function to use for each routine. dav1d does this using function pointers, and rav1d does too, with some modifications.

Initially, we tried replacing the (indirect) function pointer dispatch with matching on the enum CpuFlags and then making direct function calls. This should be compiled down to a jump table that will be perfectly predicted, just like the perfectly predicted indirect function pointer. This technique has been known to improve performance in many other contexts, such that there is even a crate to help do this: enum_dispatch. However, when profiling, we found this was actually slower than the function pointers so we returned to them. It was not clear why enum dispatch was slower in this case, and this may be a pattern the Rust compiler can optimize better.

One significant advantage to the enum dispatch approach is that it is trivial to make the fallback Rust function call be slightly different. This is needed to pass different arguments so that the fallback function can be made fully safe. With function pointers, however, this isn't as simple. We first tried changing the function pointer signature and adding shims around assembly functions that converted from the safe signature to the unsafe asm signature. However, this added an extra function call that hurt performance. It may be possible to remove this overhead with better cross-language link/binary-time optimization to allow inlining of assembly functions.

Instead, we ended up passing extra arguments to the function pointers. The assembly functions remain exactly the same and just ignore the extra arguments. This is safe to do according to the calling convention for our architectures. Passing unused arguments is inexpensive, so it ended up not hurting performance. However, we did run into issues where we wanted to pass arguments with non-stable (i.e. non #[repr(C)]) ABIs. In our particular case this is fine, since only the Rust fallback function will actually read and use them, but the compiler does not know this. Ideally, we could add #[allow(improper_ctypes)] to such parameter types, but parameter type attributes are not yet allowed. To work around this, we added an FFISafe type that converts between &T and *const FFISafe. FFISafe contains a PhantomData and a bool (to not be a ZST), and the conversion is just a pointer cast. We pass this FFISafe type as an extra argument to the function pointer, and the Rust fallback function can then access the safe Rust type rather than an unsafe raw pointer.

Inner Mutability

We had to introduce new locks and atomic fields while porting dav1d to safe Rust to provide inner mutability of shared data structures. However, adding a Mutex or RwLock around any fields and structures that needed inner mutability resulted in lock contention across threads, as different subfields and data were accessed by threads concurrently in dav1d. We had to carefully consider how to architect and split data structures so that threads would not contend with each other on write access to fields with inner mutability. We primarily ensured this by using Mutex::try_lock() and RwLock::try_{read,write}() whenever adding new locks, and testing thoroughly to make sure that taking a lock would never fail due to contention. We tried using the try-lock crate but found that the fallible locking in the parking_lot crate was just as efficient for our uses.

When possible and reasonable, we used atomic data types to provide inner mutability rather than introducing new locks. The atomig crate was especially helpful in using atomics for enums and small aggregate types. We found that, at least on x86_64 and ARM, relaxed atomic loads and stores use the same instructions as non-atomic loads and stores but allow inner mutability of the atomic types supported in Rust. We did have to be careful to not use any of the .fetch_* methods of Rust atomics, as these methods lower to different instructions that require a bus lock and added noticeable overhead. Instead, we read from the atomic, perform the operation, and then write the result back to the atomic. As each individual operation is atomic, this is sound Rust, although the entire operation is not atomic. We were confident that these data accesses were not contended in the first place as they were not atomic in the dav1d C implementation. At worst, if this assumption does not hold we would have a (sound) data race, not a memory safety bug. We encapsulated these relaxed atomic types in a wrapper type, RelaxedAtomic, to indicate they were only atomic to provide inner mutability. This wrapper also prevents accidental use of any slow atomic operations.

Bounds Checking

Bounds checking is one of the largest differences between our safe Rust and the original unsafe C code. C does no implicit bounds checking, while Rust implicitly checks that all buffer accesses are in-bounds. As we migrated unsafe Rust to safe, idiomatic Rust, we added many bounds checks through the use of safe Rust types. We tried to proactively minimize the impact of these checks, but measuring performance was critical to ensuring that our efforts were useful. The general idea in eliding unnecessary bounds checks was that we needed to expose as much information about indices and slice bounds to the compiler as possible. We found many cases where we knew, from global context, that indices were guaranteed to be in range, but the compiler could not infer this only from local information (even with inlining). Most of our effort to elide bounds checks went into exposing additional context to buffer accesses.

The simplest case of bounds checking is iterating over elements in one or more slice(s). At first we attempted to use iterators whenever possible to avoid bounds checks. We soon realized that they tended to introduce complexity, both at the source code and assembly instruction level because loops in the dav1d C code are often more complex than just a simple iteration. Instead, we tended toward a technique we call pre-slicing. When indexing into a slice or array in a loop (or just multiple times), first slice up to the length that will be accessed. For example, instead of this:

fn square(src: &[u8], dst: &mut [u8], len: usize) {
  for i in 0..len {
    dst[i] = src[i] * src[i];
  }
}

we can do this:

fn square(src: &[u8], dst: &mut [u8], len: usize) {
  let src = &src[..len];
  let dst = &mut dst[..len];
  for i in 0..len {
    dst[i] = src[i] * src[i];
  }
}

This moves the bounds check outside of the loop to the pre-slice, which makes the loop codegen simpler and more amenable to auto-vectorization and other optimizations.

When pre-slicing with ranges with a non-zero start offset, eliminating the lower bounds check was an important optimization. To eliminate the lower check, the compiler must guarantee that the lower bound will always be less than the upper bound. For example with the range i..i + n, if the compiler can guarantee that i <= i + n, then checking that i + n is in range is sufficient to ensure the whole range is valid. This requires that we can ensure that i + n will not overflow and wrap around to zero. We had to be careful that our arithmetic was done with the correct precision to ensure that computation could not overflow. In this case, if i and n are u32 values then we can first cast to usize before doing the addition to ensure that operation will not overflow. We had to be particularly careful about this arithmetic precision in Rust, as signed integer overflow is well-defined unlike in C where it is undefined behavior and disallowed.

The type of an index can sometimes guarantee that an index is in bounds. For example, a boolean value used to index into a two element array does not require a bounds check. To take advantage of this property in rav1d, we had to rewrite boolean values stored as integer types in C to Rust bool types and only cast at the point we are indexing with the value. Similarly for small, fixed sets of named integer values, we can use a Rust enum and only cast the value to a usize when it is used as an index.

Inlining is crucial in helping us propagate range information to uses where it allows bounds check elimination. We made our bounds checking methods (such as in our DisjointMut type, which we discussed further in our blog post on refactoring rav1d) as small as possible and marked them with #[inline]. We moved the error handling code to an inner #[inline(never)] function, as the standard library does. The panicking and message formatting code is quite large, but since it's not on the hot path, we can put it in its own never inlined function. This way, the hot code will only have a branch and call to this error function (as well as the actual bounds check compare), and there will only be one copy of the error function. Adding this optimization resulted in all of our bounds check functions being fully inlined and made a noticeable performance improvement. Hot/cold path splitting (for panicking code paths) and partial inlining by the compiler could have eliminated the need for this manual annotation.

In some cases, inlining may not result in range information in the same optimization scope as the check we want elided. We cannot direct the compiler to only inline the small parts of larger functions that contain just the information we want to propagate. In some cases when the range is a power of 2, we can use a cheap & mask. In some others, we can use cmp::min, as a cmov/csel is generally cheap (and shorter!) than a panicking branch. For cases where we have a fixed, known range, we can encode this information in the type system instead. We created an InRange type, which imposes the restriction that the wrapped T type is in the range MIN..=MAX. For example, for a value in the range 0..1024, we use the type InRange. At creation we check that the value is in bounds, and then an inlined getter adds an assert_unchecked that the value is in the type's range. The optimizer can then rely on this property to eliminate bounds checks when using that value as an index into a sufficiently large buffer.

Initialization

Initialization of large stack buffers is costly. dav1d handled this efficiently by leaving the initial declaration uninitialized and (sometimes only partially) initializing the buffer. In Rust, access to uninitialized memory is unsafe, so c2rust properly initializes all variables before use. This is generally fine, as the types are small and/or the optimizer can optimize out redundant initializations. However, for large, complexly initialized arrays, zero initialization of stack variables proved expensive. In these cases, we had to use an array of MaybeUninit values, as shown in the example below, and verify that all element reads only access initialized elements. When array methods for MaybeUninit are stabilized, some of this usage can be simplified, but the fundamental use of uninitialized memory will remain.

let mut txa = [[[[MaybeUninit::uninit(); 32]; 32]; 2]; 2];

// Calls to `decomp_tx` that initialize portions of the `txa` array omitted.
// After these calls to `decomp_tx`, the following elements of `txa` are initialized:
// * `txa[0][0][0..h4][0..w4]`
// * `txa[1][0][0..h4][0..w4]`
// * `txa[0][1][0..h4][x]` where `x` is the start of a block edge
// * `txa[1][1][y][0..w4]` where `y` is the start of a block edge

// Subsequent use of `txa`
// SAFETY: y < h4 so txa[0][0][y][0] is initialized.
let txa_y = unsafe { txa[0][0][y][0].assume_init() };

// SAFETY: h4 - 1 < h4 and ..w4 < w4 so txa[1][0][h4 - 1][..w4] is
// initialized. Note that this can be replaced by
// `MaybeUninit::slice_assume_init_ref` if it is stabilized.

let txa_slice =
    unsafe { &*(&txa[1][0][h4 - 1][..w4] as *const [MaybeUninit<u8>] as *const [u8]) };
a[..w4].copy_from_slice(txa_slice);

Partially initialized stack variable from lf_mask.rs

dav1d and rav1d make many large allocations as they decode large videos. We noticed that dav1d tended to completely free old buffers and re-malloc new ones (which are subsequently lazily initialized). We initially replaced that with APIs like Vec::resize, which avoided having to re-initialize the existing elements. However, for large allocations like the picture data pool, we realized that this was substantially slower. After some debugging, we realized that this is because zero-initialized large allocations can be optimized by the allocator and kernel, and do not require additional initialization if allocated by the kernel. In Rust, such initialization-free large zero allocations can be made through alloc_zeroed, which vec![0; len] is special-cased for. And as soon as we started using those APIs, we regained the lost performance of Vec.

Branchless Instructions and Stack Usage

As we focused on the hottest functions, we noticed that certain optimizations we would try had unpredictable effects on code generation for branches. When making an unrelated change, some other branch became mispredicted often and degraded performance even more. It turned out that code that in most contexts was lowered to a branchless cmov/csel instruction was instead being lowered to a conditional branch, but avoiding this proved challenging.

For example, we started with this code:

fn decode_coefs_class<const TX_CLASS: usize, BD: BitDepth>(...) {
    ...
    let mut tok = tok as u32;
    tok *= 0x17ff41;
    level[0] = tok as u8;
    tok = (tok >> 9) & (rc as u32 + !0x7ff);
    if tok != 0 {
        rc = rc_i;
    }
    ...
}

https://github.com/memorysafety/rav1d/pull/1246/files

It was already quite optimized in dav1d with all of its magic numbers and bitwise operations. When we refactored decode_coefs_class from a C (and then Rust) macro into a const generic function, performance generally stayed the same, except it introduced a new branch misprediction on if tok != 0, but only for specific combinations of BitDepth and TxClass (TX_CLASS: usize is the discriminant of enum TxClass). After some investigation, we realized that the Rust code for fn decode_coefs (the parent function here) used significantly more stack space than the C code did. There were many stack spills, and we theorize that tiny changes elsewhere in the code (like the TxClass variant) changed stack usage, and led LLVM to make different branching optimization decisions in trying to limit stack spills.

We also noticed this was a problem in many other very large functions, so for the large and hot functions, we attempted to minimize stack usage. We tried reducing the size of arguments where possible (e.g. u32 instead of usize, bool instead of usize, a u8-sized enum instead of usize), but this did not consistently improve performance.

We noticed that the existence of panicking code significantly increased stack usage. Similarly to how panicking/formatting code prevented crucial inlining in bounds checking code until moved out of the function, panicking code also increased stack usage and spills. In decode_coefs, we found one panic hiding in a #[derive(Atom)]. Removing it by manually impling Atom using .unwrap_or_default() instead of .unwrap(), we were able to reduce decode_coefs stack usage from 296 to 232 bytes.

However, reducing stack usage all the way to C levels proved too difficult. rustc seems to generate code that uses more stack space or is harder for LLVM to optimize than clang for C code. In most cases, this doesn't turn into a performance problem. In especially large and hot functions, the additional stack size can affect performance by causing seemingly unrelated bottlenecks, like branch misprediction.

We're also aware that rustc tends to generate far more stack copies due to moves than C compilers like clang, some of which has been addressed by new LLVM optimizations for Rust. We investigated if this might be contributing to the performance overhead of Rust in rav1d, but we do not do many large moves, owing to the C-style architecture of the codebase.

We ended up having to work around the branch misprediction, so we first tried changing the above code to this:

fn decode_coefs_class<const TX_CLASS: usize, BD: BitDepth>(...) {
  ...
  // If we do this after the `tok *= 0x17ff41`,
  // it uses a mispredicted branch instead of `cmov`.
  let tok_non_zero = tok != 0;

  let mut tok = tok as u32;
  tok *= 0x17ff41;
  level[0] = tok as u8;
  tok = (tok >> 9) & (rc as u32 + !0x7ff);
  if tok_non_zero {
      rc = rc_i;
  }
  ...
}

https://github.com/memorysafety/rav1d/pull/1246/files

We just stored tok_non_zero in its own variable, and this made the branch branchless again. However, after some more optimizations elsewhere in decode_coefs, the branch came back again, for only one combination of TxClass and BitDepth. This time, we tried to write the code in a more obviously branchless way:

fn decode_coefs_class<const TX_CLASS: usize, BD: BitDepth>(...) {
  ...
  let tok = tok as u32 * 0x17ff41;
  level[0] = tok as u8;

  // This is optimized differently from C to avoid branches,
  // as simple branches are not always optimized to branchless `cmov`s.
  let mask = tok >> 9;
  let tok = mask & (rc as u32 + !0x7ff);
  let mask = mask as u16;
  rc = (rc_i & mask) | (rc & !mask);
  ...
}

https://github.com/memorysafety/rav1d/pull/1257/files

We rewrote the logic to avoid an if altogether (a more straightforward approach where we replaced the if with indexing like rc = [rc, rc_i][(tok != 0) as usize] resulted in a branchless setne, but less efficient code otherwise). So far, it appears that LLVM has not tried to optimize this back into a branch at all given the much more explicit bit operations, but there are no guarantees that it will stay optimized well. One thing that could help would be to stabilize the recently added select_unpredictable intrinsic, which appears to be exactly what we want (mark a branch as unpredictable so that it will be optimized as such).

Deriving `Clone` and `Copy`

A helpful user pointed out to us that the code generated for clones differed significantly when the type derives both Clone and Copy versus only Clone. As noted in the corresponding rustc issue, the Copy version is usually more efficient, but there are also some cases where the Clone only version is more efficient. We could make all such types both Clone and Copy, getting better optimizations on average, but we don't want to semantically mark such large types as Copy and risk having them accidentally copied. We hope that a fix for this issue will land in rustc soon.

Diminishing Returns

As we profiled rav1d and compared hot instructions to the corresponding C implementation, we found many opportunities for small optimizations that added up to a few percent of performance improvement. We have covered the most productive optimization opportunities above, but we also found cases where rewriting small sequences to be more amenable for optimization was profitable. This was a tricky process, as most of our attempts to optimize machine code sequences turned out to not make a difference in overall performance. Accurate benchmarking was critical to ensuring that we did not land changes that added source code complexity without improving performance on all targets.

Inlining was a small source of differences between our Rust implementation and the original C code. Rust inlines across modules, while C (without LTO) will not. We experimented with tuning the Rust inlining by adding #[inline(always)] and #[inline(never)] annotations to match the original C inlining behavior when the two differed. We found that the Rust compiler did a fairly good job of making profitable inlining choices without annotations, with the exception of a few functions that were large and cold and therefore should not be inlined. The upstream dav1d codebase did not benefit from increased inlining scope much (LTO did not substantially improve performance), which indicates to us that the C code was already structured into compilation units with an eye toward useful inlining. As would then be expected, we found that Rust with LLVM generally made the same inlining choices as Clang with the same backend.

Current State

After applying the optimization techniques discussed here (and more), we have reduced performance overhead from a peak of about 11% when we started optimizing the safe Rust implementation in earnest to under 6% now on x86_64. We believe there is still room for further improvement, both in how the compiler optimizes Rust and in our implementation details. This will necessarily be a process of small, incremental improvements, as our profiling now indicates that the remaining overhead is spread roughly evenly across the largest, most complex functions in rav1d.

We'd like to thank Amazon Web Services, Sovereign Tech Fund, and Alpha-Omega for supporting the development of this work. If you want to learn more about rav1d or start using it, check it out on GitHub.

http://download.opencontent.netflix.com.s3.amazonaws.com/AV1/Chimera/Old/Chimera-AV1-8bit-1920x1080-6736kbps.ivf ↩︎

Porting C to Rust for a Fast and Safe AV1 Media Decoder

Mon, 09 Sep 2024 00:00:00 +0000

AV1 is an increasingly important video format and it needs a memory safe, high performance decoder. We worked with the team at Immunant to develop rav1d, a Rust-based port of dav1d, a C decoder. This is the first of two blog posts about how the team approached this effort.

Josh Aas, Head of ISRG's Prossimo project

Complex data parsing is one of the most security-critical operations in modern software. Browsers must decode untrusted audio and video inputs encoded with extremely complicated formats in real time. Memory safety bugs in this decoding process are disastrous and common. For example, researchers fuzzing H.264 decoder implementations have demonstrated that these decoders are a dangerous source of bugs. AV1 is a similarly complex, widely used video format. We need a memory safe, performant implementation of AV1 format parsing to avoid parsing vulnerabilities in heavily targeted software such as browsers.

To create this fast and safe AV1 decoder, we have ported an existing high performance AV1 decoding library, dav1d, from C to Rust: rav1d. Our implementation is drop-in compatible with the dav1d C API. Format parsers that were unsafe C are now memory safe Rust. To preserve performance, we have kept the (unsafe) native assembly routines that implement low-level decoding operations. These assembly routines primarily operate on buffers of primitive values using validated data from the Rust parsing code. Historically, most exploitable bugs are in higher level format parsing code, rather than low level data operations. We will continue to fuzz and analyze the assembly routines to mitigate memory corruption bugs at that level.

Our goals for the rav1d implementation are:

Move dav1d's C code to Rust for memory safety
Drop-in C API compatibility
Performance on par with the C implementation
Reuse the assembly code from dav1d and make it easy to frequently synchronize it
X86-64 and ARM64 support

Migration Approach

Writing a high performance and complete AV1 decoder from scratch is a challenging project. It requires a deep understanding of the AV1 standard and domain knowledge about how to best implement this codec format in a performant and compatible way. The dav1d implementation, developed by the VideoLAN and FFmpeg communities and sponsored by the Alliance for Open Media, has been under development for 6 years. It contains about 50k lines of C code and 250k lines of assembly. This implementation is mature, fast, and widely used. Rather than try to reimplement such a decoder in Rust from scratch, we chose to migrate the existing dav1d C code to Rust.

We want to provide a compatible C API to make migration to our new Rust library smoother. We also want to reuse the dav1d assembly code to preserve performance. These compatibility constraints mean that we must preserve some of the same data structure layout as dav1d both externally and internally. We must also call the assembly routines in the exact same way as dav1d does. Functionally, this requires that we implement decoding in effectively the same way as dav1d.

We could have manually rewritten dav1d into Rust one function or module at a time. However, given the compatibility we wanted to retain, this would have been a tedious process. To get to the point where we could make cross-cutting changes to internal data structures necessary for enforcing memory safety would require rewriting much of the codebase. Instead we chose to migrate by initially transpiling the C code into equivalent but unsafe Rust using c2rust. This allowed us to start the rewriting process from a fully working, seamlessly compatible Rust codebase without introducing any new logic bugs in the process. The bulk of the work then consisted of manually refactoring and rewriting the unsafe Rust into a safe, idiomatic Rust.

Transpiling into unsafe Rust followed by rewriting provided two important advantages for this project: 1) starting from a fully working Rust implementation allowed us to thoroughly test decoding functionality while incrementally refactoring, and 2) transpiling complex decoding logic reduced the need for expert domain knowledge of the AV1 specification. We found that full CI testing from the beginning while rewriting and improving the Rust code was immensely beneficial. We could make cross-cutting changes to the codebase and run the existing dav1d tests on every commit. Between the static checking of the Rust compiler and integration testing of the full decoder, we spent comparatively less time debugging than we would have implementing a decoder from scratch. The majority of the team on this project were experts in systems programming and Rust, but did not have previous experience with AV codecs. Our codec expert on the project, Frank Bossen, provided invaluable guidance but did not need to be involved in the bulk of the effort.

When we first embarked on this project, we estimated that it would take on the order of 7 person-months of effort. However, we found that we needed to spend significantly more manual effort than we had anticipated. In total we spent more than 20 person-months of effort with a team of 3 developers. The rewriting process, especially while attempting to preserve performance, ended up being far more involved than we had anticipated. We encountered some challenges that were due to idiosyncrasies of the dav1d codebase itself or the c2rust transpiler tooling. For example, the dav1d C code structure resulted in significant duplication of transpiled Rust code for both 8- and 16-bit depths, which we had to manually unify and deduplicate. Interfacing with existing assembly in a safe and ergonomic way while preserving performance required significant care and effort. Many of the other challenges we encountered are more fundamental to migrating C code to Rust, so this post will focus on these issues and our solutions.

Challenges

We encountered various challenges due to the mismatch between C and safe Rust patterns. Lifetime management required understanding the existing codebase in great detail, but we did not find that lifetimes and borrows were the most challenging issues. Rust thread safety, which makes sharing mutable data across worker threads difficult, was a poor fit for the dav1d threading model which shares almost all data implicitly between threads. Memory ownership and buffer pointers were further important sources of difficulty, as well as unions and other unsafe C patterns.

Threading

dav1d uses a pool of worker threads to concurrently execute tasks that do not depend on each other. However, these tasks operate on shared global and per-frame context structures and even share mutable access into the same buffers. For example, Figure 1a shows an excerpt from the root context structure that all threads must access. Each in-progress frame has a Dav1dFrameContext which is shared between worker threads operating on that frame. Each Dav1dTaskContext object is only ever used by a single thread, but in the C version each object is accessible to all other threads via the root Dav1dContext. Finally, the root context contains many other state fields, some of which must be mutable by the main thread but readable in worker threads.

struct Dav1dContext  {
    Dav1dFrameContext *fc;
    unsigned n_fc;

    Dav1dTaskContext *tc;
    unsigned n_tc;

    struct Dav1dTileGroup *tile;
    int n_tile_data_alloc;
    int n_tile_data;
    int n_tiles;

    // ...
}

Figure 1a: Excerpt from the C root context structure

Rust requires that all data shared between threads be Sync, which means that the data must be safe to access concurrently from multiple threads. We want to share a borrowed root context between all threads, so all data in that context must be immutable. To allow mutation of shared data, we must introduce locks to ensure thread safety is maintained at runtime. We added Mutex and RwLock as needed to allow interior mutation. If we assume that the original C code does not have data races (we did not observe any in dav1d), these new locks should never be contended. We made heavy use of Mutex::try_lock() and RwLock::try_read() / RwLock::try_write() to validate at runtime that a thread could safely access data without possibly introducing delays waiting on a lock.

pub struct Rav1dContext {
    pub(crate) state: Mutex<Rav1dState>,
    pub(crate) fc: Box<[Rav1dFrameContext]>,
    pub(crate) tc: Box<[Rav1dContextTaskThread]>,

    // ...
}

pub struct Rav1dState {
    pub(crate) tiles: Vec<Rav1dTileGroup>,
    pub(crate) n_tiles: c_int,

    // ...
}

Figure 1b: Corresponding context excerpt from rav1d

As shown in Figure 1b, we had to reorganize the dav1d structures to better fit into the Rust thread safety model. We refactored mutable state into a new Rav1dState structure and wrapped this in a mutex. It's also worth noting that tc no longer contains thread-local data for all threads but instead only the thread handle and Sync metadata for thread coordination. All thread-local data from Dav1dTaskContext is now managed by each worker thread independently so it does not need to be Sync.

Adding extra locks handles the case where only a single thread needs to mutate a particular field or structure. dav1d, in many cases, relies on concurrent but non-overlapping access to a single buffer. One thread must read or write from a range of the buffer while another thread accesses a different, disjoint range of the same buffer. This pattern, while free of data races in practice, does not map cleanly into safe Rust idioms. In safe Rust, one would generally first partition a buffer into disjoint slices then distribute these disjoint slices to different threads for processing. That pattern requires knowing the precise partitioning of each data buffer ahead of time in order to properly distribute these slices to task threads. In the case of AV1, this buffer partitioning would be extremely complicated as the partitioning is not static or even contiguous. Crates exist for storing N-dimensional arrays to allow for partitioning and chunking these buffers, such as ndarray, but we would need to understand the precise access patterns of all tasks for all buffers in order to properly partition these buffers. This would have required a fundamental re-architecting of the rav1d task scheduling.

Instead, we implemented another approach that more closely fits the model used in dav1d. We created a buffer wrapper type, DisjointMut that allows for disjoint, concurrent mutable access to a buffer. In debug builds, we track each borrowed range to ensure that each mutable borrow has exclusive access to its range of elements. We found this tracking to be incredibly useful for debugging and ensuring threads did not borrow data in a way that overlaps with other threads. However, tracking each borrow is too expensive for release builds, so in release builds the entire DisjointMut structure is a zero-cost wrapper over the underlying buffer. Access into the buffer is still bounds checked, so we preserve spatial safety while potentially compromising thread safety. All DisjointMut buffers in rav1d are primitive data, so at worst this pattern could only introduce nondeterminism if access is not correctly disjoint. Figure 2 shows an excerpt from a structure that is shared by all worker threads. Multiple threads concurrently mutate different blocks in the b field, so we wrapped this vector in DisjointMut to allow concurrent access.

struct Dav1dFrameContext_frame_thread  {
    // ...

    // indexed using t->by * f->b4_stride + t->bx
    Av1Block *b;
    int16_t *cbi; /* bits 0-4: txtp, bits 5-15: eob */

    // ...
}  frame_thread;

Figure 2a: Excerpt from the C frame context structure

pub struct Rav1dFrameContextFrameThread {
    // ...

    // Indexed using `t.b.y * f.b4_stride + t.b.x`.
    pub b: DisjointMut<Vec<Av1Block>>,

    pub cbi: Vec<RelaxedAtomic<CodedBlockInfo>>,

    // ...
}

Figure 2b: rav1d Rust equivalent to Figure 2a

Where possible, we used atomic types instead of adding locking. We are relying on the code already avoiding logical data races, and atomic primitive types provide formal thread safety. We did not require any particular atomic memory ordering because we are assuming that writes to shared fields are not racy, so we used relaxed ordering. On the platforms we are targeting, naturally aligned loads and stores are already atomic, so relaxed ordering atomic operations in Rust lower to the same memory operations as in C with no additional overhead¹. We could not use non-relaxed atomics or fetch+update methods, as these operations lower to complex, slower instructions. We added a RelaxedAtomic wrapper to simplify usage of these atomic fields and ensure that we did not use inefficient patterns. We also used the atomig crate to make simple primitive sized structures and enums atomic.

Overall, we found the Rust thread safety model to be overly strict. Were we to write this decoder from scratch, we would have designed more disjoint and clear data sharing between threads. However, we were able to basically preserve performance without drastic changes to the existing logic by using new data structures such as DisjointMut and RelaxedAtomic that still give us the memory safety guarantees we want while relaxing data race safety enforcement.

Self-Referential Structures

Pointers into the same structure or recursively between structures is a common pattern in C that is not easily reproducible in safe Rust. The challenging pointers we encountered in porting dav1d largely fit into one of two categories: cursors tracking buffer positions, and links between context structures.

We generally refactored buffer cursor pointers into integer indices. However, this was not always straightforward -- some buffer pointers could temporarily go out of bounds before the beginning of the buffer because a positive offset would later be added or the pointer would not be dereferenced at all. We refactored these cases to ensure that offsets stayed non-negative by moving index calculations. Even for simpler cases, changing pointers to indices required that we carefully track and document which buffer each index was referencing and ensure that every use of the index had access to the corresponding buffer.

We had to disentangle the dav1d context structures by removing pointers from child structures to their containers and then passing additional structure references as function parameters instead. For example, we added Rav1dContext and Rav1dFrameData reference parameters to decode_tile_sbrow, because we had to remove these pointers from the task context structure.

// Original C function
int dav1d_decode_tile_sbrow(Dav1dTaskContext *const t) {
  const Dav1dFrameContext *const f = t->f;
  Dav1dTileState *const ts = t->ts;
  const Dav1dContext *const c = f->c;
  // ...
}

// Safe Rust version
pub(crate) fn rav1d_decode_tile_sbrow(
    c: &Rav1dContext,
    t: &mut Rav1dTaskContext,
    f: &Rav1dFrameData,
) -> Result<(), ()> {
    let ts = &f.ts[t.ts];
}

Figure 3: Separating context structure

Unions

dav1d makes some use of untagged C unions. In cases where an additional field was used as a tag, we rewrote these unions into safe tagged Rust enums. The discriminant for some unions, however, was implicit in C. For example, the stage of a task, stored in an entirely different context structure, would determine which union variant should be used. For these cases, rather than add a redundant tag and change the structure representation and size, we opted to use the zerocopy crate to reinterpret the same bytes as two different types at runtime. This was only an option because these unions consisted entirely of primitive types without padding. The zerocopy traits enforce this invariant and allow zero-cost access to the union contents, without requiring an explicit tag. Though this pattern is less idiomatic, we found it was necessary in a few cases for performance and compatibility.

Conclusions

Was transpiling and rewriting worthwhile? We believe so, at least for the rav1d project. Rewriting an AV1 video decoder from scratch would have introduced all sorts of new bugs and compatibility issues. We found that, despite the threading and borrowing challenges, rewriting existing C code into safe, performant Rust was possible. Our rav1d implementation is currently about 6% slower than the current dav1d C implementation. We will go into more detail on the process of optimizing rav1d performance in an upcoming blog post. For applications where safety is paramount, rav1d offers a memory safe implementation without additional overhead from mitigations such as sandboxing. We believe that with continued optimization and improvements, the Rust implementation can compete favorably with a C implementation in all situations, while also providing memory safety.

The only overhead is from not being able to combine, for example, 2 consecutive, aligned AtomicU8 loads into a single AtomicU16 store, which would transparently be done for u8s and u16s. For individual fields accessed separately, this is not a problem. It does, however, introduce more overhead when working with arrays and slices. ↩︎

A new home for memory safe sudo/su

Wed, 17 Jul 2024 00:00:00 +0000

Today we're pleased to announce that an open source memory safe implementation of sudo/su — sudo-rs — has a new long-term home at the Trifecta Tech Foundation.

ISRG's Prossimo project set out to develop a strategy, raise funds, and select a contractor for a memory safe sudo/su implementation in early 2022. We did this because sudo and su are critical utilities managing control of the user privilege boundary on most Linux systems. The original utilities are written in C and have a history of memory safety vulnerabilities, a class of issues that critical system software should not suffer from.

During 2022 we made a plan and selected a joint team from Tweede golf and Ferrous Systems as the contractors. Funding was generously provided by Amazon Web Services. The first release was made in August 2023. A third party security audit was completed in September of 2023.

There are software packages for Debian, Ubuntu and Fedora. It's also available on crates.io.

We recently decided that Trifecta Tech Foundation would become the long-term maintainer of sudo-rs. It was founded by the team from Tweede golf, and since they worked on sudo-rs and we're big fans of their approach to open source, it was an easy decision to make on our end.

Trifecta Tech Foundation aims to provide stability to the sudo-rs project and support its maintainers. Their work will be supported by soliciting contracts and sponsorship for features and maintenance.

If you're using sudo (who isn't?) you can help make your systems and the Internet as a whole safer by becoming an adopter of sudo-rs and providing feedback. Contact Trifecta Tech Foundation if you're interested!

Support Our Work

ISRG is a 501(c)(3) nonprofit organization that is 100% supported through the generosity of those who share our vision for ubiquitous, open Internet security. If you'd like to support our work, please consider getting involved, donating, or encouraging your company to become a sponsor.

More Memory Safety for Let’s Encrypt: Deploying ntpd-rs

Mon, 24 Jun 2024 00:00:00 +0000

When we look at the general security posture of Let's Encrypt, one of the things that worries us most is how much of the operating system and network infrastructure is written in unsafe languages like C and C++. The CA software itself is written in memory safe Golang, but from our server operating systems to our network equipment, lack of memory safety routinely leads to vulnerabilities that need patching.

Partially for the sake of Let's Encrypt, and partially for the sake of the wider Internet, we started a new project called Prossimo in 2020. Prossimo's goal is to make some of the most critical software infrastructure for the Internet memory safe. Since then we've invested in a range of software components including the Rustls TLS library, Hickory DNS, River reverse proxy, sudo-rs, Rust support for the Linux kernel, and ntpd-rs.

Let's Encrypt has now taken a step that was a long time in the making: we've deployed ntpd-rs, the first piece of memory safe software from Prossimo that has made it into the Let's Encrypt infrastructure.

Most operating systems use the Network Time Protocol (NTP) to accurately determine what time it is. Keeping track of time is a critical task for an operating system, and since it involves interacting with the Internet it's important to make sure NTP implementations are secure.

In April of 2022, Prossimo started work on a memory safe and generally more secure NTP implementation called ntpd-rs. Since then, the implementation has matured and is now maintained by Project Pendulum. In April of 2024 ntpd-rs was deployed to the Let's Encrypt staging environment, and as of now it's in production.

Over the next few years we plan to continue replacing C or C++ software with memory safe alternatives in the Let's Encrypt infrastructure: OpenSSL and its derivatives with Rustls, our DNS software with Hickory, Nginx with River, and sudo with sudo-rs. Memory safety is just part of the overall security equation, but it's an important part and we're glad to be able to make these improvements.

We depend on contributions from our community of users and supporters in order to provide our services. If your company or organization would like to sponsor Let's Encrypt please email us at sponsor@letsencrypt.org. We ask that you make an individual contribution if it is within your means.

Encrypted Client Hello (ECH) Support for Rustls

Thu, 13 Jun 2024 00:00:00 +0000

We're pleased to announce that the Rustls TLS library now has experimental support for client-side Encrypted Client Hello (ECH).

When client software wants to connect to a server it uses an IP address typically obtained via DNS. However, it's quite common for a server at a single IP address to host content from multiple different domain names. To make sure a server knows which domain name the client wants to access it will specify the domain name in the TLS connection request using the SNI extension. This happens during the early Client Hello stage of setting up a TLS connection, and the domain name is not encrypted by default. ECH is a proposed Internet standard created in order to encrypt the domain. Without ECH, anyone who can see the network traffic can see which website the connection is intended for. With ECH, the domain name is encrypted, resulting in greater privacy when connecting to hosts serving many domains.

ECH is part of a trifecta of technologies that helps keep connections private: DNS over HTTPS (DoH) for protecting DNS requests, ECH for protecting the connection destination, and TLS for protecting content. You can read more about ECH in this excellent blog post from Cloudflare.

Support will be experimental for at least a few months as we get feedback, and we hope to add server-side support by the end of the year.

We'd like to thank Sovereign Tech Fund and Alpha-Omega for funding this work.
With the advancements made to Rustls over the last few years (including a FIPS-supported cryptography library, post-quantum key exchange support and robust benchmarking), we now see it as a viable, performant, and memory safe alternative to OpenSSL. We're pleased to see its adoption picking up. If your organization is interested in exploring the use of Rustls, reach out and let us know! We'll continue making our planned improvements to Rustls but would love adopter feedback.

Providing official Fedora Linux RPM packages for ntpd-rs and sudo-rs

Thu, 09 May 2024 00:00:00 +0000

Fabio Valentini is a longtime maintainer of many RPM packages for Fedora Linux. He recently helped us get sudo-rs and ntpd-rs packaged in Fedora Linux so we asked him to share his thoughts on the process.

Josh Aas, Head of ISRG's Prossimo project

Fedora Linux aims to provide high-quality RPM packages for Rust applications, which makes it a great fit for security critical applications like ntpd-rs and sudo-rs.

The Fedora approach to Rust packaging

Our approach towards building and distributing Rust applications differs from most other distribution mechanisms in some key aspects. Most importantly, we try to avoid building packages for Rust applications with vendored dependencies, and instead attempt to provide packages for individual crates whenever possible. These packages for crates are then used for building application packages, and are shared between all applications that depend on them.

Since everything that is redistributed by Fedora Linux (either as source code or as compiled packages) needs to comply with both technical and legal requirements, sharing dependencies this way makes it easier to review and audit dependencies (i.e. thorough peer review when adding a package for a new crate, and some quick checks when pushing updates for existing crates). By comparison, auditing all vendored dependencies whenever an update for an application package is pending is usually not possible due to the time that would be required, and is often neglected as a result.

Additionally, this approach allows us to ship updates for security vulnerabilities in Rust crates quickly. Only the package for the affected library needs to be updated, and packages for applications that use this library can just be rebuilt against the "fixed" version without requiring any code changes. This allows us to push security updates for Rust applications to users reliably, whereas other distribution mechanisms either don't support pushing updates to users at all (like cargo install), or would require updating vendored dependencies and/or code changes individually for each affected application.

Packaging ntpd-rs and sudo-rs

I was initially approached by Josh Aas from Prossimo in January 2023 because there was interest in providing official Fedora Linux RPM packages for ntpd-rs. At this point, our tools did not yet support building projects like ntpd-rs (i.e. projects that were organized as multiple crates within a "cargo workspace"). Over the following months, I worked on both updating our tools to support this use case and on packaging and getting packages for crate dependencies through peer review.

While I was able to finish the tooling support relatively quickly (I released official support for building "workspace" projects in February 2023), getting packages for crate dependencies reviewed and/or updated to the versions that were needed by ntpd-rs took longer than expected. One of the final blockers was resolved with the 0.17.0 release of ring in October 2023 (and the accompanying release of rustls), as this version finally included support for all CPU architectures that are supported by Fedora Linux.

Additionally, the effort to reduce the number of dependencies of ntpd-rs in the versions approaching the 1.0.0 release helped as well - even though it rendered some work that had already been done to package the dropped dependencies (i.e. the axum web framework) for Fedora obsolete (one of the dangers of working towards a moving target, I suppose).

My package review request for ntpd-rs from October 2023 was finally approved at the end of the year, and I was able to push packages for ntpd-rs to the official Fedora repositories at the start of 2024. All current branches of Fedora Linux now provide up-to-date packages for ntpd-rs.

Finally, last month, Josh Aas reached out to me again, asking me about providing packages for sudo-rs for Fedora Linux as well. Since this project is much smaller and has very few dependencies (which were all already available), I was able to get the package through review and publish official Fedora packages for sudo-rs within 10 days.

Due to the status of sudo as a non-removable package on Fedora Linux, sudo-rs currently cannot provide the sudo executable, but it should be possible to make the necessary changes to allow the packages to be truly interchangeable in the future.

Conclusion

While our approach to packaging Rust applications for Fedora is sometimes difficult and time consuming compared to other distribution mechanisms, I think the unique benefits (especially the possibility of reliably pushing security updates to users and technical / legal review of crate dependencies) currently still outweigh the cost. I'm confident that we can continue providing high-quality, up-to-date packages for ntpd-rs and sudo-rs -- and Rust applications in general -- for our users.

Prossimo is able to take on the challenging work of rewriting critical components of the Internet thanks to our community of funders from around the world. We'd like to thank the NLnet Foundation for their funding of the audit of sudo-rs. We'd also like to thank Cisco and Amazon Web Services for supporting this work and supporting the transition to memory safe software.

Rustls Gains OpenSSL and Nginx Compatibility

Wed, 08 May 2024 00:00:00 +0000

The Rustls TLS library can now be used with Nginx via an OpenSSL compatibility layer. This means that Nginx users can switch from OpenSSL to Rustls with minimal effort - users can simply swap in a new TLS library without needing to modify or recompile Nginx.

We have targeted Nginx versions greater than 1.18 on Ubuntu 22.04 or newer for initial support. Here's how easy it is to get going on x86_64 Ubuntu Linux 22.04:

$ wget https://github.com/rustls/rustls-openssl-compat/releases/latest/download/rustls-libssl_amd64.deb
$ sudo dpkg -i rustls-libssl_amd64.deb
$ sudo rustls-libssl-nginx enable
$ sudo systemctl daemon-reload
$ sudo service nginx restart

After investing heavily in Rustls over the last few years, we now see it as a viable, performant, and memory safe alternative to OpenSSL. Recent releases have brought pluggable cryptography with FIPS support, performance optimizations, post-quantum key exchange, and numerous other improvements. In the coming months, we will focus on improving performance in the few areas where Rustls doesn't already surpass OpenSSL and add support for RFC 8879 for certificate compression. ISRG's Let's Encrypt certificate authority will begin replacing OpenSSL with Rustls later this year.

The importance of memory safety has been expounded upon recently by a number of groups, including the White House Office of the National Cyber Director. Anjana Rajan, Assistant National Cyber Director for Technology Security, The White House, adds: "Moving the cyber ecosystem toward memory safe programming languages is not only good engineering practice, but imperative for our national security. Achieving this will require a pragmatic and methodical approach. Securing the building blocks of cyberspace is critical and there is no better place to start than with TLS."

Regarding Rustls, Ms. Rajan adds "The White House, Office of the National Cyber Director, commends the Prossimo team for their outstanding work in building Rustls, a FIPS compliant memory safe TLS implementation. By prioritizing integration with Nginx, the Prossimo team is actively ensuring a good developer experience when pursuing stronger cybersecurity."

We're pleased to see Rustls adoption picking up. If your organization is interested in exploring the use of Rustls, reach out and let us know! We'll continue making our planned improvements to Rustls but would love adopter feedback.

We'd like to thank Sovereign Tech Fund, Fly.io, Google, AWS, and Alpha-Omega for supporting the work to advance Rustls.

A Readout from Tectonics

Fri, 29 Mar 2024 00:00:00 +0000

In November of 2023, ISRG held an event in San Francisco called Tectonics. Our goal was to discuss solutions for moving forward with memory safety for critical Internet infrastructure. We had a group of about 50 people at the center of the memory safety movement, from engineers to public policy and corporate decision makers. We could not have asked for a better group.

There were four tracks, plus substantial open time for all attendees to discuss amongst themselves. The tracks were:

Facilitating adoption of memory safe code for Internet critical infrastructure
Memory safety roadmaps for organizations
Facilitating the inclusion of Rust in operating systems
Improving trust in Rust dependency trees

In this post we'd like to communicate some of the take-aways from the group as a whole as well as each track.

Making Connections

Just from having spent the day together, we've seen connections and conversations continue between groups that previously weren't working together.

For example - folks from Tweede golf met folks from Immunant at the event. Tweede golf is now contributing to the memory safe AV1 decoder that Immunant is working on, and Immunant is contributing to the memory safe zlib implementation that Tweede golf is working on. We love to see it!

General Memory Safe Language Adoption Issues

Across all of our tracks there was quite a bit of discussion about general issues that prevent developers and organizations from moving to safer languages. The main issues identified were:

Developer fondness for and commitment to C/C++, unwillingness to learn a new language.
- Fear that the knowledge one has built up over many years is now obsolete.
- Fear that any new language that isn't 20+ years old already is a fad, won't be around long enough to justify commitment.
- Lack of understanding about just how unsafe C and C++ are, and associated belief that one knows how to write/ship safe C/C++.
Need to invest in making operating system support for memory safe languages like Rust equivalent or better than support for C/C++. Compilers need to be included, packaging systems need work, policies need updating.
Lack of resources to rewrite components when people and orgs are stressed with maintenance and other demands for their current C/C++ software.
Concern that new code will introduce an unacceptable number of new logic bugs while resolving memory safety issues.
Complications and security risks associated with languages that tend to produce programs with large numbers of dependencies.

Facilitating Adoption of Memory Safe Code for Internet Critical Infrastructure

This track covered a lot of ground trying to identify roadblocks and paths forward. The group examined dynamics within and between private companies, the open source community, philanthropy, and government.

Ease of use was a major topic, with the conversation frequently returning to improving the toolchains for memory safe languages and making them, and various domain specific frameworks, more readily accessible to developers.

There was also quite a bit of discussion about the need for more regular communication between people working on memory safety issues at various organizations. There was general agreement that de-siloing some of the problem solving would help move things along faster.

Policy making came up frequently, and memory safety was identified as an interesting policy problem because this is an engineering problem that we know how to solve. The people and resources are out there, we just need to bring it all together to move forward. It's likely that there are many policy levers worth pulling to help move things forward. A key driver behind this view was the observation that in many contexts, a migration to a memory safe language is entirely a question of whether or not the project is resourced.

The group also discussed whether there are more places where we can build collective commitments that we could seek funding from companies and governments for.

We'd like to thank Alex Gaynor and Paul Kehrer for leading this track.

Memory Safety Roadmaps for Organizations

Various people and groups have been considering the role that memory safe roadmaps for organizations might have to play in moving things forward. The goal for this track was to spend time examining the potential in more depth.

There was general agreement that there isn't a single kind of roadmap likely to work across the entire spectrum of sizes and types of organizations. To get coverage across the organizations that matter, we'll probably have to pursue multiple strategies.

The three sources of influence in the space are regulation, market forces, and distribution channels. We're looking for roadmap solutions that help these sources of influence make good decisions and exert their influence in the right areas. In order to do this, we need ways to measure the safety of software and perhaps also the soundness of organizational policy and direction.

There are questions about how to get executive buy-in for producing roadmaps, and how to make sure there is organizational follow-through. On the subject of what it would take to get executive buy-in, there was discussion about what other benefits might be bundled with the security benefits. For example - Moore's law is over and parallelism is the way forward for performance. Memory safety really helps with this.

When we think about the substance of a plan in a roadmap, there is broad agreement that we want organizations to commit to writing all new projects in a memory safe language, followed by a commitment to moving critical components of existing software (e.g. media decoders, TLS libraries) to memory safe software. What exactly we're looking for beyond that, and what's realistic, is not clear enough.

It was proposed that if a roadmap plan encounters too many challenges, we could pivot to some kind of external analysis for gaining insight into progress. One option for external analysis is something like SSL Labs but for memory safety. Pieces of software could be scored based on their memory safety, and organizations could be scored based on their software, policies, and practices.

We'd like to thank Eric Mill and Bob Lord for leading this track.

Facilitating the inclusion of Rust in operating systems

Rust is a key tool for programs that need to be both high performance and memory safe. Strong support for Rust in operating systems can greatly improve security. This track explored the challenges and possibilities for Rust support in operating systems. The takeaways included:

Interfacing between Rust and C++ is extremely difficult. Because of limitations imposed by the complexity of C++, Bindgen forces a C interface model.
- Can interoperability with C++ be improved with deeper integration of clang++ in the Rust compiler?
- Would it be possible to create a model in which C++ and Rust are subsumed into one model, with Rust bringing the memory management verification?
Need to make a better plan for long-term support of older Rust compiler/toolchain versions. This is also a problem for LLVM to some extent.
Porting some parts of libc (e.g. DNS, malloc) to Rust would certainly be a boon for operating system security.
Static compilation presents challenges for updating dependencies because each package using a dependency must update instead of a single shared library. Operating systems will take some time to adjust to this.
- Is there a future in which it's possible to ship dependencies as dynamically linked libraries?
Integrating additional memory safe code in operating systems will involve more cross-language boundaries in binaries. We should try to minimize the possibility of memory management issues on these boundaries, including the ones described here.
More a la carte separation of std and core (e.g. stack unwinder) would be helpful. The ability to move platform irrelevant components would be helpful in order to support additional platforms.

We'd like to thank Arlie Davis and Siddarth Pandit for leading this track.

Improving Trust in Rust Dependency Trees

Rust makes it easy to include dependencies, but this has led to a tendency for Rust programs to include many dependencies. It's not uncommon to see 100+ dependencies even for modest programs. The problem is that this necessitates an extensive web of trust that is a serious security liability. We have seen similarly vulnerable supply chains in Node and Python lead to disaster [1][2].

Logistically, having many dependencies can create problems for operating system packages trying to introduce Rust programs into contexts where having so many dependencies is rare (e.g., C).

While it is possible to build applications with fewer dependencies, the problem is endemic within the Rust ecosystem. It's not clear that the Rust project today is keen to address the problem, so Tectonics attendees have been discussing two possible options, which could be pursued in parallel:

Building and promoting a more advanced version of blessed.rs, one that offers more assurances about the blessed packages. The idea is to get programs and libraries to use the same set of dependencies and take more steps to ensure that the blessed packages are well-maintained. We could also focus on reducing indirect dependencies for the blessed packages.
Building a set of libraries outside of the official Rust project, to be maintained by the community that builds it. This would offer the same guarantees and benefits that a standard library would: security update guarantees, trustworthy ownership, consistent and thorough testing, consistent naming and searchability, etc.

Of these options #1 would probably be an easier but less complete solution. We have already heard from multiple parties interested in option #2 that could bring significant resources to bear.

We'd like to thank Florian Gilcher and Dirkjan Ochtman for leading this track.

Conclusion

This was the first event that ISRG has ever produced. We learned a lot, and we're pleased with the outcomes, including stronger relationships between various people and organizations. We may do it again if we feel the timing is right at some point in the future.

As we shared in the opening, this group of about 50 people at the center of the memory safety movement made Tectonics the success that it was. We're grateful for everyone listed below for their time and contributions:

ALEX GAYNOR

ALEX REBERT

AMIT LEVY

ANDREW WHALLEY

ARLIE DAVIS

BOB LORD

CHRIS PALMER

CRAIG NEWMARK

DAVID WESTON

DAN FERNELIUS

DIRKJAN OCHTMAN

DOUG GREGOR

EDWARD WANG

FIONA KRAKENBUERGER

FOLKERT DE VRIES

GAIL FREDERICK

HUGO VAN DE POL

JEFF HODGES

JOEL MARCEY

JOSH AAS

KEES COOK

KEVIN RIGGLE

LUIS VILLA

MATTHEW RILEY

MICHAEL BRENNAN

PAUL KEHRER

PER LARSEN

POWEN SHIAH

RAMON DE C VALLE

SARAH GRAN

SHAI CASPIN

STEPHEN CRANE

STEPHEN LUDIN

STEW SCOTT

TYLER MCMULLEN

WALTER PEARCE

WINDOW SNYDER

YAEL GRAUER

YUCHEN WU

The Rustls TLS Library Adds Post-Quantum Key Exchange Support

Tue, 26 Mar 2024 00:12:00 +0000

The Rustls TLS library has added experimental support for post-quantum key exchange (specifically, a Kyber/X25519 hybrid scheme). This feature prevents a post-quantum adversary from discovering the encryption keys to be used in a TLS connection. While no post-quantum adversary is known to exist today, it's important that we prepare for the eventuality now. This change is just the latest in a flurry of progress on Rustls, including the recent addition of a FIPS-supported cryptography library and the design and utilization of a robust performance benchmarking system.

One of the first tasks in setting up a TLS connection is figuring out which encryption keys will be used. That process is called key exchange. Most modern TLS connections use an algorithm called X25519 to exchange keys today, but that algorithm is not safe in a post-quantum context. In order to secure key exchanges for a post-quantum world, Rustls is using a hybrid approach combining the well-tested X25519 algorithm with a post-quantum Kyber key encapsulation mechanism (KEM).

Support for post-quantum key exchange is currently considered to be experimental and is implemented as a separate crate. As the specification standardizes and the code stabilizes, we will eventually make it part of the standard Rustls crate as a stable feature.

We recently learned that nearly two percent of all TLS 1.3 connections established with Cloudflare are secured with post-quantum technology. In February of this year, Apple announced that iMessage will be secured with post-quantum algorithms before the end of the year. Signal has already deployed post-quantum protection. We love to see the world proactively prepare itself for upcoming threats on a responsible timeline rather than rushing to react, and we're excited that Rustls is now on this path as well.

There are a few things that need to happen to prepare the Internet for a post-quantum world. Protecting TLS key exchange is just one of them. You can learn more about the state of post-quantum defense on the Internet in this excellent blog post from Cloudflare.

With the advancements made to Rustls over the last few years, we now see it as a viable, performant, and memory safe alternative to OpenSSL. We're pleased to see its adoption picking up. If your organization is interested in exploring the use of Rustls, reach out and let us know! We'll continue making our planned improvements to Rustls but would love adopter feedback.

We'd like to thank Sovereign Tech Fund, Fly.io, Google, AWS, and Alpha-Omega for supporting the work to advance Rustls.

White House, Craig Newmark Support Memory Safe Software

Tue, 12 Mar 2024 00:12:00 +0000

Initial signs point to 2024 being a big year for memory safety and we aim to continue Prossimo's work to accelerate the momentum.

Last month, the White House's Office of the National Cyber Director (ONCD) issued a report that strongly endorses the use of memory safe languages. We've been formally working on improving memory safety for critical Internet infrastructure for years now and are proud to be the only 501c3 nonprofit referenced in this report. The report highlights a few points that are well-aligned with Prossimo's outlook:

Now is the time to make memory safe choices since it effectively solves an avoidable problem,
There is clear evidence that switching to memory safe languages has a positive impact on digital security, and
Everything everywhere doesn't need to be re-written; instead take a tactical approach that prioritizes security-sensitive functions.

The positive industry response to the report is encouraging as well. "Memory safety vulnerabilities pose a significant security risk to software systems and are a root cause of many of the most damaging cyberattacks. To address this, we need to adopt memory safe programming languages for new applications and rewrite code using modern memory safe languages with secure development practices from the start. We're pleased to see the ONCD raise this issue because the integrity of the global software supply chain is critical for national and international security," said John Delmare, Global Cloud and Security Application Lead, Accenture.

We also received a vote of confidence from one of cybersecurity's most influential philanthropists: Craig Newmark. Craig Newmark Philanthropies renewed a grant for $100,000 to support Prossimo's efforts toward better memory safety in critical open source software. Since its founding, 100% of Prossimo's funding has come from contributions, and support from industry leaders like Craig Newmark continues to sustain our momentum across a wide range of initiatives:

Sudo/su: A trimmed down, memory safe version of Sudo/su is ready for use in Fedora and Debian.

Rustls: This memory safe TLS library has a strong culture and practice of benchmarking for improved performance and initial indicators show it will surpass OpenSSL on a variety of metrics this year. In addition, Rustls now has a FIPS-certified cryptography library and will soon land an OpenSSL compatibility layer, making the transition from OpenSSL seamless. The world has needed a better TLS library for a long time, and 2024 will be the year for Rustls to step up.

Reverse Proxy: Nearly every big deployment on the Internet uses a reverse proxy and that needs to be memory safe. We are building just that on top of Cloudflare's recently open sourced Pingora framework. It's called River and it will have many improvements including and beyond memory safety.

AV1: Media decoders are some of the most prolific sources of memory safety vulnerabilities (see the recent WebP vulnerability). We're working to create a suite of media decoders and compression libraries that are safer without sacrificing performance, which is critical for adoption. We're currently developing a safer AV1 decoder and we're seeing strong interest in adoption from major companies.

We're excited by the growing community invested in building a memory safe future. If you or your organization is interested in helping us get there, please reach out at sponsor@abetterinternet.org.

Sudo-rs dependencies: when less is better

Thu, 07 Mar 2024 00:00:00 +0000

The sudo utility represents a critical privilege boundary, so it should be memory safe. We rewrote it in Rust with partners at Tweede golf and Ferrous Systems. Ruben Nijveld from the Tweede golf team offers his perspective here on one of the greatest challenges we faced when developing software that can be widely adopted: Rust crate dependencies.

Josh Aas, Head of ISRG's Prossimo project

When sudo-rs development started, we added several dependencies using Rust's crates ecosystem to quickly ramp up development. During development we accrued approximately 135 transitive (direct and indirect) dependencies. Once this was identified, we managed to reduce our total dependencies down to three. In this blog, we explain why and how we did this.

Growing and shrinking our dependencies

As we ramped up development, we wanted to quickly get to a working prototype. This allowed us to work in more detail on parts of the program, while still being able to quickly run the entire program to validate any changes we made. During this ramp-up, we added approximately 10 direct dependencies, which in turn caused some 125 indirect dependencies to be added to our project. Especially those indirect dependencies might scare you a little for a security-oriented application like sudo-rs, but that number is somewhat artificially inflated because Rust automatically includes all relevant crates for all supported platforms, including crates for platforms such as Windows, which we obviously would not require as a Unix utility.

After having identified our dependencies as a potential issue, we started working on reducing our usage of them. Over the course of a few months we carefully removed almost all of our dependencies, ending up with only three crate dependencies required for the current version. Those crates are libc, glob, and log. All three of them are being maintained under the rust-lang github organization, indicating that they are very much at the core of the Rust ecosystem.

Our dependency graph before we finished trimming

Most of the time our usage of the removed crates was limited to a few places and we used little of the functionality of the crate. In other cases we had to put in some more effort, but none of our usage was so extensive that it couldn't be replaced relatively easily with code written by ourselves.

How dependencies can cause trouble

Sudo-rs team member Marc wrote about dependencies previously on the Tweede golf blog. His post gives a good overview of our considerations concerning which dependencies to keep and which ones to get rid of. In essence, having a lot of dependencies results in two problems. The first is the burden problem, where each added dependency requires extra effort. That manifests in tasks such as keeping up to date with dependencies, but also requires extra work for downstream users like people packaging the project for a Linux distribution.

The second problem is a trust problem: each additional dependency is another team to trust and another codebase to validate. This trust problem is especially important to sudo-rs. As a setuid program meant for elevating privileges, all code that is compiled into sudo-rs has the potential to accidentally (or intentionally) give access to system resources to people who should not have that access. The setuid context additionally puts some constraints on how code is executed, and dependencies might not have accounted for that context. We could not expect any of our dependencies to take into account such a context either.

Of course there is also the other side of the coin: dependencies are not solely a bad thing. As Marc noted in his blog post, we all stand on the shoulders of giants. If we cannot rely on the wider community we might end up repeating mistakes or missing knowledge. We might have to repeat writing code that has already been battle-tested and perfected over many years and by many people. Additionally, being able to contribute back to a wider ecosystem helps everyone, and not just ourselves.

Evaluate and re-evaluate

After announcing the plan to rewrite sudo in Rust, one of the pieces of feedback we read online most was our overreliance on external dependencies, making it harder to validate the correctness of the code. At that time we were already working on reducing our reliance on external dependencies, and those voices confirmed that we were on the right track. While that meant looking critically at our dependencies, it did not mean removing them just for the sake of it. We constantly weighed the pros and cons of each dependency. However, in the end, a lot of the functionality in sudo-rs is of such a special case that we opted to remove all but the most essential crates.

As an example of how we evaluated our dependencies, we previously used clap for command line argument parsing. We replaced it with our own argument parsing once we noticed that adopting clap was taking more code than doing it ourselves. Additionally, we saw that clap offered far more features than we needed, which in turn meant pulling in a significant number of additional dependencies. That resulted in too large of a library, too many teams to trust, and too many possibilities for bad setuid behavior for sudo purposes. In the end, we chose the potential dangers of reimplementing command line parsing over the potential issues of including clap, even though it is a great library for general-purpose command line parsing.

Conclusion

We believe that the current set of crates is a good trade-off between potential risk and gained benefits. Our packaging story is relatively easy for most Linux distributions (with sudo-rs already being available for Debian and Fedora), but that would have been a different story had we kept our 135 dependencies. Also, companies or institutions that require legal review of the entire transitive dependency tree might look at our code much more favorably now. Of course, not every project is like sudo-rs, and other projects might come to different conclusions on how valuable their dependencies are. But halting development, taking a step back, and looking critically at your dependencies is a valuable exercise. Dependencies are often overlooked in that regard, but your responsibility as a developer doesn't end where your code ends; it extends to your dependencies as well.

Image credits

Both dependency graph images were generated with cargo depgraph.

Rustls Now Using AWS Libcrypto for Rust, Gains FIPS Support

Thu, 29 Feb 2024 00:00:00 +0000

As of today, the Rustls TLS library is using AWS Libcrypto for Rust (aws-lc-rs) for cryptography by default, with the option to enable FIPS support. This removes a major roadblock for safer TLS in many organizations.

Over the past couple of years it became clear to us that in order to bring the best possible version of Rustls to a wider audience, we would need to make changes to the cryptographic support offered. The first step was to introduce pluggable cryptography, the ability to add support for choosing cryptographic back-ends at build time. Work on this began in Q3 2023 and is now complete. You can now choose to build Rustls with aws-lc-rs or ring for cryptography. The community has started to add support for cryptography from Rust Crypto, Mbed TLS, and BoringSSL. We hope to add support for SymCrypt soon.

We chose to make aws-lc-rs the new default because it's a high quality implementation with FIPS support. The AWS cryptography team had already developed excellent Rust bindings by the time we needed them, and their team has been a joy to work with. We also appreciate that their project is open source, hosted and developed in public, on GitHub.

Over the same period of time that we worked on pluggable cryptography, ISRG engaged Adolfo Ochagavía to build comprehensive benchmarking for Rustls. This system serves two purposes: 1) helping to prevent performance regressions, and 2) informing us about how Rustls performance compares to other libraries. It has already prevented some code from being merged that would have regressed performance, and we'll have more to say soon about the performance advantages of Rustls.

We're incredibly proud of the big steps we've taken recently towards a safer TLS implementation for much of the Internet. We're excited about the next phases of our work, including an OpenSSL compatibility layer. The OpenSSL compatibility layer will allow Rustls to act as a drop-in replacement for many OpenSSL users.

Prossimo is able to take on the challenging work of rewriting critical components of the Internet thanks to our community of funders from around the world. We'd like to thank AWS, Sovereign Tech Fund, Google, Fly.io, and Alpha-Omega for their support of our work on Rustls.

Announcing River: A High Performance and Memory Safe Reverse Proxy Built on Pingora

Wed, 28 Feb 2024 00:00:00 +0000

Today we are announcing plans to build a new high performance and memory safe reverse proxy in partnership with Cloudflare, Shopify, and Chainguard. The new software will be built on top of Cloudflare's Pingora, a Rust-based HTTP proxy, the open sourcing of which was announced today.

Just about every significant deployment on the Internet makes use of reverse proxy software, and the most commonly deployed reverse proxy software is not memory safe. This means that most deployments have millions of lines of C and C++ handling incoming traffic at the edges of their networks, a risk that needs to be addressed if we are to have greater confidence in the security of the Internet. In order to change this, Prossimo is investing in new reverse proxy software called River, which will offer excellent performance while reducing the chance of memory safety vulnerabilities to near zero.

“Cloudflare announced plans to open source Pingora, a high performance, memory-safe framework to build large scale networked systems a little over a year ago. After the announcement, it became clear that we were not the only ones striving to replace unsafe legacy Internet infrastructure,” said Aki Shugaeva, Engineering Director, Cloudflare. “Cloudflare's mission is to help build a better Internet and we look forward to partnering with the ISRG and others on the River project to continue to improve the Internet for everyone.”

“At Shopify, we support the efforts to improve the efficiency of crucial web traffic handling infrastructure for a safer and faster HTTP routing system. We appreciate Cloudflare's contribution on this front through the open-sourcing of Pingora, providing the project with a solid foundation to build upon.”

Mike Shaver, Distinguished Engineer at Shopify

ISRG contracted with James Munns over the past few months to lead an effort with our partners to create an architecture and engineering plan, which can be viewed here.

Some of the most compelling features include:

Better connection reuse than proxies like Nginx due to a multithreading model, which greatly improves performance.
WASM-based scriptability means scripting will be performant and River will be scriptable in any language that can compile to WASM.
Simple configuration, as we've learned some lessons from configuring other software for the past couple of decades.
It's written in Rust so you can deploy without worrying about memory safety issues.

The engineering work to implement is expected to begin in Q2 2024.

Building a piece of software like this is no small task and we recognize the ambition it will take. We're grateful for the help we've gotten from our partners so far. “Chainguard is proud to support and promote the use of memory safe software, and Prossimo's new reverse proxy, River, is a powerful step forward in securing critical parts of our collective Internet infrastructure,” said Dan Lorenc, CEO of Chainguard. “We commend the Cloudflare team for their open source contribution of the Pingora framework and ISRG's continued work to help developers everywhere build with more memory safe technologies and eliminate entire classes of these vulnerabilities.”

If you or your organization are interested in contributing engineering hours or financial support to help us get there, please reach out to sponsor@abetterinternet.org to begin the conversation.

Automating Releases for Bindgen

Thu, 08 Feb 2024 00:00:00 +0000

Bindgen is an important tool for helping to accelerate the transition from C and C++ to Rust because it generates FFI bindings. We knew that improving the robustness of bindgen would advance our efforts of bringing memory safety to critical infrastructure. We've been working with Ferrous Systems to make improvements to bindgen. This post summarizes their most recent work.

Josh Aas, Head of ISRG's Prossimo project

Ferrous Systems maintained bindgen through the end of 2023, and we're excited to share an update on our contributions to this project.

Our plan has always been to hand the project back to its original maintainer, so we decided to focus on easing the maintenance of the project instead of adding new features. We made several improvements to the documentation (2607, 2613, 2615 and 2634), and fixed small bugs that were introduced in some of the restructuring work we did in the past (2614, 2621, 2625, 2629, 2633, 2637, 2648 and 2676).

However, our most relevant contribution to the project is a complete overhaul of the release process of bindgen.

Automated releases

Bindgen does not stick to a periodic release schedule; instead, releases are done when users need a specific feature that has not been released yet. This motivated us to make the release process as easy and short as possible.

Creating a new release previously consisted of the following steps:

Ensure that all the changes done to the project's API since the last release are included in the unreleased section of the changelog
Move all the changes in the unreleased section of the changelog to the new version's section
Update the table of contents of the changelog using doctoc
Bump the version of the bindgen and bindgen-cli crates
Publish a new version of both crates in crates.io
Generate a new git tag with the same commit published in crates.io
Create a new Github release pointing to this tag

Although most of these steps were not difficult, they took more than a few minutes to complete and forced the maintainer to go back and forth between windows to remember the next step each time. However, the only step that requires human intervention is updating the changelog. The rest is something that a machine with a long enough strip of tape and a good table of rules could do.

Luckily for us, there are already some excellent tools in the Rust ecosystem focused on releasing and distributing software.

First we have cargo-release, which allows us to automatically bump the version of the crates, create the git tag and publish the crates to crates.io. It also has this neat feature of running hooks before the release is done, which we used to update the sections and table of contents of the changelog automatically.

Then we have cargo-dist, which is able to create binaries and installers, and also generate Github releases automatically.

This means that not only were we able to automate most of the release process, but we were also able to produce binary releases so users no longer have to compile bindgen-cli themselves every time a new version of bindgen is released.

We didn't get this process right initially. It required a couple of iterations and we even botched one release in the process.

As an aside, it was lovely to see the axodotdev team, who are behind cargo-dist, celebrate the news that the (grand old) bindgen project is benefitting from the fruits of their labor.

What's next?

With our Prossimo contract ending in 2023, we handed back complete maintenance of bindgen to its original maintainer, Emilio. We would like to thank Prossimo for the opportunity, as well as Emilio for trusting us to help maintain the project.

Securing the Web: Rustls on track to outperform OpenSSL

Thu, 04 Jan 2024 00:00:00 +0000

Securing the Web: Rustls on track to outperform OpenSSL

Prossimo is funding the development of Rustls, a high-quality TLS implementation written in Rust, with the aim of replacing less safe alternatives such as OpenSSL. This article goes into recent developments in performance tracking for Rustls and provides a performance comparison between Rustls 0.22.0 and OpenSSL 3.2.0 - the latest releases of both projects at the time of writing.

Our investment in benchmarking has helped confirm that Rustls is competitive with OpenSSL. In some scenarios Rustls is already faster, or less resource intensive. In other cases the benchmarking has highlighted areas we can target for improvements. Tight integration with the development process has already paid dividends in identifying regressions and helping inform architectural choices.

Performance as a feature

Aside from correctness and security, it is important for a TLS implementation to keep overhead at a minimum. Consider, for instance, the case of a web server under heavy load: a performant TLS implementation will be able to serve more clients than a less performant one. Historically, this has led the industry to treat performance as a non-negotiable feature, preferring TLS implementations with low latency and a low resource footprint, even if they are written in unsafe languages such as C.

With the rise of Rust, however, safer alternatives have become possible without compromising on performance. This was confirmed in 2019 when Joseph Birr-Pixton's benchmarks showed Rustls beat OpenSSL in data transfer throughput, handshakes per second and memory usage. Though later versions of OpenSSL caught up in some of the benchmarks, the results made clear that Rustls is a contender to keep an eye on.

As Rustls grows in popularity and the industry trends towards memory safety¹, it becomes more and more important to guarantee top-notch performance. For that reason, I spent the months between August and December developing an advanced setup to track the library's performance in a more principled way. Thanks to this work, maintainers now receive automatic feedback on the performance impact of each pull request and have the data to drive performance optimization efforts.

Automated feedback on pull requests

The topic of benchmarking in a continuous integration setup is challenging. Rustls' issue tracker states the problem as follows in issue 1385:

It would be very useful to have automated and accurate feedback on a PR's performance impact compared to the main branch. It should be automated, to ensure it is always used, and it should be accurate, to ensure it is actionable (i.e. too much noise would train reviewers to ignore the information). The approach used by rustc [the Rust compiler] is a good example to follow, though its development required a daunting amount of work.

After careful research, prototyping, and talking to people involved in benchmarking the Rust compiler, we arrived at a design with the following setup:

Hardware: the benchmarks run on a bare-metal server at OVHcloud, configured in a way that reduces variability of the results.
Scenarios: we exercise the code for bulk data transfers and handshakes (full and resumed²), with code that has been carefully tuned to be as deterministic as possible.
Metrics: we measure executed CPU instructions and wall-clock time (the former because of its stability, the latter because it is the metric end users care about).
Reporting: once a benchmark run completes, its respective pull request gets a comment showing an overview of the results, highlighting any significant changes to draw the reviewer's attention (here is an example). Cachegrind diffs are also available to aid in identifying the source of any performance difference.
Tracking: each scenario keeps track of measured performance over time, to automatically derive a significance threshold based on how noisy the results are. This threshold is used during reporting to determine whether a result should be highlighted.

You can find the code for the benchmarked scenarios in the main rustls repository, under ci-bench. The code for the application that coordinates benchmark runs and integrates with GitHub lives in its own repository.

Trophy case

In the past months, early versions of the benchmarking system have already helped drive development of Rustls. Below are some examples:

PR 1448: introducing dynamic dispatch for the underlying cryptographic library was necessary to make the API more user-friendly, but maintainers were concerned about potential performance regressions. The automated benchmark report revealed that the change had a mildly positive effect on handshake latency, and no effect at all in other scenarios. With this, maintainers were able to merge the pull request with confidence.
PR 1492: a security feature was introduced to zeroize fields containing secrets, which was expected to have some performance impact. The automated benchmarks showed that the regressions were manageable (between 0.5% and 0.85% for resumed handshake latency, and lower to no impact in other scenarios). Again, this information allowed the maintainers to merge the pull request with confidence. Quoting ctz: [there] was a clear security/performance tradeoff, and being able to transparently understand the performance cost was very useful.
PR 1508: upgrading the ring dependency, which Rustls uses by default for cryptographic operations, caused an up to 21% regression for server-side handshake latency. After some investigation and discussion with ring's maintainer, we concluded that the regression was due to missed optimizations in GCC. The regression was filed to BoringSSL and GCC issue trackers, but there is currently no planned fix. The recommended solution is to compile ring using Clang, or to use a different cryptographic library such as aws-lc-rs.
PR 1551: a refactoring caused a mild regression for handshake latency, but it was caught during review thanks to the automated benchmarks. The regression was promptly fixed and even resulted in a mild performance improvement.

Comparison against OpenSSL

The system described above is ideal to track performance differences among versions of Rustls, but it cannot be used to compare against other TLS implementations. For one, CPU instruction counts are an unsuitable metric when comparing totally different codebases. Using the secondary wall-clock time metric is not an option either, because the scenarios are tweaked for determinism and to detect relative variations in performance, not to achieve the maximum possible throughput.

Fortunately, the Rustls repository provides a set of benchmarks meant to obtain absolute measurements, making it possible to answer questions like: what is the maximum throughput the library can achieve when transferring data over TLS 1.2 with the ECDHE_RSA_AES128-GCM_SHA256 cipher suite? These benchmarks were used in the 2019 comparison against OpenSSL, and we recently reused them to generate up-to-date results on server-grade hardware.

The full results, including details about our hardware and methodology, are available on GitHub. Below follow the most important conclusions from comparing Rustls 0.22.0 and OpenSSL 3.2.0:

Rustls achieves best overall performance when used together with the aws-lc-rs cryptography provider instead of ring. For the highest throughput, the jemalloc allocator should be used (it more than doubles the throughput for outgoing data transfers, compared to Rust's default glibc malloc). Since this is the most performant configuration we use it when discussing further results below.
Rustls uses significantly less memory than OpenSSL. At peak, a Rustls session costs ~13KiB and an OpenSSL session costs ~69KiB in the tested workloads. We measured a C10K memory usage of 132MiB for Rustls and 688MiB for OpenSSL.
Rustls offers roughly the same data send throughput as OpenSSL when using AES-based cipher suites. Data receive throughput is 7% to 17% lower, due to a limitation in the Rustls API that forces an extra copy. Work is ongoing to make that copy unnecessary.
Rustls offers around 45% less data transfer throughput than OpenSSL when using ChaCha20-based cipher suites. Further research reveals that OpenSSL's underlying cryptographic primitives are better optimized for server-grade hardware by taking advantage of AVX-512 support (disabling AVX-512 results in similar performance between Rustls and OpenSSL). Curiously, OpenSSL compiled with Clang degrades to the same throughput levels as when AVX-512 is disabled.
Rustls handles 30% (TLS 1.2) or 27% (TLS 1.3) fewer full RSA handshakes per second on the server side, but offers significantly more throughput on the client side (up to 106% more, that is, a factor of 2.06x). These differences are presumably due to the underlying RSA implementation, since the situation is reversed when using ECDSA (Rustls beats OpenSSL by a wide margin in server-side performance, and lags behind a bit in client-side performance).
Rustls handles 80% to 330% (depending on the scenario) more resumed handshakes per second, either using session ID or ticket-based resumption.

The future

As far as performance goes, Rustls is steadily positioning itself to become the default TLS implementation on the internet. Next to confirming the library's potential, the benchmark results reveal where Rustls needs to improve. Now we have the necessary benchmarking infrastructure in place, one of the priorities for 2024 will be to outperform OpenSSL on all fronts. Stay tuned!

Consider, for instance, Microsoft's stance on the matter, AWS' commitment to fund memory safety initiatives, and the recently published Case for Memory Safe Roadmaps. ↩︎
It is important to test both from-scratch (or full) and resumed handshakes, because the performance characteristics of the two are very different. ↩︎

A Year-End Letter from our Vice President

Thu, 28 Dec 2023 00:00:00 +0000

This letter was originally published in our 2023 Annual Report.

We typically open our annual report with a letter from our Executive Director and co-founder, Josh Aas, but he's on parental leave so I'll be filling in. I've run the Brand & Donor Development team at ISRG since 2016, so I've had the pleasure of watching our work mature, our impact grow, and I've had the opportunity to get to know many great people who care deeply about security and privacy on the Internet.

One of the biggest observations I've made during Josh's absence is that all 23 people who work at ISRG fall into that class of folks. Of course I was a bit nervous as Josh embarked on his leave to discover just how many balls he has been keeping in the air for the last decade. Answer: it's a lot. But the roster of staff that we've built up made it pretty seamless for us to keep moving forward.

Let's Encrypt is supporting 40 million more websites than a year ago, bringing the total to over 360 million. The engineering team has grown to 12 people who are responsible for our continued reliability and ability to scale. But they're not maintaining the status quo. Let's Encrypt engineers are pushing forward our expectations for ourselves and for the WebPKI community. We've added shorter-lived certificates to our 2024 roadmap. We're committing to this work because sub-10 day certificates significantly reduce the impact of key compromise and it broadens the universe of people who can use our certs. In addition, the team started an ambitious project to develop a new Certificate Transparency implementation because the only existing option cannot scale for the future and is prone to operational fragility. These projects are led by two excellent technical leads, Aaron Gable and James Renken, who balance our ambition with our desire for a good quality of life for our teams.

Prossimo continues to deliver highly performant and memory safe software and components in a world that is increasingly eager to address the memory safety problem. This was evidenced by participation at Tectonics, a gathering we hosted which drew industry leaders for invigorated conversation. Meanwhile, initiatives like our memory safe AV1 decoder are in line to replace a C version in Google Chrome. This change would improve security for billions of people. We're grateful to the community that helps to guide and implement our efforts in this area, including Dirkjan Ochtman, the firms Tweede golf and Ferrous Systems, and the maintainers of the many projects we are involved with.

Our newest project, Divvi Up, brought on our first two subscribers in 2023. Horizontal, a small international nonprofit serving Human Rights Defenders, will be collecting privacy-preserving telemetry metrics about the users of their Tella app, which people use to document human rights violations. Mozilla is using Divvi Up to gain insight into aspects of user behavior in the Firefox browser. It took a combination of focus and determination to get us to a production-ready state and our technical lead, Brandon Pitman played a big role in getting us there.

We hired Kristin Berdan to fill a new role as General Counsel and her impact is already apparent within our organization. She joins Sarah Heil, our CFO, Josh, and me in ISRG leadership.

Collectively, we operate three impactful and growing projects for $7 million a year. This is possible because of the amazing leadership assembled across our teams and the ongoing commitment from our community to validate the usefulness of our work. As we look toward 2024 and the challenges and opportunities that face us, I ask that you join us in building a more secure and privacy respecting Internet by sponsoring us, making a donation or gift through your DAF, or sharing with the folks you know why security and privacy matter to them.

Tectonics 2023: a Productive Convening to Accelerate Memory Safety

Fri, 03 Nov 2023 00:00:00 +0000

We are so pleased to share that Tectonics was an invigorating and productive convening on how to advance memory safety. Leaders like Window Snyder, Doug Gregor, David Weston, and Fiona Krakenbürger joined many others for a day-long conversation. We are grateful to everyone who joined us.

A few initial observations struck me from the day's conversations:

We set out to make yesterday's conversation a "2.0", which moved past the discussion of the problem to focus on solutions. I was pleased with how many stories of experience were shared; it was a reminder of how much great progress has already been made.
There were perspectives coming from practitioners, policy makers, advocacy folks, and people in a position to make engineering priority decisions, and participants really valued hearing how others are using their skills and energy to tackle this enormous challenge.
Improving memory safety is not just a technological challenge. The day's conversations were a good reminder that people are at the heart of changing how security-sensitive software is written, used, and thought about.

We made a decision early on in the planning of Tectonics to create a format that allowed for in-depth conversation by breaking attendees into three-hour tracks focused on a specific topic. We'd like to especially thank our track leaders, Alex Gaynor, Paul Kehrer, Bob Lord, Eric Mill, Siddarth Pandit, Arlie Davis, Dirkjan Ochtman, and Florian Gilcher, whose guidance made the track format so productive.

We'd also like to thank our event sponsors, Ford Foundation, Google, Tweede golf, and Heroku for making this day possible.

Our next step for Tectonics will be to compile pages of notes from the day into a series of readouts that we'll publish in the weeks ahead. To be sure you receive these and other updates from ISRG, subscribe to our newsletter.

All of our work, including Tectonics, is made possible thanks to financial support from the people and companies who value better security and privacy for the Internet. Internet Security Research Group (ISRG) is the parent organization of Prossimo, Let's Encrypt, and Divvi Up. ISRG is a 501(c)(3) nonprofit. If you'd like to support our work, please consider getting involved, donating, or encouraging your company to become a sponsor.

Announcing Hickory DNS

Thu, 05 Oct 2023 00:00:00 +0000

Benjamin Fry is the founder and maintainer of DNS software that has attracted growing industry interest due to its progress toward being one of the only open source, high performance, memory safe DNS resolvers. We've invited Benjamin to provide his thoughts on the growth of this project and make an exciting announcement.

Josh Aas, Head of ISRG's Prossimo project

Trust-DNS, a project I started on August 7, 2015, will now be known as Hickory DNS. Since its inception, the project has been an exciting experiment to build a systems level component in the memory safe, low overhead language, Rust. Over the years it's gone from a standard implementation of classic DNS to a toolbox capable of serving many needs. The Hickory DNS project supports DNSSEC, DoT, DoH, and DoQ. It has been used to build stub-resolvers, DNS authorities, experimental recursive resolvers, and low level protocol implementations. I had a lot of personal goals for this project and with this name change I hope that it will help attract more interest to achieve them all.

Why make this change?

I chose the Trust-DNS for good reasons, Rust was in the name, and I had the desire to make something trust-worthy. This has always been a project supported out of passion and as it's attracted more interest from others we want to ensure that it has a future with a more widely appealing identity for consumers, developers, and funders. On top of that, we wanted to select a defensible trademark. The Trust-DNS brand name was deemed to be less defensible since it is a combination of generic words, especially in an industry where any DNS provider will want to talk about the trust their service provides. Hickory in this context is far less likely to occur unintentionally without directly referring to this project.. Over the past year there's been a conversation going on with ISRG to receive some support by way of their Prossimo project to add features and robustness to this software. They also intend to eventually use it in Let's Encrypt. Since we are in the midst of a major push to advance functionality, it seemed like a good time to make the necessary changes to the name as well.

Where did the name Hickory DNS come from? While considering the goals of this project, trust, security, reliability, safety arose as top priorities; the Hickory tree seemed like a good representation of those goals. A tree has some obvious relationships to the graph in DNS. The Hickory tree itself is a strong hardwood tree that tends to grow straight and look quite elegant. It's known for being hardy, growing all over North America. It's tough and shock-resistant. These properties are all things that we've grown to appreciate in regards to having developed the project with the Rust language. By choosing something from nature to represent this, we're calling out the organic way in which this project has attracted interest and development from all over the world. As the project's founder, I hope that with this name we can all feel proud of what's been built so far and recognize how much further the project can grow. Our contribution policies will remain unchanged and we will always welcome new contributions assuming they fit with our goals. We want the trust we've gained over the years to be something which people will continue to rely on.

How will this impact existing users and contributors?

We will be moving the project into the Hickory DNS organization on GitHub. During this move we will also be changing the project name to Hickory DNS. This transition to an organization and away from my personal github account will allow for a greater set of administrative options for other collaborators, like Dirkjan Ochtman who's been helping maintain the project for a couple of years now. All of the crates will start being published under the hickory-* name, for example the popular crates trust-dns-resolver and trust-dns-proto will become hickory-resolver and hickory-proto, while the server will be hickory-dns. There may be other changes that need to occur, if others have gone through similar moves, we'd be open to feedback about how to make it most effective. Things that we will not change: licensing (MIT or Apache 2.0), Code of Conduct, and seeking to make the best DNS software available.

I want to thank the many users of and contributors to the Trust-DNS project and I hope you will join us by continuing to collaborate on the same project under the new name, Hickory DNS. This project would not be where it is without your support of 2,560 commits and 71,318 lines of code written by 170 collaborators. All of that has led to, according to GitHub, 1,136 dependent projects that use the software. Crates.io shows 20,310,799 downloads of the proto crate and 19,003,768 of the resolver crate. With these new investments in the project, we hope to grow the usage of the server and make it something people are comfortable deploying in their production systems. Thank you everyone for your confidence, I am excited for the future of this project, and I hope you are too. If you have any questions or concerns related to this change, feel free to reach out to me. Here's a Hickory tree in the woods near my parents' house in New York, it's probably 20 or 30 years old. I hope that together we can build something as beautiful.

Again, thank you!

Advancing Rustls and Rust for Linux with OpenSSF Support

Mon, 18 Sep 2023 00:00:00 +0000

Prossimo continues to advance the functionality and scalability of the Rustls TLS library and the Rust for Linux effort thanks to $530,000 in funding from the OpenSSF’s Alpha-Omega project. This funding will further Prossimo’s efforts to bring memory safety to critical components of the Internet and further OpenSSF’s Alpha-Omega project’s mission to protect society by improving the security of open source software. "As a memory-safe language, Rust plays a pivotal role in fortifying critical software infrastructure,” said OpenSSF Alpha-Omega co-lead, Michael Scovetta. “Alpha-Omega is proud to support Prossimo's efforts to enhance the Rustls cryptographic library and bolster Rust's integration within the Linux kernel."

Rustls

Rustls is a memory safe library that implements the TLS protocol. TLS is the most ubiquitous protocol for encrypting traffic on the Internet as well as internal networks. Since we began funding Rustls in 2020, we’ve made continued efforts to move it toward being a more performant and memory safe alternative to OpenSSL.

Funding will support progress on our work plan. One of the most exciting priorities is the enablement of pluggable cryptographic backends. This feature will make it possible for Rustls users to choose among cryptographic backends, bringing an important degree of diversity, flexibility and resiliency to the cryptography underlying the Internet. This optionality will also reduce friction for large organizations looking at moving to a memory safe option.

We are also planning to implement a C-based OpenSSL compatibility layer so that OpenSSL consumers can easily switch to Rustls without needing to make major changes to their code or learn Rust.

Rust for Linux

OpenSSF funding will help us maintain Rust as a supported second language for Linux kernel development, and to foster the creation of drivers and modules written in Rust. Rust support was merged into Linux kernel v6.1, an incredible achievement. Work is now focused on improving that support and getting larger modules and drivers contributed. “Bringing memory safety to a piece of software as critical as the kernel is a watershed moment for our efforts,” said the head of Prossimo and Executive Director of ISRG, Josh Aas. “With funding from OpenSSF, we remain resolutely focused on building a more secure Internet for everyone, everywhere.”

The primary maintainer of Rust for Linux, Miguel Ojeda, has been working full time under contract with Prossimo since April of 2021. His leadership has ushered Rust into the stable Linux kernel, a feat that required gaining the trust of Linux kernel maintainers and decision makers. Ojeda has also fostered the development of an invested and growing community of Contributors. “Since the merge, the kernel has been steadily gaining support for the dependencies that key use cases need, as well as new contributors and companies supporting us with engineering time” commented Ojeda. This growing momentum for the Rust for Linux project means that now is not the time to take our foot off the gas. “The following months will be critical, because the first uses of Rust in the kernel will be submitted to be evaluated by kernel maintainers. If successful, over time, some of those use cases could have a security impact in billions of devices” said Ojeda.

We hope to see more public and private organizations who rely on open source critical digital infrastructure to step up and support it. If you or your organization would like to come on board as a funder of Prossimo, we would be excited to begin a conversation with you at donate@abetterinternet.org.

Internet Security Research Group (ISRG) is the parent organization of Prossimo, Let’s Encrypt, and Divvi Up.

The First Stable Release of a Memory Safe sudo Implementation

Tue, 29 Aug 2023 00:00:00 +0000

Prossimo is pleased to announce the first stable release of sudo-rs, our Rust rewrite of the critical sudo utility.

The sudo utility is one of the most common ways for engineers to cross the privilege boundary between user and administrative accounts in the ubiquitous Linux operating system. As such, its security is of the utmost importance.

The sudo-rs project improves on the security of the original sudo by:

Using a memory safe language (Rust), as it's estimated that one out of three security bugs in the original sudo have been memory management issues
Leaving out less commonly used features so as to reduce attack surface
Developing an extensive test suite which even managed to find bugs in the original sudo

The Wolfi Linux OS already includes sudo-rs and we hope that others will follow their lead. "When we first set out to build Wolfi, making sure it was memory safe was always a top priority," said Dan Lorenc, CEO and Co-founder at Chainguard. "The sudo utility is a perfect example of a security-critical tool that's both pervasive and under-appreciated. Security improvements to tools like this will have an outsized impact on the entire industry. The work that went into building the first sudo-rs release is a great step forward in eliminating potential security issues by adopting memory safe languages like Rust. This is critical for upholding and maintaining Wolfi as the secure-by-default foundation for developers who want to address most modern supply chain threats."

A joint team from Tweede Golf and Ferrous Systems built sudo-rs under contract with Prossimo. We're pleased with how much progress they've made since starting this project in December, 2022. An external security audit of the sudo-rs code is scheduled to start in September 2023. After that, the team will start on Milestone 4 of our work plan, which focuses on enterprise features.

The original C-based sudo utility has been maintained by Todd C. Miller for many years now, and we're grateful to him for taking on this huge and important task. We're also grateful that Todd has made time to offer us excellent advice on implementing sudo-rs.

Prossimo is able to take on the challenging work of rewriting critical components of the Internet thanks to our community of funders from around the world. We’d like to thank the NLnet Foundation for their funding of the audit of Sudo-rs. We'd also like to thank Amazon Web Services for supporting this work and supporting the transition to memory safe software.

Prossimo announces Tectonics: an event to shift the work of memory safety forward

Wed, 26 Jul 2023 00:00:00 +0000

Not all that long ago, the idea of rewriting much of the Internet's critical software to make it memory safe was, if thought about at all, quickly dismissed as an unrealistic endeavor. And while this idea may still be ambitious today, there's momentum towards shifting the focus from dialogue to planning and execution.

That momentum inspired us to come up with Tectonics. The vision for Tectonics, happening November 2 in San Francisco, is to move the conversation around memory safety from "why" and "what if" to "how to." As one of our funders, Craig Newmark, noted, "I learned about memory safety bugs the hard way, back in 1985 when I was a programmer. We now have the tools to address this problem, so it's time to take action and eliminate these bugs and vulnerabilities by using memory safe code."

We recognize and are encouraged by the breadth and frequency of conversations around memory safety. While that is a strong wind in the sails of moving this work forward, Tectonics will be a day of proctored conversations led by individuals leading this work. Our goal is to collaboratively create a series of recommendations and guidance on how we can proliferate memory safety across the Internet.

In a day-long convening, Tectonics will use part of the day to hear from leaders like Window Snyder, CEO at Thistle Technologies, and Bob Lord, Senior Technical Advisor at CISA. The afternoon working group conversations will focus on addressing three topics:

Adoption of memory safe languages in operating systems
Dependency management
Organizational roadmaps for deploying memory safe software

Through proctored conversations, the end-result of Tectonics will be clear and actionable recommendations that Prossimo will publish and distribute. We're excited about the idea of a 2.0 conversation that will bring clarity to our collective work to build a more secure Internet for everyone, everywhere.

Tectonics sponsorships begin at $5,000 and are available now. Registration will open later this year, however you can save the date now to be notified once registration opens. Prossimo is a project of Internet Security Research Group (ISRG), a 501(c)(3) nonprofit organization. ISRG launched Prossimo in 2020 to bring greater attention and resources to tackling the problem of a lack of memory safety in the Internet's critical infrastructure. Since its founding, Prossimo has funded nine initiatives with more than $5M in funding to rewrite critical components of the Internet.

$1.5M from Sovereign Tech Fund to Fuel Memory Safety

Tue, 11 Jul 2023 00:00:00 +0000

Sovereign Tech Fund will be supporting three Prossimo initiatives over the next 18 months with work contracts totaling $1.5M. This is the largest single contract Prossimo has received to date. This funding enables our continued work with the wonderful maintainers, developers, and funders who have helped us make such great progress so far.

Rustls

This funding supports the development of both foundational features and general improvements. Rustls is well-positioned to replace OpenSSL in many scenarios and our work will make it more appealing for a wider user base. For example, in 2023 we plan to:

Enable pluggable cryptographic back-ends
Add the option to rely on OS trust verifier platforms
Develop a comprehensive performance benchmarking system
Change the default cryptographic library to one that is FIPS certified

rav1d

We will continue work to port the C code in the dav1d AV1 video and image decoder to Rust. We expect that rav1d will be ready for initial users in early 2024. Codecs have a long history of memory safety problems so we are excited to build one in a memory safe language just as more companies are making the switch from other media types to AV1.

DNS

We will accelerate the development and maturation of a high-potential DNS resolver. It will be highly performant, open source, memory safe, and fully recursive. Let's Encrypt, a sibling project of Prossimo also run by ISRG, will be one of the first large-scale deployments.

There is strong alignment between the work of Prossimo and the goal of the Sovereign Tech Fund, which is to strengthen digital infrastructure and open source ecosystems in the public interest. Fiona Krakenbürger, co-founder of the Sovereign Tech Fund, commented "The memory safety work that the Internet Security Research Group does with Prossimo is absolutely essential. It exemplifies the digital infrastructure and open source ecosystem the Sovereign Tech Fund wants to support. By investing in making TLS, the AV1 media decoder, and a DNS resolver more secure, we're acting in the public interest by improving the security of everyone using the internet, from individuals to companies and governments. Together, we're safeguarding our shared digital infrastructure for the common good."

Since Prossimo only focuses on critical infrastructure that is widely used, our work can have a broad impact across a large number of people using the Internet (even if those people never know it!). This approach helps us do the most good with our resources.

We applaud the Sovereign Tech Fund and the German government for recognizing the connection between strong, well-supported digital infrastructure and innovation and economic growth (the fund is financed by the German Federal Ministry for Economic Affairs and Climate Action).

With better and more secure tools, people, companies, and institutions can focus more on the task at hand. We hope to see more public and private organizations who rely on open source critical digital infrastructure to step up and support it. If you or your organization would like to come on board as a funder of Prossimo, we would be excited to begin a conversation with you at donate@abetterinternet.org.

ISRG’s 10th Anniversary

Wed, 24 May 2023 00:00:00 +0000

It's hard to believe 10 years have passed since Eric Rescorla, Alex Halderman, Peter Eckersley and I founded ISRG as a nonprofit home for public benefit digital infrastructure. We had an ambitious vision, but we couldn't have known then the extent to which that vision would become shared and leveraged by so much of the Internet.

Since its founding in 2013, ISRG's Let's Encrypt certificate authority has come to serve hundreds of millions of websites and protect just about everyone who uses the Web. Our Prossimo project has brought the urgent issue of memory safety to the fore, and Divvi Up is set to revolutionize the way apps collect metrics while preserving user privacy. I've tried to comprehend how much data about peoples' lives our work has and will protect, and tried even harder to comprehend what that means if one could quantify privacy. It's simply beyond my ability.

Some of the highlights from the past ten years include:

May 24, 2013: ISRG is incorporated, intending to build Let's Encrypt
November 18, 2014: The Let's Encrypt project is announced publicly
September 14, 2015: Let's Encrypt issues its first certificate
October 19, 2015: Let's Encrypt becomes publicly trusted
December 3, 2015: Let's Encrypt becomes generally available
March 8, 2016: Let's Encrypt issues its millionth certificate
June 28, 2017: Let's Encrypt issues its 100 millionth certificate
March 11, 2019: The ACME protocol becomes an IETF standard
February 27, 2020: Let's Encrypt issues its billionth certificate
October 26, 2020: ISRG board approves a privacy preserving metrics project, now Divvi Up
December 9, 2020: ISRG board approves a memory safety project, now Prossimo
December 18, 2020: Divvi Up starts servicing COVID exposure notification
October 3, 2022: Support for Rust is merged into the Linux kernel

All this wouldn't be possible without our staff, community, donors, funders, and other partners, all of whom I'd like to thank wholeheartedly.

I feel so fortunate that we've been able to thrive. We're fortunate primarily because great people got involved and funders stepped up, but there's also just a bit of good fortune involved in any success story. The world is a complicated place, there is complex context that one can't control around every effort. Despite our best efforts, fortune has a role to play in terms of the degree to which the context swirling around us helps or hinders. We have been fortunate in every sense of the word and for that I am grateful.

Our work is far from over. Each of our three projects has challenges and opportunities ahead.

For Let's Encrypt, which is more critical than ever and relatively mature, our focus over the next few years will be on long-term sustainability. More and more people working with certificates can't recall a time when Let's Encrypt didn't exist, and most people who benefit from our service don't need to know it exists at all (by design!). Let's Encrypt is just part of how the Internet works now, which is great for many reasons, but it also means it's at risk of being taken for granted. We are making sure that doesn't happen so we can keep Let's Encrypt running reliably and make investments in its future.

Prossimo is making a huge amount of progress moving critical software infrastructure to memory safe code, from the Linux kernel to NTP, TLS, media codecs, and even sudo/su. We have two major challenges ahead of us here. The first is to raise the money we need to complete development work. The second is to get the safer software we've been building adopted widely. We feel pretty good about our plans but it's not going to be easy. Things worth doing rarely are.

Divvi Up is exciting technology with a bright future. Our biggest challenge here, like most things involving cryptography, is to make it easy to use. We also need to make sure we can provide the service at a cost that will allow for widespread adoption, so we'll be doing a lot of optimization. Our hope is that over the next decade we can make privacy respecting metrics the norm, just like we did for HTTPS.

The Internet wasn't built with security or privacy in mind, so there is a bountiful opportunity for us to improve its infrastructure. The Internet is also constantly growing and changing, so it is also our job to look into the future and prepare for the next set of threats and challenges as best we can.

Thanks to our supporters, we'll continue adapting and responding to help ensure the Web is more secure long into the future. Please consider becoming a sponsor or making a donation in support of our work.

AWS commits $1M to bring memory safety to critical parts of the Web

Thu, 11 May 2023 12:00:00 +0000

Amazon Web Services (AWS) has long supported ISRG's mission through sponsorships of projects such as Let's Encrypt. Today, we're pleased to announce that AWS has continued its commitment to Prossimo through a contribution of $1 million, funding four initiatives focused on improving memory safety: building a memory safe AV1 decoder, rav1d, rewriting sudo/su, furthering our efforts with Rustls, as well as building out NTPd-rs.

"At AWS, security is job zero and we are constantly looking for ways to help us and our customers operate more securely. With this funding, we're furthering ISRG's mission to build a more memory safe internet through the creation of new solutions for securing critical software tools. Investing in open source communities is essential to their long-term sustainability so they can continue to help tackle complex problems like memory safety." remarked David Nalley, Head of Open Source Strategy and Marketing at AWS.

Our work with the AV1 Decoder initiative is a unique opportunity because it's a relatively new media format and we have a chance to develop a safe decoder option before many organizations make their initial choices about AV1 implementations. This piece of infrastructure can be memory safe from the start. The plan for rav1d is that it performs as well or better than the C-based dav1d decoder.

Work on rav1d started towards the end of February 2023. The primary contractor is Immunant, with veteran codec expert Frank Bossen advising and contributing part-time. The plan is to transpile the C code in dav1d to Rust, then most of the time will be spent cleaning it up from unsafe transpiled Rust to safe, idiomatic Rust. The initial transpile has been completed already and work is well under way to get tests passing.

The sudo and su utilities mediate a critical privilege boundary on just about every open source operating system that powers the Internet. Unfortunately, these utilities have a long history of memory safety issues.

Work started on sudo and su in December of 2022. The contractors are a combined team from Tweede Golf and Ferrous Systems. The maintainer of the traditional sudo program, Todd Miller, is volunteering as an advisor to the team.

Our goal with Rustls is to build a safer TLS library that can largely replace OpenSSL over time. Rustls will be performant and memory safe. This work began in 2022 and is picking up great speed both in terms of new contributions and new consumers of Rustls.

NTP is how the Internet keeps track of time, but most of today's popular implementations are written in C. Our work has produced a new client and server that are both ready for use. We've also added Network Time Security (NTS) to both the NTP server and client.

We're grateful for the longtime commitment from AWS to helping ISRG and its projects build a more secure and privacy-respecting Web for everyone, everywhere. If you or your organization would like to come on board as a funder of Prossimo, we would be excited to begin a conversation with you at donate@abetterinternet.org.

Bringing Memory Safety to sudo and su

Wed, 26 Apr 2023 00:00:00 +0000

Our Prossimo project has historically focused on creating safer software on network boundaries. Today however, we're announcing work on another critical boundary - permissions. We're pleased to announce that we're reimplementing the ubiquitous sudo and su utilities in Rust.

Sudo was first developed in the 1980s. Over the decades, it has become an essential tool for performing changes while minimizing risk to an operating system. But because it's written in C, sudo has experienced many vulnerabilities related to memory safety issues.

When we're thinking about what software we want to invest in we think primarily about four risk criteria:

Very widely used (nearly every server and/or client)
On a critical boundary
Performing a critical function
Written in languages that are not memory safe (e.g. C, C++, asm)

The program sudo fits all four of those risk criteria. It's important that we secure our most critical software, particularly from memory safety vulnerabilities. It's hard to imagine software that's much more critical than sudo and su.

This work is being done by a joint team from Ferrous Systems and Tweede Golf with generous support from Amazon Web Services. The work plan is viewable here. The GitHub repository is here.

If you'd like to support Prossimo's work to improve memory safety, please consider contributing.

Memory Safe Network Time (NTP) Has New Home, Seeks Early Adopters

Mon, 17 Apr 2023 00:00:00 +0000

Today we're pleased to announce that an open source memory safe implementation of NTP - ntpd-rs - has a new long-term home and is looking for early adopters.

The implementation includes a server and client, as well as full support for Network Time Security (NTS), which brings encryption and greater integrity to time synchronization. Timing is precise and stable, as reflected by excellent performance in the NTP pool.

ISRG's Prossimo project set out to develop a strategy, raise funds, and select a contractor for a memory safe NTP implementation in early 2022. We did this because NTP is a critical network-based service and the most widely used implementations are written in C. This is a recipe for exploitable memory safety vulnerabilities, a class of issues that critical system software should not suffer from.

During Q1 2022 we made a plan and selected Tweede golf as the contractor. Funding was generously provided by Cisco and Amazon Web Services. Work started on April 1, 2022. A security audit of the initial production-ready code, performed by Radically Open Security and funded by NLNet Foundation, was completed in March of 2023.

During the course of the work it was decided that Tweede golf would become the long-term maintainer of ntpd-rs as part of their Pendulum Project. Since their team wrote ntpd-rs and we're big fans of their approach to open source, it was an easy decision to make on our end. Their work will be supported by soliciting contracts and sponsorship for features and maintenance.

If you're running NTP services you can help make your systems and the Internet as a whole safer by becoming an early adopter of ntpd-rs and providing feedback to Tweede golf. Contact Tweede golf via pendulum@tweedegolf.com if you are interested!

Rustls 0.21.0 Released With Exciting New Features

Wed, 29 Mar 2023 00:00:00 +0000

We're incredibly excited about the latest release of Rustls, a memory safe TLS implementation. This release has two major new features and a number of other improvements.

The first big feature is support for TLS certificates containing IP addresses. Rustls can now be used to set up TLS connections addressed by IP rather than a domain name. This is useful for things like Kubernetes pods, which often use IP addresses instead of domain names, and for DNS over HTTPS/TLS which need an IP address for the server to avoid circular dependency on name resolution. TLS certificates for IP addresses have been the most heavily requested feature for quite a while now and it's great to have it completed.

The second big feature is support for RFC8446 C.4 client tracking prevention. This means that passive network observers will no longer be able to correlate connections from ticket reuse.

Version 0.21.0 also contains a number of other improvements. Rustls gets contributions from many individuals, but we'd like to give particular thanks to Joe Birr-Pixton, Dirkjan Ochtman, Rafael López, Daniel McCarney, Jacob Hoffman-Andrews, and Jacob Rothstein for their work on this release.

ISRG, via our Prossimo project, is investing heavily in Rustls. It's our goal to make Rustls the most attractive option for software needing TLS support. Daniel McCarney and Jacob Rothstein are currently working on Rustls under Prossimo contracts that tackle the items on our work plan. One of the most important priorities is the enablement of pluggable cryptographic backends. This feature will make it possible for Rustls users to choose among cryptographic backends like Ring or SymCrypt. We intend for this optionality to reduce the friction for large organizations looking at moving to a memory safe option.

The team is already hard at work on the next version. If you're as excited as we are about the progress and potential, please join Google, Fly.io, and Amazon Web Services in supporting this work.

About Us

A Safer High Performance AV1 Decoder

Thu, 09 Mar 2023 00:00:00 +0000

Prossimo is excited to announce that we are working on an AV1 decoder called rav1d, which can be used for both video and images. Our strategy is to move the C code in the dav1d decoder to Rust, while retaining the high performance assembly code.

Image and video decoders have historically been a major source of exploitable memory safety vulnerabilities because they often process data from networks in complex ways. Improving memory safety for media decoders is important if we want to reduce the number of exploitable vulnerabilities people are exposed to on the Internet.

AV1 is a relatively new, open, royalty-free video coding format. AV1 compression can be used for both video and images (the image format is called AVIF). AV1 is rapidly gaining popularity, and we expect that many applications will need to select an AV1 decoder soon. We want to make sure everyone has a safe option to choose.

Immunant is the primary contractor for this work, with assistance from veteran codec expert Frank Bossen. They are going to use a strategy that's new for Prossimo - transpiling.

The C code in dav1d was initially transpiled to Rust using the c2rust transpiler built by Immunant. With the transpile complete, the team is now working to manually change unsafe transpiled Rust to safe, idiomatic Rust. Along the way they will make sure all tests are passing and that performance is the same or better than dav1d. The final product will include a C API compatible with the dav1d API, so that C consumers can use rav1d with minimal effort, just like they use dav1d.

When combined with a memory safe demuxer like mp4parse-rust it will be possible to do a lot of the work to decode AV1 images and video with a relatively high degree of memory safety. Some assembly code that is not memory safe will still be part of the decoding process, which is necessary in order to retain great performance.

The first four milestones in the work plan have been generously funded by Amazon Web Services. We are working to raise an additional $400k to complete the work.

You can follow our work on this initiative here.

Klint: Compile-time Detection of Atomic Context Violations for Kernel Rust Code

Wed, 08 Mar 2023 12:00:00 +0000

Gary Guo is helping our efforts to bring Rust into the Linux kernel by building a tool called klint. We asked him to provide his perspective on the work in this blog post. Thank you for your partnership and contributions, Gary!

Josh Aas, Head of ISRG's Prossimo project

For the last couple of months, I have been working on a static analysis tool called klint that is able to detect, in compile time, coding errors related to atomic contexts in Rust kernel code.

In this blog post, I'll talk about what is atomic context, what can happen if they are misused, how it is related to Rust and why it is important to detect these errors.

Atomic Contexts

Generally, a piece of Linux kernel code runs in one of two contexts, atomic context or task context (we are not going to discuss the raw atomic context in this blog post). Code running in task contexts is allowed to sleep, e.g. rescheduling or acquiring a mutex. Code running in atomic contexts, on the other hand, is not allowed to sleep.

One obvious example of atomic context is interrupt handlers. Apart from interrupts, the kernel also moves from task context into atomic context if a spinlock is acquired, or when the code is inside an RCU critical section.

Sleeping inside an atomic context is bad -- if you acquire a spinlock and then go to sleep, and another piece of code tries to acquire the same spinlock, it is very likely that the system will be locked up.

spin_lock(&lock);
...
mutex_lock(&mutex); // BAD
...
spin_unlock(&lock);

These kinds of mistakes are easy to make and hard to debug, especially when the sleepable call is deeply nested. To debug this, kernel C code has might_sleep() annotations all around the place (e.g. inside the mutex_lock function). If you have DEBUG_ATOMIC_SLEEP config enabled, then the kernel will track the preemption count. This counter is incremented whenever you enter an atomic section (e.g. by acquiring a spinlock) and decremented on exit. If the counter is non-zero, then it means that the kernel is inside an atomic context -- calling might_sleep() in this case will produce a warning to aid debugging.

Memory Safety Aspects of Atomic Context

The Rust for Linux project tries hard to ensure that it can provide safe abstractions of the kernel C API and empower drivers to be written in safe Rust code. We already have a list of synchronisation primitives implemented, and this includes spinlocks and mutexes. Therefore, the concept of atomic context is as relevant in Rust code as in C code.

You might ask, how is memory safety related here? If you are familiar with Rust, there's a chance that you are aware of what "memory safety" in Rust means. Safe code in Rust should not be able to cause use-after-free or data races, but causing a deadlock is memory safe. If a Rust kernel driver sleeps while inside an atomic context, it might cause a deadlock, which is bad and should be avoided, but it should be memory safe regardless, right?

This would be true if spinlocks were the only source of atomic contexts. However, the kernel very widely employs RCU (read-copy-update). Details of RCU can be found in the kernel documentation, but in a nutshell, RCU is a synchronisation mechanism to provide efficient read access to shared data structures. It allows multiple readers to access shared data structures without locking. A data structure accessible from an RCU read-side critical section will stay alive and will not be deallocated until all read-side critical sections that may access it have been completed.

In the kernel, an RCU read-side critical section starts with rcu_read_lock() and ends with rcu_read_unlock(). To drop a data structure unpublished from RCU, one would do synchronize_rcu() before dropping it:

/* CPU 0 */                 /* CPU 1 */
rcu_read_lock();
ptr = rcu_dereference(v);   old_ptr = rcu_dereference(v);
/* use ptr */               rcu_assign_pointer(v, new_ptr);
                            synchronize_rcu();
                            /* waiting for RCU read to finish */
rcu_read_unlock();
                            /* synchronize_rcu() returns */
                            /* destruct and free old_ptr */

If you look at the implementation detail of rcu_read_lock(), however, you will see that it compiles down to a single compiler barrier — asm volatile("":::"memory"), if all the debugging facilities are off. Yes, there are absolutely no instructions generated for rcu_read_lock() and rcu_read_unlock()! Linux kernel plays a trick here -- it implements synchronize_rcu() in a way such that it returns after all CPU cores experience context switches at least once. The kernel considers an RCU read-side critical section to be an atomic context, so no code inside it may sleep and thus cause a context switch. With this reasoning, if all CPU cores have gone through context switches, then all live read critical sections must have been completed. The soundness of synchronize_rcu() relies on the fact that code cannot sleep inside RCU read-side critical sections! If such sleep indeed happens, it can cause synchronize_rcu() to return early and thus cause memory to be freed before rcu_read_unlock(), leading to use-after-free.

TL;DR: How RCU is implemented in the Linux kernel lifts sleep in atomic context from "it's bad because it might cause deadlock" to "it's bad because it can cause use-after-free".

RCU Abstractions in Rust

Rust code, unlike C, usually does not use separate lock and unlock calls for synchronisation primitives -- instead, RAII is used, and lock primitives are typically implemented by having a lock function that returns a Guard, and unlocking happens when the Guard is dropped.

For example, RCU read-side critical section could be implemented like this:

struct RcuReadGuard {
    _not_send: PhantomData<*mut ()>,
}

pub fn rcu_read_lock() -> RcuReadGuard {
    rcu_read_lock();
    RcuReadGuard { _not_send: PhantomData }
}

impl Drop for RcuReadGuard {
    fn drop(&mut self) {
        rcu_read_unlock();
    }
}

// Usage
{
    let guard = rcu_read_lock();

    /* Code inside RCU read-side critical section here */

    // `guard` is dropped automatically when it goes out of scope,
    // or can be dropped manually by `drop(guard)`.
}

If we disregard the memory safety issues discussed above just for a second, Rust lifetimes can model RCU fairly well:

struct RcuProtectedBox<T> {
    write_mutex: Mutex<()>,
    ptr: UnsafeCell<*const T>,
}

impl<T> RcuProtectedBox<T> {
    fn read<'a>(&'a self, guard: &'a RcuReadGuard) -> &'a T {
        // SAFETY: We can deref because `guard` ensures we are protected by RCU read lock
        let ptr = unsafe { rcu_dereference!(*self.ptr.get()) };
        // SAFETY: The lifetime is the shorter of `self` and `guard`, so it can only be used until RCU read unlock.
        unsafe { &*ptr }
    }

    fn write(&self, p: Box<T>) -> Box<T> {
        let g = self.write_mutex.lock();
        let old_ptr;
        // SAFETY: We can deref and assign because we are the only writer.
        unsafe {
            old_ptr = rcu_dereference!(*self.ptr.get());
            rcu_assign_pointer!(*self.ptr.get(), Box::into_raw(p));
        }
        drop(g);
        synchronize_rcu();
        // SAFETY: We now have exclusive ownership of this pointer as `synchronize_rcu` ensures that all reader that can read this pointer has ended.
        unsafe { Box::from_raw(old_ptr) }
    }
}

Note that in read, the returned lifetime 'a is tied to both self and RcuReadGuard. That is, the RcuReadGuard must outlive the returned reference -- leaving RCU read-side critical section by dropping RcuReadGuard will ensure that references obtained through the read method will no longer be readable.

However, such an abstraction is not sound, due to the sleep-in-atomic-context issue that we have described above.

fn foo(b: &RcuProtectedBox<Foo>) {        fn bar(b: &RcuProtectedBox<Foo>) {
    let guard = rcu_read_lock();
    let p = b.read(&guard);
                                              let old = b.write(Box::new(Foo { ... }));
    sleep();                                  // `synchronize_rcu()` returns
                                              drop(old);
    // Rust allows us to use `p` here
    // but it is already freed!
}

There were discussions about how we can provide abstractions of RCU in a sound way in the past two years. One approach is to make all RCU abstractions unsafe -- this is bad from a usability point of view, and wouldn't solve the issue when a Rust callback is called from C code inside RCU read critical sections. We can force preemption count and atomic context checking to be enabled, but this would introduce overhead to all kernel code that makes use of RCU and spinlocks. In fact, this approach was proposed by Wedson Almeida Filho and faced some rather significant pushbacks from Linus Torvalds.

People familiar with paradigms in Rust might also wonder if a token type, or some possible context and capabilities extension might help with this, but unfortunately it would not help this scenario. You can't do negative reasoning with token types thus a token-based approach would require almost all functions to carry tokens in their signatures.

In the end, we took none of the above approaches. There are no safeguards in the kernel's Rust API abstractions that prevent sleep-in-atomic-context from happening. This means that if you compile your kernel with preemption count tracking disabled, it's possible to write a Rust driver with only safe code that results in a use-after-free. Pragmatism is prioritised over soundness.

Custom Compile-time Checking with `klint`

While we have now established that we can't deal with sleep in atomic context with API design, nor with runtime checking (at least not in all configurations), there is still a way out -- custom linting tools. Here's how klint comes to play.

klint checks atomic context violation by tracking preemption count at compile-time. Each function is given two properties:

The adjustment to the preemption count after calling this function.
The expected range of preemption counts allowed when calling the function.

Here's a list of properties for some locking-related functions:

Function name	Adjustment	Expectation
`spin_lock`	`1`	`0`.. (any value)
`spin_unlock`	`-1`	`1`.. (≥1)
`mutex_lock`	`0`	`0`
`mutex_unlock`	`0`	`0`
`rcu_read_lock`	`1`	`0`..
`rcu_read_unlock`	`-1`	`1`..
`synchronize_rcu`	`0`	`0`

As you can see, sleepable functions (like synchronize_rcu and mutex_lock) are marked as having an adjustment of 0 (thus will not change the preemption count) and expects the preemption count of precisely 0 (i.e. not in atomic context). spin_lock can be called from any context (thus 0.. expectation) but will adjust the preemption count after returning.

klint provides a #[klint::preempt_count] attribute that can be applied to functions to annotate their properties. There is also a #[klint::drop_preempt_count] that can be used to annotate behaviour when a struct/enum is dropped. For example, the RcuReadGuard (and similarly, SpinLock) above could be annotated like this:

#[klint::drop_preempt_count(adjust = -1, expect = 1.., unchecked)]
struct RcuReadGuard { /* ... */ }

#[klint::preempt_count(adjust = 1, expect = 0.., unchecked)]
pub fn rcu_read_lock() -> RcuReadGuard { /* ... */ }

and sleep function could look like this:

#[klint::preempt_count(adjust = 0, expect = 0, unchecked)]
pub fn coarse_sleep(duration: Duration) { /* ... */ }

klint will analyse all functions, inferring possible preemption count values at each function call site, and will raise errors if the annotated expectation is violated. For example, if some code calls coarse_sleep with spinlock or RCU read lock held, then klint will give an error:

error: this call expects the preemption count to be 0
  --> samples/rust/rust_sync.rs:76:17
   |
76 |  kernel::delay::coarse_sleep(core::time::Duration::from_secs(1));
   |  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: but the possible preemption count at this point is 1

klint will also perform inference on annotated functions, to check that your annotation is correct, unless the unchecked option is supplied:

#[klint::preempt_count(expect = 0..)]
pub fn callable_from_atomic_context() {
    kernel::delay::coarse_sleep(core::time::Duration::from_secs(1));
}

will give

error: function annotated to have preemption count expectation of 0..
  --> samples/rust/rust_sync.rs:97:1
   |
97 | pub fn callable_from_atomic_context() {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: but the expectation inferred is 0
note: which may call this function with preemption count 0..
  --> samples/rust/rust_sync.rs:98:5
   |
98 |       kernel::delay::coarse_sleep(core::time::Duration::from_secs(1));
   |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   = note: but this function expects preemption count 0

In an ideal world, apart from FFI functions that need to be annotated, all other functions can have these properties inferred. But in reality there are additional difficulties from:

Generic functions
Indirect function calls (trait objects, function pointers)
Recursion

For recursive functions, klint will simply assume a default property, and if the result is different, it will give an error, asking for an explicit annotation.

Generic functions are tricky because it is impossible for us to assign a single property to a generic function. For example, we can't tell whether Option::map will sleep or not -- its property depends on its type argument, that is, the function/closure that we give it. Therefore, instead of treating a generic function as one entity, klint will check each monomorphized instance of a generic function separately. klint does attempt to optimise this process -- it will try to infer properties on a generic function first before bailing out and checking again after monomorphization.

klint assumes all function pointers to be sleepable and makes no adjustment to preemption counts. klint will warn if a Rust function that adjusts the preemption count is converted to a function pointer. For callers that can ensure their function pointers won't sleep, klint provides a way to annotate a function with its properties and skip checks and inferences.

For trait objects, by default klint will similarly assume these functions are sleepable and make no adjustment. Unlike function pointers though, trait methods can be annotated. Those annotations will be used on virtual function calls, and they will be checked against their implementations. For example, here's how the ArcWake trait is annotated in the kasync module:

/// A waker that is wrapped in [`Arc`] for its reference counting.
///
/// Types that implement this trait can get a [`Waker`] by calling [`ref_waker`].
pub trait ArcWake: Send + Sync {
    /// Wakes a task up.
    #[klint::preempt_count(expect = 0..)]
    fn wake_by_ref(self: ArcBorrow<'_, Self>);

    /// Wakes a task up and consumes a reference.
    #[klint::preempt_count(expect = 0..)] // Functions callable from `wake_up` must not sleep
    fn wake(self: Arc<Self>) {
        self.as_arc_borrow().wake_by_ref();
    }
}

These annotations and inferred results are absent in rustc's metadata so klint will persist these data in a separate metadata file. Similar to clippy, klint is implemented with a custom rustc driver, so to use it, simply replace rustc invocations with klint calls.

`klint` in Action

https://github.com/Rust-for-Linux/linux/pull/958 is an experimental PR which includes necessary changes to Rust for Linux code to make it work with klint. While still not production-ready, klint is already able to find bugs.

If the FIXME line in rust/kernel/kasync/executor/workqueue.rs is commented out, compiling the "Rust" branch (note that this is the branch with experimental code and is not the branch for upstreaming) with klint will fail with the following error:

error: trait method annotated to have preemption count expectation of 0..
   --> rust/kernel/kasync/executor/workqueue.rs:147:5
    |
147 |     fn wake(self: Arc) {
    |     ^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: but the expectation of this implementing function is 0
note: the trait method is defined here
   --> rust/kernel/kasync/executor.rs:73:5
    |
73  |     fn wake(self: Arc) {
    |     ^^^^^^^^^^^^^^^^^^^^^^^^
note: which may drop type `kernel::sync::Arc>>` with preemption count 0..
   --> rust/kernel/kasync/executor/workqueue.rs:149:5
    |
147 |     fn wake(self: Arc) {
    |             ---- value being dropped is here
148 |         Self::wake_by_ref(self.as_arc_borrow());
149 |     }
    |     ^
note: which may call this function with preemption count 0..
   --> rust/kernel/sync/arc.rs:236:5
    |
236 |     fn drop(&mut self) {
    |     ^^^^^^^^^^^^^^^^^^
note: which may drop type `kernel::sync::arc::ArcInner>>` with preemption count 0..
   --> rust/kernel/sync/arc.rs:255:22
    |
255 |             unsafe { core::ptr::drop_in_place(inner) };
    |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    = note: which may drop type `kernel::kasync::executor::workqueue::Task>` with preemption count 0..
    = note: which may drop type `kernel::sync::Arc` with preemption count 0..
note: which may call this function with preemption count 0..
   --> rust/kernel/sync/arc.rs:236:5
    |
236 |     fn drop(&mut self) {
    |     ^^^^^^^^^^^^^^^^^^
note: which may drop type `kernel::sync::arc::ArcInner` with preemption count 0..
   --> rust/kernel/sync/arc.rs:255:22
    |
255 |             unsafe { core::ptr::drop_in_place(inner) };
    |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    = note: which may drop type `kernel::kasync::executor::workqueue::Executor` with preemption count 0..
    = note: which may drop type `kernel::Either` with preemption count 0..
    = note: which may drop type `kernel::workqueue::BoxedQueue` with preemption count 0..
note: which may call this function with preemption count 0..
   --> rust/kernel/workqueue.rs:433:5
    |
433 |     fn drop(&mut self) {
    |     ^^^^^^^^^^^^^^^^^^
    = note: but this function expects preemption count 0

The problematic call trace that klint supplies ends on the BoxedQueue::drop. If we navigate to that function, we will see that it ends with a call to destroy_workqueue, which indeed might sleep. This can happen if the Waker is called after the executor is dropped and the tasks cancelled.

Limitation

Currently, klint does not have a way to represent a try_lock-like function for spinlocks (try_lock for mutexes is fine as it doesn't change the preemption count).

impl<T> SpinLock<T> {
    // Preemption count adjustment of this function is 0 or 1 depending on the variant of the return value.
    fn try_lock(&self) -> Option<Guard<'_, Self, WriteLock>> { ... }
}

Although it's possible to rewrite the try_lock function to take a callback to avoid this limitation:

impl<T> SpinLock<T> {
    // Preemption count adjustment of this function is 0!
    fn try_lock<T, F: FnOnce(Option<Guard<'_, Self, WriteLock>>) -> T>(&self) -> T { ... }
}

this is not as easy to use as a simple try_lock function that returns an Option.

Similarly, this pattern is not yet supported by klint:

fn foo(take_lock: bool) {
    if take_lock {
        spin_lock(...);
    }
    ...
    if take_lock {
        spin_unlock(...);
    }
}

The anticipation is that this pattern is less likely to be observed in Rust code due to RAII guards, but similar patterns can still arise from implicit drop flags introduced by the compiler:

fn foo(take_lock: bool) {
    let guard;
    // An implicit bool will be introduced here by the compiler to track if `guard` is initialised
    if take_lock {
        guard = SPINLOCK.lock();
    }
    ...
    // An implicit branch will be introduced here by the compiler to drop `guard` only if it has been initialised
}

We’d like to thank Futurewei for generously funding the work covered in this blog post!

Future Work

While klint is already proven to be useful, to date it is largely a prototype and needs more work to be production ready. Some possible future work includes:

Extending the analysis to work with acquiring/releasing locks conditionally (as discussed in the previous section);
Improving diagnostic messages to be more intuitive for kernel programmers (e.g. use this function might sleep as opposed to this function expects preemption count 0);
Exploring ways to annotate FFI functions automatically or derive them from C source file;
Expanding the checks to include raw atomic contexts misuse.
Integrating klint into CI.

Improving Rust compile times to enable adoption of memory safety

Tue, 31 Jan 2023 12:00:00 +0000

Rémy Rakic is helping us enable the adoption of memory safe software through work to improve Rust compile times. We asked him to provide his perspective on the work in this blog post. Thank you for your partnership and contributions, Rémy!

Josh Aas, Head of ISRG's Prossimo project

Introduction

Over the past few months I've been working as part of the Rust compiler performance working group on the initiative for better build times. I've been working on this effort through a contract with Prossimo that was generously supported by Google.

Context

Rust compile times are very often brought up as an area needing improvement, including in the yearly community survey. Slow compile times can be barriers to adoption and improving them could therefore help broaden the language's impact. That's one of Prossimo's goals: improving the potential for more memory safe software, particularly in the most critical parts of Internet infrastructure (e.g., networking, TLS, DNS, OS kernels, etc).

In my mind, Rust has historically focused more on runtime performance than compilation times, much like LLVM, one of the most important components used in Rust compilation. I feel that's a common story for modern compilers, in both engineering and academia. Compared to some other older languages tailored to having a lightning fast single-pass compiler, this was not the most important principle in Rust's design. The primary focus of the designers was making sure the language would offer the desired safety guarantees without compromising on the performance of programs.

Nonetheless, compile times have received a lot more attention recently and have noticeably improved over the years. There's more work we can do though. To help move things forward, tools and processes have been adopted and refined over time to help foster a culture of performance. Examples include:

A suite of benchmarks for the compiler, used for every PR that is merged. Each one is benchmarked under various use-cases: in check, debug or release modes, with or without incremental compilation (with different granularity of changes), with many different hardware metrics that are recorded and graphed over time, as well as from the compiler's internal profiling infrastructure.
These benchmarks can be triggered on-demand for a PR, prior to merging, in order to avoid surprises.
A summary of the results is posted on each merged PR, to notify the authors, reviewers, and the working group, if a PR needs attention.
A weekly triage process, to summarize the week's results, and have a friendly human in the loop in case there are calls to be made: help weed out sources of noise in the results (it happens sometimes), small inconsequential regressions that can be ignored or ones that would require more work, or unforeseen performance issues requiring a revert. We also celebrate the wins!
These summaries are used to notify the compiler team in their weekly meeting of the recent progress, as well as the community, via This Week in Rust.

Priorities

I worked with Prossimo on the following priorities:

Make pipelined compilation as efficient as possible
Improve raw compilation speed
Improve support for persistent, cached and distributed builds

We start by looking for what's slow. Looking at this holistically, from the crate-compile level, might provide new insights, especially since we've rarely done this before. I gathered the 1000 most popular crates from crates.io, and gathered data for complete cargo builds including dependencies. I also gathered rustc self-profiling data for a higher-level view and profiled for sources of high memory usage. All of this was done in check, debug, and release modes, with varying degrees of parallelism.

From this high level view, we could see a few promising ways to move forward:

Improvements to the compilation pipeline: profiling to find sources of slowness, and then find solutions and mitigations to these issues. That could be in rustc, cargo, or even rustup.
Improve compile times of targeted crates: if popular crates contain sources of slowness, this can in turn impact all the crates that transitively depend on it. In some situations, it's possible to improve the crate itself in addition to the compiler and tools.
Preventing future slowness: analyzing, tracking, mitigating regressions and bugs (e.g., incremental compilation issues that could lead to turning the feature off, as has happened before).
And finally, help people achieve the above (both contributors and crate authors). It's a common occurrence that people would like to see sources of slowness in their projects, and having the compiler display this information would help them organize or refactor their code accordingly.

Based on these findings, the compiler performance working group drafted a roadmap, updated our benchmark suite so that they stay relevant and representative of the practices people use, and developed a policy to periodically update our benchmarks so that they stay relevant. We saw new hotspots and inefficiencies in these new crates, and some surprising finds related to pipelining and scheduling, to the common presence of build scripts, and the relative importance of proc-macros.

An Overview of the Items I Worked On

The Compile-Time Function Evaluation Engine is seeing more use with the ongoing improvements and expansions to the "const" parts of the language. Its efficiency is important and will matter more and more in the future: some speedups were made here in the interning of allocations (which matters for example when traversing static arrays) by limiting it to the allocations that are known to contain references or interior mutability. Some of these buffers can be big (which is another area where improvements can be made in the future), and also have derived data like masks to track whether each byte of an allocation is correctly initialized, however the hashing algorithm the compiler uses (FxHash) is more suited to shorter keys, and we can limit some of the costly interning work by hashing less data overall.

Hashing is used heavily in the compiler, and improvements to FxHash (or using a different algorithm) would have noticeable effects on compile times. While it's surprisingly effective, we tried and measured an interesting variation that we ultimately chose not to land: it performed slightly differently on Intel vs AMD CPUs (and looked slightly better on the latter on some metrics). There are possible improvements still to be made on bigger buffers for example, where we could make better use of SIMD, but at the moment rustc still targets baseline x86-64 CPUs (SSE2) so that's a work item left for the future.

A related area is memory allocation (there's a saying that "rustc is a hashing and memory allocation benchmark in disguise"), and we already use jemalloc on linux and macOS. Since this is an active area of performance work for systems software, we regularly try alternative allocators as they keep improving. For example, Microsoft's mimalloc or snmalloc. We've also tracked, tested, and updated rustc to the long-awaited jemalloc 5.3 release. Due to the way rustc is architectured in order to be used from our other tools, it's less easy to use custom allocators than in a regular rust program (#[global_allocator] can't be used, and there are additional requirements due to interacting with LLVM) so there still are some improvements to be made here: some (hard-to-avoid) inefficiencies on macOS are still present, and Windows is still using the default system allocator.

As we mentioned above, rustc for x86-64 targets is built and distributed for x86-64 baseline CPUs, so we also tried to see whether performance would be improved with more recent microarchitecture levels instead: x64-v2 ("SSE4") and x64-v3("AVX/AVX2"). While there are improvements via auto-vectorization, it is not yet worth the complications that distributing such artifacts would have to the infrastructure and CI. It's however an interesting area for future work, for both the libs and compiler teams: the compiler and standard library being able to make use of modern SIMD algorithms more easily, would have benefits that would improve upon auto-vectorization.

We mentioned above that some of the linux and macOS targets were using a custom allocator, but that's part of a bigger set of configuration and distribution features made for better performance, and that can independently be enabled or disabled (mostly depending on whether the CI builders have the capacity to do that additional work). For example, in addition to having a custom allocator, doing Link-Time Optimization when building LLVM or rustc, doing Profile-Guided Optimizations for LLVM or rustc, using tools to optimize the final binaries (e.g. BOLT). All of these are enabled on x86_64-unknown-linux-gnu. In the most recent survey, 60-70% of respondents answered that they were mainly targeting linux, but more than 30% were also targeting macOS, and similarly more than 30% were also targeting Windows. We've therefore worked on making improvements on these latter two targets, to match the level of polish seen by the Linux users. I was able to update our bootstrap code, CI and perf collector to be able to use PGO for LLVM and rustc on Windows as well as enabling ThinLTO when building rustc for additional improvements. The macOS builders didn't have the capacity to support such a change at the time, but the situation has since improved, and similar improvements will be done there in the future. ThinLTO for rustc has already been enabled on nightly).

We were able to upgrade and deduplicate some rustc dependencies (and remove a now-unused jemalloc wrapper) to make the contributor experience slightly better, by improving rustc compile times locally and on CI.

We looked at speeding up parts of "trait coherence" (checking that there is at most one implementation of a trait for any given type) handling negative impls (an unstable feature used in parts of the standard library), by doing less traversals when looking for its dedicated attributes.

Cargo and rustc support "pipelining" for faster compilation: if there's no linking involved, a crate can start compilation early, before its dependencies have completely finished their build. It only needs what is called "metadata" from them. So cargo asks rustc to emit it, and when it's available it can start building dependent crates if there's enough parallelism available. In benchmarks, we saw that the popular library hyper wasn't seeing this pipelining, neither were its users when building it. There were different ways to fix this, one of which took advantage of a feature of cargo that was still under development. So we opted to help test and benchmark cargo's --crate-type unstable option, and helped hyper use it to fix the issue.

The benchmarks showed that some build.rs scripts could be slow to compile, and were surprisingly common. They showed missing pieces in the language, requiring this cargo-specific concept to achieve goals specific to a target or language version (MSRV), were slower to build than we'd expect and involved overhead from rustup. We helped revive the RFC, so that some of these scripts could be removed in the future and improve compile times, and made plans for faster compilation for them.

Most of the time, users will not build their whole set of dependencies (only after upgrading rustc, changing RUSTFLAGS or a combination of features, etc) but that happens often on CI (on builders with a generally low number of cores) and there were some improvements we could make in cargo here. First, some better defaults could be chosen when compiling build dependencies (build scripts and proc-macros, and their dependencies), in particular, debuginfo is less commonly useful here than for an actual binary/library. We've made a prototype and benchmarked it on the 1000 crates dataset. There are open PRs to add this to cargo but review hasn't finished yet.

Then we also saw that proc-macros could hinder build parallelism, and seemed to be built later than we anticipated, sometimes participating in pipeline stalls. Looking at cargo's timing graphs, there were cases where compilation could be improved by changing the scheduling of the crate graph. Some prototypes were made and benchmarked on the 1000 crates dataset. The first one managed to make use of the existing notion of "priority" in cargo (a proxy for the number of work items depending on a crate) to bias towards higher-priority crates whenever the choice of next crate to build was made, and has since landed in cargo. This is noticeable with a number of dependencies, or at low core counts (so matches the configuration seen on CIs, especially at the free tiers). The second prototype additionally allowed scheduling hints so users could assign higher priorities to some crates (in the spirit of this feature request) for example to build common proc-macro crates sooner (or any crate that could make better use of parallelism by being scheduled differently).

And finally to help users see whether proc-macros slowed down their builds, I expanded the self-profiler to show proc-macro expansion, and opt into finer details on demand, to be able to analyze each proc-macro use and its duration. (Similarly supporting regular macros would be interesting in the future).

In the same spirit, since monomorphizations are often a compilation-time cost that is not easy to see, some work trying to show the number and sizes of generic function instantiations has been started. (It's helpful but there are still improvements to be made in the future: this is currently computed on the MIR, while stats on the actual LLVM IR emitted, like cargo llvm-lines, would be even more helpful).

We were able to benchmark bjorn3's cranelift codegen backend on full crates as well as on the build dependencies specifically (since they're also built for cargo check builds, and are always built without optimizations): there were no issues, and it performed impressively. It's well on its way to becoming a viable alternative to the LLVM backend for debug builds.

As improving compilation times for specific popular crates would be impactful throughout the ecosystem (in particular when rustc itself could have different tradeoffs when fixing a given issue), we noticed a few hotspots and made PRs to help out async-std, quote, diesel.

I also re-started a long-term piece of work that is still ongoing: doing the prep work required to be able to switch to use the lld linker on linux by default. People tend to use the default linker (often ld.bfd) which is quite a bit slower than lld. Making cargo/rustc use it would be noticeable for the projects that aren't already using a faster linker.

We saw opportunities to remove some items from crate metadata: sometimes information could be stored redundantly, sometimes it's only used by a tool (e.g. rustdoc), and have started this clean-up. More work in this direction could be interesting in the future (and a bunch of work there has already been done by other contributors already): improvements in this area apply to the loading of all crates, and in particular speeding up decoding libstd/libcore's (although that is quite fast already) would ultimately apply to most compilation sessions.

As mentioned earlier, a big part of getting faster is not getting slower, so there were a lot of regression analyses (including the ones specific to our infrastructure, where we often roll up multiple PRs into one in order to save CI time) in performance and incremental compilation for example.

Conclusion

This was a quick look at some of the work achieved in 2022, and other members of the working group have also published similar reports: Nick Nethercote does so regularly, Jakub Beránek recently did as well. Many others have also contributed various improvements, from the standard library to the infrastructure, all that work combined together resulted in noticeable improvements to compile times.

The compiler performance working group has completed many if not all of the items in the roadmap, but performance work is never really done, and continues as we speak. Explorations and plans are being drafted for 2023, for example reviving the effort dedicated to making better use of parallelism in the compiler, and more.

I'd also like to thank Felix Klock and Wesley Wiser for their ideas, time, and guidance, the other members of the working group Ryan Levick, Mark Rousskov, Jakub Beránek and Nick Nethercote for their help, talent and the great work they did, and Prossimo for giving me the opportunity to contribute to that effort.

Assessing Progress on Memory Safety at USENIX Enigma Conference

Thu, 26 Jan 2023 12:00:00 +0000

I had the pleasure of discussing the state of memory safety at this year's USENIX Enigma conference, one of our industry's leading fora for important security and privacy issues. There were several salient points made about how to eliminate code that lacks memory safety, and I want to highlight a few that I see as most actionable.

Start by insisting that new modules and programs be written in memory safe languages. It's going to take a while to replace unsafe code that has already been written, but we can stop creating more unsafe code now.
We don't need to rewrite everything at once. Pick the most security-critical modules and start there. It can be relatively easy to replace an unsafe module with a safe one.
Maintainers don't necessarily need to learn another language. For example, many Rust-based modules come with C APIs so you can integrate them easily without needing to know Rust.
Help stakeholders understand that we don't have to live with the constant stream of memory safety vulnerabilities that come out of code that is not memory safe. It will take some work, but we have the knowledge and tools to make memory safety vulnerabilities a rarity.

In 2023, Prossimo will continue to make headway on improving memory safety in critical infrastructure. I'm particularly excited about new work on the memory safe TLS library called Rustls.

I'd like to thank Yael Grauer for organizing this panel, and my fellow panelists, Amira Dhalla and Alex Gaynor for the energizing conversation. Thank you to the USENIX Enigma conference and its organizers for giving us the opportunity to discuss and get the word out about this important topic. Finally, thanks to our funders who have supported our memory safety efforts: Acton Family Giving, AWS, Cisco, Futurewei, Fly.io, and Google.

If you're interested in learning more about memory safety, check out this new report from Consumer Reports.

A Year-End Letter from our Executive Director

Mon, 05 Dec 2022 12:00:00 +0000

This letter was originally published in our 2022 annual report.

The past year at ISRG has been a great one and I couldn't be more proud of our staff, community, funders, and other partners that made it happen. Let's Encrypt continues to thrive, serving more websites around the world than ever before with excellent security and stability.

A particularly big moment was when Let's Encrypt surpassed 300,000,000 websites served. When I was informed that we had reached that milestone, my first reaction was to be excited and happy about how many people we've been able to help. My second reaction, following on quickly after the first, was to take a deep breath and reflect on the magnitude of the responsibility we have here.

The way ISRG is translating that sense of responsibility to action today is probably best described as a focus on agility and resilience. We need to assume that, despite our best efforts trying to prevent issues, unexpected and unfortunate events will happen and we need to position ourselves to handle them.

Back in March of 2020 Let's Encrypt needed to respond to a compliance incident that affected nearly three million certificates. That meant we needed to get our subscribers to renew those three million certificates in a very short period of time or the sites might have availability issues. We dealt with that incident pretty well considering the remediation options available, but it was clear that incremental improvements would not make enough of a difference for events like this in the future. We needed to introduce systems that would allow us to be significantly more agile and resilient going forward.

Since then we've developed a specification for automating certificate renewal signals so that our subscribers can handle revocation/renewal events as easily as they can get certificates in the first place (it just happens automatically in the background!). That specification is making its way through the IETF standards process so that the whole ecosystem can benefit, and we plan to deploy it in production at Let's Encrypt shortly. Combined with other steps we've taken in order to more easily handle renewal traffic surges, Let's Encrypt should be able to respond on a whole different level the next time we need to ask significant numbers of subscribers to renew early.

This kind of work on agility and resilience is critical if we're going to improve security and privacy at scale on the Web.

Our Divvi Up team has made a huge amount of progress implementing a new service that will bring privacy respecting metrics to millions of people. Applications collect all kinds of metrics: some of them are sensitive, some of them aren't, and some of them seem innocuous but could reveal private information about a person. We're making it possible for apps to get aggregated, anonymized metrics that give insight at a population level while protecting the privacy of the people who are using those apps. Everybody wins - users get great privacy and apps get the metrics they need without handling individual user data. As we move into 2023, we'll continue to grow our roster of beta testers and partners.

Our Prossimo project started in 2020 with a clear goal: move security sensitive software infrastructure to memory safe code. Since then, we've gotten a lot of code written to improve memory safety on the Internet.

We're ending the year with Rust support being merged into the Linux kernel and the completion of a memory safe NTP client and server implementation. We're thrilled about the potential for a more memory safe kernel, but now we need to see the development of drivers in Rust. We're particularly excited about an NVMe driver that shows excellent initial performance metrics while coming with the benefit of never producing a memory safety bug. We are actively working to make similar progress on Rustls, a high-performance TLS library, and Trust-DNS, a fully recursive DNS resolver.

All of this is made possible by charitable contributions from people like you and organizations around the world. Since 2015, tens of thousands of people have given to our work. They've made a case for corporate sponsorship, given through their DAFs, or set up recurring donations. That's all added up to $17M that we've used to change the Internet for nearly everyone using it. I hope you'll join these people and support us financially if you can.

Rust in the Linux Kernel: Just the Beginning

Tue, 18 Oct 2022 00:00:00 +0000

Support for using Rust in the Linux Kernel was recently merged by Linus Torvalds. This is important because Rust is memory safe, meaning that code written in Rust does not suffer from things like buffer overflows, use-after-free, and other memory management vulnerabilities that plague software written in unsafe languages like C and C++. Being able to use Rust in the Linux kernel is an incredible milestone on the road to a more secure future for the Internet and everything else that depends heavily on Linux.

This milestone is the result of amazing work led by Miguel Ojeda. Miguel has been doing his work under contract with our Prossimo project, which was made possible with generous financial support from Google.

We will soon be lending more support to the Rust for Linux project by financially supporting Gary Guo's work on improving the Rust compiler's support for features needed in the kernel. This work has been made possible with generous support from Futurewei.

While adding Rust support to the Linux kernel is an almost unbelievable achievement requiring years of hard work, this is just the beginning. Now this new capability needs to be used by developing and merging safer device drivers and possibly other kernel components written in Rust.

We are working to identify Rust drivers that would benefit from investment so that we can coordinate fundraising and contractors to help. We're also talking with companies that maintain drivers for their hardware and encouraging them to experiment with moving drivers to Rust.

A lot of progress has been made on an NVMe driver that we're excited about. We hope to make significant investments in this effort soon. The Android team at Google has been experimenting with porting their Binder IPC driver, and we expect this may be one of the first to achieve production status. Other vendors in the Android ecosystem have also expressed an interest in using Rust for new driver development. There's additionally a GPU driver for Apple's M1 platform being worked on by "Asahi Lina," a member of the Asahi Linux community.

If you'd like to help us support work on Rust drivers for the Linux kernel please get in touch!

A Memory Safe Implementation of the Network Time Protocol

Tue, 11 Oct 2022 00:00:00 +0000

Folkert and the team at Tweede golf are helping us to build a memory safe NTP implementation. We asked them to share their experience in this blog post. Thank you for your partnership and contributions, Tweede golf team!

Josh Aas, Head of ISRG’s Prossimo project

For the last few months we at Tweede golf have been working on implementing a Network Time Protocol (NTP) client and server in rust.

The project is a Prossimo initiative and is supported by their sponsors, Cisco and AWS. Our first short-term goal is to deploy our implementation at Let's Encrypt. The long-term goal is to develop an alternative fully-featured NTP implementation that can be widely used.

In this blog post we'll talk about the process of implementing a new open-source version of the protocol in Rust, why an alternative NTP implementation is important, and our experiences along the way.

Our project is called ntpd-rs. More information and the initial release of the client can be found here.

What is NTP?

The network time protocol synchronizes time between devices connected to a network. Accurate time is essential when your device communicates with other devices, mostly to make sure events are ordered correctly. The device you are reading this article on is probably running an NTP process that regularly synchronizes itself with the real time.

NTP is one of the oldest Internet protocols, and although it is less known than HTTP or DNS for example, the Internet and its billions of devices depend on it every day.

The clocks in our devices are reasonably accurate, but can drift meaningfully in the space of hours. The real time is kept with atomic clocks. Many technology companies and foundations provide NTP servers that make this time available to the Internet.

But if you ask such a server what time it is, then by the time its response reaches you, that time is out of date. You need to somehow correct for the transmission delay.

The NTP client performs this correction and maintains connections with multiple NTP servers for increased reliability.

Goals

NTP is a foundation of the internet, and must be absolutely secure and reliable. For example, precise time is used to check TLS certificates. It would be a disaster if an attacker could adjust the system time such that an outdated certificate is seen as valid.

The primary goal for this project is to provide an alternative implementation of NTP, that is just that: secure and reliable.

About Prossimo and the relevance of our project

Our NTP project fits seamlessly into Prossimo's objectives. Memory safety is a requirement to achieve a secure implementation of any critical software. Prossimo's mission states:

Memory safety for the Internet's most critical infrastructure

Simply put: we should provide memory-safe implementations for pieces of software that run the Internet wherever we can. This has materialized in work being done on TLS (Rustls), curl, the Linux kernel, and this project.

Security of open source software

Prossimo's mission has recently attracted more attention since the Linux Foundation project OpenSSF published The Plan and selected Memory Safety as one of ten points of action to improve security of open source software. The plan is backed by open source foundations including the Rust Foundation, the technology industry, as well as the White House. The execution of the memory safety work in The Plan consists in large part of Prossimo projects.

We will not dive into all the pros and cons of re-implementations in memory-safe languages here, but we do hope to show that these types of efforts do not need to be lengthy and painful. You can deliver solid results far more quickly than you might think, if you use a programming language like Rust, with its strong ecosystem and tools.

Rust

Using Rust for our implementation means that the client we build is memory safe. We don't do many allocations, but we can be confident that we'll have no segfaults, buffer overflows or memory leaks ever. The same is true for anyone building on top of our implementation.

Another benefit of Rust is that we can use its standard library and package ecosystem, so our NTP implementation is much smaller (hence easier to validate) than the alternatives. This small size also makes it easier to play around with extensions to the NTP specification (e.g. in the development of the next version of NTP, NTPv5).

A nice side-effect of our effort is that more people are now familiar with how NTP works, and hopefully our implementation is more accessible. It's a bit of a niche thing to get into, but as we noted above accurate time is crucial for a lot of modern Internet infrastructure.

Our experiences

The protocol

We initially struggled with turning the NTP specification text into code. What seems reasonable at a distance becomes painfully imprecise when you actually implement it. Still, after some deliberation, we were ready to get to work.

The core of our implementation is a collection of data structures and algorithms that implement the logic of NTP. This part does not do any network calls or clock modification, it just describes as data what calls and modifications are needed.

Modeling this core, ntp-proto as pure functions makes it easily testable, and means we have no unsafe code in this part of the code base.

Our implementation is not a port of the logic of existing implementations ntpd or chrony. In fact, we barely looked at their implementation at all, because we found it hard to map their source code to the NTP specification. Our implementation is based solely on the specification.

The network

Our networking requirements go beyond what the Rust standard library provides. NTP uses UDP to send packets, which is well-supported, but we want more.

The NTP algorithm uses send and receive timestamps, which means we have to know the time right before a packet was sent, and right after it was received.

We could create these timestamps in our program with Instant::now, but that time would include the time spent between the socket receiving the message, and our program actually resuming to process the message.

Unix sockets provide more accurate send and receive timestamps. But to get them, we must configure the socket with functions from libc, which is unsafe and wildly underdocumented. After lots of digging, we figured out the mechanism, made a tiny contribution to libc, and implemented a safe async version of kernel software timestamping.

The clock

Manipulating the system clock is not exposed by the standard library, so here too we must drop down to libc. Luckily we had some experience with manipulating the clock on Linux from our Precision Time Protocol project.

The system

The above components are combined in what NTP calls the "system": it orchestrates how the NTP daemon interacts with the outside world and the system clock.

The system relies on Rust's async support and the tokio async runtime. We use clap to define our command-line interfaces.

The existing C solutions need to build concurrency, argument parsing and other basic functionality from scratch. Having no dependencies brings distribution advantages, but has serious downsides in terms of lines of code and maintainability. Verifying memory safety and other security properties in such a big code base is hard.

In Rust, checking properties of the system is much easier: memory safety is guaranteed by the compiler, and the standard library and many of the popular libraries have undergone serious security scrutiny. Our code, and by extension the work involved in security reviews, is comparatively small.

Conclusion

Once we had a handle on NTP's core ideas, development went smoothly. Setting up all the other parts, networking, clock adjustment and the asynchronous tasks, was a joy.

Our smooth experience strengthens us in the conviction that Rust is the correct choice for projects of this type: all of the protocol logic is completely safe, and only when we interact directly with the OS do we need to reason about unsafe code. We start the subsequent phases of the project with confidence.

Our code is available on Github.

Roadmap

With Prossimo's support we aim to build a complete NTP implementation that provides a modern alternative for ntpd and chrony. In the short term, there are two milestones on our roadmap:

NTP server

Development of the NTP server is nearly complete, expected to be finished Nov 2022.

NTS support

Plain NTP is unencrypted and does not establish a trusted connection. NTS adds these features on top of NTP. NTS is important when using NTP servers on the public internet but is not widely deployed yet. We hope that supporting it will help with adoption.

Check Prossimo's project plan for more details and for options to support their work.

Memory Safety for the World’s Largest Software Project

Thu, 23 Jun 2022 00:00:00 +0000

Rust for Linux and Prossimo

The Rust for Linux project aims to bring a new system programming language into the Linux kernel. Rust has a key property that makes it very interesting to consider as the second language in the kernel: it guarantees no undefined behavior takes place (as long as unsafe code is sound), particularly in terms of memory management. This includes no use-after-free issues, no double frees, no data races, etc.

Prossimo is an Internet Security Research Group (ISRG) project. Its goal is to improve the Internet's security-sensitive software infrastructure by addressing memory safety issues in C and C++ code via the use of memory safe languages. One critical example of such infrastructure is the Linux kernel, used in most servers in the world as well as in billions of devices.

The origins of Rust and the kernel

The desire to write Linux kernel code in Rust has been around for quite a while, and different people have created out-of-tree modules with Rust over the years. The earliest attempt I am aware of is from 2013 by Taesoo Kim, which was even before Rust 1.0 was released.

The Rust for Linux project was created with a dream goal beyond out-of-tree modules: providing Rust support within the kernel itself. If GitHub logs are to be believed, I created the organization back in the summer of 2019, although it did not really have activity until the next summer, when several things happened in a row.

In July 2020, Nick Desaulniers sent an email to the Linux Kernel Mailing List (LKML) about putting together an “in-tree Rust” session for the 2020 Linux Plumbers Conference (LPC). That email resulted in a few of us presenting the "Barriers to in-tree Rust" talk in August 2020, which triggered quite a few discussions and feedback in the LPC hackrooms. While it was still a moonshot, it seemed like enough people had an interest in Rust around the kernel; thus I thought it would be a good time to get everyone together working in a single place.

To that end, a handful of days later I submitted the first pull request of the Rust for Linux project which added the initial Rust support, including the Kbuild integration, initial built-in modules support and the beginning of the kernel crate (which contained Alex Gaynor's and Geoffrey Thomas' abstractions).

Others joined the effort over the next few months, such as Wedson Almeida Filho from Google, who is a maintainer of the project and the main contributor to the abstractions and drivers. Soon after that the Internet Security Research Group contacted me to offer support on working on Rust for Linux full time for a year with funding from Google.

Progress this year

We have had a lot of progress since the Request for Comments was submitted to the Linux Kernel Mailing List. On the infrastructure side, some relevant changes include:

Removed panicking allocations by integrating a subset of the alloc standard library.
Moved to Edition 2021 of the Rust language.
Moved to stable releases of the Rust compiler (unstable features are still used) and started to track the latest version.
Added arm (32-bit) and riscv basic architecture support.
Testing support, including running documentation tests inside the kernel as KUnit tests.
Added support for hostprogs (userspace programs used during build) written in Rust.
On-the-fly generation of target specification files based on the kernel configuration.
Expanded the documentation, including a new example repository showing a Rust out-of-tree module based on the in-tree Rust support.

On the abstractions and driver side, some important changes have been:

PrimeCell PL061 GPIO example driver.
Functionality for platform and AMBA drivers, red-black trees, file descriptors, efficient bit iterators, tasks, files, IO vectors, power management callbacks, IO memory, IRQ chips, credentials, VMA, Hardware Random Number Generators, networking...
Synchronization features such as RW semaphores, revocable mutexes, raw spinlocks, a no-wait lock, sequence locks...
Replaced Arc and Rc from the alloc crate with a simplified kernel-based Ref.
Better panic diagnostics and simplified pointer wrappers.
The beginning of Rust async support.

Related projects saw a lot of progress too:

Rust stabilized a few unstable features we used.
Improvements on the Rust compiler, standard library and tooling, such as making rustc_parse_format compile on stable or the addition of the no_global_oom_handling and no_fp_fmt_parse gates.
binutils/gdb/libiberty got support for Rust v0 demangling.
pahole is getting support for excluding Rust compilation units.
Intel's 0Day/LKP kernel test robot started testing a build with Rust support enabled.
KernelCI is also looking forward to enable Rust in their runs.
Linaro's TuxSuite added Rust support.
rustc_codegen_gcc (the rustc backend for GCC) got merged into the Rust repository.
GCC Rust (a Rust frontend for GCC) is working towards compiling core, which would be quite a milestone.
Compiler Explorer added the alternative compilers for Rust (GCC Rust, rustc_codegen_gcc and mrustc), as well as other features such as MIR and macro expansion views.

On top of that, we got contacted by several entities from the industry about their interest in the project, such as Google, Arm, Microsoft and Red Hat. Other companies have also privately expressed support for the project and/or are giving time to their engineers to explore its usage.

We have also been in contact with several academics, including researchers at the University of Washington: “Rust for Linux is a key step towards reducing security-critical kernel bugs, and on the path towards our ultimate goal of making Linux free of security-critical bugs. We are using Rust in our OS research, and adoption is easier with an existing Rust in the Linux kernel framework in place”. They recently published An Incremental Path Towards a Safer OS Kernel.

Similarly, members of LSE (Systems Research Laboratory) at EPITA (École pour l'informatique et les techniques avancées) used Rust for Linux because they “are convinced that Rust is changing the landscape of system programming by applying the research done on programming languages in the last decades. We wanted to see how the language was able to help us write code we are really comfortable with thanks to the extensive static checking.”

In addition, we presented Rust for Linux (and Rust in general) in a few avenues: Linaro Virtual Connect Fall, Clang Built Linux Meetup, Linux Plumbers Conference (LPC) (e.g. on Rust in the Linux ecosystem, on Rust for Linux and on Android drivers in Rust), Samsung Engineering Summit, Open Source Summit Japan, Rust Cross Team Collaboration Fun Times (CTCFT) and Rust Linz. As a fun fact, according to the keynote’s informal poll at LPC 2021, Rust was the emerging technology attendees were most excited about.

We also gave a couple Linux Foundation Live Mentorship Series sessions on an introduction to Rust safety and abstractions and on code documentation and tests.

Finally, we organized Kangrejos, the Rust for Linux Workshop (using the LPC infrastructure – thanks!), as a place to meet with everyone interested in a single place just before LPC.

How the last year felt

The kernel is a huge project with a lot of stakeholders. Since the beginning, it was clear that adding a second "main" language to the kernel would have both technical and management challenges.

For instance, we have had discussions and feedback from maintainers of the kernel build system, documentation, testing, CIs, architecture, tooling (e.g. pahole), etc. We also had contact with several Rust teams for discussions around features the kernel needed and other topics. We talked to organizations like the Linux Foundation as well as with news organizations such as Linux Weekly News (LWN), which covered several Rust-related topics and Kangrejos.

All in all, most of my work last year has been on working with all the stakeholders to try to get everyone on board. There have been many people that have contributed to the project in many different ways: code contributions, reviews, documentation, tooling support, Rust features... to the point that it is hard to list them all. Please take a look at the "Acknowledgments" section of each cover letter throughout the months. Some recurring contributors have been Björn Roy Baron and Gary Guo (as experts on the Rust compiler), Maciej Falkowski, Adam Bratschi-Kaye, Sven Van Asbroeck, Boqun Feng, Finn Behrens...

Overall, it has been a blast working with all the different teams and people, and I hope we continue forward getting everyone together to make this happen. Prossimo's commitment to this project has allowed me to work full time on it for which I am very grateful – thank you!

The next year

This second year since the RFC we are looking forward to several milestones which hopefully we will achieve:

More users or use cases inside the kernel, including example drivers – this is pretty important to get merged into the kernel.
Splitting the kernel crate and managing dependencies to allow better development.
Extending the current integration of the kernel documentation, testing and other tools.
Getting more subsystem maintainers, companies and researchers involved.
Seeing most of the remaining Rust features stabilized.
Possibly being able to start compiling the Rust code in the kernel with GCC.
And, of course, getting merged into the mainline kernel, which should make everything else easier!

In terms of events, we are looking forward to:

Open Source Summit North America – tune in and ask questions!
The second edition of Kangrejos, the Rust for Linux Workshop, face-to-face this time.
Linux Plumbers Conference 2022. This year there will be a Rust MC (microconference), and we intend to cover talks and discussions on both Rust for Linux as well as other non-kernel Rust topics. The Call for Proposals is open!
Three more Linux Foundation Live Mentorship Series are coming.
Participation planned in a few more venues, to be announced.

If all this happens, it may turn out to be an even more exciting year than the first!

About Prossimo

ISRG is the 501(c)(3) nonprofit organization behind Prossimo and Let's Encrypt. We are 100% supported through the generosity of those who share our vision for ubiquitous, open Internet security. If you'd like to support our work, please consider getting involved, donating, or encouraging your company to become a funder.

Bringing Memory Safe TLS to Apache httpd

Tue, 01 Mar 2022 00:00:00 +0000

Stefan Eissing helped us complete one of our first Prossimo initiatives so we asked him to provide his perspective on the work in this blog post. Thank you for your partnership and contributions, Stefan!

Josh Aas, Head of ISRG’s Prossimo project

Motivation

One goal of ISRG's Prossimo memory safety initiative is to show how software infrastructure on the Internet can be improved by enhancing existing software with solutions that provide better security via memory safety.

What better target could there be than the ancient (nearly Internet age!) juggernaut of HTTP, the Apache httpd server. It has been around since 1995 and is written in the security fragile C language. As a longtime member of the Apache community, I was eager to explore how memory safe code could play nicely within the great tool this group has built.

The Project

The goal of this effort is to develop a memory safe TLS backend for Apache httpd. TLS is a critical function within httpd so creating a backend with improved memory safety should be a meaningful security improvement for people who choose to use it.

Several components made this work possible. Rustls is a complete implementation of the TLS protocol written in Rust. In order to make it possible to use Rustls from a C program, Jacob Hoffman-Andrews and Kevin Burke created a C API for Rustls called rustls-ffi. The new mod_tls module uses Rustls via rustls-ffi.

Rustls-ffi has nothing specific to Apache in it, but it was developed in coordination with the Apache work, as well as similar work being done on curl, in order to ensure that it works well for real consumers. If you want to make TLS in your own C application safer, rustls-ffi is an option for you now.

On the Apache side, I did two things:

I developed mod_tls around Rustls, via rustls-ffi, as a memory safe alternative to the existing TLS module mod_ssl, which uses OpenSSL and compatible libraries. Apache modules are dynamically loadable extensions, so people can choose what they need.
I enhanced Apache's internal infrastructure to allow not just one, but several, TLS modules to be loaded in the same server instance.

The second point means you can phase in mod_tls alongside mod_ssl in your installations. Rustls only supports TLS 1.2 and higher because earlier TLS versions are not considered to be secure. However, some people still need to support older versions of TLS for legacy systems. If you want to use mod_tls for TLS 1.2 and higher while still supporting older versions of the protocol with mod_ssl, you can do that today.

The mod_tls documentation offers a feature table comparing it to mod_ssl, letting people determine if they can make the complete switch or which parts of the server they may migrate to the memory safe implementation. Possible scenarios are described in the documentation. Client certificate support is not available yet, but we are hoping to add it in the future.

The Cost

I spent approximately six months building mod_tls and making it possible to load multiple TLS modules at the same time in httpd. This was made possible through Google's generous support of Prossimo and its mission to improve memory safety (thank you!).

Compared to the impact this work can have, it was not a big investment. Making it possible to swap memory safe modules into existing C programs for critical functionality is a strategy that ISRG's Prossimo project is particularly interested in. It means they can invest in a small set of important libraries and then use them in critical software infrastructure with minimal effort. As much as I would like to learn Rust in the future, I was able to bring a Rust-based TLS module into httpd without having to know any Rust at all!

As mentioned before, the Prossimo project did something very similar with curl, but they didn't stop at TLS - they were also able to do the same thing for HTTP, making it possible to build curl with a memory-safe Hyper HTTP implementation.

The Result

In December 2021, mod_tls was released as part of Apache httpd v2.4.52. It has experimental status, which is commonly done when new functionality is phased in because the project wants feedback from the field and reserves the possibility to make non-backward compatible changes.

Load tests on developer machines are very promising. Performance of Rustls is on the level of OpenSSL (release line v1.1.x was used in the test). Memory use appears to be reduced. Note that Rust produces 'direct' code and has no garbage collection or other runtime overhead. It remains solid under load.

Availability

Right now you need to build rustls-ffi separately, preferably from the github repositories. Once you have rustls-ffi installed you can build mod_tls as part of the Apache httpd release, just like other modules.

We will be working to get various Linux and BSD distributions that already ship the Rust toolchain to make a package for mod_tls (as they do for other Apache modules already). That will be the preferred way for you to use it and also receive updates.

About Prossimo

A Year-End Letter from our Executive Director

Thu, 16 Dec 2021 00:00:00 +0000

This letter was originally published in our 2021 annual report.

We can do a lot to improve security and privacy on the Internet by taking existing ideas and applying them in ways that benefit the general public at scale. Our work certainly does involve some research, as our name implies, but the success that we’ve had in pursuing our mission largely comes from our ability to go from ideas to implementations that improve the lives of billions of people around the world.

Our first major project, Let’s Encrypt, now helps to protect more than 260 million websites by offering free and fully automated TLS certificate issuance and management. Since it launched in 2015, encrypted page loads have gone from under 40% to 92% in the U.S. and 83% globally.

We didn’t invent certificate authorities. We didn’t invent automated issuance and management. We refined those ideas and applied them in ways that benefit the general public at scale.

We launched our Prossimo project in late 2020. Our hope is that this project will greatly improve security and privacy on the Internet by making memory safety vulnerabilities in the Internet’s most critical software a thing of the past. We’re bringing a healthy dose of ambition to the table and we’re backing it up with effective strategies and strong partnerships.

Again, we didn’t invent any memory safe languages or techniques, and we certainly didn’t invent memory safety itself. We’re simply taking existing ideas and applying them in ways that benefit the general public at scale. We’re getting the work done.

With our latest project, Divvi Up for Privacy Preserving Metrics (PPM), the core ideas are a bit newer than the ideas behind our other projects, but we didn’t invent them either. Over the past decade or so some bright people have come up with a way to resolve the tension between wanting to collect metrics about populations and needing to collect data about individuals.

We believe those ideas have matured enough that it’s time to deploy them to the public’s benefit. We started by building and deploying a PPM service for Covid-19 Exposure Notification applications in late 2020, in partnership with Apple, Google, the Bill & Melinda Gates Foundation and the Linux Foundation. We’re expanding that service so any application can collect metrics in a privacy-preserving way.

Being ready to bring ideas to life means a few different things.

We need to have an excellent engineering team that knows how to build services at scale. It’s not enough to just build something that works - the quality and reliability of our work needs to inspire confidence. People need to be able to rely on us.

We also need to have the experience, perspective, and capacity to effectively consider ideas. We are not an organization that “throws things at the wall to see what sticks.” Between our staff, our board of directors, our partners, and our community, we’re able to do a great job evaluating opportunities to understand technical feasibility, potential impact, and alignment with our public benefit mission—to reduce financial, technological, and educational barriers to secure communication over the Internet.

Administrative and communications capabilities are essential. From fundraising and accounting to legal and social media, our administrative teams exist in order to support and amplify the critical work that we do. We're proud to run a financially efficient organization that provides services for billions of people on only a few million dollars each year.

Finally, it means having the financial resources we need to function. As a nonprofit, 100% of our funding comes from charitable contributions from people like you and organizations around the world. But global impact doesn’t necessarily require million dollar checks: since 2015 tens of thousands of people have given to our work. They’ve made a case for corporate sponsorship, given through their DAFs, or set up recurring donations, sometimes to give $3 a month. That’s all added up to $17M that we’ve used to change the Internet for nearly everyone using it. I hope you’ll join these people and support us financially if you can.

Supporting Miguel Ojeda’s Work on Rust in the Linux Kernel

Thu, 17 Jun 2021 00:00:00 +0000

ISRG’s Prossimo project for memory safety aims to coordinate efforts to move the Internet’s critical software infrastructure to memory safe code. When we think about what code is most critical for today’s Internet, the Linux kernel is at the top of the list. Bringing memory safety to the Linux kernel is a big job, but the Rust for Linux project is making great progress. We’re pleased to announce that we started formally supporting this work in April 2021 by providing Miguel Ojeda with a contract to work on Rust for Linux and other security efforts full time for one year. This was made possible through financial support from Google. Prior to working with ISRG, Miguel was undertaking this work as a side-project. We are happy to do our part in supporting digital infrastructure by enabling him to work full-time on it.

We’ve worked closely with Dan Lorenc, Software Engineer at Google to make this collaboration possible. "Google has found time after time that large efforts to eliminate entire classes of security issues are the best investments at scale. We understand work in something as widely used and critical as the Linux kernel takes time, but we're thrilled to be able to help the ISRG support Miguel Ojeda's work dedicated to improving the memory safety of the kernel for everyone," Dan said.

Miguel recently posted an RFC for Rust support in the Linux kernel. We’ve been watching Miguel’s work with great interest, and this RFC is a perfect example of the consideration and diligence that goes into his efforts. “Adding a second language to the Linux kernel is a decision that needs to be carefully weighed. Rust brings enough improvements over C to merit such consideration,” Miguel said about his motivation for this work. We’re excited for Miguel to be able to focus on this work over the next year.

The Linux kernel is at the heart of the modern Internet, from servers to client devices. It’s on the front line for processing network data and other forms of input. As such, vulnerabilities in the Linux kernel can have a wide-ranging impact, putting security and privacy for people, organizations, and devices at risk. Since it’s written largely in the C language, which is not memory-safe, memory safety vulnerabilities such as buffer overflows and use-after-frees are a constant concern. By making it possible to write parts of the Linux kernel in Rust, which is memory-safe, we can entirely eliminate memory safety vulnerabilities from certain components, such as drivers.

We’d like to thank Alex Gaynor, Geoffrey Thomas, Nick Desaulniers, Wedson Almeida Filho, and Miguel Ojeda for their work to get the Rust for Linux project off the ground and build the strong community that supports it today, as well as all of the folks who have contributed to the effort. We’d also like to thank Google again for their financial support.

While this is the first memory safety effort we’ve announced under our new Prossimo project name, our memory safety work began in 2020. You can read about our efforts to bring memory safety to curl and the Apache HTTP server, and to add improvements to the Rustls TLS library.

ISRG is the 501(c)(3) nonprofit organization behind Prossimo and Let’s Encrypt. We are 100% supported through the generosity of those who share our vision for ubiquitous, open Internet security. If you’d like to support our work, please consider getting involved, donating, or encouraging your company to become a funder.

Preparing Rustls for Wider Adoption

Tue, 20 Apr 2021 00:00:00 +0000

SSL/TLS libraries are critical software infrastructure for the Internet. Unfortunately, most of them have a long history of serious security issues. Many of those issues stem from the fact that the libraries are usually written in languages like C, which are not memory safe. It’s time for the Internet to move on to more secure software, and that’s why our Memory Safety Initiative is coordinating work to make further improvements to the Rustls TLS library.

Rustls is an excellent alternative to OpenSSL and similar libraries. Much of its critical code is written in Rust so it’s largely memory-safe without sacrificing performance. It has been audited and found to be a high quality implementation. Here’s one of our favorite lines from the report:

“Using the type system to statically encode properties such as the TLS state transition function is just one example of great defense-in-depth design decisions.”

With financial support from Google, we’ve contracted with Dirkjan Ochtman, an experienced Rust developer and Rustls contributor, to make a number of additional improvements to Rustls, including:

Enforce a no-panic policy to eliminate the potential for undefined behavior when Rustls is used across the C language boundary.
Improve the C API so that Rustls can even more easily be integrated into existing C-based applications. Merge the C API into the main Rustls repository.
Add support for validating certificates that contain an IP address in the subject alternate name extension.
Make it possible to configure server-side connections based on client input.

These improvements should make Rustls a more attractive option for many projects. We are already integrating it into Curl and Apache httpd, and we hope to replace the use of OpenSSL and other unsafe TLS libraries in use at Let’s Encrypt with Rustls.

We currently live in a world where deploying a few million lines of C code on a network edge to handle requests is standard practice, despite all of the evidence we have that such behavior is unsafe. Our industry needs to get to a place where deploying code that isn’t memory safe to handle network traffic is widely understood to be dangerous and irresponsible. People need memory safe software that suits their needs to be available to them though, and that’s why we’re getting to work.

ISRG is a 501(c)(3) nonprofit organization that is 100% supported through the generosity of those who share our vision for ubiquitous, open Internet security. If you’d like to support our work, please consider getting involved, donating, or encouraging your company to become a funder.

A Memory Safe TLS Module for the Apache HTTP Server

Tue, 02 Feb 2021 00:00:00 +0000

Apache httpd is still a critically important piece of infrastructure, 26 years after its inception. As an original co-developer, I feel a serious revamp like this has the potential to protect a lot of people and keep httpd relevant far into the future.

Brian Behlendorf

The Apache HTTP Server, httpd, is an important piece of the Internet’s infrastructure. Hundreds of millions of websites use it every day to serve requests. As such, improvements to httpd security have broad impact.

One of the biggest issues with httpd is the fact that it’s written in C, which is not a memory safe language. Memory safety issues dominate its list of security vulnerabilities. Rewriting httpd from scratch or moving its users to a memory safe alternative would be incredibly difficult, but fortunately we can tackle httpd’s memory safety problem incrementally.

ISRG is starting by facilitating the creation of a new TLS module for httpd called mod_tls. The new module will use the excellent Rustls library for TLS instead of OpenSSL. We hope that someday mod_tls will replace mod_ssl as the default in httpd.

We have contracted Stefan Eissing of Greenbytes, also an httpd committer, to do the work. Google has generously provided the funding.

Memory Safe ‘curl’ for a More Secure Internet

Fri, 09 Oct 2020 00:00:00 +0000

Memory safety vulnerabilities represent one of the biggest threats to Internet security. As such, we at ISRG are interested in finding ways to make the most heavily relied-upon software on the Internet memory safe. Today we’re excited to announce that we’re working with Daniel Stenberg, author of ubiquitous curl software, and WolfSSL, to make critical parts of the curl codebase memory safe.

ISRG is funding Daniel to work on adding support for Hyper as an HTTP back-end for curl. Hyper is a fast and safe HTTP implementation written in Rust.

At the same time, ISRG engineers will add support for Rustls as a TLS back-end for curl. Rustls is a safe implementation of TLS, including certificate verification and the network protocol written in Rust. It has been audited and we suggest reading the conclusions on page 11 of the report if you want to get even more excited about Rustls.

At first the memory-safe HTTP and TLS backends will be opt-in. We will work with Daniel and various partners to make sure they are extensively tested, and if all goes well the plan is for the memory safe back-ends to become the default. By making the most frequently used networking code in curl memory safe by default we’ll better protect the billions of people who rely on systems using curl.

Users who need to continue using the unsafe C back-ends for whatever reason will be able to continue doing so by building curl with the C back-ends enabled.

We’d like to thank Daniel for his willingness to be a leader on this issue. It’s not easy to make such significant changes to how wildly successful software is built, but we’ve come up with a great plan and together we’re going to make one of the most critical pieces of networking software in the world significantly more secure. We think this project can serve as a template for how we might secure more critical software, and we’re excited to learn along the way.

We’d also like to thank everyone involved in creating Hyper, Rustls, and the libraries they depend on. In particular we’d like to thank Sean McArthur for his work on Hyper, Joseph Birr-Pixton for his work on Rustls, and Brian Smith for his work on Ring (which Rustls uses).

The mission of Internet Security Research Group (ISRG) is to reduce financial, technological, and educational barriers to secure communication over the Internet. ISRG is a California public benefit corporation, recognized by the IRS as a tax-exempt organization under Section 501(c)(3). Our work is funded, in part, by individuals from more than 55 countries around the world. To donate, visit https://letsencrypt.org/donate

Prossimo

Q1 2026 Rustls Performance Update

Overview

Results

Analysis

Looking Forward

Four Years of Momentum: Craig Newmark Philanthropies and the Future of Memory Safety

Delivering on the Promise of Memory Safety

The Value of Long-Term Advocacy

A Note from our Executive Director

Improving Error Handling in Rustls

Rustls Joins Rust Foundation's Rust Innovation Lab

Opportunistic Encryption Is Coming to Hickory DNS

sudo-rs Headed to Ubuntu

Compatibility with C is Key for Memory Safe Software

Rustls (TLS)

zlib-rs

rav1d (AV1 Decoder)

Rust for Linux

bindgen

$20,000 rav1d AV1 Decoder Performance Bounty

Rustls Server-Side Performance

What is Rustls?

On the server

Resumption mechanisms

Handshake latency distribution

Conclusion

An Update on Memory Safety in the Linux Kernel

How Prossimo's Risk and Opportunity Criteria Help Us Plan

Risk Criteria

Opportunity Criteria

Conclusion

Hickory DNS is Moving Toward Production Readiness

DNSSEC

Specification Conformance Bug Fixes

Production-readiness Features

Looking Forward

A Note from our Executive Director

Security-Sensitive Industries Move to Memory Safety

A new home for memory safe Zlib

New home

What's next?

Rustls Outperforms OpenSSL and BoringSSL

What is Rustls?

Handshake Performance

Throughput Performance

Testing Methodology

Try Rustls!

Thank You

River Reverse Proxy Making Great Progress

Our Latest Release

Why River?

What's Next

Thank You

Optimizing rav1d, an AV1 Decoder in Rust

Performance Measurement

Starting Point

Optimization Process

Dynamic Dispatch

Inner Mutability

Bounds Checking

Initialization

Branchless Instructions and Stack Usage

Deriving Clone and Copy

Diminishing Returns

Current State

Porting C to Rust for a Fast and Safe AV1 Media Decoder

Migration Approach

Challenges

Threading

Self-Referential Structures

Unions

Conclusions

A new home for memory safe sudo/su

Support Our Work

More Memory Safety for Let’s Encrypt: Deploying ntpd-rs

Encrypted Client Hello (ECH) Support for Rustls

Providing official Fedora Linux RPM packages for ntpd-rs and sudo-rs

The Fedora approach to Rust packaging

Packaging ntpd-rs and sudo-rs

Deriving `Clone` and `Copy`

Custom Compile-time Checking with `klint`

`klint` in Action