Lecture 8: Security

The Most Important Lesson About Security

As a software engineer, you’ll probably make bad decisions about security, and the illusion of a secure system is even worse than an insecure one. Trust the work of security experts, either by consulting them or by using tools developed by them. And when designing tools for other developers, let the secure way of doing things be the default.

That said, here’s the least you need to know about security.

Code Injection

The most common kinds of vulnerabilities have to do with running untrusted code. Typically that happens when you incorrectly interpret user input as code.

Let’s look at the two most common flavors of code injection.

SQL Injection

See the TODOOSE code base at the sql-injection branch. Edit the description of a TODO item to something like groceries"-- and see the description for all the items change to groceries. The " closes the first \" in the SQL we wrote in the code base, and the -- comments out the rest of the line, which would say to only update WHERE identifier is the one given in the URL. The usual way to add values in a prepared statement with statement.setString() fixes this by sanitizing the input—that is, converting things like " into \".

See the obligatory xkcd comic about SQL Injection.

Cross-Site Scripting (XSS)

See the TODOOSE code base at the xss branch. Edit the description of a TODO item to something like groceries <button onclick="alert('Not really');">Add New Item</button> and see a misleading button appear. The dangerouslySetInnerHTML attribute we used in the code base disables React’s sanitization, which would convert something like <button> into <button>. See MDN’s documentation on HTML entities.

User Management

See the TODOOSE code base at the user-management branch.

The two parts of user management are authentication (user signup and login) and authorization (determining whether the user can do whatever they’re trying to do, for example, what we did with the GET /items request). Javalin has some native support for authorization.

For authentication, we added an users table to the database to store the login and password. We added an UsersRepository, UsersController, and routes for user signup and login.

For authorization, we protected the GET /items request such that only users who are logged in can see the items.

To implement login, we used Javalin’s support for sessions (see the methods ctx.sessionAttribute(key, value) and ctx.sessionAttribute(key)), which depends on cookies. Typically HTTP requests are completely independent of one another, but the server may set cookies as part of a response. Cookies are just pieces of data the browser (or Postman) will and send along with every request to the same server, and the server will use this piece of data to tell who’s who.

This implementation is still insecure because it stores passwords in plain text. Consider what would happen if an attacker used SQL Injection to gain access to the database.

The solution to this is to use a Key Derivation Function (KDF). You can think of a KDF as a function that takes the user password and returns an arbitrary string. What’s important about KDFs is that it’s really hard to come up with a password that will result in a given arbitrary string. At user signup, we use the KDF on the password before storing it in the database. At user login, we use the KDF on the given password, and compare the result to what was stored on the database.

A KDF is like a hash (if you know what that is), except that unlike typical hash functions like SHA and MD5 (if you know what those are) KDFs are designed to be slow! This is good, because attackers can try fewer passwords per minute.

Also, we want to prevent attackers from computing huge tables of possible passwords and their corresponding hashes—the so-called rainbow tables. So we concatenate a salt (a random string) with the password before giving it to the KDF. We can store the salt in the clear in the database.

We recommend bcrypt, which is implemented in this Java library. The library takes care of salting, hashing, and so forth.

See the documentation for this implementation of bcrypt in Ruby for some more information on bcrypt.

bcrypt is also used by Ruby on Rails and Spring.

Refactoring

We didn’t talk about this in lecture because we covered refactoring indirectly before. But here are some notes about it.

What?

To change how something is done in code without changing what’s being done.

Often people will abuse the term and say that they’re refactoring something even though they’re changing what’s being done as well.

Examples

TODOOSE video series starting on Session 24.

The Technology part of Assignment 3.

Even simple things like renaming variables/attributes/methods, like we did in previous lectures and assignments. Most refactorings have to do with introducing or removing abstractions, for example, extracting a piece of code into a function; or its converse, inlining a function body where a function is called; and so forth.

Why?

Because getting code to work and do what you want is only half the battle. Code must easy to understand and maintain.

Because it allows you to explore different solutions to the same problem and learn more about their trade-offs.

Because it helps you actively read a code base you’re unfamiliar with.

Because it helps you shape the code base toward a goal, for example, preparing the code base to add a new feature.

for each desired change, make the change easy (warning: this may be hard), then make the easy change

–Kent Beck

When?

When you’re new to a code base, refactor as a way to explore the code base and appreciate the decisions made by the original developers. Rename variables to names that make more sense to you; move code around to understand what it’s doing, and so forth. If you think your refactorings improve the code base, that may be your first contribution. But often you’ll end up just discarding the refactorings, and that’s fine.

When you’re developing a new feature, you may start refactoring as soon as you get things to work, turning a first draft into understandable code.

When you’re navigating the code base, you may make small refactorings here and there to make things better.

Before you even start a more complicated feature, it may make sense to refactor the code base to prepare it for the feature.

But don’t fall into the trap of perpetual refactoring cycles. Code can’t be perfect; good enough is enough.

How?

Ensure you have good tests, because they’re the only guarantee that you haven’t broken anything with the refactorings. Then perform the smallest change you can while keeping the tests passing. Rinse and repeat.

There are many refactoring techniques that are commonly applied, and these techniques received names. This is a similar to what hapenned with design principles and design patterns.

IntelliJ can perform simple refactorings automatically; right-click on some code and look at the Refactor menu.

For more comprehensive catalogs, see the Refactoring book by Martin Fowler. Or, try the refactoring.guru. Avoid the similar-looking SourceMaking, which isn’t as good.

This is an archived version of https://www.jhu-oose.com that I (Leandro Facchinetti) developed when teaching the course in the Fall of 2019. Some of the links may be broken.