Field Validation for APIs with Python, Flask, and SQLAlchemy

This is the article I wish existed three days ago.

I'm in the process of building a data exploration and visualization web app (narratus), and am using a Flask API backend to support a React frontend. I decided to accept user data posted as json, instead of form data. I now know there is not a lot of documentation about how to add field validation if your are not using WTForms. Hopefully this article will help.

Let's look at how to accomplish validation by using SQLAlchemy's built-in validates() decorator.

Validation Strategy Options for Flask

Probably the most common way to validate user input in Flask is to use the validators provided in the WTForms library. The ease of use and thorough documentation make this a good choice. The drawbacks being that you must add WTForms as a dependency and your user must submit their data through a web form, which may not be ideal when implementing something like a RESTful API.

Other options include the colander and marshmallow libraries. These are both powerful libraries, but their added complexity might not be ideal for all use cases.

Finally, we can validate using SQLAlchemy, itself. Let's look at how to accomplish validation by using SQLAlchemy's built-in validates() decorator.

Validating with SQLAlchemy

Like any ORM, we are able to set constraints at the database level. However, for more complex validation we can use SQLAlchemy's validates() decorator.

We initialize our app in the app/init.py file:

from configparser import ConfigParser
from flask import Flask
from flask_sqlalchemy import SQLAlchemy

# import config file to global object
config = ConfigParser()
config_file = '../config.ini'
config.read(config_file)

# instantiate flask app
app = Flask(__name__)
app.config['SECRET_KEY'] = config.get('flask', 'secret_key')
app.config['SQLALCHEMY_DATABASE_URI'] = config.get('flask', 'database_uri')
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False

db = SQLAlchemy(app)

from app import routes, models

Note that there are many ways to initialize a Flask app. The above configuration is recommended by Miguel Grinberg's excellent tutorial.

We set up our User model in our app/models.py file:

import re
from werkzeug.security import generate_password_hash, check_password_hash
from sqlalchemy.orm import validates
from app import db

class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(64), index=True, unique=True, nullable=False)
    email = db.Column(db.String(120), index=True, nullable=False)
    password_hash = db.Column(db.String(128))
    role = db.Column(db.Enum('basic', 'admin', name='user_roles'), default='basic')

    def set_password(self, password):
      self.password_hash = generate_password_hash(password)

    def check_password(self, password):
      return check_password_hash(self.password_hash, password)

Notice we've added a couple of methods to our User class to hash and check passwords, as passwords should never be stored as plain text in a database.

Right now the only validation that occurs is from the constraints by the database, such as trying to save a string in the id field. We can use @validates() to add some reasonable limitations to what is acceptable input for these fields:

@validates('username')
def validate_username(self, key, username):
  if not username:
      raise AssertionError('No username provided')

  if User.query.filter(User.username == username).first():
    raise AssertionError('Username is already in use')

  if len(username) < 5 or len(username) > 20:
    raise AssertionError('Username must be between 5 and 20 characters')

  return username

@validates('email')
def validate_email(self, key, email):
  if not email:
    raise AssertionError('No email provided')

  if not re.match("[^@]+@[^@]+\.[^@]+", email):
    raise AssertionError('Provided email is not an email address')

  return email

Your validation functions should return the field you are validating. By adding these functions into the User class we force an AssertionError if the input violates these validation rules. For example:

$ user = User(username='Sam', email='sseaborn@example.com')
AssertionError: Username must be between 5 and 20 characters

Handling the password validation is a little different, because we should only be updating the password_hash field using the set_password() method to hash our passwords. Since what we want to validate is the provided password, not the password_hash field, we can do our validation right in the set_password() method:

def set_password(self, password):
  if not password:
      raise AssertionError('Password not provided')

  if not re.match('\d.*[A-Z]|[A-Z].*\d', password):
      raise AssertionError('Password must contain 1 capital letter and 1 number')

  if len(password) < 8 or len(password) > 50:
      raise AssertionError('Password must be between 8 and 50 characters')

  self.password_hash = generate_password_hash(password)

One of the nice benefits of this method of validation is the custom error messages that can be passed to the view function to let the user know what the error was: app/views.py:

from flask import jsonify, request
from app import app, db
from app.model import User

@app.route('/api/create_user', methods=['POST'])
def create_user():

  data = request.get_json()
  username = data['username']
  password = data['password']
  email = data['email']

  user = User(username=username, email=email)
  user.set_password(password)
  try:
    db.session.add(user)
    db.session.commit()
    return jsonify(msg='User successfully created', user_id=user.id), 200
  except AssertionError as exception_message:
    return jsonify(msg='Error: {}. '.format(exception_message)), 400

So if we were to send this data: {"username":"sseaborn", "password":"secret", "email":"sseaborn@example.com"} to the /api/create_user endpoint, we will get this in the response data: {"Error: Password must contain 1 capital letter and 1 number."}

Happy coding.