tech Posts


Thoughts on PyCon 2017, Day 1

by phildini on May 19, 2017


Here we are the end of the first conference day of PyCon. Thinking over the day, and including thoughts from the opening reception last night, I'm struck by something that is even more true this year than it was last year:

The Python community is incredible. We are at an inflection point where we need to be making measured, conscious decisions to keep Python and its community thriving. 

I'm going to be writing even more about this in the coming weeks, but let me jot down some observations, and then try to sum them up at the end:

1. It was pretty obvious to anyone who had been here last year that there were less sponsor booths in the expo hall. Noticeably less. Speaking to someone who had a booth last year and chose not to this year, there was perhaps a level if disgruntledness with the organizers that caused them to skip this year. That's troubling. What's even more troubling is talking to conference organizers for other Python and Django conferences about how it's gotten harder this year to find sponsors.

2. Jake VanderPlas' keynote this morning highlighted some areas where Python is making incredible inroads, and showcased how Python is becoming the defacto tool in many areas of the science community. They choose Python for many reasons, and one of them is:

 

 

Or, to put it more succintly:

 

These thoughts really resonated with the audience, based on the number of likes and retweets I got. And I think these are sentiments the community at large shares. We don't (necessarily) choose Python because it's the fastest language on the planet. We choose it because we like working in the language and we love the community that comes with it.

3. Speaking of that community, I didn't get a chance to see as many of the talks as I would like, because I spent so much time chatting with people about fascinating topics in the hallway track. The hallway track continues to be one of the best parts of PyCon, and it was especially noticeable this year that people were being encouraged to participate. One of the amazing things about PyCon is that all the talks are recorded and put online for free, sometimes within hours of their being given, so attending a talk can often be considered secondary to meeting interesting people in the hallway.

4. This is even more pure anecdote than (3), but it felt like I heard of more people finding it harder to get jobs in Python building web applications, and easier in things like data or science. I can't prove this is true, and it might not be all bad, but it's something to watch. Any area where it's suddenly harder to find work in Python means a pillar of our community is weakening, and we should be aware of it.

5. The day ended with lightning talks, and I hope everyone in the audience saw Cameron Dershem's talk about what the Rust community is doing better than the Python community, especially when it comes to improving usability of the language and making it easier to contribute. Furthermore, I hope it was a call to arms for all of us to start pushing for making every aspect of our community feel welcoming, and like new people can make a difference.

Summing up: Python's community still feels like home, to me and many others, and PyCon feels like a homecoming. If we want to make sure the community continues to be incredible, we need to keep an eye on trends in where people and companies are using PyCon. We also need to continue to be excellent to each other, whether the person we're talking to has been here for years or just learned about Python today.

I'm going to keep pushing to make Python better, and I look forward to seeing you all at PyCon tomorrow.


Using Django Channels as an Email Sending Queue

by phildini on April 8, 2016


Channels is a project by led Andrew Godwin to bring native asynchronous processing to Django. Most of the tutorials for integrating Channels into a Django project focus on Channels' ability to let Django "speak WebSockets", but Channels has enormous potential as an async task runner. Channels could replace Celery or RQ for most projects, and do so in a way that feels more native.

To demonstrate this, let's use Channels to add non-blocking email sending to a Django project. We're going to add email invitations to a pre-existing project, and then send those invitations through Channels.

First, we'll need an invitation model. This isn't strictly necessary, as you could instead pass the right properties through Channels itself, but having an entry in the database provides a number of benefits, like using the Django admin to keep track of what invitations have been sent.

from django.db import models
from django.contrib.auth.models import User


class Invitation(models.Model):

    email = models.EmailField()
    sent = models.DateTimeField(null=True)
    sender = models.ForeignKey(User)
    key = models.CharField(max_length=32, unique=True)

    def __str__(self):
        return "{} invited {}".format(self.sender, self.email)

We create these invitations using a ModelForm.

from django import forms
from django.utils.crypto import get_random_string

from .models import Invitation


class InvitationForm(forms.ModelForm):

    class Meta:
        model = Invitation
        fields = ['email']

    def save(self, *args, **kwargs):
        self.instance.key = get_random_string(32).lower()
        return super(InvitationForm, self).save(*args, **kwargs)

Connecting this form to a view is left as an exercise to the reader. What we'd like to have happen now is for the invitation to be sent in the background as soon as it's created. Which means we need to install Channels.

pip install channels

We're going to be using Redis as a message carrier, also called a layer in Channels-world, between our main web process and the Channels worker processes. So we also need the appropriate Redis library.

pip install asgi-redis

Redis is the preferred Channels layer and the one we're going to use for our setup. (The Channels team has also provided an in-memory layer and a database layer, but use of the database layer is strongly discouraged.) If we don't have Redis installed in our development environment, we'll need instructions for installing Redis on our development OS. (This possibly means googling "install redis {OUR OS NAME}".) If we're on a Debian/Linux-based system, this will be something like:

apt-get install redis-server

If we're on a Mac, we're going to use Homebrew, then install Redis through Homebrew:

brew install redis

The rest of this tutorial is going to assume we have Redis installed and running in our development environment.

With Channels, redis, and asgi-redis installed, we can start adding Channels to our project. In our project's settings.py, add 'channels' to INSTALLED_APPS and add the channels configuration block.

INSTALLED_APPS = (
    ...,
    'channels',
)

CHANNEL_LAYERS = {
    "default": {
        "BACKEND": "asgi_redis.RedisChannelLayer",
        "CONFIG": {
            "hosts": [os.environ.get('REDIS_URL', 'redis://localhost:6379')],
        },
        "ROUTING": "myproject.routing.channel_routing",
    },
}

Let's look at the CHANNEL_LAYERS block. If it looks like Django's database settings, that's not an accident. Like we have a default database defined elsewhere in our settings, here we're defining a default Channels configuration. Our configuration uses the Redis backend, specifies the url of the Redis server, and points at a routing configuration. The routing configuration works like our project's urls.py. (We're also assuming our project is called 'myproject', you should replace that with your project's actual package name)

Since we're just using Channels to send email in the background, our routing.py is going to be pretty short.

from channels.routing import route

from .consumers import send_invite

channel_routing = [
    route('send-invite',send_invite),
]

Hopefully this structure looks somewhat like how we define URLs. What we're saying here is that we have one route, 'send-invite', and what we receive on that channel should be consumed by the 'send_invite' consumer in our invitations app. The consumers.py file in our invitations app is similar to a views.py in a standard Django app, and it's where we're going to handle the actual email sending.

import logging
from django.contrib.sites.models import Site
from django.core.mail import EmailMessage
from django.utils import timezone

from invitations.models import Invitation

logger = logging.getLogger('email')

def send_invite(message):
    try:
        invite = Invitation.objects.get(
            id=message.content.get('id'),
        )
    except Invitation.DoesNotExist:
        logger.error("Invitation to send not found")
        return
    
    subject = "You've been invited!"
    body = "Go to https://%s/invites/accept/%s/ to join!" % (
            Site.objects.get_current().domain,
            invite.key,
        )
    try:
        message = EmailMessage(
            subject=subject,
            body=body,
            from_email="Invites <invites@%s.com>" % Site.objects.get_current().domain,
            to=[invite.email,],
        )
        message.send()
        invite.sent = timezone.now()
        invite.save()
    except:
        logger.exception('Problem sending invite %s' % (invite.id))

Consumers consume messages from a given channel, and messages are wrapper objects around blocks of data. That data must reduce down to a JSON blob, so it can be stored in a Channels layer and passed around. In our case, the only data we're using is the ID of the invite to send. We fetch the invite object from the database, build an email message based on that invite object, then try to send the email. If it's successful, we set a 'sent' timestamp on the invite object. If it fails, we log an error.

The last piece to set in motion is sending a message to the 'send-invite' channel at the right time. To do this, we modify our InvitationForm

from django import forms
from django.utils.crypto import get_random_string

from channels import Channel

from .models import Invitation


class InvitationForm(forms.ModelForm):

    class Meta:
        model = Invitation
        fields = ['email']

    def save(self, *args, **kwargs):
        self.instance.key = get_random_string(32).lower()
        response = super(InvitationForm, self).save(*args, **kwargs)
        notification = {
            'id': self.instance.id,
        }
        Channel('send-invite').send(notification)
        return response

We import Channel from the channels package, and send a data blob on the 'send-invite' channel when our invite is saved.

Now we're ready to test! Assuming we've wired the form up to a view, and set the correct email host settings in our settings.py, we can test sending an invite in the background of our app using Channels. The amazing thing about Channels in development is that we start our devserver normally, and, in my experience at least, It Just Works.

python manage.py runserver

Congratulations! We've added background tasks to our Django application, using Channels!

Now, I don't believe something is done until it's shipped, so let's talk a bit about deployment. The Channels docs make a great start at covering this, but I use Heroku, so I'm adapting the excellent tutorial written by Jacob Kaplan-Moss for this project.

We start by creating an asgi.py, which lives in the same directory as the wsgi.py Django created for us.

import os
import channels.asgi

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "myproject.settings")
channel_layer = channels.asgi.get_channel_layer()

(Again, remembering to replace "myproject" with the actual name of our package directory)

Then, we update our Procfile to include the main Channels process, running under Daphne, and a worker process.

web: daphne myproject.asgi:channel_layer --port $PORT --bind 0.0.0.0 -v2
worker: python manage.py runworker --settings=myproject.settings -v2

We can use Heroku's free Redis hosting to get started, deploy our application, and enjoy sending email in the background without blocking our main app serving requests.

Hopefully this tutorial has inspired you to explore Channels' background-task functionality, and think about getting your apps ready for when Channels lands in Django core. I think we're heading towards a future where Django can do even more out-of-the-box, and I'm excited to see what we build!


Special thanks to Jacob Kaplan-Moss, Chris Clark, and Erich Blume for providing feedback and editing on this post.


Why Doesn't the Django CSRF Cookie Default to 'httponly'?

by phildini on October 19, 2015


Recently, some questions asked by a friend prompted me to look deeper into how Django actually handles it's CSRF protection, and something stuck out that I want to share.

As a refresher, Cross-Site Request Forgery (CSRF) is a vulnerability in web applications where the server will accept state-changing requests without validating they came from the right client. If you have example.com/user/delete, where normally a user would fill out a form to delete that account, and you're not checking for CSRF, potentially any site the user visits could delete the account on your site.

Django, that marvelous framework for perfectionists with a deadline, does some things out-of-the-box to try and defend you from CSRF attacks. It comes default-configured with the CSRF middleware active in the middleware stack, and this is where most of the magic happens.

The middleware works like so: When it gets a request, it tries to find a csrf_token in the request's cookies (all cookies the browser knows about for a URL are sent with every request to that URL, and you can read about some interesting side-effects of that here: Cookies Can Be Costly On CDNs). If it finds a token in the cookie, and the request is a POST request, it looks for a matching token in the request's POST data. If it finds both tokens, and they match, hooray! The middleware approves the request, and the request marches forward. In all other cases, the middleware rejects the request, and an error is returned.

The CSRF middleware also modifies the response on its way out, in order to do one important thing: set the cookie with the CSRF token to read. It's here that I noticed something interesting, something that struck me as curious: The CSRF token doesn't default to 'httponly'.

When a site sets a cookie in the browser, it can choose to set an 'httponly' property on that cookie, meaning the cookie can only be read by the server, and not by anything in the browser (like, say, JavaScript). When I first read this, I thought this was weird, and possibly a mistake. Not setting the CSRF token 'httponly' means that anyone who can run JS on your pages could steal and modify the CSRF cookie, rendering its protection meaningless.

Another way to read what I just wrote would be: "If my site is vulnerable to Cross-Site Scripting (XSS) attacks, then they can break my CSRF protection!" This phrasing highlights a bit more why what I just said is funny: If your site is vulnerable to an XSS attack, that's probably game over, and worrying about the CSRF protection is akin to shutting the barn door after the horse has been stolen.

Still, if the CSRF cookie defaulted to 'httponly', and you discovered your site had an XSS, you might breathe a little easier knowing that bad state-changing requests had a harder time getting through. (Neglecting other ways the cookie could be broken in an XSS attack, like cookie jar overflow). I was talking to Asheesh Laroia about this, and he called this the "belt-and-suspenders" approach to securing this facet of your web application. He's not wrong, but I was still curious why Django, which ships with pretty incredible security out-of-the-box, didn't set the default to 'httponly'.

We don't know the answer for sure (and I would love to have someone who knows give their thoughts in the comments!), but the best answer we came up with is: AJAX requests.

The modern web is composed less-and-less of static pages. Increasingly, we're seeing rich client-side apps, built in JavaScript and HTML, with simple-yet-strong backends fielding requests from those client-side apps . In order for state-changing AJAX requests to get the same CSRF protection that forms on the page get, they need access to the CSRF token in the cookie.

It's worth noting that we're not certain about this, and the Django git history isn't super clear on an answer. There is a setting you can adjust to make your CSRF cookie 'httponly', and it's probably good to set that to 'True', if you're certain your site will never-ever need CSRF protection on AJAX requests.

Thanks for reading, let me know what you think in the comments!

Update (2015-10-19, 10:28 AM): Reader Kevin Stone left a comment with one implementation of what we’re talking about:

$.ajaxSetup({
    headers: {
         'X-CSRFToken': $.cookie('csrftoken')
    }
}

 

 

Django will also accept CSRF tokens in the header ('X-CSRFToken'), so this is a great example. 

Also! Check out the comment left by Andrew Godwin for confirmation of our guesses.


Porting Django Apps to Python 3, Part 1

by phildini on May 26, 2015


Hello! Welcome to the first in a series of posts about my experiences making Django apps Python 3 compatible. Through these posts I'll start with a Django app that is currently written for Python 2.7, and end up with something can be run on Python 3.4 or greater.

Some quick notes before we begin:

  • Why am I doing this? Because we have 5 years until Python 2.7 goes end-of-life, and I want to be as ready as possible for making that change in the code that I write for my job. To prep for that, I'm converting all the Django apps I can find, from side-projects and Open Source projects.
  • Why 5 years? Because that's the time outlined in PEP-0373, and based on Guido's keynote at PyCon 2015, that's the timeline we all should be sticking to. It's also recently been brought to my attention that further Python 2.7 releases are really the responsibility of one person, the inimitable Benjamin Peterson, and if he for any reason decides to stop making updates that 2020 timeline may get drastically shortened. It's better to be prepared now.
  • Why "Python 3 compatible"? Why not fully Python 3? Because I believe the best way forward for the next 5 years will be writing polyglot code that can be run in either Python 2.7 or Python3.4+ environments. (I'm going to start shortening those to py2 and py3 for the rest of this post.) So I won't be using 2to3, but I will be using six.

With those pieces in mind, let's begin!

I started with Cards Against Django, a Django implementation of Cards Against Humanity that I wrote with some friends a couple years ago. We didn't own Cards Against Humanity, and hilariously thought it would be easier to build it than to buy it. (We also may have just wanted the challenge of building a usable Django app from scratch). The end result was a game that could be played with an effectively unlimited number of players, each on their own device, and which was partially optimized for mobile play. To get a sense of what the code was like before I started the migration, browse the Github repo at this commit.

Now it turns out I made one assumption right at the beginning of this port that made things a bit harder, and may have distracted from the original mission. The assumption was that Django 1.5 is not py3 compatible, when in fact it was the first py3-compatible version. Had I found and read this Python 2 to 3 porting guide for Django, I may have saved myself some headache. You now get the benefit of a free mini-lesson on upgrading from Django 1.5 to Django 1.8.

Resource #1: The Django Python 3 Porting Guide

Real quick, I'm going to go through how my environment was set up at the beginning of this project, based on the starting commit listed above.

This snippet will setup a virtual environment using mkvirtualenv, install the local requirements for the app, and initialize the db using the local settings.

Ok, let's upgrade to Django 1.8 $ pip install -U Django ..and naively try to run the dev server.

Well that's a bummer, but fairly expected that I wouldn't be able to make the jump to 1.8 easily. What's interesting about this error is that it's not my code that seems to be the problem -- it looks like the problem is in django-nose.

$ pip install -U django-nose nose

Try runserver again...

Hmm... obviously the API for transactions changed between Django 1.5 and Django 1.8. Here I looked at the Django release notes, and noticed that 'commit_on_success' was deprecated in 1.8. Digging in to the new transaction API, it looked like 'transaction.atomic' was pretty much the behavior I wanted, so I went with that.

Resource #2: The Django Release Notes

Third time's the charm, yes?

Apparently not. This one was weird to me, because I didn't have South in my installed apps. Through a sense of intuition that I can't really explain, I suspected django-allauth, the authentication package this project uses. I wondered if an older version of django-allauth was trying to do South-style migrations.

$ pip install -U django-allauth

Sure enough, an old version of allauth was the culprit, and an upgraded version allowed the runserver to launch successfully.

So now I have the development server running, but I've got that warning about needing to run migrations. This is the part of this upgrade that I knew was coming, and I was most worried about. I already have the database initialized from Django 1.5's 'syncdb' -- what will happen when I run 'migrate'?

It turns out, not a whole lot. Running this command gave me a 'table already exists' DatabaseError. Googling for this issue left me a little stumped, so eventually I turned to the #django channel on Freenode IRC. (If you're curious how to get a persistent connection to IRC, check out this post.) I was able to get some great help there, and it was suggested I try the one-two punch of:

That '--fake' bit did the trick, convincing Django I had run the migrations (since the tables were already correctly created), and silencing the warning.

With the development server running on Django 1.8 (including the very limited test suite), I'm feeling confident about the migration to Python 3. Is my confidence misplaced? Find out in part 2!

If you'd like to see the totality of the work required to migrate this Django app from 1.5 to 1.8, check out this commit.

If you have feedback about what I did wrong or right, or have questions about what's here, leave a comment, and I'll respond as soon as I'm able!


IRC all the way down (ZNC + IRCCloud + Quassel)

by phildini on May 2, 2015


For years, I felt that IRC was something I had to put up with. Most of the communities I want to be part of have a large IRC presence, and so I would fire up my trusty local IRC client, connect to Freenode or OFTC, and try to learn from the excellent people who also hang out in various IRC communities. But I was always frustrated by the fact that I would miss discussions when I wasn't connected.

A few months back, a friend of mine introduced me to Quassel, an open source software package that gets around IRC's major limitation (from my point of view): that your ability to read the contents of a channel are limited by your client being connected to the network. (The number of IRC loggers and other workarounds for persistence indicates others also find this a limitation.)

Quassel, in it's preferred configuration, requires at least two machines: a core that runs on an always-on server, and a client that connects to that core. The core is what actually connects to the IRC networks with your ident, and keeps a persistent connection for you. On the surface, this might not seem like an improvement over, say, irssi running on a server. It's an improvement for me because, despite several attempts, I have never been able to wrap my mind or fingers around irssi's keyboard shortcuts. Quassel has a nicer interface, a good desktop app, and some mobile mobile app support.

How do you get Quassel? Quite easily, if you're on an Ubuntu system. I recommend one of the cheap boxes from DigitalOcean. They're easy to use, and only $5/month for a 512MB RAM / 20GB disk box.

On the server where you want your Quassel core to run, add the Quassel ppa to your apt repositories:

sudo add-apt-repository ppa:mamarley/quassel

Install the Quassel core package:

sudo apt-get update; sudo apt-get install quassel-core

You also want to make sure you've opened up port 4242 to outside traffic, as that's the port Quassel runs on. If you're not running a firewall (you probably should be!), you don't have to do anything. If you're running ufw like I am, you'll need to do this:

sudo ufw allow 4242
sudo ufw reload

Now that your core is all set up, let's configure it! One of the amazing things about Quassel is that you configure the core through the client. Download the client for your OS of choice, and it will walk you through how to get everything up and running.

So Quassel is great, and for a few months it served all my IRC needs perfectly well. But as I started getting more and more involved in communities on IRC, I started to feel the desire for a more mobile-ready solution. Quassel does have a free Android app, but I currently run iOS, and the iOS app didn't thrill me based on what I saw of it. I started looking for a better solution.

Some of my friends on IRC have been using IRCCloud for months, and they seemed to really enjoy it. I got an invite to the service from one of them, played around a bit, but didn't immediately see the appeal. At the time, I was still happy with my Quassel core and client. When I started hankering for a mobile solution, I gave IRCCloud another look, but didn't feel I could leave Quassel completely behind. By this point, I had given accounts on the core to some other friends interested in IRC, so I knew I couldn't shut it down. Plus, having Quassel as a backup in case IRCCloud ever went down seemed like a great idea. How could I get the best of both worlds, where Quassel and IRCCloud could use the same IRC connection, and I would never lose uptime?

Enter ZNC. ZNC is an IRC bouncer, a piece of software that essentially proxies IRC connections for you. It connects to IRC, and you connect to it, similarly to Quassel. The difference is, the Quassel client speaks to the Quassel core over the Quassel protocol. You can connect to ZNC over IRC, using any client. Like IRCCloud, and the Quassel core.

How do you get setup with ZNC? On the same box where you're running that Quassel core, do:

sudo apt-get install python-software-properties
sudo add-apt-repository ppa:teward/znc
sudo apt-get update
sudo apt-get install znc znc-dbg znc-dev znc-perl znc-python znc-tcl

This will add the ZNC ppa to your apt repositories, and install ZNC. Next you need to choose a user that will run the ZNC service. This could be your default user, although that's not recommended, and it most certainly shouldn't be the root user. I created a new user for running ZNC like this:

sudo adduser znc-admin

Before you configure ZNC to run under this user, you'll need to open another port in your firewall.

sudo ufw allow 5000
sudo ufw reload

Now you're ready to start up ZNC.

sudo su znc-admin
znc --makeconf

ZNC will ask you a whole bunch of questions, like what port to run on, what users to create, and how connections should be set up. The directions starting about halfway down this DigitalOcean article are pretty good, and I followed most of their options, changing the user details to match what I needed. Once you've finished setup, ZNC will give you two important URLs: The URL to connect to the ZNC web interface, where you'll most likely configure ZNC going forward, and the URL for connecting an IRC client to ZNC. That connection URL will be in the form of:

{your server address or IP}:{port you chose} {username}:{password}

If you have an IRCCloud account, you'll need to pay special attention to those last bits, because {username}/{network name}:{password} will be your full server password to connect to the right account. For example:

UserName/freenode:password

When you add the network to IRCCloud, it'll look something like this:

IRCCLoud settings

You can use similar settings to connect Quassel to the same ZNC server.

Unfortunately, IRCCloud makes you upgrade your account to add servers with passwords. But in my opinion, IRCCloud is totally worth the $5/month. The more I use it, the more I like the service, the interface, and the mobile support. IRCCloud plus ZNC, with Quassel as a backup client connected to the same ZNC service, solves all my IRC woes. Hopefully, some combination of these services will be helpful to you as well.

And I'll see you on IRC.