Redis: Relations in a NoSQL world
Posted: March 23rd, 2010 | Author: Adam | Filed under: NoSQL, Python, Redis, programming | 4 Comments »In the first article in our series on Redis we talked about how to get started and the basics of the simple data structures that are available in redis. The simple structures are good for basic operations like storing strings and keeping counters but, using it for anything more complex requires relating one set of data to another. At first glance this is a bit of a problem since redis by design is a flat dictionary with no relations but, with a bit of application code and adherence to some mental programming standards you can build some quite complex applications using redis.
We hinted at this style of structure in the first article when we got to talking about sets with our reddit like example of story voting.
r_server.sadd("story:5419:upvotes", "userid:9102")
Here we have a set in a redis called “story:5419:upvotes” and every element stored in the set is called “userid:xxxx”. This is an example of a basic relation.
Let’s make this example simpler and go with the concept of username / password storage for a website. Instead of using MySQL to store this information we will use redis.
Relations in a NoSQL world
We have to cheat in redis’s flat name space to make relations in our data. Redis isn’t going to be aware of these relations and unlike RDBMS (like MySQL), redis does nothing to help us out. No index’s, no nifty SQL syntax with WHERE or JOIN to do the work for us. We have to handle all of the relational logic in application code, which in turn means you (the developers) have to do extra documentation explaining just how everything fits together in redis or you are going to lose your data. The benefit you though is raw speed and flexibility during development. Don’t like the constraints one data model is giving you? Start over and just recode; no need to alter a DB schema or change a DB server; redis doesn’t care.
Our requirements
- Users have a username
- Usernames have a password in a one-to-one fashion
- We must be able to create a user and assign a password
- Users must be able to login (check username against password)
- We must be able to delete a username and their associated password
Designing our redis namespace
This the format of the description of the redis namespace we’ll be using
[key] – {datatype} – (example values)
We also will use the notation *variable* to denote that in a key name is a variable relating to something else.
[users] – {set} – (“adam”, “bob”, “carol”)
[user:*username*:fullname] – {string} – (“Adam Smith”, “Bob Barker”, “Carol Burnett”)
[user:*username*:password] – {string} – (md5 hash password, no example)
That’s it. As long as we stick with this design pattern here we can keep track of all of the users in our system and handle logins.
Creating a new user
r = redis.Redis("localhost")
from hashlib import md5
def add_user(username, fullname, password):
if r.sadd("users", username):
r.set("user:%s:fullname" % username, fullname)
r.set("user:%s:password" % username, md5(password).hexdigest() )
return True
else:
return False
We’ve created a function called add_user where we are taking in a username, the full name of the user, and password and will return True if we successfully create a new user and False if we don’t. Remember the redis command SADD (Set Add) returns False if the object already exists in the set, so if a user with that username already exists, we can’t create a new user so we return False; otherwise we add it to the set and then we set the associated keys “fullname” and “password” to the appropriate vaues. Hashing the password with a md5 hash since we never store passwords anywhere in plain text.
Let’s take a look at what this looks like being called.
>>> add_user("bob", "Bob Barker", "priceisright")
True
>>> add_user("bob", "Bob Barker", "priceisright")
False
The first time we try to add the username “bob” we succeed but, the second time it fails since there is already a user with that username. Let’s look at the data stored in redis.
>>> r.smembers("users")
set(['bob'])
>>> r.get("user:bob:fullname")
'Bob Barker'
>>> r.get("user:bob:password")
'543f24fdc95f4e3f52fc7a4f2166167e'
We see our key “user” that has a set stored with only one element, our username “bob”. And now we have our related keys “user:*username*:fullname” and “user:*username*:password” which in practice are “user:bob:fullname” and “user:bob:password”. As long as we stick to the naming convention we should always know where everything is!
Let’s go ahead and create a few more users just for good measure
>>> add_user("adam", "Adam Smith", "wealthofnations")
True
>>> add_user("carol", "Carol Burnett", "eartug")
True
>>> r.smembers("users")
set(['carol', 'bob', 'adam'])
Logging a user in
Now that we have created our relational data store in redis and have a few users in there, lets try to log one in.
def authenticate_user(username, password):
#if username in r.smembers("users"):
if r.sismember("user", username):
passhash = md5(password).hexdigest()
if passhash == r.get("user:%s:password" % username):
return True
else:
return False
else:
return False
First thing we do is check to see if the username is in the set of all of the elements in the key “users” using the method sismember (set – is a member). If the username isn’t there we are going to return False since, that user doesn’t even exist. If the user does exist, I take a hash of their submitted password just like we did when we added the user, then we fetch that password from the related key and see if they are the same, if they are, we return True!
Let’s take a look at this in action
>>> authenticate_user("ghost", "idontexist!")
False
>>> authenticate_user("adam", "keynes")
False
>>> authenticate_user("adam", "wealthofnations")
True
>>> authenticate_user("bob", "priceisright")
True
So we can see here, a user that doesn’t exist is not authenticated. A user that does exist but, has the wrong password is denied. Given the right username and the right password a user gets in!. And just to show we can handle multiple users like a proper relational system should “bob” authenticates properly as well
Deleting a user
Eventually we are going to need to delete a user from our system and that is as easy everything else we have done as long as we stick to the same namespace we laid out before
def delete_user(username):
if username in r.smembers("users"):
r.srem("users", username)
r.delete("user:%s:fullname" % username)
r.delete("user:%s:password" % username)
return True
else:
return False
Again the first thing we do is check that the user exists first, since if they don’t no need to do everything else. After we know the user exists, we remove them from the set of “users” then remove their related fullname and password keys. Let’s check this out in action
>>> authenticate_user("adam", "wealthofnations")
True
>>> delete_user("adam")
True
>>> authenticate_user("adam", "wealthofnations")
False
>>> delete_user("adam")
False
>>> r.smembers("users")
set(['bob', 'carol'])
We can authenticate the user “adam” properly. Then we delete him, and now he can’t authenticate any more. If we try to delete him again we see that we can’t since he doesn’t exist.
Wrapping Up
We can make very complex relational data structures in redis as long as we document and properly use the name space. Keep everything as simple and as straight forward as possible and document everything throughly. Also make sure all your actions on redis can be keep as modular as possible by wrapping them in functions or classes, just like in this example. Then if for some reason you need to alter your “schema” you only have one place to change it in your application.
Footnote:
This is an example and is not a secure method for handling user submitted data and does no sanitization or sanity checks on the information. I am aware of this but, this post was not designed to be a comprehensive analysis of AAA methods just a quick example of how to do basic relations in a key-value store.
[...] just yesterday we posted a tutorial on how to use redis to store relational despite relations not being supported. Soon after we published the documentation on the new redis hash type went online. Now hashes by [...]
Bug correction from Brian: http://degizmo.com/2010/03/24/redis-relations-in-a-nosql-world-using-hashes/#comments
Good intro on how to structure data on Redis.
Readers may be interested in how Ohm does it. Ohm is a Ruby library that maps objects to Redis back and forth. It supports the native Redis data types and higher level features like relationships, indexes, validations, etc.
http://ohm.keyvalue.org
As a quick note, on delete_user, we’re first retrieving the entire list of users to then check if the to-be-deleted user is present, which is fine for a small demo but not good in a production site. Instead, to check for the existence of the user-to-be-deleted, one can use r.sismember(“user”, username) which will return just a boolean (which is also the method the author used in authentication).