Motor vs PyMongo - Asynchronous vs Synchronous DB calls

Hello people! This is my first comparison. In case you feel I need to change my method of testing, shout it out in the comments.

Let me explain the purpose behind this experiment. I'm part of a startup, and we were creating an admin page for our service. We needed to stream customer data. We were on Flask, Werkzeug, MongoDB (using PyMongo), and Python3. We didn't really think of having an admin page beforehand, so we essentially came up with a quick hack (on the day our service was kickstarted) of querying the database each time we refreshed our admin page. We created a simple route on Flask that queried the DB and returned the data. Although this seemed to work with little data in the DB, it started throwing up when there were DB insertion requests from customers. We were using the $5 plan from Digital Ocean, which meant we had a not-so-fancy machine. At times, it would seem that the machine had stalled. Even a simple ping would have a huuuge latency. I'm not blaming PyMongo (although I initially did, but as the results below would show, the real culprit wasn't PyMongo). We blamed ourselves for going into production with a development server (Werkzeug).

We clearly didn't heed to the copious amounts of advice that Werkzeug wouldn't handle connections in a live environment. Call it a rookie mistake. To be honest, we didn't expect to have even a 100 requests on the server. Poor extrapolation, I guess. Also, when we checked the system resources, MongoDB was using about 400MB of RAM. I was sure that PyMongo was behind this fiasco.

Okay, enough of history, let's delve into the wonderful world of asynchronous server programming. I'll explain WebSockets and the current admin page in later posts. But for this post, I'll restrict myself to a comparison of Motor and PyMongo, which are two drivers for MongoDB.

A little about MongoDB - it is a document-based database, which means it provides us the flexibility to alter the schema at any time. Or, in other words, there is no schema at all. One can insert any integer, string, dict, serialized data, etc. without having to tinker with the schema.

A little about PyMongo - every DB has a connector. I've previously used mysql.connector when I was writing python code that interacts with MySQL. And this is the official Python connector for MongoDB. I think it does a wonderful job, but the only problem though, is that it is synchronous, and hence, scaling might be a problem.

A little about Motor - Just like PyMongo, this is another connector for MongoDB. This provides an asynchronous mechanism for DB calls and insertions. Before you think that this is the magic potion for writing superfast, ultra-scalable software, complete reading the rest of the post.

Okay! First test...
Pretty simple, really. Use the time utility on any linux machine to calculate the CPU time.

There are two files here - pymongotest.py (which consists code using PyMongo) and motortest.py (which consists code using Motor)

Here's the code for pymongotest.py

from pymongo import MongoClient         #import PyMongo driver
client = MongoClient()
db = client['Test1']

def callback1():
    print("Callback1 is called")
       callback2()

def callback2():
    print("Callback2 is called")

def start():
    document = {"1":"one", "2":"two"}
    db.users.insert(document)
    print("Printed in start")
    callback1()

start()

And here's the code for motortest.py

from tornado.ioloop import IOLoop
from motor import MotorClient       #Import motor driver
client = MotorClient()
db = client['Test2']

def callback1(result, error):
    print("Result is %s"%result)
    print("Callback1 is called")
    callback2()

def callback2():
    print("Callback2 is called")
    IOLoop.instance().stop()        #Stops the tornado instance

def start():
    document = {"1":"one", "2":"two"}
    db.users.insert(document, callback=callback1)       #users is the collection 
    print("Printed in start")
    IOLoop.instance().start()       #Begin the tornado IO-Loop

start()

Let's go through the code. In pymongotest.py, I connected to a collection called Test1. The function start() starts the execution of the program. Once called, it inserts the variable "document" into the DB, then prints a "Printed in start" statement, follows it up with a call to callback1(), which prints a "Callback1 is called" statement. This further calls callback2(), which finishes the execution after printing a "Callback2 is called" line.

In motortest.py, I connected it to a collection called "Test2". But the first line is what sort of lends the asynchronous nature to the program. I'm not really sure, since this is my first attempt at writing code in Tornado, so please feel free to correct me. It imports the IOLoop, wcich handles the IO of the program (at least, that's what I think it does). Just as it was in the previous case, the start() function begins the flow of execution. There's something different though, when you see the line where the DB is being updated. There's a callback given as an argument. This is the callback function which is called after the insertion takes place. Also, there's an IOLoop.instance().start(), which as far as I know, begins the IOLoop. Right after the insertion, callback1() is called. Again, callback1() has 2 arguments. If you notice, in pymongotest.py, the callback1() function didn't take any arguments. But this one does. I think the database outputs an ID for the transaction, and this is passed on to the callback function along with an error code, in case an error is thrown. If there's no error, on printing the value of the "error" argument, it prints None.

Okay, now that we are done with the code, let's time it.
But, to see the difference between the two, let's look at the output printed in both the cases. Let's see if there's any difference at all!

When I ran pymongotest.py, this is the output

localhost:~ name$ python3 pymongotest.py
Printed in start
Callback1 is called
Callback2 is called

Let's look at motortest.py

localhost:~ name$ python3 motortest.py
Printed in start
result is 5459xxxxxxx
Callback1 is called
Callback2 is called

Find anything unusual or surprising? Notice the difference? If you observe the output of motortest.py, you will see that the result has been printed after the "Printed in start" statement. It means that the CPU processed the statement after the DB insertion before the insertion has completed. This is asynchronousness (I think I just made up a word) at work. But in pymongotest.py, the database insertion completes before printing the "Printed in start" line. This is because PyMongo is sequential. Now that we've understood the difference, let's take a look at the CPU time...Why don't you take a guess at which one would be faster? PyMongo? Motor? Obviously Motor, duh!

Time for pymongotest.py

localhost:~ name$ time python3 pymongotest.py
Printed in start
Callback1 is called
Callback2 is called

real    0m0.115s
user    0m0.093s
sys     0m0.022s

Check out time for motortest.py

localhost:~ name$ time python3 motortest.py
Printed in start
Result is 5459f062bd0c8e0d014354c2
Callback1 is called
Callback2 is called

real    0m0.185s
user    0m0.155s
sys     0m0.029s

Wait, what? PyMongo is faster than Motor? How can it be? Isn't that against logic? Isn't asynchronous >>> synchronous? Maybe this is a one-off instance. Let's try running it over a 1000 DB insertions. Motor will slaaayyy PyMongo!

So, the same code with the only modification being:

for i in range(1000):
    db.users.insert(document)

Let's run the tests and see..

pymongotest.py at 1000 insertions

localhost:~ name$ time python3 pymongotest.py
Printed in start
Callback1 is called
Callback2 is called

real    0m0.415s
user    0m0.285s
sys     0m0.033s

motortest.py at 1000 insertions

localhost:~ name$ time python3 motortest.py
Printed in start
Result is 5459f234bd0c8e0d174a2d45
Callback1 is called
Callback2 is called

real    0m0.993s
user    0m0.849s
sys     0m0.142s

You've got to be KIDDING me! PyMongo is almost 4.3 times faster than Motor! This is wroooong! Why?

Let's analyze the disadvantage that Motor has, since we have to come up with an answer for this wierd behavior. Motor is a an asynchronous wrapper around PyMongo. So, in fact, it adds to the overhead. Its max speed is bound by the speed of PyMongo, and its min speed can be anything lesser than the speed of PyMongo. It can only ever hope to be as fast as PyMongo, but not faster (in sequential programming).

But I thought asynchronous calls are supposed to be ridiculously faster than synchronous calls. But the results have disproved my beliefs.

Hold on, though. Let's check what happens if there are CPU intensive operations as well. What then? Will PyMongo leave Motor way behind or will Motor at least, put up a decent fight?

Let's modify the code to make it CPU and IO intensive. The existing code spends a lot of time doing DB insertions, and since PyMongo is the native driver, it logically should've shined, and shine it did.

pymongotest.py looks like this..

from pymongo import MongoClient
client = MongoClient()
db = client['Test1']

count = 0

def callback1():
    for i in range(100):    
        print("Callback1 is called")
        callback2()

def callback2():
    for i in range(100):    
        print("Callback2 is called")
    global count
    count += 1

def start():
    for i in range(100):
        document = {"%s"%i:"one", "%s"%(i+1):"two"} 
        db.users.insert(document)
        print("Printed in start")
        callback1()

start()
print("Count is %s"%count)

I've made it a little CPU intensive by throwing in a few calculations (which I think might make the CPU sweat, but I'm pretty certain that there would definitely be hardware support for simple increments, so I'm still not really sure of my method of testing). Let's look at the results..

localhost:~ name$ time python3 pymongotest.py
...
...
...
real    0m53.153s
user    0m5.960s
sys     0m3.772s

The ellipsis represent the print statements..

Now to motortest.py..

from tornado.ioloop import IOLoop
from motor import MotorClient
client = MotorClient()
db = client['Test2']
count = 0

def callback1(result, error):
    for i in range(100):    
        print("Result is %s"%result)        
        print("Callback1 is called")
        callback2()

def callback2():
    for i in range(100):            
        print("Callback2 is called")
    global count
    count += 1
    IOLoop.instance().stop()

def start():
    for i in range(100):        
        document = {"%s"%i:"one", "%s"%(i+1):"two"}     
        db.users.insert(document, callback=callback1)
        IOLoop.instance().start()       

start()
print("Count is %s"%count)  

I'm not an expert at Tornado, so forgive me if this is a mistake to fire up IOLoop.instance().start() at every iteration. The reason I'm doing this is because I think every IOLoop.instance.start() has a counterbalancing IOLoop.instance.stop() inside callback2(). If this has an effect on the results, kindly let me know. Even better, if you could improve this rudimentary test.

So, the results are as follows:

localhost:~ name$ time python3 motortest.py
...
...
...
real    0m54.175s
user    0m6.116s
sys     0m4.183s

I don't understand! Motor is still slower than PyMongo, even though I think the entire process is CPU intensive. What is it that I'm doing wrong? Is it the case that my design is wrong or is it that Motor shines, perhaps, in other cases? From the looks of it, Motor seems to delegate the process flow to a callback function. But shouldn't that make it faster than PyMongo? Or is it that the machine uses some exotic caching mechanism that is aiding PyMongo? I'm on Ubuntu 14.04 with an Intel core i3 3.07Ghz (first gen, desktop) with 2GB RAM.

I'll try the test with random data and check.

Also, during the course of this test, I found out something incredible that I didn't know previously. Printing control statements vs no statements at all vastly improves the speed. I'm not talking about percentage improvements here..I'm talking about speed gains of about 100x on the system time.

When I ran pymongotest.py without the print statements, the time function reported the following result:

localhost:~ name$ time python3 pymongotest.py
real    0m0.653s
user    0m0.176s
sys     0m0.022s

motortest.py returned the following:

localhost:~ name$ time python3 motortest.py
real    0m0.723s
user    0m0.263s
sys     0m0.032s

Even here, Motor lagged behind PyMongo, but the real revelation is that in this case, I didn't print any of those silly lines and the sys time reported is 0.022s/0.032s. Previously, when I printed all those print statements, the time reported is 3.772s/4.183s (PyMongo/Motor respectively). That's about a 171x/130x gain in performance. I didn't know the variation would be this dramatic.

Anyhow, I was hoping Motor to be the panacea that would dramatically make my code faster. Maybe Motor would do well when there are multiple connections racing to perform DB queries.