Why is my cursor closed in my generator function by the time I pass it to StreamingHttpResponse?


I have a potentially large queryset that I don't want to load into memory. It's a custom SQL statement.

django.http.StreamingHttpResponse takes an iterator (generator) argument. To avoid loading everything into memory, I'm making use of Postgres server-side cursors and fetchmany (though I'm yet to verify that this actually works).

Here's my generator function that I pass:

def queryset_generator(cursor, chunk_size=CHUNK_SIZE):
    while True:
        if cursor.closed:
            yield "cursor closed!"
            break
        rows = cursor.fetchmany(chunk_size)
        if not rows:
            break
        yield rows

I test if the cursor is closed because otherwise psycopg2 complains when the code afterwards tries to access a closed cursor.

And here's how I pass it in my view (simplifying the SQL):

with connections['mydb'].cursor() as cursor:
    cursor.execute("SELECT * FROM foobar;")
    return StreamingHttpResponse(queryset_generator(cursor))

That consistently gives me

cursor closed!

Why is the cursor closed in my generator function? If I do this in my view it works fine:

with connections['mydb'].cursor() as cursor:
    cursor.execute("SELECT * FROM foobar;")
    return StreamingHttpResponse(cursor.fetchall())

It's also potentially noteworthy that this works fine in the shell:

cursor = connections['mydb'].cursor()
cursor.execute(...)
for x in StreamingHttpResponse(queryset_generator(cursor))._iterator:
    print(x)

Answer

Why is the cursor closed in my generator function?

Because you exit the context manager with return:

return StreamingHttpResponse(queryset_generator(cursor))

This exits the with block, triggers the __exit__ method on the context manager, and this method closes the cursor. The with statement or the context manager can't know that you just passed a reference to the cursor object to something else that still needs it to remain open. with doesn't care about references, only about the semantic block ending.

If you need to keep the cursor open until the StreamingHttpResponse() instance is done streaming data, you can't use the context manager around the return statement.

Pass in a cursor without using it in a context manager, and make the queryset_generator() function responsible for using with instead:

def queryset_generator(cursor, chunk_size=CHUNK_SIZE):
    with cursor:
        while True:
            if cursor.closed:
                yield "cursor closed!"
                break
            rows = cursor.fetchmany(chunk_size)
            if not rows:
                break
            yield rows

and

cursor = connections['mydb'].cursor()
cursor.execute("SELECT * FROM foobar;")
return StreamingHttpResponse(queryset_generator(cursor))

Now the cursor is kept open until the while loop in queryset_generator() is done.