Reading from MongoDB with Azure Functions

As part of my Cardalog web app, I’m writing an API to interact with a MongoDB using Azure Functions. In the series introduction, I set up the app’s components. In this installment, I’ll implement a Function to read all cards from the database.

I’ve got the API and UI repositories out on GitHub. This Gist contains the complete implementation I’ll reference throughout this post.

Connecting to MongoDB

First up, install the Mongo driver package with func extensions install -p MongoDB.Driver -v 2.9.3.

Since I’m using Functions, I’m not retaining any state or resources like connections to the database. The connection is easy to spin up, so there’s just a little boilerplate needed.

var client = new MongoClient("mongodb://127.0.0.1:27017");
var db = client.GetDatabase("cardalog");
var coll = db.GetCollection<BsonDocument>("cards");

Once I mature the APIs a bit, I won’t use magic strings for configuration details and I’ll harden access rights. As long as I’m prototyping, though, I’ll do it the quick and dirty way.

coll is how I’ll interact with the cards collection. For my use case, this collection is similar to a relational DB table. There are some fundamental differences worth knowing. For now, just keep in mind that the collection doesn’t have a schema.

Seeding the database

Before going further, I’ll seed the database with some dummy data I created with Mockaroo. You can use and copy the schema or run curl "https://api.mockaroo.com/api/bb6e1fd0?count=1000&key=d0846c90" > "MtG.json" in a terminal.

Once you have the mocked data, use mongoimport -d cardalog -c cards .\MtG.json to write it to the cards collection.

Compass has an “import data” option but, for whatever reason, the data never appeared in my collection. The CLI command has always worked for me.

Be careful with the word “schema” in this context. I mentioned that Mongo is schemaless but, for Mockaroo to generate data, I had to define a schema. You can create a different Mockaroo schema to mock up data from a different card game and those can live side-by-side in the same collection.

Reading cards

This first read implementation will be sort of naive. I want it to respond with all cards in JSON format so I’ll just create a list of BsonDocument, convert it to JSON, and ship it back to the client.

var projection = Builders<BsonDocument>.Projection.Exclude("_id");
var cardsBson = await coll
  .Find(new BsonDocument())
  .Project(projection).ToListAsync();
var cards = new BsonArray(cardsBson);
return (ActionResult)new OkObjectResult(cards.ToJson());

I had some trouble deserializing the auto-generated _id so I used a projection to exclude that from the results. Then I used IMongoCollection<T>.Find with no filter specified to get everything back.

cardsBson is a List<BsonDocument>. I can’t return a collection to the client, so I fit the collection into a single BsonArray object, then use ToJson which serializes the array of cards as a valid JSON string.

I intend to have thousands of cards in here so I’m going to implement paging eventually. For now, I’m happy sending everything back.

Routing the read call

Before I can move on to implementing the write Function, I want to make a couple of tweaks for the request parameter’s decoration. The Function template allows GET and POST calls and doesn’t specify the route. This is done via HttpTriggerAttribute.

[HttpTrigger(AuthorizationLevel.Anonymous, "get", "post", Route = null)]

If the route is left null, Functions will use the FunctionNameAttribute to create the URI. This first Function would then live at http://localhost:7071/api/ReadCards. I’m doing my best to adhere to REST conventions so I want to take the verb out.

Change null to "cards" and remove "post" because this Function is only for reading the collection. I now have this.

[HttpTrigger(AuthorizationLevel.Anonymous, "get", Route = "cards")]

Test it out

The code’s in place so all that’s left is to give it a whirl. From the CLI, run the app with func host start. The runner will print the URI to any Functions in the project once they’re listening. This first one listens on ReadCards: [GET] http://localhost:7071/api/cards. Fire up a browser or your favorite tool for playing with HTTP requests (I like Postman) and send a GET request to http://localhost:7071/api/cards.

Recap

I’ve got my database seeded with dummy data and implemented a basic “read everything” Function which crams all of my cards into a JSON array. I’ll need to add paging at some point and my error handling is really primitive but I’m ready to create my first UI to display my cards.

Cardalog: Blazor, MongoDB, and Azure Functions

I’ve played collectible card games (CCGs) for over twenty years and, in that time, have gotten a lot of cards. My collection is a mess of abandoned cataloging implementations so finding any particular card is next to impossible.

Instead of relying on some complex physical storage scheme for my cards, I’m going to make an API to manage the data about my collection and bolt on other applications and products to do stuff with that data. All together, this is my Cardalog.

The application stack

I tried a few combinations of technologies before settling into Blazor WebAssembly for the web GUI, Azure Functions for a serverless API, and MongoDB for storage. By the end of this article, I’ll have installed all the necessary tech and I’ll be ready to start the real implementation.

While this is the stack I’m starting out with, I’m not going to paint myself into a corner with it. I’ve got a lot of ideas bouncing around about other potential front ends and some side projects I can implement once I’ve got a good volume of data.

Unless otherwise noted, I use Visual Studio Code and PowerShell as my primary development tools.

Blazor WebAssembly

Blazor is a framework for building SPAs using C# instead of JavaScript. WebAssembly (Wasm) provides the means for .NET code to run in browsers. A key advantage is, after the initial download of binaries from the server, the application executes as quickly as native applications.

Blazor WebAssembly is in preview.
  • Get started by installing the .NET Core 3.1 preview SDK
  • Install the Blazor Wasm template at the terminal with dotnet new -i Microsoft.AspNetCore.Blazor.Templates::3.1.0-preview3.19555.2
  • Navigate to the directory you want the project to live in and execute the command dotnet new blazorwasm -o Cardalog
  • Run the application from the terminal with dotnet run

The Blazor Wasm template scaffolds a couple of things for you: a home page, a “counter” app, and a weather app which reads JSON from a local file. Check out the code and the site to see some of the basic concepts available. They’re a decent launchpad into working with Blazor but we’ll throw it all away very soon.

Azure Functions: a serverless API

There’s a lot to say about Functions Apps and serverless tech in general. While I love the managed product, the thing that really brings me back to Functions is that it forces me to think atomically. Instead of god classes and huge services, each Function in a Functions App is meant to serve one purpose. The first one I’ll create, for instance, is meant only to read all of the cards from my database.

  • First, install the Azure Functions extension for VS Code.
  • Open the command palette in Code (ctrl + shift + p) and run the Azure Functions: Create New Project... command. It’ll take you through an interactive process for the rest.
  • Choose your root directory (I chose the same directory the Cardalog GUI lives in) and create a folder called Cardalog.Api
  • Choose C# as the language unless you want to translate the API to another language.
  • Select HttpTrigger as the template for your first Function.
  • Name the Function ReadCards
  • Use the namespace Cardalog.Api
  • Select Anonymous for the access rights

We’ll be throwing out all of the scaffolded code when I go over the read implementation in my next article.

MongoDB

Lastly, install MongoDB Community Server and Compass. The server, by default, listens on port 27017 and requires no authentication. Open Compass and connect to it by filling out the form like so.

After you’ve connected, find the green “CREATE DATABASE” button near the top left of the main window. Name the database cardalog and the first collection cards. Later in the series, we’ll seed the collection with some JSON.

Recap

By now, we’ve got the Blazor front end, Azure Functions API, and MongoDB created. None of the components are talking yet, but it doesn’t take a lot to get there.

Coming up, I’ll seed the database with some test data, write a Function which replies to the caller with those data, and write a new page in the UI to get the card data from the API.

Scalability in Cosmos DB

Even when just sitting at idle, Cosmos DB will rack up charges. You set the throughput as Request Units (RUs) and that power is constantly provisioned so Cosmos won’t experience “warm up time”. Naturally, you get billed by the hour.

I’ve spent more money than I had to by manually provisioning the throughput or, more often, failing to scale down during non-peak hours. For my ETL to run effectively without racking up charges, I decided to set up some programmatic scaling.

Use case: scaling before large writes

My Cosmos consumers don’t need a lot of power so I only need to ramp up the RUs for roughly a 90-minute window in the early morning when I throw a lot of writes at it. All together, I must have the following so the throughput is only increased while it’s needed.

  • My load subsystem must be able to increase the throughput just before it starts loading.
  • It must be able to decrease the throughput as soon as it’s done loading.
  • Since Azure Data Factory orchestrates my ETL, it must be able to invoke the solution.

Cosmos APIs

I had a lot of choices for programmatically interacting with Cosmos including a RESTful API, the Azure CLI, PowerShell, and .NET. Look for other options in the Quickstarts in the official documentation.

Some of these, like PowerShell, can’t modify the throughput. Make sure to RTFM.

Azure Functions as a proxy

ADF has a first-class Azure Functions Activity. Functions are a lightweight compute product, which makes them perfect for my on-demand requirement. As a managed product, I don’t have to worry about configuration beyond choosing my language

It was easiest for me to implement this in C#, so I chose the .NET library and created a .NET Core Function with an HTTP trigger.

Implementation

My implementation is stateless and is not coupled to any of my ETL subsystems. Microsoft.Azure.DocumentDB.Core is the only dependency not already bundled with the Functions App template. Ultimately, the app is plug-and-play.

Here are a couple of edited excerpts from the Function to show the most important parts. The repo is linked in the last section for your free use.

Communicating with Cosmos

Cosmos references are wired up in two stages. First, the DocumentClient (i.e. a reference to the Cosmos DB) is spun up using the Cosmos instance’s URL.

using (
  var client = new DocumentClient(new Uri(dbUrl),
  key,
  new ConnectionPolicy { UserAgentSuffix = " samples-net/3" }))
{
  client.CreateDatabaseIfNotExistsAsync(new Database { Id = dbName }).Wait();
  var coll = CreateCollection(client).Result;
  ChangeThroughput(client, coll).Wait();
}

CreateCollection is a helper method which gets a reference to a collection within the DB.

async Task<DocumentCollection> CreateCollection(DocumentClient client) {
  return await client.CreateDocumentCollectionIfNotExistsAsync(
    UriFactory.CreateDatabaseUri(dbName),
    new DocumentCollection {
      Id = collectionName
    },
    new RequestOptions {
      OfferThroughput = throughput
    });
}

CreateDocumentCollectionIfNotExistsAsync is part of the Cosmos .NET API. As the name implies, it conditionally creates the collection. This is necessary even if you know the collection already exists – it’s how the Function gets a reference to the collection.

Modifying the offer

The throughput is changed first by getting the existing offer for the Cosmos DB, then replacing that offer with the throughput you want.

OfferV2.OfferType is always “Invalid”. It doesn’t mean the Function failed to fetch or replace the offer.
async Task ChangeThroughput(DocumentClient client,
  DocumentCollection simpleCollection)
{
  var offer = client.CreateOfferQuery().Where(o =>
    o.ResourceLink == simpleCollection.SelfLink)
    .AsEnumerable().Single();
  var newOffer = await client.ReplaceOfferAsync(
    new OfferV2(offer, throughput));
}

There’s some more stuff to glue it together and some error handling I’ve added. The repo layers all of that in.

Using the Function

You’ll need to send along five values in the request headers.

  • The name of the Cosmos DB
  • The name of the collection
  • The Cosmos’ URL
  • One of the read-write keys
  • The number of RUs you want to set.
    • This must be a multiple of 100.
The read-write key is transmitted in plain text in my current implementation. This is an obvious security flaw which I haven’t gotten around to fixing.

Source code

The source is on GitHub. Feel free to use it. If you have ideas for improvements, I’d love to hear them.