Adding collaboration on uMap, third update

2024-02-15

I’ve spent the last few weeks working on uMap, still with the goal of bringing real-time collaboration to the maps. I’m not there yet, but I’ve made some progress that I will relate here.

JavaScript modules

uMap has been there since 2012, at a time when ES6 wasn’t out there yet

At the time, it wasn’t possible to use JavaScript modules, nor modern JavaScript syntax. The project stayed with these requirements for a long time, in order to support people with old browsers. But as time goes on, we can now have access to more features.

The team has been working hard on bringing modules to the mix, and it wasn’t a piece of cake. But, the result is here: we’re now able to use modern JavaSript modules and we are now more confident about which features of the languages we can use or not


I then spent ~way too much~ some time trying to integrate existing CRDTs like Automerge and YJS in our project. These two libs are unfortunately expecting us to use a bundler, which we aren’t currently.

uMap is plain old JavaScript. It’s not using react or any other framework. The way I see this is that it makes it possible for us to have something “close to the metal”, if that means anything when it comes to web development. We’re not tied to the development pace of these frameworks, and have more control on what we do. It’s easier to debug.

So, after making tweaks and learning how “modules”, “requires” and “bundling” was working, I ultimately decided to take a break from this path, to work on the wiring with uMap. After all, CRDTs might not even be the way forward for us.

Internals

I was not expecting this to be easy and was a bit afraid. Mostly because I’m out of my comfort zone. After some time with the head under the water, I’m now able to better understand the big picture, and I’m not getting lost in the details like I was at first.

Let me try to summarize what I’ve learned.

uMap appears to be doing a lot of different things, but in the end it’s:

  • Using Leaflet.js to render features on the map ;
  • Using Leaflet Editable to edit complex shapes, like polylines, polygons, and to draw markers ;
  • Using the Formbuilder to expose a way for the users to edit the features, and the data of the map
  • Serializing the layers to and from GeoJSON. That’s what’s being sent to and received from the server.
  • Providing different layer types (marker cluster, chloropleth, etc) to display the data in different ways.

Naming matters

There is some naming overlap between the different projects we’re using, and it’s important to have these small clarifications in mind:

Leaflet layers and uMap features

In Leaflet, everything is a layer. What we call features in geoJSON are leaflet layers, and even a (uMap) layer is a layer. We need to be extra careful what are our inputs and outputs in this context.

We actually have different layers concepts: the datalayer and the different kind of layers (chloropleth, marker cluster, etc). A datalayer, is (as you can guess) where the data is stored. It’s what uMap serializes. It contains the features (with their properties). But that’s the trick: these features are named layers by Leaflet.

GeoJSON and Leaflet

We’re using GeoJSON to share data with the server, but we’re using Leaflet internally. And these two have different way of naming things.

The different geometries are named differently (a leaflet Marker is a GeoJSON Point), and their coordinates are stored differently: Leaflet stores lat, long where GeoJSON stores long, lat. Not a big deal, but it’s a good thing to know.

Leaflet stores data in options, where GeoJSON stores it in properties.

This is not reactive programming

I was expecting the frontend to be organised similarly to Elm apps (or React apps): a global state and a data flow (a la redux), with events changing the data that will trigger a rerendering of the interface.

Things work differently for us: different components can write to the map, and get updated without being centralized. It’s just a different paradigm.

A syncing proof of concept

With that in mind, I started thinking about a simple way to implement syncing.

I left aside all the thinking about how this would relate with CRDTs. It can be useful, but later. For now, I “just” want to synchronize two maps. I want a proof of concept to do informed decisions.

Syncing map properties

I started syncing map properties. Things like the name of the map, the default color and type of the marker, the description, the default zoom level, etc.

All of these are handled by “the formbuilder”. You pass it an object, a list of properties and a callback to call when an update happens, and it will build for you form inputs.

Taken from the documentation (and simplified):

var tilelayerFields = [
    ['name', {handler: 'BlurInput', placeholder: 'display name'}],
    ['maxZoom', {handler: 'BlurIntInput', placeholder: 'max zoom'}],
    ['minZoom', {handler: 'BlurIntInput', placeholder: 'min zoom'}],
    ['attribution', {handler: 'BlurInput', placeholder: 'attribution'}],
    ['tms', {handler: 'CheckBox', helpText: 'TMS format'}]
];
var builder = new L.FormBuilder(myObject, tilelayerFields, {
    callback: myCallback,
    callbackContext: this
});

In uMap, the formbuilder is used for every form you see on the right panel. Map properties are stored in the map object.

We want two different clients work together. When one changes the value of a property, the other client needs to be updated, and update its interface.

I’ve started by creating a mapping of property names to rerender-methods, and added a method renderProperties(properties) which updates the interface, depending on the properties passed to it.

We now have two important things:

  1. Some code getting called each time a property is changed ;
  2. A way to refresh the right interface when a property is changed.

In other words, from one client we can send the message to the other client, which will be able to rerender itself.

Looks like a plan.

Websockets

We need a way for the data to go from one side to the other. The easiest way is probably websockets.

Here is a simple code which will relay messages from one websocket to the other connected clients. It’s not the final code, it’s just for demo puposes.

A basic way to do this on the server side is to use python’s websockets library.

import asyncio
import websockets
from websockets.server import serve
import json

# Just relay all messages to other connected peers for now

CONNECTIONS = set()

async def join_and_listen(websocket): CONNECTIONS.add(websocket) try: async for message in websocket: # recompute the peers-list at the time of message-sending. # doing so beforehand would miss new connections peers = CONNECTIONS - {websocket} websockets.broadcast(peers, message) finally: CONNECTIONS.remove(websocket)

async def handler(websocket): message = await websocket.recv() event = json.loads(message)

<span class="c1"># The first event should always be 'join'</span>
<span class="k">assert</span> <span class="n">event</span><span class="p">[</span><span class="s2">"kind"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"join"</span>
<span class="k">await</span> <span class="n">join_and_listen</span><span class="p">(</span><span class="n">websocket</span><span class="p">)</span>

async def main(): async with serve(handler, "localhost", 8001): await asyncio.Future() # run forever

asyncio.run(main())

On the client side, it’s fairly easy as well. I won’t even cover it here.

We now have a way to send data from one client to the other. Let’s consider the actions we do as “verbs”. For now, we’re just updating properties values, so we just need the update verb.

Code architecture

We need different parts:

  • the transport, which connects to the websockets, sends and receives messages.
  • the message sender to relat local messages to the other party.
  • the message receiver that’s being called each time we receive a message.
  • the sync engine which glues everything together
  • Different updaters, which knows how to apply received messages, the goal being to update the interface in the end.

When receiving a message it will be routed to the correct updater, which will know what to do with it.

In our case, its fairly simple: when updating the name property, we send a message with name and value. We also need to send along some additional info: the subject.

In our case, it’s map because we’re updating map properties.

When initializing the map, we’re initializing the SyncEngine, like this:

// inside the map
let syncEngine = new umap.SyncEngine(this)

// Then, when we need to send data to the other party let syncEngine = this.obj.getSyncEngine() let subject = this.obj.getSyncSubject()

syncEngine.update(subject, field, value)

The code on the other side of the wire is simple enough: when you receive the message, change the data and rerender the properties:

this.updateObjectValue(this.map, key, value)
this.map.renderProperties(key)

Syncing features

At this stage I was able to sync the properties of the map. A small victory, but not the end of the trip.

The next step was to add syncing for features: markers, polygon and polylines, alongside their properties.

All of these features have a uMap class representation (which extends Leaflets ones). All of them share some code in the FeatureMixin class.

That seems a good place to do the changes.

I did a few changes:

  • Each feature now has an identifier, so clients know they’re talking about the same thing. This identifier is also stored in the database when saved.
  • I’ve added an upsert verb, because we don’t have any way (from the interface) to make a distinction between the creation of a new feature and its modification. The way we intercept the creation of a feature (or its update) is to use Leaflet Editable’s editable:drawing:commit event. We just have to listen to it and then send the appropriate messages !

After some giggling around (ah, everybody wants to create a new protocol !) I went with reusing GeoJSON. It allowed me to have a better understanding of how Leaflet is using latlongs. That’s a multi-dimensional array, with variable width, depending on the type of geometry and the number of shapes in each of these.

Clearly not something I want to redo, so I’m now reusing some Leaflet code, which handles this serialization for me.

I’m now able to sync different types of features with their properties.

Point properties are also editable, using the already-existing table editor. I was expecting this to require some work but it’s just working without more changes.

What’s next ?

I’m able to sync map properties, features and their properties, but I’m not yet syncing layers. That’s the next step! I also plan to make some pull requests with the interesting bits I’m sure will go in the final implementation:

  • Adding ids to features, so we have a way to refer to them.
  • Having a way to map properties with how they render the interface, the renderProperties bits.

When this demo will be working, I’ll probably spend some time updating it with the latest changes (umap is moving a lot these weeks). I will probably focus on how to integrate websockets in the server side, and then will see how to leverage (maybe) some magic from CRDTs, if we need it.

See you for the next update!