RetroDRY police violence project

RetroDRY ( https://github.com/starlys/RetroDRY ) is a framework in C# and javascript/React that automates a lot of the repetitive work involved in data driven web development. It’s “Retro” because it partially replicates the ease of development in pre-web systems like Access or FoxPro, in which the database system already allows all CRUD operations by default and you only have to code for edge cases or specialized input forms. It’s “DRY” (= Don’t Repeat Yourself) because a lot of the logic is programmed only on the server side and it gets pushed to the client automatically.

After developing RetroDRY I wanted to use it in a real world test to stretch its limits. So I picked a need that’s been in the news a lot – tracking police violence. This paper outlines what was easy and hard about that exercise. I will compare the developer experience of using Retro versus hand coding.

The app allows the public to log accounts of police violence and then shows the data in grid, maps, and charts, as well as calculating danger rates by population and change over time. The front page of the app shows maps and graphs allowing drill down so the users can see what happened in a particular area. They can get the names and details and web links for each incident.

I’ll skip the design and data modeling aspects, which are the same with or without Retro, and assume the database already exists.

Server side approach

On the server side, the basic steps to code the API are to use a template based on the Net Core API template from Microsoft, with Retro plugged in already. From there the steps are quite different than with typical hand-coding. Instead of thinking through each type of data that the client needs to load or save and writing endpoints for it, we describe the shape of the data using annotated classes, much like you would do with Entity Framework (EF). But unlike EF, the units of data are called “datons” and they consist of potentially multiple rows from multiple tables, whereas EF thinks of a unit of data as a single row from a single table.

An example from the project is the Observation class, which stores one eyewitness account of a police violence incident. This class contains two nested classes to store related persons and web links about the incident. So the unit (the “daton” in Retro terms) is one Observation record, plus any number of ObsPerson child records and any number of ObsLink child records. This daton is always loaded, saved, and communicated as a whole. The other main datons are Users and Incidents. An Incident is a parent of Observation, since more than one eyewitness could record the same incident.

For this demo project I didn’t want it to be too easy – it had to stretch the framework with real-world requirements. So there’s also an IncidentSummary table with a stored procedure to collate the observations and denormalize the counts and victim names and keywords into a single indexed record. That table has full text indices so that users can efficiently search incidents on names, badge numbers, location and keywords.

After describing the shape of the persistent data, we also describe in a similar way the queryable read-only data sets and what search criteria they support. These are called “viewons” in Retro. This project has 7 viewons: users, US states, US counties, incidents, valid-incidents, observations, audits, and incident-detail. The incident detail viewon has three levels of parent-child relationships, and is used to feed the drill down view on the web site. The valid-incident viewon is a focus of a lot of the server work because it feeds the graphs and maps on the app’s front page using several child tables, and it has to be optimized for performance.

Because this project has some tricky data semantics, RetroDRY wasn’t able to do all the loading and saving using default behavior. A small example was the geographic table which stores both state and county-level information, but I wanted the queries to be separate (one for states and one for counties). I could have built the tables separately to match Retro expectations, but as I said I was trying to make the framework work for nonstandard cases. So the geography table has custom load functions. A more complex case was loading the valid-incident dataset. This had to make decisions server-side, like whether to show the whole-country map with numbers in each state or show a zoomed in regional map with pins identifying each violence incident. So I mostly bypassed Retro for loading this dataset. Even with all the customization, I only had to populate instances of the viewon class, and Retro still handles the client communications.

And finally we have to tell Retro about the roles and permissions. Without this, it could allow anonymous users to edit any record in the database. The roles for this project are the public (no account needed), reviewers who approve submissions, and the administrator with full permissions. Each role contains a list of table names and permissions, with optional overrides by column.

Server classes in detail

Here is a complete list of the classes I had to write server side. I know, the list is short. That’s the idea!

  • AppUser – 125 lines to describe the roles and permissions
  • UserController – one API endpoint with 50 lines to authenticate users
  • Persistons – 200 lines to describe the persistent datons, including validation (it’s a data dictionary or a class-based way to model the database)
  • Viewons – 375 lines to describe the search criteria and result sets for 7 viewons
  • SqlOverrides – 600 lines of SQL load/save customizations, most of which is the valid-incident collation logic
  • Startup – 20 lines to register custom columns and overrides

Total

  • Around 1400 lines… for an entire scalable real-world app with user accounts, audit logs, per-column permissions and locks!

Client side approach

Using React I made two separate single-page apps. The front page is for most users and it uses a public non-authenticated user which has view permissions for the valid-incident and incident-detail viewons. The “back” page is for entering new incidents and admin use to maintain all data. Only the back page has a login option.

Because Retro does so much for you, the client doesn’t have to deal with permissions, prompts, entry forms different API endpoints, or a lot of the repetitive stuff in most apps. For example there is a button in the app for “Search Incidents” and its handler is a one-liner specifying which viewon to show. Once the viewon is shown, the user then magically has all the CRUD operations available – they can search incidents by all the criteria defined in the server tier, and pull up any incident and edit it. Foreign keys are handled for you, so the user sees dropdowns with readable choices for columns defined as lookups, although only the lookup key is stored in the database. No code is needed for this at all – Retro handles permissions, validation, locking for multiuser access, language prompts, entry forms for all supported data types, loading and saving.

The self-imposed requirements for public entry of new observations could not be done purely with Retro because I wanted the user to first enter basic info, then choose the location on a map, then add further details in a wizard-like way. Retro out of the box only gives you a single default entry form, and we can’t expect users to know things like the latitude and longitude numbers. Even though I had to hand code the entry form, I could still use Retro entry cards for each part of the process. In this way the code says “allow the user to enter these specific fields here” but doesn’t need to actually code any of the entry forms.

Client classes in detail

There are 21 files in all. Omitting the small stuff like globals and utility classes, the React components are as follows:

Front page components

  • SearchMain – 100 lines as a container for the components below, and server communications and refresh logic
    • SearchCriteria – 100 lines for entry of of the search criteria, using Retro components for the actual inputs (so we didn’t need to code tedious date range validation for example)
      • MapCriteria – 100 lines for showing a google map, allowing place name search, and allowing the user to draw a rectangle to identify the search location (shown as dialog by SearchCriteria)
    • AnalyzeRegionMap – 80 lines for showing a google map with pins for each incident, popup details and link to full details in a dialog
    • AnalyzeGrid – 40 lines for showing incidents in a grid with link to full details
    • AnalyzeDanger – 40 lines for showing a CSS-based graph of danger levels by state or county
    • AnalyzeCountryMap – 50 lines for showing a vector based national map (using react-simple-maps) with incident counts in each state, and drill down capability
    • AnalyzeChange – 60 lines for showing a bar graph (using nivo graphs) of change in victims and deaths by time period
    • IncidentDetail – 70 lines for formatting the incident detail in a dialog (eyewitness accounts, location, persons and links); this uses Retro cards to streamline number/date formatting and nested grids

Back page components

  • EditMain – 40 lines as a container for the components below and all the default Retro CRUD operations on users, incidents and observations
    • NewObservation – 190 lines for defining the wizard-like entry process, making sure the user goes through the cards in the right order, then submitting to the server
      • FindLocation – 90 lines for showing a google map, allowing place name search, and allowing the user to select an exact location location (shown as dialog by New Observation)

Total

  • Around 1000 lines …for a complete real-world client app with security, maps, graphs, reports, and entry forms for several data types!

The takeaway

While Retro can’t do everything, it radically cut down the lines of code needed, even for a highly customized application.

Is notifyplex.com useful?

I built notifyplex.com because I wanted the benefits of it for myself, but I wasn’t sure who else would want to use it. While it was interesting and fun to build, the gap between what senders want and what receivers want might be a long bridge to cross.

The balance of annoyance

Email and other digital “communications” are often an adversarial game where senders try to buy or manipulate people’s attention, and recipients have to ward off as much intrusion as possible. We either have to painstakingly unsubscribe from each unwanted sender, or mark them as spam or ignore the in-box altogether. We spend energy trying not to be the victim of communications but at the same time we still want to know certain things, which are sometimes buried among all the junk.

Twitter addresses this with the theory that everyone can talk and you only listen to who you want. Your only choice is to follow or not follow; but really we care about the relevance of the content, not just the sender. Some platforms use AI to guess what we want to see, but this takes away our control.

Broadcast media deal with a similar thing – If they run too many ads they lose listeners and that could collapse their ad revenue, but if they run too few, they are missing revenue. They find that special balance of annoyance that is on the edge of what people will accept.

Classifying and censoring

My thought with notifyplex is that you should be able to say everything (like twitter) but you should be honest about classifying what you are sending. That way recipients can decide in a clear way (not using a mystery AI algorithm) which items to receive. The sender still has to play the “balance of annoyance” game because they know that by misclassifying or underclassifying, they will lose readers.

One of the first people I asked about it when it was in production was someone who has a large email list concerning local business. That person knows that people remain on the list because they trust her to only send things that are appropriate. Most weeks there are a few emails which sometimes contain something I am happy to learn about, and the other ones are easy to delete, so the balance of annoyance is mostly in my favor.

She acts as a censor on my behalf because she sometimes rejects requests to send out information. That’s good. There are breakdowns though: Everything goes through her and she’s busy, and she doesn’t do a lot of editing. For example sometimes the subject line is something like “Fwd: re: Pls send to your contacts!”, and the email body is a nested chain of emails that I would have to dig through to even understand what it is about. Major formatting fails are common.

Qualified messages

Using sales terminology of “qualified leads”, a “qualified message” would be one that the recipient wants. If a sender is honest about classifying over a long period of time, and it is formatted consistently to be easy to read, their subscriber list will grow and most messages will be qualified. If they are inconsistent or send too much without classifying, messages will be unqualified and people will unsubscribe.

Notifyplex.com should make it possible to build up very large and highly qualified audiences through trust, rather than through the adversarial system we are used to. That’s because it allows senders complete control over classifying, using the terms specific to their organization; and it allows recipients complete control over message selection using the sender’s system of classification.

With notifyplex we can think about sending out huge amounts of information knowing that recipients can be highly selective. It’s okay that most messages don’t reach most people; what matters is that the right people receive the gems of information they want.

Is it useful?

I believe that notifyplex will only be useful if we are willing to shift from adversarial to trust based communications, and if senders can envision the benefits.

Megaworkarounds (long)

Computing technology is still in its infancy with respect to durability, reusability and correctness of code, as evidenced by the fact that developers have to deal with system issues constantly in application code. We are always writing almost the same code over and over because the new code is running on a different language or platform, takes advantage of new system features, or needs different dependencies. We don’t have a space to construct abstracted data and operations where we are free from thinking about the machine boundaries, memory management and other concerns that detract from the purity of the logic. Therefore most of what we write is within the space bound by those limits.

Engineering today is therefore an art of tying together the abstract perfection of pure logic with the finesse of manipulating actual machines. That art – working around limitations – is essentially what’s difficult about the art. It’s hard mainly because the technology is immature.

To be highly effective, we have to stay focused on what’s coming and meanwhile work with what we have. That’s the concept of “megaworkarounds” – the collection of work patterns and techniques used to effectively work around the limitations of the state of technology. We need to be aware when we are doing a megaworkaround and why, and that’s what this article covers. It’s not about “best practices” (as if that characterizes some kind of end state) but it’s rather about ways to do things solidly despite being amidst chaos.

Read more

Writing Requirements

This article captures the art of software requirements writing as it’s done at Divergent Labs.

The requirements document lists everything the product or feature will do, in a way that everyone on a team can read. Let’s start with a few examples from a hypothetical project that allows a user to enter comments attached to specific frames in a video. These requirements are in no particular order for this example.

Read more

Software Engineering Overview at Divergent Labs

This article describes the software development approach used at Divergent Labs. In the early part of my career I focused on small projects ranging in size from $500 to $5,000 each, completing a dozen or more of them per year. Through that experience I learned how to manage rapid life cycles efficiently and comprehensively. The process I developed works for larger projects too, and so it became standard for the work we do.

Read more

Software Engineering Versus the Alternative

In my decades of work as a custom software developer, I’ve concluded that engineering is a good idea.

Engineering is what happens when requirements are distilled into the simplest, most elegant, fast and beautiful machine that enables long-lasting performance. If you just manage to get something to work through repeated fiddling, it probably shouldn’t be called “engineering”. Engineering requires a plan and process, projecting into the future, knowing the steps in advance, and reliably achieving the results.

Read more