Browsing as guest. Sign in Sign up

Home > Blog > 28/01/2022

Refactoring—by Nick Downing

So after gaining considerable momentum to complete the refactoring project of the last 3–4 weeks, I am happy with things and can take a breather. Whilst the site still has the same basic functionality (read static content, sign up, verify email, password reset, sign in/out, change details and password), things are very different under the hood, and there are also visible UX improvements.

The LogJSON database has been in a useable state for about 2 weeks and since then I have been changing all the flows to store the data (such as user-account data upon sign up) in the central LogJSON database for the site, rather than individual JSON files. But since this required rewriting almost all code, I decided the rewrite would encompass the following points:

  • Change data storage to use LogJSON (the original goal of the rewrite).
  • Partition the logic into API endpoints and pages.
  • Rearrange the code into a sensible directory structure.
  • UX improvements, such as custom forms and fewer pop-ups.
I will discuss each point in further detail next.

Change data storage to use LogJSON

The experience with LogJSON to date

After some weeks of using LogJSON for the backend storage, the experience has been very good. Compared to the traditional 2-dimensional structure of a relational database (SQL), the tree-like structure of LogJSON proves to be quite convenient.

The JSON structure of the database at the moment looks like

{
  "globals": {...},
  "sessions": {...},
  "accounts": {...},
  "navigation: {...}
  "nodemailers: {...}
}
where globals stores miscellaneous information such as the site’s title and copyright message, sessions stores information about each browser session connected to the site (including sign-in status), accounts stores information about signed-up users, navigation stores the navigation tree of the site (basically a mapping from directory names to page titles, in a hierarchical structure), and nodemailers stores information about mail accounts and servers which we use for verifying users’ email addresses and similar purposes.

The sessions and accounts JSON Objects are keyed by the user’s session or account identifier respectively, and lead to a JSON Object per session or account. These latter do not have a predefined structure, as any part of the business logic is able to add a new key to the session or account as required. This is useful because normally this kind of addition would require a SQL structure change or a new table, both of which are inconvenient and time-consuming to implement.

Of course, it is also easy to add new top-level keys to the database for new functionality that doesn’t reside in the session or account. The general approach I have taken is that any code accessing the database and expecting a particular key should add that key if it does not exist. Ultimately we might want to have a schema for the database, and predefined objects with predefined field sets. This would also work well, but would require a bit more effort upfront.

From a historical perspective, my understanding was that IBM was making object-oriented databases in the 70s and 80s before Oracle popularized the relational paradigm with SQL in the late 80s and 90s. Having taught the relational algebra at undergraduate level, I appreciate there are many good things about it, and I would not want to dismiss it out of hand. But I still think that the idea of object databases deserves reconsideration, hence my experiment here.

Use of LogJSON transactions

In my initial attempt at the conversion to LogJSON, I was trying to keep the number of transactions to a minimum, particularly considering that all parent objects to a changed object must be rewritten to the log every transaction.

So each API endpoint would, for example, invoke the code to generate a new session cookie and enter or locate a session object in the database, after having opened the transaction it would use to perform its own operation. This avoided having an extra transaction just to deal with the session cookie, but also made the logic for each API endpoint more complex. A similar issue arose when serving pages and dealing with analytics or sign in/out status for the navbar.

A particularly annoying thing about this ‘optimization’ was that transactions which were inherently read-only were becoming read-write in general, since they might write a new session even if the rest of the transaction was only reading. This added considerable complexity to the read-only transactions such as ‘get’ API endpoints and the navbar template.

Therefore, I eventually realized that this was not really an optimization, and so I changed it so that when you hit the server, it will usually run 3 or 4 smaller database transactions throughout the process, e.g. for the session cookie, the analytics, the navbar, and whatever operation is needed by the page itself. Some of these are read-only, which will improve concurrency. A cost is that the log grows slightly faster, as there are multiple write transactions too.

Potential issues with LogJSON and solutions

It must be admitted that the accounts object is not scalable as things stand, since for example when I create a management interface for users, I will want to display something like 20 users per page sorted by email, and at present you would have to extract the entire set of keys from the accounts object simultaneously and sort them to make any sense of them.

Therefore, I plan to create a B-Tree index of email addresses in sorted order, which will be maintained as accounts are added and removed. This B-Tree will be encoded into JSON and stored in the database as well. I will do this before attempting to create the management interface.

Another scalability issue is that LogJSON generally writes out an entire JSON Object (plus all its parent objects) when it changes. Although it does not write out the child objects that have not changed, an object with many keys will still place a strain on the system (since all keys and pointers to their corresponding child objects must be written out even if only one key or pointer has changed). I see two possible solutions to this issue:

Improve LogJSON to only write out changed keys. Later on we will need to track the deltas (changed keys) of each transaction precisely in order to improve concurrency (needed in turn if the load on the webserver is high). If we write the transactions to the log in delta-format, then they will be quicker to write. But conversely, they will be slower to read.

Modify database clients to avoid writing large objects. We could use a hashing approach to implement this. Suppose the user’s account key was jane@doe.com, we could hash this to find a hex value of say 0x569a, and then we break up the accounts object by storing under the hash first, e.g.

{
  ...
  "accounts": {
    ...
    "56": {
      ...
      "9a": {
        ...
        "jane@doe.com": {...},
        ...
      },
      ...
    },
    ...
  },
  ...
}
and this way the top 2 levels of the accounts table would have at most 256 entries each (easy to write out each time they change) and the next level would have about 1/65536 the number of entries as the original table. This would scale to at least hundreds of thousands of accounts. An adaptive scheme might also be considered.

A cheap way to achieve the second goal might be simply to store the data in B-Trees (implemented via JSON Arrays of some fixed maximum size corresponding to the B-Tree block size) rather than JSON Objects directly. It would give the advantage of being able to extract data in order as well as by key. Even if we didn’t use this latter feature, the nature of the B-Tree itself provides an adaptive way of keeping the index in manageably sized chunks.

Final comments on the LogJSON experiment

I also fixed several minor bugs in LogJSON over the last few weeks and it appears to be stable at this point. I must admit I have not constructed a large-scale programmatic test (add and remove thousands of items at different hierarchical levels, perhaps concurrently, and test that everything is stored correctly throughout the process) but I will do so eventually. I’ll also make a similar test to simulate heavy load on the website and exercise the database in the process.

Overall, I’m satisfied with the progress of the LogJSON experiment, though I haven’t implemented the planned B-Tree layer yet. (I have been doing some research into B-Trees in preparation and have found some MIT-licensed sample code, although it’s not perfect for my requirement).

Partition the logic into API endpoints and pages

Under the new scheme, API endpoints use POST requests and return machine-readable responses, whereas pages use GET requests and return human-readable markup. This helps to separate business logic from presentation logic. Whilst I’m not generally a huge fan of model/view/controller type schemes, I must admit that my previous code was a bit messy when both logics were intermixed, and it was hard to see what was happening when you returned to the code later on.

In the previous way, with the business logic encoded into the particular pages it was used on, the page would generally check for a POST request and do something different in such case, for instance it might save something in the database and return a thank-you page.

I did have a few machine-readable API endpoints for special cases such as the sign-in, and some interesting API endpoints uch as the sign-out which would return status text to be displayed in the page. But as there was no clear separation between API endpoints and pages, these endpoints simply appeared as pages that the user would not normally read directly.

The new API-based way is much better from the UX perspective too, as we do not have to return a new page to the user each time they interact with the site, and given that APIs return machine-readable responses it is easy to indicate the progress of an operation with ticks, crosses and the like. And we can make the API more granular and present a running status to the user during a sequence of several API calls, which is easier to implement and better for the UX.

Interestingly, I can still use some text-based status messages (where the API endpoint returns text to be embedded in the page, like it used to) because I have standardized all API errors to return an application/problem+json object mostly following IETF standards, and this contains a detail field where I can describe an error message for the user. However, in the success case, any message for the user is generated by client-side code rather than being returned from the API as in the old way.

The new API endpoints are very easy to use, since I have the utility class Problem and utility function api_call() which take care of marshalling the call across the client/server boundary. At the server side, each endpoint such as /api/account/sign_in.json is held in a corresponding source file such as /api/account/sign_in.json.jst and the marshalling is done by a subroutine post_request() in /_lib/post_request.jst, which also avoids a lot of duplication compared to before. To create a new endpoint, you simply copy a similar endpoint source file to the new endpoint name.

Rearrange the code into a sensible directory structure

Basically under the new system, things are placed in different directories from the root according to their function, similar to how in a unix system you have /bin, /lib, /usr and so on, and a package will divide up its files between these directories. Whilst there is some argument about this (would it not be better to have each package keep all its files in the same place?), there are also advantages to separating things out by function.

So under the new scheme, API endpoints are under /api, utility code and templates are under /_lib, and pages are under the original directory structure reflecting the navigation tree of the site.

In the previous way, it mainly followed the navigation tree of the site, which was not very flexible, since the navigation tree does not necessarily reflect the structure we need for the logic, and also was quite messy as API, utility and template code occurred at various levels of the hierarchy among the code that generated the pages.

The previous theory was that if for example I wanted to add a blog functionality to some other website, I’d just copy the /blog directory and all business logic and templates would automatically be installed as well. But I have decided that mixing everything together was too impractical, so I might have to write an installation procedure at some point if I want a component system for sites. For the time being I’ll just install things in the needed places manually, which is not really much of a hassle.

UX improvements, such as custom forms and fewer pop-ups

Old style form filling

In the old way we were simply using plain old HTML forms with a ‘Submit’ button, and the browser would encode them into something like JSON (I am not entirely clear how this encoding works) and submit them with a POST request.

I created a little in-browser simulation to demonstrate how this the old style of form worked. Please have a play with it, and then we will demonstrate some UX improvements. Note that for example purposes, you can trigger a server-side error by entering the name ‘Jane Doe’.

Refresh

Please enter your details to receive a greeting message!

* This field is required.

details = undefined

In the above demonstration form, we had followed one good UX practice:

Tell the user why we need this information and what we will do with it.

However, we had also broken another recommended UX practice:

Don’t repeat the field name in the placeholder text.
Use a real person’s name or details for example purposes, as this will speed data entry.

We are using the browser’s built-in validation style above, which isn’t consistent across browsers. On Chromium, if you do not fill in a required field, a tooltip appears saying ‘Please fill in this field’. I don’t think this text can be changed, and more annoyingly, only the first error is indicted. Whilst this doesn’t specifically violate any UX guidelines, we can do better:

Use a custom validation style to highlight errors with colour and provide useful hints for the expected input.

Another UX issue that is demonstrated in the above example is:

Don’t throw away the user’s partial input if they have to refresh the page!
Save a draft to the server during data entry, and pre-populate the form on refresh.

Form filling was one of the killer applications for the early web, and the old way of doing things worked well for its time. Whilst the traditional way is often like it is for a reason, and I do not agree with change for the sake of change, there are definitely improvements to be made in this case. We will look at the improved form UX next.

New style form filling

In the new way we use an API endpoint, or several, to receive the user’s input. This allows us to stay on the same page and provide interactive feedback to the user, e.g. if something goes wrong. We can also implement advanced features, such as the saving of drafts, etc.

Here is an in-browser simulation to demonstrate the improved form-filling UX. Please have a play with it, and recall that for example purposes, you can trigger a server-side error by entering the name ‘Jane Doe’.

Refresh

Please enter your details to receive a greeting message!

Please enter a name we can address you by.
Please enter something. You can enter ‘X’ if you do not have a family name.

* This field is required.

draft_details = undefined
details = undefined

In this simulation we are able to see all communication with the server, including the saving of the draft which is normally silent from the user’s viewpoint. The draft is saved every 3 seconds (plus the round-trip time) whilst editing is taking place. The simulation correctly shows also that you can trick the system by refreshing the page very quickly after an edit.

Note that the draft can be saved after the form is submitted, which is probably redundant but harmless. It might be good to clear the form once submitted, and the corresponding draft, but I have not formed a clear policy on this. For the time being, I’m relying on stale drafts expiring at the server after a timeout of 1 day.

Some other good UX practices that we are adopting in the improved form are:

Provide an icon in every button, this speeds entry and also assists non-English speaking users.
Provide strong visual indication of completion, success or failure by means of icons and colours.
Show a pacifier when undertaking an operation that takes time, this confirms that the last click was received and also makes the operation seem quicker.
Disable UI elements such as buttons when it doesn’t make sense to interact with them yet.
Final comments on form filling

Whilst the old way was sufficient, it was only that, sufficient. The new way provides a more vibrant experience for the user, while capturing the data in a more efficient way by guiding the user through the process. We are also striving to reduce user frustration that could result in them leaving the form (and lost sales/conversions in a commercial context).

Sadly I have not had time to make the form accessible yet, and that will probably be the subject of a future article. Apart from that, the experiment with providing a modern form-filling UX has been very successful. Although it has taken a week or so of experimentation, I consider this time well spent, as I learned something about modern UX design.

Conclusions

The new internals of the site are much more pleasant to work on, as things are much more organized, and the LogJSON database allows plenty of flexibility. Thus we can quickly implement UX features such as the saving of drafts. Moreover, the new UX (implemented during the database upgrade) is much more visually pleasing and easier to use.