10 min read

A prototype is, well, a prototype. It breaths life into an idea, it takes an interactive form, and this generates new ideas and insights.  Alongside functional ideas and insights (hey! wouldn’t it be cool if the user could do *that*!) – are engineering ideas and insights (hey! wouldn’t it be cool if we could make it do *that* more quickly/flexibly/reliably)!

Now that most of the functionality I wanted is in place, I’m literally losing sleep over those engineering insights…

I may occasionally write some down here, if they are clear enough for me to articulate – and this is one such case.

Until now I have seen the idea of introducing “conflicting/non-conflicting” backgrounds as a specific piece of functionality appropriate only for given scenes. Equally, the idea of calibrating eye-misalignment or the luminance ratio for each eye, ought to take place in calibration scenes whose results are stored for later use.

Planning is everything. The plan is nothing.  This plan didn’t survive first contact with reality!

As we have been testing with a wide range of strabismic/amblyopia people, we have noticed how diverse not only their symptoms are, but the circumstances in which different symptoms are triggered.  In some cases, for instance, we can calibrate their eye-misalignment in the eye-misalignment calibration scene (logical, eh!) but when we get to the depth perception scene, something odd happens. Some people switch back to monocular vision, and others see 8 rings, as the four rings in each eye are no longer fusing – their misalignment angle has changed.

At first, I proceeded to add the capability to re-align the eye misalignment specifically in the depth perception scene.  I actually refactored the entire code base to abstract “model” control from the “input” controllers which drive the models, so we could have prefabricated “micro controllers” which we drag and drop into scenes to quickly enable specific pieces of functionality.

This refactoring is useful as we move towards multi-modal games, in which they may behave differently when used at home as opposed to being used in the clinic… but somehow I was bothered that it was still too much boiler plate for each scene.  There was also the question of how to avoid interference between all these micro-controllers- what if they were mapping to identical sets of inputs?  Although I’ve expanded the list of control inputs with a custom controller which adds support for short/long/hold presses, that’s still only 12 axis plus 6 button press modes. We aren’t going to be able to cover all situations.

This led me to thinking about a “special mode” switch – much like the classical approach of “escaping” in CS. For, instance, a new button only available via a keyboard (e.g. “C”) which switches into a Control mode where the currently active scene controller is disabled, control is passed to the camera (until relinquished with another control press) so that we can freely alter some of the core EyeSkills CameraRig properties, until we switch out of that mode and re-activate the previous scene controller.

In turn, in a blinding flash of the obvious, I then realised that the EyeSkills Camera Rig contains almost all the functionality we need to explore scenes that aren’t behaving as we like.  We could make this “camera mode controller” support switching into into a mis-alignment mode, allowing a recalibration on the spot – or a binocular suppression mode, allowing the eye conflict ratio to be altered in the scenario we find ourselves in – or a conflict mode (if we add a default conflict background to the camera itself with a very large Z distance).  We can also do the same with the monocular/biocular cues for helping establish how many eyes a participant are currently using to see (this is not at all obvious to them!).

In turn, this would mean that we could store per-scene camera configurations which really work for a person in that specific circumstance (where the images on the screen and their nature may be (are!) tending to stimulate the visual system differently. We can then measure progress against those scene specific baselines.

This makes the most important next refactoring to be an amalgamation of core strabismic/amblyopic “lock breaking” functionality in the CameraRig.

I’m looking forward to a fresh start in the morning!


Now that I am looking more deeply into what would need to be done, I ask myself whether it would be simpler, and safer, to give the CameraRig prefab its own dedicated input controller which responds to a totally unique set of inputs. There is no chance of interference with another loaded controller in the scene this way, and we do not multiply the amount of elements the camera needs to know about (e.g. to disable other controllers) where disabled controllers might, in turn, need to know they have been disabled! It might also make a great deal of sense to allow the participant to continue their access to the underlying scene while a practitioner “picks their mental lock” by manipulating the camera and its new-found super powers.

Of course, we will always have a potential collision in input control where we are using the headset to control the binocular suppression and misalignment positioning :-/  Any secondary (e.g. keyboard driven) set of inputs will also be inaccessible to a participant who is only using a bluetooth controller with a headset.  This would be the downside to the approach.

As always – every solution involves trade-offs.  What is the better choice here?  Lets define “better”.  The user (participant) is at the center.  As their vision system is dynamic, it is probably imperative that they have a means to adjust the environment they are in to the current state of their vision (which can change from moment to moment, and in interaction with any given visual environment).  This implies that not only must we be able to temporarily “escape” from a scene into managing the CameraRig, but that this must be a capability made available to the participant.

From an engineering point of view, we will probably (as this is a prototype) make this power available to the participant and practitioner using a dedicated option which we do not use for any other scenario. Unfortunately – it is probably too risky using modified OK/Cancel buttons (e.g. long press) as user mistakes could generate frustrating consequences.  One possibility would be to watch for a “HOLD” event on the “EyeSkills Up” button – which implies the participant needs to hold the button for over three seconds. This also prevents excessive “meddling”.  To release themselves from the mode, a repeated HOLD event would toggle that state.

Once in the escaped state, the camera is then responsible for temporarily disabling other control scripts.  IFF (If and only if) those scripts containing input control are nicely segregated (or marked by inheriting a special class – but that way lies pain) we could even automatically disable all scripts found in a particular portion of the scene hierarchy – but such decisions will also reduce our compatibility with other pre-fab games and start imposing unnecessary restrictions on developers.  For now, we will allow the CameraRig to be populated with a list of game objects to be enabled/disabled.

Slowly : we have the sketch of a design coming together –

The CameraRig will contain additional GameObjects for :

  • Monocular/Biocular detection
  • Conflict /Non-Conflicting (but biocular) backgrounds
  • (but NOT eye misalignment fixation assets – they will be implicit to the containing scene)

This implies that the EyeSkillsCameraRig.cs script will need to expose a List of GameObjects to the IDE for a developer to say “these are the ones you need to manage when escaping control”.   The script will also contain a dedicated input handler which is always listening for, and toggling on, a long HOLD.  When it is an escaped state – it will also be necessary not only to activate (and later deactivate) the appropriate set of visible objects and provide a means to control them, but it will also be necessary to choose which functionality one wishes to access!

This becomes quite involved. It is possibly beyond the scope of this iteration. For now, with the time available, I feel that it is worth laying a marker.  At the very least, we should be able to enter this escaped mode and activate the monocular/biocular cues.  Depending on how long this takes and how difficult it is, we may proceed to implementing another depth of virtual menu (e.g. would you like mono/bioc cues or eye misalignment or binocular suppression testing or conflict backgrounds).


The first step is to create a separate EyeSkillsCameraRigEditMode script which we will attach to our camera prefab.  This helps us enforce a clean CameraRig API, avoid breaking existing scenes, and make it simpler to back-away from the idea should it show itself to be unworthy!

Another interesting point that is immediately raised is, when other scenes dedicated to one aspect of brain hacking (e.g. misalignment measurement) are doing exactly what the functionality of this extended CameraRig are offering, then why duplicate the effort. The CameraRig may also need the ability for a dev to say that a particular functionality is already “on”. We will also add this for the Xcular cues (monocular/binocular cues).


Time for another quick decision.  Either our EyeSkillsCameraRigEditMode script is a bit of a monolith with command code for all the various modes it supports, or it is simply a coordinating script which activates and de-activates micro-controllers depending on the desired control functionality. TBH, the latter option is appealing.  We start with a set of deactivated micro-controllers and our main script simply monitors for entry to our special mode.  If then handles activating those micro-controllers, which have a convention for their own object setup and tear-down.  We are effectively recreating the app structure (a flexible menu which leads to scenes) with an in-scene arrangement in which a flexible menu enables/disables micro-controllers.


Well, that was a very frustrating experience.  Ideally, we would enable or disable entire micro-controller scripts, as this means we waste no cycles processing Update() methods in these classes. However, after a series of very contradictory problems, it’s become clear that “enable” doesn’t instantly make a script available so certain properties may simply be inaccessible if called immediately afterwards.  It also seems that delegate based callbacks don’t play nicely when scripts are enabled/disabled.

Without going into it more deeply – I’ve fallen back to simpler techniques. The main problem was how to allow a micro controller to indicate to the parent class that it has finished being used.  Because our core parent class for managing the micro controllers will not change (or at least, that is unlikely) I pass it in as an object safe in the knowledge that the micro controller can then safely call a known shutdown method on the parent.  Crude, but effective.

Implementing a Micro Controller isn’t hard. Here’s the XCues example :

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using EyeSkills;

public class MicroControllerXCues : MicroController {

    protected AudioManager audioManager;
    private EyeSkillsCameraRigEditMode parentManager;
    public GameObject leftCue, rightCue;
    private bool active = false;

    public void Start()
    {
        deactivateCues();
        audioManager = AudioManager.instance;
    }

    public override void Shutdown()
    {
        active = false;
        deactivateCues();
        parentManager.MicroControllerShutdown();
    }

    public override void Startup(EyeSkillsCameraRigEditMode _parent)
    {
        active = true;
        audioManager.Say(“XCuesAvailable”);
        parentManager = _parent;
    }

    private void activateCues(){
        leftCue.SetActive(true);
        rightCue.SetActive(true);
    }

    private void deactivateCues(){
        leftCue.SetActive(false);
        rightCue.SetActive(false);
    }

    public void Update()
    {       
        if (active)
        {
            if (Input.GetButton(“EyeSkills Up”))
            {
                Debug.Log(“Activating Cues”);
                activateCues();
            }
            else if (Input.GetButton(“EyeSkills Down”))
            {
                Debug.Log(“Deactivating Cues”);
                deactivateCues();
            }
            else if (Input.GetButton(“EyeSkills Confirm”) || (Input.GetButton(“EyeSkills Cancel”)))
            {
                Shutdown();
            }
        }
    }

}

The main thing to notice is that we only have to implement two methods : Startup and Shutdown (where Shutdown can be called should the parent object receive notification that the user is returning to the menu – time to clean up any microControllers). We only activate our input and graphical elements on Startup (with the unfortunate side effect that we constantly check a semaphore on Update as disabling the script causes too many side-effects). On Shutdown we only need to know to call the supplied parent.MicroControllerShutdown();

Now we can implement more micro controllers…

 

Would you like to beta test EyeSkills* or just follow what we are doing?