*I originally sent this to some team lists. Cross-posting this to mobile-firefox-dev, so it is fully public.* Accessibility Changes in GeckoView 64
Hello! We are having some major changes come in to our accessibility API support in GV 64/65. Since this stuff straddles two modules, I figured it would be good to write what may be the start of a design doc, or the very least a big picture brain dump. Sorry if its a wall of text, maybe if I recorded this as a podcast you would listen? I hope this is helpful to someone. tl;dr 64 introduces a major overhaul in our accessibility support. The goal is to provide a WebView-camparable API, and support multiple usecases. Android’s accessibility API is synchronous on the UI thread, we put in some interesting caching measures to make sure we never block. ------------------------------ In the last few months, I have been redoing a lot of our accessibility support in Fennec and GeckoView. Most of this got fleshed out in Bugzilla. Since only a few reviewers saw a piecemeal picture of these changes, I wanted to lay out here what has changed, why, and what has improved. The changes are split among two modules: GeckoView (mobile/) and accessibility (accessible/), and so were the reviewers. My aim here is to give a better picture of what’s up. The time before When Fennec was first released as a native application (after the rewrite from the XUL app), Gingerbread was the largest portion of our install base. Gingerbread had very little accessibility support to begin with, and had a minimal API with no concept of node hierarchies and only simple accessibility events, with AccessibilityEvent.TYPE_VIEW_FOCUSED being the most used primitive. In Fennec, we decided to use the platform’s a11y API, and tool it in a way that would optimize the screen reader user’s experience. We did this mostly through the addition of a JS module that would pass screen reader tailored events back to the platform. By “screen reader tailored”, I mean that they would account for things that screen reader users cared about, like announcing roles (eg. ‘read more, link’), and context (eg. ‘first item in list with 10 items’). In a typical platform it is the screen reader’s job to deduce the context of the user’s cursor position and synthesize a helpful utterance, in Fennec we were essentially spoon feeding the screen reader with helpful utterances. In addition to the tailored output, we implemented other features that are traditionally left to the assistive tech We brewed our own “accessibility cursor” that followed different navigation rules and allowed the user to explore the page with a keyboard (slider keyboards were common), and later with gestures. We also rendered a rectangle around the current accessibility cursor’s position. The JS layer acted as an intermediary that encapsulated the Gecko accessibility tree structure and hid it from the platform, while at the same time providing useful output for screen reader interactions. This allowed us to offer advanced features that were not available in Gingerbread, and were in their infancy in Ice Cream Sandwich, things like explore by touch, advanced gestures, and navigating pages by granularity. When Mozilla pivoted to Firefox OS, we took our Android js layer, and expanded it to be a standalone screen reader. Since we already assumed many roles of a screen reader in Android (utterance generation, gesture detection, tree navigation, etc.), we already had what we needed, we just sent our utterances directly to speech synthesis instead of an android ‘focus’ accessibility event. The Now A jury rigged solution for TalkBack has served its purpose, and we are in need of a generalized accessibility API that can handle any number of use cases. Since 2011 some new realities have emerged: 1. Android’s accessibility API has evolved and is fully featured, including accessible object hierarchy support, many more event types, native support for tables and collections, advanced text support, etc. 2. TalkBack has matured and gained many more features and advanced gestures. 3. Google’s accessibility suite has expanded and includes other accessibility services besides the TalkBack screen reader, like switch access, and tap to speak. 4. Other OEMs package their own specialized screen reader experiences, like Amazon’s VoiceView or Samsung’s Voice Assistant. 5. Developers who use GeckoView expect to have UIAutomator support. This is a test ui automation framework that leverages the a11y API. 6. Password manager apps have grown to rely on accessibility APIs for form autofill (while Google addressed this specific usecase with a new autofill API introduced in API 26, many older devices still use the a11y API). Instead of the JS intermediary, we will need to expose the accessibility tree and its functionality to the platform and allow a diverse set of accessibility services to interface with it. Just like we do on desktop platforms. In the first set of patches in bug 1479037, I did just that. The changes in bug 1479037 remove a lot of the JS bits we used for AccessibilityEvent generation, and replaced it with proper proxying of gecko accessibility events, and the exposure of our full gecko accessibility tree. This gives accessibility services, like TalkBack, and form autofill apps enough information to be useful. This change was also a first naive pass, and introduced some major performance regressions. Blockage The android API call for retrieving an accessible node object from a custom view (createAccessibilityNodeInfo) is synchronous, and is called on the UI thread. In gecko accessibility, we rely heavily on DOM and many of the calls require us to be on the main gecko thread. So in order to fully populate an Android AccessibleNodeInfo object, we need to synchronously call into the gecko thread from the UI thread. That’s not good. What makes it really bad is e10s. If e10s is in play, we need to perform a handful of synchronous IPDL calls into a content process to get all the info we need to populate a node. The UI thread remains blocked until a full accessible node is returned. With all of that horribleness, I still decided to start with a naive implementation to see how bad it actually was from a performance perception standpoint. It’s bad. Not only because of the issues above, but because accessibility services (and often core Android components) assume traversing accessible trees is cheap, so they do it very often. This makes things like scrolling janky - an autofill service will poll the entire tree for input fields on each scroll event. Early on when I looked into how Chrome was managing this, it became clear that they had a full, thread safe, cache of all accessibles in all open tabs in the top-level process. So this brings up the solution and loaded issue of caching.. Caching The design debates around e10s and accessibility are long going. In the runup to e10s, we needed to make some choices about how we were going to support it in a performant way on desktop. Assistive technologies (accessibility services in Android, or clients in Windows), see each application as a unified tree of accessible nodes. For us to represent a unified tree, we would need to query the content processes synchronously for content accessibility nodes. Chrome has solved this issue by caching the entire accessible tree of every tab in the top-level process (some more background: it took them several years to develop a robust caching scheme, it is hard and still buggy, in that while they had no accessibility API support). On Windows, we decided to use COM redirection tricks. They essentially allow assistive technologies to query information directly from the content processes while still maintaining the illusion of a unified tree. So what were we to do? Do like Chrome and Cache All the Things? Yes and no. I have experimented with two cache types that give us just enough information to be useful, but are lightweight enough that they can be frequently invalidated and not become stale. By default, GeckoView’s accessibility cache is *masked*, meaning a node will exclude non-cached child nodes from its child list, so an accessibility service would only know about the cached ones, and will not query us for nodes outside the cache. We currently serve up two *unified* caches, meaning we query both caches and get the most recent attributes from each. With these two caches we hope to get the best of both worlds, have immediate access to accessible nodes with just enough information to be useful, while keeping the cache current and uncorrupted. This will allow us to avoid the pitfalls of a maintaining a long-term cache that would need to stay in sync via tree deltas, and can potentially diverge. In short, our cache might not always reflect the true accessibility state %100 of the time, but at least it is short lived and gets updated often enough that users won’t perceive it. Here are the two caches: 1. Viewport Cache The viewport cache consists of all accessible nodes that are visible in the GeckoView viewport. It is fully repopulated on every scroll, and on each tree mutation. it is throttled so it won’t happen more than twice a second. The information on each node includes its screen bounds, its role (AccessibilityNodeInfo.getClassName()), and its state (AccessibilityNodeInfo.is*())). The information in this cache has proven to be enough for form autofill services, and switch access. Both of which don’t require low latency cache updates, or full text contents support. 2. Focus Path Cache The focus path cache gets replaced on each focus or accessibility focus change. It consists of the focused node and its parentage. It has a higher fidelity of information on each node compared to the viewport cache, and contains text, localized role descriptions, as well as range/collection info. This cache primarily serves screen readers like TalkBack that need a lot of information about a single focused node, and are primarily focus-driven. Cache updates Generally, the caches are fully replaced each time a cache-relevant event happens (eg. focus invalidates the focus cache, scroll invalidates the viewport cache). But there are several events that may mutate a cached node. Selected or checked events will change the selected or checked state of a cached node. Focus or accessibility focused events will update the stored ID of the focus/accessibility focused node. When not to cache By default, the cache is on. Any cache misses don’t fall back on the slow blocking path in Gecko/e10s, instead the root node is returned. So there is no risk of ever getting into a blocking call. When a developer wishes to test their app with UIAutomator, they may turn off caching and access the canonical Gecko accessibility tree directly. That will give them full access to all its of the tree, and their resource IDs (aka DOM node IDs). Comparison with Chrome’s implementation The advantage that this scheme has over Chrome’s is that it is leaner. It loads faster and it is theoretically lighter on memory (its not really optimized for storage yet since it uses GeckoBundle string key and value pairs). I’m proud to say that from a TalkBack performance perspective we are doing just as good, and often better, than Chrome. There are some issues with responsiveness that are out of our control, for example gesture detection and speech synthesis have constant latencies. Schedule The naive full tree implementation, with all the horrible blocking landed in 64. In the meantime caching was implemented and landed in Nightly. I wish the timing was better on that. We hope to have it uplifted to Beta this week. Further Work Now that the main foundation is set there are other items and features we can do to robustify, and further this accessibility support. Here is a partial unordered list: - A live regions cache for better screen reader experiences - Tuning our support for switch access - Tuning our support for braille - Adding table support - Adding heading level support - Further performance tweaks - Look into Amazon’s VoiceView and Samsung’s Voice Assistant - Look into tap to speak support - Look into d-pad support (for FireTV remotes) Thanks for reading this through!!
_______________________________________________ mobile-firefox-dev mailing list mobile-firefox-dev@mozilla.org https://mail.mozilla.org/listinfo/mobile-firefox-dev