At the current level of abstraction, some people may be concerned about how "inefficient" or "intractable" such-and-such a part of the design is, citing various AI war stories about how some approach was already tried, and failed, back in the 60s, 70s, 80s etc.
At the present (3rd) level of abstraction, what we mostly care about is that information flows through the machine in such a way that it behaves functionally in the way that we want (maximally-safe, -benevolent, and -trustworthy), largely ignoring time and space etc for the time being.
As we refine each roadmap step down towards a concrete implementation, the physical resources required will become substantial (particularly in respect of C01/UBT). This is to be expected. AI/AGI is computationally un/semi-decidable/intractable whichever way you try to achieve it. (In fact, this is where the unprecedented value generated by AI/AGI is derived: any specific case for which you can solve a problem that is "provably impossible in the general case" will often generate much greater real-world value than a general solution to a much easier fully-decidable problem.)
Thus, as we refine the present design down to a concrete implementation, we will gradually apply various speedup techniques, for example:
- Algorithmic acceleration - This is where we take a much closer look at the time and space complexity of our implementations ("algorithm X runs in O(n squared) time, whereas algorithm Y runs in O(n log n) time" etc). For example, some data representations (data structures) will allow the variant of a loop to be decreased by a greater amount (corresponding to greater computational work being performed) on each iteration, thereby allowing a computation to progress more quickly.
- Statistical acceleration, Type 1: Statistical information about the data to which an algorithm is applied in actual use (which can only be collected empirically) may inform the decision as to which of several competing algorithms is optimal in terms of time and space complexity.
- Statistical acceleration, Type 2: A complex decision point within an algorithm (such as state space search) may be facilitated using a neural net or other machine learning model trained on data gleaned from many prior executions (e.g. AlphaGo). Note that using e.g. a neural net to speed up a state space search algorithm such as theorem proving or witness synthesis is broadly analogous to Kahneman's System 1 vs System 2 ("Thinking, Fast and Slow") - a lower-level, intuitive, pattern-matching mechanism (System 1), massively parallel, and trained on a lifetime's experience of problem examples, takes a very fast, but not necessarily correct, "quick-and-dirty guess", which is then double-checked, and applied as appropriate, by a much more sequential, deliberate, and above all precise, higher-level algorithm (System 2) such that the two systems working together produce something greater than the sum of its parts. Anywhere there's state-space search (e.g. deduction, abduction), we can apply this speedup.
- Parallel acceleration - Some algorithms may be partitioned into sub-tasks which are then distributed over many cores executing in parallel. (In some cases tight coupling will be possible, in others, for example where a lot of information needs to be shared between cooperating processes, only loose coupling will be appropriate).
- Hardware acceleration - If profiling reveals a program hotspot, then that specific functionality may be directly implemented in hardware (e.g. FPGA, ASIC). Note that Amazon AWS now supports FPGA-accelerated instances.
- Targeted hardware - Hardware designed for a specific algorithm will generally execute that algorithm (many thousands or even millions of times) more quickly than general purpose hardware. Cue B03 and B04!
- Auto-refactoring - Once the program, hardware, and/or system synthesis tools developed at roadmap steps B02 to B04 have achieved greater-than-human capability in respect of any of 1-6 above, then the machine will be able to synthesise a provably correct faster implementation of itself.
- Uncle Bob's Law - In the preface to his 2018 book Clean Architecture, veteran software engineer Robert C Martin describes how computer technology improved by 22 orders of magnitude in his first 50 years as a programmer. Given that ours is a 50-100 year project, it would not be unreasonable to anticipate a further 10-20 orders of magnitude improvement over the next 50-100 years. I say this despite the fact that Moore's Law is coming to an end - given the economic drivers, it would be unwise to underestimate the ingenuity of the hardware industry!
- Attention - In principle, given sufficient compute, an intelligent system (such as an AGI) should be able to process an arbitrarily large number of distinct tasks simultaneously. Nevertheless, however much compute is available, the possibility will always exist that some eventuality might occur whereby some tasks will need to be prioritised (given disproportionate attention) over others. Thus any sensible AGI design will assume that compute is always a scarce resource and incorporate some kind of attention mechanism, effectively rationing compute over tasks.
- Approximation - It is often faster to compute an approximate result (corresponding to a relatively shallow semantic understanding) rather than a fully precise result, and such techniques are often touted as being indispensable for practical AI/AGI to be possible. Nevertheless, we must not forget that it is primarily the digital computer's precision of thought that makes super-intelligence (far exceeding that of humans, potentially by many orders of magnitude) - and the corresponding practical utility that comes with it - possible. Should the technique of approximation be overly or inappropriately applied in an AI/AGI system, merely in order to preserve scarce compute (which, thanks to Uncle Bob's Law, we can be pretty sure will become less scarce with every year that passes), then we risk throwing the baby out with the bathwater. Consequently, whenever the necessary compute is available, an AGI should always perform the precise calculation, only falling back on approximation (via graceful degradation) as a last resort. Thus precision (corresponding to a deep semantic understanding) should be the default, and approximation the fallback.