sprint-econtai/Temporal Coherence Blog Post Results.md
Félix Dorn 43076bcbb1 old
2025-07-15 00:41:05 +02:00

5.8 KiB
Raw Permalink Blame History

Methodology

To estimate the importance of temporal coherence in the economy, we used the O*NET database, which lists all occupations and tasks involved infor each occupation for the US economy. We used Barnett 2025s classification of the remotable tasks in this database1. The reason for this is that the automation potential from AI, at the moment, comes mostly from AI systems that can act in digital environments, rather than in the physical world.

We use a large language model to classify all remotable tasks by how much time it would take a human to perform them, which gives us a lower- and upper-bound for each task if such estimate is possible.

We exclude non-estimable tasks from the estimate, as arguably, these tasks are the ones most likely to stay human.

![][image1] Results

Our main findings arecan be summarised byin two simple graphs. Figure 1 shows the distribution of tasks by length as classified by the LLM. The result is clear: more than 80% of tasks can be performed by humans in less than 8 hours of autonomous work. Comparatively few tasks take more than a week, and just a handful of them go over 6 months. ![][image2] The implications of these estimates, if we take them at face value, are striking: it just doesnt seem that AIs will have to improve that much at temporal coherence for this automation bottleneck to be solved for most tasks.

This is precisely what the second graph shows. Figure 2 combines our estimates for task length with METRs projections for the increase in coherence of AI models. The result is that by 2026, less than 40% of tasks wont be automatable, if they depended only on temporal coherence. By the end of the decade, AI systems will have enough temporal coherence to essentially take over the US economy again, if temporal coherence were the only capability left standing. Perhaps even more importantly, the change in AI automation potential looks discrete rather than continuous, going from nearly 0% of tasks in 2025 to more than 60% in 2026. This is due to most tasks taking around 8 hours to complete. Though we dont explore them, its clear this has important policy implications. ![][image3]

To understand how much economic value could be unlocked if temporal coherence were solved, conditional on it being the fundamental bottleneck, we investigated how much temporal coherence the five most remotable occupations required.

![][image4]

As can be seen in the graph, all top five occupations, except for Architecture and Engineering, have 80% of tasks which can be completed in a day (8 hours of work). Because AI agents could achieve this level of temporal coherence in 2026 according to METRs projections , AI agents could be unlocking a tremendous amount of economic value. To get a rough sense for the magnitude, suppose temporal coherence would be enough for task automation. Then, if 80% of tasks in all of these occupations can be automated, this could amount to 3.72$ trillion dollars of economic value. This is of course a simplification and an upper bound, but it illustrates how quickly AI could transform the economy thanks to improvements in temporal coherence. One question remains: how confident are we in estimates of task duration?

Half of all tasks are estimated to take between two to twenty hours, and 75% are less than or equal to 1 day on the lower end and less than or equal to 6 days on the higher end. These estimates sound reasonable for many O*NET tasks.

We find that 90% of task duration estimates have an upper to lower bound duration ratio lower than or equal to 10. ![][image5] Tasks with very large ratios accurately reflect the ambiguity of the task description. For instance, the task Conduct research in a particular field of knowledge and publish findings in professional journals, books, or electronic media takes between 40 hours and 1 year (216x difference). Another example is the task Create or use statistical models for the analysis of genetic data, which takes between one day and six months (180x difference).

The value of our results

Why should you care about these results?. First, our estimates of the distribution of task length for the US economy point to temporal coherence or whatever name you want to give to the capacity of people to work autonomously on a task until completion being a less important factor than we expected. The fact that 80% of tasks can be completed in 8 hours or less suggests that, regardless of the actual importance of temporal coherence for unlocking economic value, this will not be a bottleneck for very long. When we combine our estimates with METRs, which project the increasing capacity of AI agents to perform tasks over longer time horizons, we realise that by 2026, temporal coherence wont be an issue for automating more than 60% of all tasks.

Second, once you add in the assumption that temporal coherence is likely to be the biggest bottleneck of all the abilities that AIs currently lack (or have not perfected): Our results suggest that a lot of economic value could potentially be unlocked due to increases in AI agents temporal coherence,2 and that this could happen fairly soon. This has potential policy implications that we do not explore in this piece, but seem particularly relevant for concerns about gradual disempowerment.


  1. We think Barnetts classification could be improved further; for example, by using O*NETs Physical Work Conditions annotations. ↩︎

  2. This is not guaranteed, however, because actual automation depends on other factors which we dont analyze here, like the cost of running the AI agents and integrating them into current business, regulation and other frictions. ↩︎