sprint-econtai/Temporal Coherence Blog Post Results.md at master

Félix Dorn 43076bcbb1 old

2025-07-15 00:41:05 +02:00

5.8 KiB

Raw Permalink Blame History

Methodology

To estimate the importance of temporal coherence in the economy, we used the O*NET database, which lists all occupations and tasks involved infor each occupation for the US economy. We used Barnett 2025’s classification of the remotable tasks in this database¹. The reason for this is that the automation potential from AI, at the moment, comes mostly from AI systems that can act in digital environments, rather than in the physical world.

We use a large language model to classify all remotable tasks by how much time it would take a human to perform them, which gives us a lower- and upper-bound for each task – if such estimate is possible.

We exclude non-estimable tasks from the estimate, as arguably, these tasks are the ones most likely to stay human.

![][image1] Results

Our main findings arecan be summarised byin two simple graphs. Figure 1 shows the distribution of tasks by length as classified by the LLM. The result is clear: more than 80% of tasks can be performed by humans in less than 8 hours of autonomous work. Comparatively few tasks take more than a week, and just a handful of them go over 6 months. ![][image2] The implications of these estimates, if we take them at face value, are striking: it just doesn’t seem that AIs will have to improve that much at temporal coherence for this automation bottleneck to be ‘solved’ for most tasks.

This is precisely what the second graph shows. Figure 2 combines our estimates for task length with METR’s projections for the increase in coherence of AI models. The result is that by 2026, less than 40% of tasks won’t be automatable, if they depended only on temporal coherence. By the end of the decade, AI systems will have enough temporal coherence to essentially take over the US economy – again, if temporal coherence were the only capability left standing. Perhaps even more importantly, the change in AI automation potential looks discrete rather than continuous, going from nearly 0% of tasks in 2025 to more than 60% in 2026. This is due to most tasks taking around 8 hours to complete. Though we don’t explore them, it’s clear this has important policy implications. ![][image3]

To understand how much economic value could be unlocked if temporal coherence were solved, conditional on it being the fundamental bottleneck, we investigated how much temporal coherence the five most remotable occupations required.

![][image4]

As can be seen in the graph, all top five occupations, except for Architecture and Engineering, have 80% of tasks which can be completed in a day (8 hours of work). Because AI agents could achieve this level of temporal coherence in 2026 – according to METR’s projections –, AI agents could be unlocking a tremendous amount of economic value. To get a rough sense for the magnitude, suppose temporal coherence would be enough for task automation. Then, if 80% of tasks in all of these occupations can be automated, this could amount to 3.72$ trillion dollars of economic value. This is of course a simplification and an upper bound, but it illustrates how quickly AI could transform the economy thanks to improvements in temporal coherence. One question remains: how confident are we in estimates of task duration?

Half of all tasks are estimated to take between two to twenty hours, and 75% are less than or equal to 1 day on the lower end and less than or equal to 6 days on the higher end. These estimates sound reasonable for many O*NET tasks.

We find that 90% of task duration estimates have an upper to lower bound duration ratio lower than or equal to 10. ![][image5] Tasks with very large ratios accurately reflect the ambiguity of the task description. For instance, the task ‘Conduct research in a particular field of knowledge and publish findings in professional journals, books, or electronic media’ takes between 40 hours and 1 year (216x difference). Another example is the task ‘Create or use statistical models for the analysis of genetic data’, which takes between one day and six months (180x difference).

The value of our results

Why should you care about these results?. First, our estimates of the distribution of task length for the US economy point to temporal coherence – or whatever name you want to give to the capacity of people to work autonomously on a task until completion – being a less important factor than we expected. The fact that 80% of tasks can be completed in 8 hours or less suggests that, regardless of the actual importance of temporal coherence for unlocking economic value, this will not be a bottleneck for very long. When we combine our estimates with METR’s, which project the increasing capacity of AI agents to perform tasks over longer time horizons, we realise that by 2026, temporal coherence won’t be an issue for automating more than 60% of all tasks.

Second, once you add in the assumption that temporal coherence is likely to be the biggest bottleneck of all the abilities that AIs currently lack (or have not perfected): Our results suggest that a lot of economic value could potentially be unlocked due to increases in AI agents’ temporal coherence,² and that this could happen fairly soon. This has potential policy implications that we do not explore in this piece, but seem particularly relevant for concerns about gradual disempowerment.

We think Barnett’s classification could be improved further; for example, by using O*NET’s Physical Work Conditions annotations. ↩︎
This is not guaranteed, however, because actual automation depends on other factors which we don’t analyze here, like the cost of running the AI agents and integrating them into current business, regulation and other frictions. ↩︎

5.8 KiB Raw Permalink Blame History Unescape Escape

5.8 KiB

Raw Permalink Blame History