Radio frequency (RF) technologies are often used to track assets in indoor environments. Among others, ultra-wideband (UWB) has constantly gained interest thanks to its capability to obtain typical errors of 30 cm or lower, making it more accurate than other wireless technologies such as WiFi, which normally can predict the location with several meters accuracy. However, mainly due to technical requirements that are part of the standard, conventional medium access strategies such as clear channel assessment, are not straightforward to implement. Since most scientific papers focus on UWB accuracy improvements of a single user, it is not clear to which extend this limitation and other design choices impact the scalability of UWB indoor positioning systems. We investigated the scalability of indoor localization solutions, to prove that UWB can be used when hundreds of tags are active in the same system. This paper provides mathematical models that calculate the theoretical supported user density for multiple localization approaches, namely Time Difference of Arrival (TDoA) and Two-Way Ranging (TWR) with different MAC protocol combinations, i.e., ALOHA and TDMA. Moreover, this paper applies these formulas to a number of realistic UWB configurations to study the impact of different UWB schemes and settings. When applied to the 802.15.4a compliant Decawave DW1000 chip, the scalability dramatically degrades if the system operates with uncoordinated protocols and two-way communication schemes. In the best case scenario, UWB DW1000 chips can actively support up to 6171 tags in a single domain cell (no handover) with well-selected settings and choices, i.e., when adopting the combination of TDoA (one-way link) and TDMA. As a consequence, UWB can be used to simultaneously localize thousands of nodes in a dense network. However, we also show that the number of supported devices varies greatly depending on the MAC and PHY configuration choices.