While several methods have been proposed to assess the influence of continuous visual cues in parallel numerosity estimation, the impact of temporal magnitudes on sequential numerosity judgments has been largely ignored. To overcome this issue, we extend a recently proposed framework that makes it possible to separate the contribution of numerical and non-numerical information in numerosity comparison by introducing a novel stimulus space designed for sequential tasks. Our method systematically varies the temporal magnitudes embedded into event sequences through the orthogonal manipulation of numerosity and two latent factors, which we designate as "duration" and "temporal spacing". This allows us to measure the contribution of finer-grained temporal features on numerosity judgments in several sensory modalities. We validate the proposed method on two different experiments in both visual and auditory modalities: results show that adult participants discriminated sequences primarily by relying on numerosity, with similar acuity in the visual and auditory modality. However, participants were similarly influenced by non-numerical cues, such as the total duration of the stimuli, suggesting that temporal cues can significantly bias numerical processing. Our findings highlight the need to carefully consider the continuous properties of numerical stimuli in a sequential mode of presentation as well, with particular relevance in multimodal and cross-modal investigations. We provide the complete code for creating sequential stimuli and analyzing participants' responses.