Results indicate that neither the WAB-R fluency scale nor the machine learning algorithms were as useful (reliable and valid) as a simple trichotomous judgement of fluent, non-fluent, or mixed by SLPs. These results, together with data from the literature, indicate that it is time to re-consider use of the WAB-R fluency scale for classification of aphasia. It is also premature, at present, to rely on machine learning to rate spoken language fluency.