Machine learning techniques such as support vector machines are applied to a text classification task to determine mental health problems. Inputs are transcribed speech samples from a “structured-narrative task” and outputs are psychiatric categories such as schizophrenia. In a preliminary trial, subjects from three groups generated speech samples: those with clinically diagnosed schizophrenia (31 patients), clinically diagnosed mania (16 patients) and controls (9 subjects). Even though the structured narrative task resulted in the use of a limited vocabulary by all subjects (only a total of 1100 different words were used), a classification performance approaching 80% accuracy was achieved for the schizophrenia versus control task. Classification performance at this level indicates that the method is suitable for diagnostic or screening purposes. It is expected that results improve further in experiments utilising free-speech samples. Diagnostic categories in psychiatry can be broad and heterogeneous, e.g. schizophrenia, which includes a range of very different symptoms. In further experiments, clustering techniques are used to extract task-relevant diagnostic categories from psychiatric reports. In these reports, psychiatrists typically include biographic, background and referral information, a description of symptoms and an opinion on treatment recommendations. At the task level, diagnostic reports are written for a specific audience or decision making body. In preliminary experiments, detailed and specific diagnostic categories have been extracted from psychiatric reports by use of unsupervised learning. These categories genuinely reflect the everyday practise of a mental health professional.