Maximilian Renz

Master's Thesis

Advisors
Ann-Kristin Seifer, Robert Richer, Prof. Dr. B. Eskofier

Depression is a prevalent clinical condition, affecting over 280 million individuals globally, including 5% of the adult population [1]. For those affected, the condition represents a considerable burden, entailing a loss of quality of life and a reduction in life expectancy [2]. Early detection is crucial, as timely intervention can significantly enhance outcomes [3]. The duration of untreated depression plays a critical role in recovery, with earlier treatment being associated with higher remission rates [4], ultimately improving mental and social functioning [5].

Depression is usually diagnosed by a trained professional (e.g. physicians, psychiatrists, psychologists) using structured interviews and standardized questionnaires such as the Hamilton Rating Scale for Depression (HAM-D) [6], the Beck Depression Inventory-II (BDI-II) [7], and the Patient Health Questionnaire (PHQ) [8]. However, these assessments can only be performed when patients seek medical care from these professionals.

There is an emerging field of research investigating alternative approaches for depression detection by assessing the potential of objective digital biomarkers, such as voice, activity, gait, sleep time, and many more [9]. A frequently employed biomarker for the detection of depression is voice [10,11], given that individuals with depression exhibit notable alterations in their speech patterns. These include, among others, a reduction in intensity, a decrease in fundamental frequency deviation (FFD) or pitch range, and a slowing of speech rate [12]. The application of machine learning has enabled the development of numerous automated speech analysis techniques that use these paralinguistic features [13]. Another objective digital biomarker that can be indicative of depression is gait, including a slumped posture, impaired dynamic balance, and reduced gait speed [14], and other generic and expert motion features, which have already been shown to be promising for assessing the influence of stress on body posture and movement [15].

Data from these digital biomarkers can be obtained in a number of ways. In addition to external or ambient sensors, which are fixed in the environment, wearable sensing devices are often used, for instance, inertial sensors, smartphones, or smartwatches [16]. While it has been shown that voice analysis, as well as motion and gait patterns, can be used to detect depression, it has not been analyzed whether the combination of these modalities can be used to enhance depression detection. Furthermore, the majority of existing approaches are not suitable for everyday use, as they require the user to wear one or more sensors. Therefore, a hearing aid, a wearable device on the head, has the advantage of recording voice, gait, and other movements directly, as it inherently integrates a motion sensor as well as a microphone sensor. While earables in general are gaining popularity [18, 21], the combination of sensors on the ear (or head) to detect depression represents a novel approach that has not yet been documented in the literature.

The objective of this thesis is to develop an algorithm that can predict depression based on the combined analysis of voice, gait, and motion from hearing aid sensors. This entails data acquisition, feature extraction, and the development and evaluation of various machine and deep learning approaches. The data will be acquired through the utilization of hearing aid-integrated sensors, specifically the inertial sensor and the microphone. A study will be planned and conducted to collect data from 20 participants (10 healthy and 10 depressed individuals). The PHQ-8 will be used to assess the presence and severity of depression. To obtain voice data, a semi-structured interview will be conducted, which is recorded by the hearing aid microphones worn by the participants. To gather data related to the person’s gait, a 6-minute walking test will be conducted. Additionally, head movements will be recorded in different situations to be able to gain insight into the participants’ activity and movement. All tests will be conducted in a laboratory setting.

To predict the depression score, different algorithmic approaches will be implemented and evaluated. Following the extraction of features from the gait and motion data, different classifiers are evaluated with respect to their classification performance of distinguishing healthy from depressed individuals. In the first approach, voice features will be extracted by using established frameworks such as openSMILE [17] as well as gait and motion features, e.g. by using EarGait [18] and tsfresh [19]. Subsequently, a classification is performed. The second method differs primarily in the processing of voice. In this case, wav2vec 2.0, a present deep learning model for emotion recognition [20], is used for depression detection. Prior to training, the model is pre-trained with speech data from other datasets. Following, fine-tuning is done by using speech recorded by hearing aid microphones. Different information fusion techniques will be applied to find the best-performing depression classification method.

Name	Default Cookie
Provider	Owner of this website
Purpose	Saves the visitors preferences selected in the Consent Banner.
Privacy Policy	https://www.mad.tf.fau.de/privacy/
Hosts	www.mad.tf.fau.de
Cookie Name	rrze-legal-consent
Cookie Expiry	1 Year

Name	WordPress
Provider	No transmission to third parties
Purpose	Test if cookie can be set. Remember User session.
Privacy Policy	https://www.mad.tf.fau.de/privacy/
Hosts	.www.mad.tf.fau.de
Cookie Name	wordpress_[*]
Cookie Expiry	Session

Name	SimpleSAML
Provider	No transmission to third parties
Purpose	Used to manage WebSSO session state.
Privacy Policy	https://www.mad.tf.fau.de/privacy/
Hosts	www.mad.tf.fau.de
Cookie Name	SimpleSAMLSessionID,SimpleSAMLAuthToken
Cookie Expiry	Session

Name	PHPSESSID
Provider	No transmission to third parties
Purpose	Preserves user session state across page requests.
Privacy Policy	https://www.mad.tf.fau.de/privacy/
Hosts	www.mad.tf.fau.de
Cookie Name	PHPSESSID
Cookie Expiry	Session

Accept	Siteimprove Analytics
Name	Siteimprove Analytics
Provider	Rosenheimer Str. 143 C, 81671 Munich, Germany
Purpose	Used to help record the visitor’s use of the website.
Privacy Policy	https://www.siteimprove.com/privacy/privacy-policy/
Hosts	siteimprove.com
Cookie Name	nmstat
Cookie Expiry	1000 Days

Accept	Twitter
Name	Twitter
Provider	Twitter International Company, One Cumberland Place, Fenian Street, Dublin 2, D02 AX07, Ireland
Purpose	Used to unblock Twitter content.
Privacy Policy	https://twitter.com/privacy
Hosts	twimg.com, twitter.com
Cookie Name	__widgetsettings, local_storage_support_test
Cookie Expiry	Unlimited

Accept	YouTube
Name	YouTube
Provider	Google Ireland Limited, Gordon House, Barrow Street, Dublin 4, Ireland
Purpose	Used to unblock YouTube content.
Privacy Policy	https://policies.google.com/privacy?hl=en&gl=en
Hosts	google.com, youtube.com, youtube-nocookie.com
Cookie Name	NID
Cookie Expiry	6 Months

Accept	Vimeo
Name	Vimeo
Provider	Vimeo Inc., 555 West 18th Street, New York, New York 10011, USA
Purpose	Used to unblock Vimeo content.
Privacy Policy	https://vimeo.com/privacy
Hosts	player.vimeo.com
Cookie Name	vuid
Cookie Expiry	2 Years

Accept	Slideshare
Name	Slideshare
Provider	Scribd, Inc., 460 Bryant St, 100, San Francisco, CA 94107-2594 USA
Purpose	Used to unblock Slideshare content.
Privacy Policy	https://www.slideshare.net/privacy
Hosts	www.slideshare.net
Cookie Name	__utma
Cookie Expiry	2 Years