Sensor Fusion for Identification of Freezing of Gait Episodes Using Wi-Fi and Radar Imaging

Parkinson’s disease (PD) is a progressive and neurodegenerative condition causing motor impairments. One of the major motor related impairments that present biggest challenge is freezing of gait (FOG) in Parkinson’s patients. In FOG episode, the patient is unable to initiate, control or sustain a gait that consequently affects the Activities of Daily Livings (ADLs) and increases the occurrence of critical events such as falls. This paper presents continuous monitoring ADLs and classification freezing of gait episodes using Wi-Fi and radar imaging. The idea is to exploit the multi-resolution scalograms generated by channel state information (CSI) imprint and micro-Doppler signatures produced by reflected radar signal. A total of 120 volunteers took part in experimental campaign and were asked to perform different activities including walking fast, walking slow, voluntary stop, sitting down & stand up and freezing of gait. Two neural networks namely Autoencoder and a proposed enhanced Autoencoder were used classify ADLs and FOG episodes using data fusion process by combining the images acquired from both sensing techniques. The Autoencoder provided overall classification accuracy of ~87% for combined datasets. The proposed algorithm provided significantly better results by presenting an overall accuracy of ~98% using data fusion.

disease, the patients' health get badly affected and because of enduring ailment, coerce them to get more frequent sick leaves and eventually it come to stage, where they have no choice but to consider early retirement from work. In this way, they get sufficient time to properly look after health maintain their health in more effective way and prevent any fatal and sudden disorders. keeping their health ample time to preserve usually take early retirement that eventually leads to loss in productivity along extreme healthcare and other societal expenditures.
In medical healthcare, one of the major motor related impairments in that present biggest challenge is freezing of gait (FOG) in Parkinson's patients. In FOG episode, the patient is unable to initiate, control or sustain a gait. The FOG events are usually confined to short period of time followed by regaining control and continue regular walking. Statistic This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ indicate that half of the Parkinson's patients experience these episodes twice a month and third experiences on day to day basis [2]. This results in limited mobility, deterioration in mental well-being and adverse effect on activities of daily livings and the patients are at high risk of falling causing fatal injuries [3]. Critical events such as falls make 20% to 30% of injuries among elderly people [4].
These signs make the occurrence of FOG episodes as serious health concerns for patients. It is thus highly important to deliver means for identifying and reducing the effect of FOG episodes. Rhythmic auditory stimulation (RAS), such as playing marching music and dance therapy, has been shown to be a safe, inexpensive, and effective method in resuming normal gait in PD patients [5].
The non-invasive wireless sensing leveraging wireless channel information (CSI) and micro-Doppler signatures in medical healthcare are two of the most promising solutions to detect the occurrence of FOG episodes in real-time and deliver feedback cue in an attempt to restore the normal gait alert the patient and reduce the risk of critical events such as falls. Providing timely cue to the Parkinson's disease patients has proved to decrease the severity of particular episodes [6] and reduce the overall duration of the episodes by 35% [7]. Several researchers have designed FOG classifiers based on either software devices. The software-based FOG detector [8] presents several constraints such as mapping machine learning algorithms on hardware. It is thus evident that the FOG detection in PD patients that provide hardware implementation is obtrusive and bulky to be worn on body. There are several case studies that have mapped real-time, complex hardware implementation of machine learning classifiers for wearable healthcare applications. The Field Programmable Gate Array (FPGA) based systems were proposed for heartbeat monitoring [9], [10], seizure episodes in epilepsy disease [11]- [13] and activities of daily living [14], [15]. In light of these advances in healthcare sector, this article presents a non-invasive, low-cost and easily deployable FOG detector based on the data fusion of scalogram obtained using Wi-Fi sensing and spectrograms acquired using radar sensor. A novel deep neural network, namely enhanced Autoencoder is proposed that classifies activities of daily living and detects FOG episodes with high accuracy.
The paper is organized as follows: Section II discusses the related work done on freezing of gait detection, section III explains the signal processing for Wi-Fi sensing, generating scalograms from CSI and producing spectrograms from received signals through radar sensor. Section III gives a brief introduction to the neural network used and discusses the proposed enhanced Autoencoder. Section IV provides the experimental setup and how data were acquired, section V provides details about results obtained and section VI concludes the paper.

II. RELATED WORK
Several researchers have addressed the application of software-based systems and wearable devices in healthcare sector to recognize human activities [16], detect inertial movements [17]. An extensive work has been done in research published work in [18], [19] that leverage wireless sensing for medical healthcare application. In addition, several recent research articles have been published on patient monitoring, quantification of Parkinson's disease motor impairments using wearable sensors only. The main issues with these articles are the locations of sensors placed on bodies, the experimental setup and achieved accuracy. A comprehensive review of medical healthcare detection systems have been discussed in [20], [21]. There are several systems that exploit data mining and machine learning algorithms to determine the severity of particular disease. Arora et al. [19] recruited ten clinically diagnosed PD patients and ten healthy volunteers. Android smartphones were deployed on each subject's body and were asked to perform activities such as finger tapping, finger to nose test, and walk back-and-forth. The authors claim that this method could easily discriminate PD patients from healthy volunteers using machine learning algorithms, however due to the limited number of subjects and low classification accuracy, this solution is not suitable for FoG gait detection. Numerous wearable sensors have been for objective assessment of FoG episodes, yet there is small or no agreement with regard to location, number of participants, experiment setup and dataprocessing techniques. Stand-alone tri-axial accelerometer is widely used as in [22], magnetometer or combination of accelerometer and gyroscopes. Wearable sensors non-wearable sensors are used for cardiovascular activity, gait identification and activities of daily living [23]- [27].
The data-processing and classification algorithms are different in all aforementioned systems. Algorithms based on threshold are easily implemented and deliver adequate performance [28], however, these require optimum threshold levels to be tuned on all patients. Support vector machine (SVM), Naïve Bayes (NB), random forest (RF) are commonly machine learning algorithms used for classification tasks. NB and RF are less computional complex as compared to SVM algorithm, but the latter presents higher classification accuracy and robustness against a large amount of data. Capecci et al. [29] used smartphone worn on waist that executes gait test where accelerometer data is used leveraging threshold-based method, obtaining accuracy of 84.4%. Rodríguez et al. [30] used SVM algorithm with data obtained from 21 participants equipped with wearable sensor to determine FoG and provides specificity and sensitivity of 79.0% and 74.7%, respectively. In reference [31], the authors have used a model on convolution neural network using accelerometer data on 21 volunteers providing an accuracy of 92.3%.

III. PRELIMINARIES A. Wireless Wi-Fi Signals & Channel State Information Extraction
The Wi-Fi signals driven by Orthogonal Frequency Division Multiplexing (OFDM) such as IEEE 802.11 a/a/ac can efficiently and effectively overcome the multipath frequency selective fading experienced due in an indoor environment. The OFDM frequency spectrum is split into several orthogonal frequency carriers where the data that is to be transmitted is encoded and mapped using same modulation scheme. The received signal at the receiving side is down-converted into the baseband signal. The sub frequency channels can be converted from time domain into frequency domain using serial-toparallel signal converter and then applying Fast Fourier Transform (FFT) on all frequency channels, as shown in figure 1. The input serial data stream is formatted into the word size required for transmission, and shifted into a parallel format. The data is then transmitted in parallel by assigning each data word to one carrier in the transmission. The reason for converting serial into parallel is due to the fact that all subcarriers are transmitted at once in once burst rather than one after another, as in case for serial transmission. So that each frequency channel carry adequate information. The operating frequency of Wi-Fi router is set to 2.4 GHz.
The commodity devices such as Intel 5300 and Atheros ar5b225, are open-source device drivers that allow recording the channel state information of each frequency channel (carrier) representing the fine-grained physical layer channel measurements, consisting of wireless channel characteristics such as power distortion, multipath fading and shadowing effect.
Let Hi denote the channel state information values of frequency channel i, which is a complex value and is given as: Here |Hi| and Hi denote the amplitude and phase information of ith subcarrier, respectively. The phase of individual subcarrier i, Hi is written as follows: where β indicates the initial phase offset of ith subcarrier of the phase-locked loop and mi is the subcarrier index of ith frequency channel. The environmental noise is represented in terms of Z, and λp, λs, and λc are the phase errors, sample frequency offset and central frequency offset, respectively. The CSI phase information is not adequate due to the random noise in the radio frequency channels because of using off-the-shelf Intel 5300 network interface. Hence, in this article, we have only considered the amplitude inform retrieved using channel state information.

B. Radar Technology & Micro-Doppler Signatures
The radar sensors use radio signals to identify and locate a target. The traditional radar sensor system consists of a trans-mitter and receiver and a typical signal processing functional unit. The sensor when operating, continuously radiates electromagnetic signals and a target within area of interest reflects back the signal that is received by the radar system. Frequency modulated continuous wave (FMCW) radar uses transmission frequency that varies linearly across the waveform resulting in overcoming range-profile issues and is now widely used short-range wireless sensing applications, including activities of daily living, fall detection in healthcare and so on [23]. The FMCW radar operating at 5.8 GHz is extremely robust against interferences from RF signal sources and has the capacity to record human micro-Doppler signatures while maintaining high resolution. The specifications of an FMCW radar sensors used in this work is presented in table I. The main reason for using radar in the region of 5.6 -6 GHz with operating frequency of 5.8 GHz (ISM, unlicensed band) is that the Wi-Fi transmitter was working at 2.4 GHz (ISM band, unlicensed band). In order to avoid co-channel interference, we opted for a Wi-Fi router and radar sensor that should work in an unlicensed band and should operate at different frequencies.
The RF signals transmitted by an FMCW radar sensor can be mathematically denoted as follows: Here, T F is the total duration of a frame as indicated in Figure 1, N F represents the total number of transmitted frames. The transmitted FMCW signal comprising L number of chirps at the i th frame can be written as: Here, f0 is the operating frequency, μ denotes a change of instantaneous frequency of an FCMW chirp signal. The value of μ can be determined bandwidth (B) divided by the duration of a frame (TF).
In equation 6, a indicates the amplitude information in terms of complex values of the reflected signal of the m th target for i th frame, d s is the distance between the corresponding arrays, τ and w (i) l,k (t) represent the range, Doppler information, and direction-of-arrival information, respectively.

IV. THE PROPOSED FOG DETECTION SYSTEM
The proposed freezing of gait detection system architecture is presented in figure 2 that resolves a multi-classification problem to identify the particular episodes and detect ADLs. The proposed system involves a low-cost commercially available 2.4 GHz WiFi router, a 5.8 GHz Ancortek radar sensor. Each body movement induces a unique CSI represented in terms of scalogram extracted from variances of amplitude information and a micro-Doppler signatures generated by received signal reflected by the subject's body. The multiresolution time-frequency scalograms are obtained using continuous wavelet transform and micro-Doppler are acquired using short-time Fourier transform. The Wi-Fi-based and radar-based signatures are used image object to train and test the proposed enhanced classifier to determine the FOG episodes.

V. PRE-PROCESSING
This section discusses the pre-processing for CSI data obtained using Wi-Fi signals and micro-Doppler signatures received using the radar sensor. The channel state information is extracted from the packets retrieved from Internet Control Messages Protocol (ICMP) packets. Consequently, in theory, the total number of CSI packets received are the same as ICMP packets. However, it was observed that received CSI packets were slightly lesser than the number of ICMP transmitted packets. To synchronize and calibrate the frequency of the data collected, we performed the linear transformation on the raw CSI. The IEEE 802.11n Wi-Fi networks use multiple data sub frequency carriers or subcarrier that transmit Wi-Fi signals simultaneously exploiting orthogonal frequency division multiplexing. In principle, each subcarrier carrier independent information. However, each adjacent subcarriers carry similar information at times. For this purpose, we have used principal component analysis to obtain an independent dataset for each human activity. The CSI data packets can be integrated together into multiple independent principle components.

A. Scalogram
Time-frequency scalograms are extracted from CSI amplitude information from each received packet and are used as features to detect FoG episodes. The extracted features are a multiresolution energy density function obtained from 5000 packets using Continuous Wavelet Transform (CWT). The energy density function E(t, f) is obtained by squaring the amplitude information of the CWT function C d (t, f) of a discrete sequence. The time-frequency scalogram can be computed from the CWT C c (t,s) of a continuous-time signal x(t), represented in terms of time t and scale s as in equation 7.
Here ψ(v − t s ) denote the dilation of the wavelet ψ(t). We have scaled the term v − t = τ and scale the value of s as a function of frequency f provided s = g 1 (w) = g 2 (f). The continuous wavelet transform of discrete and continuous signals are represented in equation 8 and 9 as follows: Here x(KT) is a discrete sequence of samples having a period T = 1/F, where F is the sampling frequency. The CWT of a discrete signal can be obtained when x(kT) is replaced with CSI SC (kT), expressed in equation 10.
In equation 10, F = 60 Hz is the sampling frequency of CSI amplitude information and T = 0.02 seconds. The mother wavelet used in this work is the "morse" wavelet. The scalogram E(t,f) can further be described mathematically as follows: The continuous wave transform-based scalograms deliver multiple resolution time-frequency analyses that are primarily ependent on the window size, resulting in various dilation of the mother wavelet. The scalograms provide adequate transient changes in the amplitude information of the CSI packets due to the human body movements and present high resolution as smaller window durations at higher frequency are used. Moreover, the scalograms have the potential to detect smooth features in waveforms due to larger windows durations at lower frequency. Figure 3(a) shows the variations in amplitude information for single-frequency carrier obtained using Wi-Fi sensing. A clear transition in each human activity can be seen, implying each body motion induces a unique imprint in terms of amplitude variation in dB. The scalograms presented in

B. Micro-Doppler Signature
Any subject that is moving within radar range has mechanical rotation/vibration along with its bulk translation, it leads to the generation of frequency modulation on the reflected lectromagnetic signal that produces sidebands around the subject's Doppler frequency shift known as the micro-Doppler effect [24], [25]. Any object in motion at distance in P moves with frequency f v and displacement D v , with function displacement value of, D (t) = D v si n2π f v tcosβcosα p . Let us assume R 0 to be the distance between the radar sensor and the moving target initial position O, the total range between the two will constantly change with respect to time, the target object's micro-movements are expressed as represented as R (t) + D (t). The reflected signal received can be written as: The value of f 0 in equation 7 is the carrier frequency, λ is its wavelength and ρ is the backscattering coefficient. Putting R(t) in equation 7, the received signal can be mathematically described as follows: Here w v = 2π f v , the derivative of the second phase component provides micro-Doppler shift expression which is expressed as: VI. AUTOENCODER AND PROPOSED ENHANCED AUTOENCODER FOR FOG DETECTION One of the major challenges researchers face is the application of deep neural networks when classifying RF signals (Wi-Fi and Radar in our case) is the limited size of the available dataset. RF data acquisition is consumed a lot of time, is expensive and involves a large number of volunteers [26]. It is highly unlikely to record hundreds of thousands of Wi-Fi and radar data and hence novel algorithms have to be introduced so as to avoid underfitting and overfitting problems. In this context, a 14-layer deep convolution neural network architecture has been proposed to classify human gait classification. In this work, we consider a conventional Autoencoder and compare the results with our proposed threelayer enhanced Autoencoder.

A. Auto-Encoder
An Autoencoder neural network that reproduces the input values at the output side with specific limitations. For instance, given The values of W ∼ and b ∼ show the weights and biases at the decoder side, respectively. During the unsupervised pretraining process, the neural network tried to reduce the reconstruction error to its minimum value as: To prevent the Autoencoder neural network, cost function with a sparsity parameter is applied to force the network to learn the correlation among the given input data. With the addition of sparsity parameters, the cost function can thus be written as: where h denotes the number of hidden neurons, β is the sparsity proportion and KL describes Kullback-Leibler divergence and can be expressed as follows: where p j represents the activation function for j th hidden neuron and p denote the value for activation function, h is the number of hidden neurons. After pretraining the network, the decoder is removed from the network and encoder values are placed for training, using a supervised learning method by adding the SoftMax classifier with six neurons after the encoder. An input vector of K is fed as an input to the SoftMax function that primarily normalizes it into a probability distribution consisting of K probabilities, P (y k ||x i ) for k = 1, 2, . . . K, proportional to the exponentials of the input numbers [28], [29]. In probability distribution function, the input value x i corresponds to labels class y k . The probability class p k can be mathematically expressed as: an input matrix x, the network aims to estimate h w (x) ≈ x. The unsupervised algorithm was introduced in order to initialize the weights and biases of an Autoencoder which was extremely efficient effective when a limited number of training data were available. The Autoencoder neural network implements unsupervised pretraining processing by encoding decoding the Input data, respectively. The Autoencoder approximates a nonlinear mapping on the input data matrix x as follows: where σ describes the nonlinear activation function, W and b are the weights and biases at the encoder side, respectively. The features that are encoded in the network are then decoded in order to reconstruct the particular input matrix x using the following function [27].
The optimum values of weights and biases are obtained when the cost function is minimized as: The 'fine-tuning' technique that works on the gradient-based technique, is applied to resolve the values as indicated in

B. Proposed Enhanced Autoencoder
The enhanced Autoencoder integrates the advantages of convolutional filtering in convolutional neural networks having an unsupervised pretraining of an Autoencoder. On the contrary to the topology for an Autoencoder network, instead of the fully connected layers, the encoder of the proposed neural network consist of convolutional layers and the decoder side have deconvolutional layers. The deconvolutional filters are primarily the transposed copies of the convolutional layers as done in this work and are learned from scratch. Moreover, each and every deconvolutional filer is followed up by an unpooling filter [30]. The unpooling function is operated by saving the locations of the largest values when the pooling process occurs that inherently preserve the values while zeroing the remaining ones. The proposed system is ten times faster than conventional Autoencoder and can provide adequate results when limited number of data (observations) are available for training, specifically in the case of RF sensing system such as ours. The spatial locality of the neural network is preserved by accommodating a convolutional function at each neuron. Hence for a specific input value, P, the encoder calculates the values as follows: Here σ indicates the values for activation function, * is the 2-D convolution, F n is the n th 2-D convolutional layer filter and b represents the basis of the encoder. The unsupervised pretraining process that is applied to the neural network aiming to reduce the subsequent following expression: After performing the unsupervised pretraining process, the decoding part is eliminated and the softmax classifier and fully connected layers are included at the end of the neural network. Next, the neural network is fine-tuned by optimization function as done in the convolution neural network. Similarly, the ADAM algorithm in conjunction with the ReLU activation function for optimization of the two fully connected layers having 150 hidden neurons each. The optimization of the hyperparameters of the proposed enhance Autoencoder was performed by grid search method and is discussed in subsequent section. The overall architecture of the proposed enhanced Autoencoder is presented in figure 6. The deep network architecture was implemented using 14 hidden neurons. Selecting the optimum number of hidden neurons involved rigorous experimentations. Three methods were used namely fixed, constructive and destructive. In the fixed approach, a group of neural networks with different numbers of hidden neurons were trained and evaluated on the test available dataset using a different number of randomly selected starting weights. The increment in the number of hidden neurons one, two or more depending on the computational  resources available. Plotting the evaluation criterion (e.g. sum of squared errors) on the test set as a function of the number of hidden neurons for each neural network generally produces a bowl-shaped error graph. The network with the least error found at the bottom of the bowl was selected because it was able to generalize best. This approach was time consuming, but generally worked very well. The constructive and destructive approaches involve changing the number of hidden neurons during training rather than creating separate networks each with a different number of hidden neurons, as in the fixed approach. The constructive approach involves adding hidden neurons until network performance starts deteriorating. The destructive approach is similar except that hidden neurons are removed during training.

VII. EXPERIMENTAL SETTINGS AND DATA ACQUISITION
The data were recorded from two different sources, i.e. from radar sensors and Wi-Fi router. The experiments were performed at a large room at University of Glasgow as shown in figure 8. The radar sensor antennas were kept at a distance of 0.25 meter as shown in figure 8. All the activities were performed with aspect angel parallel to the radar sensor and Wi-Fi transmitter.   lenge the classifier. Accuracy classification of FOG episodes is vital having minimum false alarms and low missed detection because inaccurate detection of the particular events can have extremely adverse effects on the patients. In this context, all of the 120 volunteers were asked to repeat a single activity more than six times, resulting in 4320 observations in total. The duration of each activity was 5 seconds with the exception of walking activity that was recorded for 10 seconds.

A. Data Acquisition Using Wi-Fi
The experimental design to detect FOG episodes in indoor settings (10 meters by 12 meters room at University of Glasgow, UK) is shown in figure 6. The transmitter, in this case, is a Wi-Fi router that operates at 2.4 GHz and is deployed 8 meters (in line-of-sight) away from the receiving antenna. The receiver is an Omni-directional antenna wired with the Intel 5300 network interface card installed in the PCIe slot of a Dell Inspiron desktop computer, Intel® Core™ i7-9700 Processor, 8GB RAM). The experimental procedure involved the acquisition of multiple data frequency carriers where 30 OFDM subcarriers were obtained from each CSI id. These packets contain the variances of amplitude and phase information of specific human activity which was obtained using signal processing techniques and was performed in Matlab 2019 tool in order to get the time-frequency scalograms. Figure 7 shows the perturbations of amplitude information obtained for five human activities obtained from channel state information using Wi-Fi sensing. Figure 7(a) shows the box plot (amplitude level against group of 30 subcarriers) for person walking fast in indoor settings. Huge variation can be seen, where maximum fluctuation occurs between 12 dB and 25 db. As the person stops walking, the change in amplitude level decreases dramatically and slight variations can be observed as in figure 7(b). There was a sudden increase in variances of amplitude information when person start walking very fast as in figure 7(c). For sitting down on chair, the CSI signatures are distinct from rest of the four imprints as shown in figure 7(d). When the person was experiencing FOG episodes, implying that the feet were felt as if glued to the ground and upper body was trying to move. Hence slight variations can be seen in figure 7(e).
The channel state information measurements for activities of daily living and FOG episodes show distinguishable variances in the amplitude information against individual subcarriers as shown in figure 8. The box plot for CSI data against individuals subcarriers is shown in figure 9, indicating the first I Q R = Q3 C S I sc − Q1 C S I sc (28) min _CSIsc = Q1_CSI sc − 1.5(IQR) The box plot as in figure 8, primarily presents the statistical information regarding the variances of amplitude information of CSI packets across all 30 subcarriers. It is evident that the box plot that statistical features for individual's frequency carrier including quartiles, interquartile ranges and median do not provide adequate information to distinguish activities of daily living and may be similar for specific subcarriers.

B. Data Acquisition Using Radar
The Ancortec radar sensor used to collect data transmitted signals with power approximately +20 dBm at 5. 8 GHz, having a bandwidth of 400 MHz bandwidth. Two Yaggi antennas were used as transmitter and receiver with gain equal to 10 dBi that had horizontal and vertical beamwidth of 60 degrees. The yaggi antenna is primarily manufactured conductors having high reflectivity potential. It has numerous applications due to its unidirectional pattern, high gain and broad bandwidth. It is designed by folding dipole directors where the dipole is electrically connected with the feeder. The dipole is used here is a resonant with the dipole length, that is half the wavelength (half of the wavelength of 5.8 GHz frequency range) and is fed to the feeder. The radiator is set to 5% less than the folded dipole material and is deployed at the forefront to a distance of lower than quarter wavelength of operating waveform. The RF signals directors are designed lesser outwards in order to give yaggi antenna deployed at transmitter and receiver side taper in the radiation and reception direction that essentially makes the operating antenna extremely directional for detecting activities of daily living. The radiation pattern of yaggi antenna used in proposed system is presented in figure 10.
The radar sensor was powered by Universal Serial Bus (USB) cable since its power consumption limited to USB standards. In addition, by analyzing the life span and autonomy of the proposed system in real-world realistic deployment, it would always be connected to a computer for data acquisition and operational task, or to the electricity mains in indoor settings.
The FMCW radar sensor presents both a range profile and Doppler information. In this work, we have only considered the latter information due to computational purpose and ADL and FOD can be accurately identified merely using this information. For data acquisition, the radar was put on an in the same room where Wi-Fi system was deployed. The participants were asked to perform activities within a short range of 1-meter to about 3-meter range and the radar data were obtained with Wi-Fi data simultaneously. The two antennas were deployed in such a way that it would keep the torso of the volunteers in the center of radar beam enhance strength of reflected strength. The recorded data were processed using Short Time Fourier Transform (STFT) to get the spectrograms and produce micro-Doppler signatures.
The Range-Time-Intensity graph can be obtained by stacking up the reflected signals in a matrix format and applying Fast Fourier Transform (FFT) function along the fasttime direction to produce range cells of human action. Next, STFT algorithm is applied to the range profile that consists the target's, in our case people in action, to generate their micro-Doppler signatures [31]. The STFT algorithms apply a sequence of FFTs algorithms with narrow, overlapping windows along the total duration of the collected data; the absolute squared value of the complex matrix is the known as the spectrogram, that is a plot of velocities of body parts in action (obtained through the Doppler effect) with respect to time. A notch MTI filter is applied to mitigate the contribution of static objects targets near 0 Hz such as walls, ceiling, floor, and furniture. Figure 11 shows the micro-Doppler signatures activities of daily living and FOG episodes obtained at the room in University of Glasgow. The positive values of micro-Doppler components are movements towards the radar, while the negative values are the movements away.

VIII. RESULTS AND DISCUSSIONS
In this article, the neural network models are implemented in Python using Keras that exploits Tensorflow using tensor manipulation library. Both Autoencoder and proposed enhanced Autoencoder classification algorithms are tested using scalograms of Wi-Fi data and spectrograms of radar sensor leveraging tenfold cross-validation of both datasets as described in section earlier section. Both neural networks are trained for 300 epochs having a minibatch size of 80. The validation accuracy for both datasets is obtained by splitting 25% of the training data as the validation set and the evaluated the models after the completion of each iteration using the validation datasets.
The adaptive moment estimation algorithm is used to optimize the pretraining process and for fine-tuning having a learning rate of 0.001. Leveraging grid search technique, we have identified the optimum values for width and depth without overfitting as described in Table II (Wi-Fi sensing), Table III (radar sensor) and Table IV (data fusion; combing scalograms and spectrograms). We have opted to use a three-layer Autoencoder having layers of 100, 50 and 25, respectively. The best classification accuracy with optimized parameters are shown in red.
For all three scenarios (Wi-Fi sensing, radar sensing and data fusion: combined datasets), the best classification performance was obtained for 100,50 and 25 layers. The Wi-Fi sensing based on scalograms classification provided an accuracy of 84.5%, the spectrogram classification for radar sensing delivered an accuracy of 85.5.%. When the two datasets were combined, the accuracy of 87.1% was achieved where a small improvement was observed.
We further examine the optimization of hyperparameters of proposed enhanced Autoencoder was performed using grid search, as presented in Table V to Table VII. The enhanced Autoencoder was implemented with three convolutional layers. Both convolutional and deconvolution layers were populated with 30 numbers of 9 by 9 and 3 by 3 concatenated filters. The best classification accuracy using scalograms with Wi-Fi Sensing, 92.3% was obtained with width of 100, depth of 2, having filter size of 9 by 9 and 3 by 3. In addition, the classification accuracy of radar sensing, 94.7%, was slightly better than Wi-Fi sensing as shown in Table VI. When the same network configurations were used, and the combined datasets of both devices were used, a significant change in the performance was observed. The accuracy of the system increased to 98.1% when classification all human activities. Table VII shows the highest classification accuracy (in red) obtained for width of 100 and depth of 2.
The confusion matrix of data fusion for ADLs recognition (walking slow, voluntary stop, walking fast, sitting down and standing up, and FOG episodes) and identifying FOG episodes for Autoencoder and proposed enhanced Autoencoder are shown in Table VI and Table VII, respectively. In Table VI, the Autoencoder provided adequate classification accuracy. However, similar activities such as walking fast & walking slow, voluntary stop and FOG episodes were deliberately chosen to confuse the classifier, consequently challenging its performance. For instance, the walking fast was classified with accuracy of around 90%, however it slightly misclassified with walking slow; indicating an error of ∼10%. A similar pattern was observed for another pair of activities namely voluntary stop and FOG episodes. The particular freezing of gait event was identified with an accuracy of 85% but the algorithm detected false negative rate of 14% for voluntary stop.
A significant improvement in classification accuracy of individual activities can be seen in Table VII. The proposed algorithm detected the walking fast, sitting down and standing up activities with 100% accuracy. The two similar pair of activities, that were presented as to confuse the proposed algorithm, were also classified with accuracies of ∼99% IX. CONCLUSION This paper presented freezing of gait detection Parkinson's disease patients that primarily causes motor impairments leading to high risks of falls. In this context, activities of daily living namely walking fast, walking slowly, voluntary stop, sitting down on chair/standing up from a chair and freezing of gait episodes were monitored using two non-invasive sensing technique. The two methods for data acquisition were Wi-Fi-based sensing and radar-based sensing techniques. The multi-resolution scalograms were produced from CSI data and spectrograms were obtained from received signal using radar sensor. An extensive experimental campaign was conducted involving more than 100 volunteers with age range from 30 to 76 years. For classification purpose, two deep neural networks such as an Autoencoder and proposed convolutional neural network based enhanced Autoencoder were used to classify activities of daily and detect FOG events. It was observed that the Autoencoder delivered overall classification accuracy of ∼87% for combined datasets, however the proposed algorithm outperformed the same classifier providing classification accuracy of 98.1% for the same datasets.