Post

Google Summer of Code 2017 Summary

This is a summary of my Google Summer of Code project containing the definition of the initial task, the implemented solution and references to show the work I have done.

GOAL

In this project I aimed to implement an application that will enable:

  1. Storage of large amounts of EEG data on a remote cluster using the Hadoop distirbuted file system
  2. Fast distributed processing, analysis and machine learning on the EEG data using Apache Spark and dl4j
  3. User interface for managing the data and building data analysis workflows allowing the researchers to choose from different signal processing methods and machine learning techniques

WORK REPORT

In order to achieve this goal, I implemented a solution with three components:

  • Client GUI which is a portable Desktop Java application allowing the user to browse/manage the data on remote Hadoop Distributed File System as well as visually build data analysis workflows using feature extraction and classification methods
  • Data Analysis Package which is a Java application made using Apache Spark, Apache Hadoop and Deep Learning for Java (dl4j) frameworks whose purpose is to provide a modular way of specifying data pipelines consisting of input sources, data processing, feature extraction and classification methods
  • Remote Server is a Spring Boot server and the main communication point for the Client GUI with the Data Analysis Package on the Hadoop server. It listens for requests such as job submittals, fetching the results of a job, listing of trained classifiers and etc.

This is a demo of the Client GUI:

[youtube=https://www.youtube.com/watch?v=48r53zLVOLM&w=320&h=266]

You can find a more detailed explanation of the solution architecture here.

Each of these components has a separate Github repository to which I am the main (and only) contributor:

Although most of my work was in the above mentioned repositories, there are still a few repositories I made containing guides or documentation for the GSoC project:

Lastly, the list of all commits I made (70+) related to this project can be found in this list:

This post is licensed under CC BY 4.0 by the author.

Comments powered by Disqus.