IPE 05-2023 Master Thesis or Internship: Development of Large-Scale Data Monitoring System for High throughput Streaming and Real-time Analysis of Data Intensive Scientific Equipments
- Institute for Data Processing and Electronics (IPE)
Industrial and scientific applications are seeing a growing use of data intensive scientific equipment (e.g., high-throughput cameras). Hence, real-time data evaluation is of indisputable value for interactive monitoring. However, processing large amount of data is a challenging task requiring orchestration of an efficient data flow and implementation of efficient parallel algorithms suitable for modern hardware architectures.
In this project, you will build a data monitoring system for data intensive scientific equipment using data processing and statistical methods. The real-time data processing will identify high-value targets for further analysis, thereby accelerating scientific discovery over traditional methods. The system will be used in several scientific pilot experiments. Notably, the KATRIN experiment--- Karlsruhe Tritium Neutrino Experiment---has numerous sensor instruments that generate data at an unprecedented rate. You will work closely with our team of experienced researchers and engineers to design and implement a robust system that can handle the challenges of processing and analyzing large volumes of data in real-time.
The goal of the work is to develop a data monitoring system that can manage such demanding experiments. You are expected to design the architecture of the system and integrate it with the real-time data processing pipelines of the experiment. Particularly, you will implement diverse communication interfaces (e.g., REST or gRPC), and use redis as the intermediate data storage.
- Collaborate with the team to understand the requirements of the data monitoring system for the different scientific experiment.
- Design and develop scalable and efficient algorithms and data structures for high-throughput data streaming and real-time analysis.
- Implement software components and modules that integrate with existing data acquisition systems.
- Test and debug the system to ensure its reliability, performance, and accuracy.
- Optimize the system to handle large-scale data and maximize its efficiency and scalability.
- Document the development process, including design decisions, implementation details, and troubleshooting steps.
- Collaborate with the scientists to integrate the data monitoring system into the overall data analysis workflow.
- Very good knowledge and practical experience with Python and Web-development programming languages
- good understanding of cloud architecture development or/and database storage is a plus
Carrasco Sanchez, Raquel
Tel: +49 721 608-42016