Summary
Tech lead, sr. software engineer, and research scientist of intelligent data and AI platforms to accelerate scientific discovery. With 15+ years at IBM, ORNL, SLAC, and UFRJ, I foster a user-centric system design by keeping experts in the development loop, rapidly translating abstract requirements from domains like Energy, Chemistry, Biology, and Climate into production-grade systems that are easier to operate, maintain, and scale. My research focuses on highly scalable, low-latency, observable, provenance- and metadata-first architectures that facilitate comprehensive data analysis across heterogeneous infrastructure, bridging edge instruments, cloud clusters, and leadership-class supercomputers, as well as data integration for SQL/NoSQL databases, knowledge graphs, messaging, streaming, and parallel file systems. My current focus includes AI/ML, LLM-driven, and agentic workflows. I authored 50+ papers, won best thesis and paper awards, held 10+ USPTO patents, and reviewed for major venues like IEEE TPDS, IEEE Big Data, IEEE eScience, FGCS, VLDB, and Supercomputing.
Areas of Expertise
- AI/ML, LLM-driven, and Agentic workflows
- Edge-Cloud-HPC Computing
- Provenance-driven data analysis, lineage, and observability
- Scalable data engineering (SQL, NoSQL, KGs, Streaming, Parallel File Systems)
Education
Federal University of Rio de Janeiro, Brazil
Ph.D. in Computer Science | Sep 2015 — Dec 2019
Thesis: Supporting User Steering in Large-scale Workflows with Provenance Data
M.Sc. in Computer Science | Jan 2013 — Jul 2015
Thesis: Controlling the Parallel Execution of Workflows Relying on a Distributed Database
B.Sc. in Computer Science | Jan 2009 — Dec 2012
Thesis: Linked Open Data Publication Strategies: An Application in Network Performance Data
International experience:
Visiting Ph.D. Student - Inria/Univ. Montpellier, France | Jan 2019 — Mar 2019
Computer Science exchange student - Missouri State University, U.S. | Jun 2011 — Jun 2012
Experience
Oak Ridge National Laboratory Oct 2022 — Present
Staff Scientist & Sr. Software Engineer, HPC Workflows, Data & AI | Knoxville, USA
-
Led R&D on workflow provenance and observability for AI-driven science, focusing on transparency, reliability, and reproducibility in end-to-end workflows.
-
Developed provenance models and system mechanisms to connect user intent, agent decisions, workflow executions, and downstream results in unified traces.
-
Validated methods through real deployments spanning Edge-Cloud-HPC environments and interactive workflows that require low-latency, auditable responses.
-
Published and presented results in workflow and eScience venues, and drove community engagement through tutorials and reference architectures.
IBM Research Apr 2015 — Oct 2022
Staff Scientist & Sr. Software Engineer, Cloud, Data & AI | Rio de Janeiro, Brazil
-
Conducted applied R&D in data management and AI systems, producing peer-reviewed outputs and patented innovations.
-
Explored scalable data services and governance-aware approaches that support AI lifecycle requirements in enterprise contexts.
-
Collaborated internationally across research and engineering teams to validate ideas in real deployments and user scenarios.
SLAC National Accelerator Laboratory, Stanford University May 2013 — Dec 2014
Research Software Engineering Intern | Menlo Park, USA
- Applied semantic web and scalable data management methods to publish structured measurement data for broad community use.
Federal University of Rio de Janeiro Jan 2010 — Sep 2014
Software Engineer (Intern → Engineer) | Rio de Janeiro, Brazil
-
Built applied semantic web and linked data solutions, grounding research ideas in real systems and user needs.
-
Early applied work on integrating heterogeneous data sources for analytics and reporting.
Petrobras May 2007 — May 2008
IT Intern | Rio de Janeiro, Brazil
- Early industry experience in software delivery and operations.
Selected on-going Projects
American Science Cloud (AmSC)
AmSC integrates DOE computing and experimental facilities, high-performance networks, and data resources into a secure, federated environment for AI-ready datasets and model services. I design and implement multi-agent workflow frameworks within ORNL use cases and align them with platform APIs for cross-facility interoperability.
Orchestrated Platform for Autonomous Laboratories (OPAL)
OPAL seeks to create a network of autonomous laboratories that can learn and adapt by combining AI, robotics, and automated experimentation. I lead agentic AI systems that translate expert intent into HPC-scale execution and multimodal analytics, enabling human-in-the-loop discovery workflows.
Advanced Manufacturing into Leadership-class Supercomputers via AI Agents
A cross-facility system integrating a natural language interface, a multi-agent decision framework, programmable facility APIs, and provenance-aware infrastructure for adaptive and reproducible workflows. I led the multi-agent communication layer and dynamic steering mechanisms that connect decisions to actions across facilities.
Flowcept
Flowcept captures runtime provenance with low overhead and links tasks, lineage, telemetry, and AI-agent interactions into end-to-end traces for accountability and reproducibility. I created and lead the platform, which underpins multiple DOE cross-facility initiatives and current work on agentic provenance.
Technical Knowledge
-
Programming Languages: Python, Java, C, C++, C#, Shell, NodeJS, Scala, Lua
-
Data Science/ML: PyTorch, MLFlow, Airflow, Pandas, Polars, Jupyter, Matplotlib, Plotly
-
Agentic AI: MCP, CrewAI, LangChain, Streamlit, Chainlit, RAG, LLM-based orchestration
-
Big Data & Streaming Platforms: Apache Spark, Dask, Kafka, Redis, RabbitMQ
-
Parallel & Distributed Programming: MPI, OpenMP, CUDA, PubSub
-
Cloud, HPC, DevOps: Kubernetes, OpenShift, Docker, Slurm, LSF, PBS; CI/CD GitHub Actions, Jenkins, Travis; Grafana
-
Databases & Knowledge Graphs: PostgreSQL/PostGIS, MySQL, MongoDB, Elasticsearch, HBase, Hive, Redis, LMDB, Polystores; AllegroGraph, Jena, Virtuoso, RDF, SPARQL, OWL
Grants and Awards
-
ORNL performance award for top performers (2025).
-
Best paper award at WORKS@Supercomputing for the paper The (R)evolution of Scientific Workflows in the Agentic AI Era: Towards Autonomous Science (2025).
-
LDRD-funded project leadership, managed near $2M and led 10+ researchers and developers (2023—2025).
-
IBM Patent Plateaus for high-impact software and data innovations (8+ USPTO patents) (2020, 2021).
-
SBBD Honored Mention for the Best Ph.D. Thesis Award (2021).
-
Early Academic Honors: SBBD Best M.Sc. Thesis and CAPES International Science Grant (2012—2017).
Teaching and Supervisions
Courses
-
Databases Laboratory (UFRJ), 2017. Teacher assistant to Prof. Marta Mattoso
-
Logics for Computer Science (UFRJ), 2012—2013. Teacher assistant to Prof. Mario Benevides
Supervisions
-
Pedro Paiva Miranda: A Mechanism for Fault Tolerance in Parallel Executions of Workflows supported by a Database. undergraduate (UFRJ), 2015.
-
Rachel Gonçalves de Castro: Publication of Workflow Provenance Data in the Semantic Web. undergraduate (UFRJ), 2015.
Scientific Community Service
-
IEEE Transactions on Parallel and Distributed Systems - Reviewer
-
Future Generation Computer Systems - Reviewer
-
Concurrency Computation Practice and Experience - Reviewer
-
Journal of Parallel and Distributed Computing - Reviewer
-
The Very Large Databases (VLDB) Journal - Reviewer
-
IEEE Transactions on Big Data - Reviewer
-
Journal of Cloud Computing - Reviewer
-
Computer Physics Communications - Reviewer
-
Discover Data - Reviewer
-
Frontiers in High Performance Computing - Reviwer and Editorial Board
-
International Workshop on AI Principles in Science Communication (AISC) - PC
-
International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD’25) - PC
-
Workflows in Distributed Environments (WiDE’24) - PC
-
IEEE/ACM Supercomputing (SC’24) - PC
-
IEEE International Conference on e-Science (eScience’23) - Session Chair, PC
-
Workflows in Support of Large-Scale Science (WORKS’20, 21, 23, 24, 25) - PC
-
Brazilian Workshop on Database and Artificial Intelligence Integration - PC
-
Brazilian Symposium on Databases (SBBD’20, 23, 24, 25, 26) - PC, Session Chair
-
Brazilian e-Science (BreSci’26) - PC
-
Innovation Summit on Information Systems (at SBSI’19,20) - PC
Badges and Certifications
-
Machine Learning Specialist Professional: Exploratory Data Analysis, Regression, Classification, Deep Learning, Reinforcement Learning, Unsupervised Learning, Time Series and Survival Analysis, AI Ethics and Explainability (2022).
-
Trustworthy AI and AI Ethics: Trustworthy AI foundations and applied AI ethics (2022).
-
LinkedIn Skill Assessment: Python, MySQL, Linux, T-SQL, NoSQL
Languages
-
English: Full professional proficiency
-
Portuguese: Native
-
Spanish: Reading fluent; speaking/listening intermediate
Publications and Events
For all publications and patents, visit renansouza.org/publications.
For all events participation, visit renansouza.org/events.