Applied Scientist & Sr. Software Engineer | Data & AI Platforms for Edge-Cloud-HPC

Summary

Tech lead, sr. software engineer, and researcher of intelligent data and AI platforms to accelerate discovery. With 15+ years at IBM, ORNL, SLAC, and UFRJ, I translate domain expertise into scalable, production-grade systems spanning edge, cloud, and leadership-class supercomputers. My work centers on highly scalable, low-latency, observable, provenance- and metadata-first architectures that integrate heterogeneous data systems to enable reliable, reproducible, and explainable large-scale agentic and AI workflows.

Areas of Expertise

  • AI/ML, LLM-driven, and Agentic workflows
  • Edge-Cloud-HPC Computing
  • Provenance-driven data analysis, lineage, and observability
  • Scalable data engineering (SQL, NoSQL, KGs, Streaming, Parallel File Systems)

Education

Federal University of Rio de Janeiro, Brazil

    Ph.D. in Computer Science | Sep 2015 — Dec 2019

    M.Sc. in Computer Science | Jan 2013 — Jul 2015

    B.Sc. in Computer Science | Jan 2009 — Dec 2012

Experience

Oak Ridge National Laboratory Oct 2022 — Present

Staff Scientist & Sr. Software Engineer, HPC Workflows, Data & AI | Knoxville, USA

  • Led R&D on workflow provenance and observability for AI-driven science, focusing on transparency, reliability, and reproducibility in end-to-end workflows.

  • Developed provenance models and system mechanisms to connect user intent, agent decisions, workflow executions, and downstream results in unified traces.

  • Validated methods through real deployments spanning Edge-Cloud-HPC environments and interactive workflows that require low-latency, auditable responses.

  • Published and presented results in workflow and eScience venues, and drove community engagement through tutorials and reference architectures.

IBM Research Apr 2015 — Oct 2022

Staff Scientist & Sr. Software Engineer, Cloud, Data & AI | Rio de Janeiro, Brazil

  • Conducted applied R&D in data management and AI systems, producing peer-reviewed outputs and patented innovations.

  • Explored scalable data services and governance-aware approaches that support AI lifecycle requirements in enterprise contexts.

  • Collaborated internationally across research and engineering teams to validate ideas in real deployments and user scenarios.

SLAC National Accelerator Laboratory, Stanford University May 2013 — Dec 2014

Research Software Engineering Intern | Menlo Park, USA

  • Applied semantic web and scalable data management methods to publish structured measurement data for broad community use.

Federal University of Rio de Janeiro Jan 2010 — Sep 2014

Software Engineer (Intern → Engineer) | Rio de Janeiro, Brazil

  • Built applied semantic web and linked data solutions, grounding research ideas in real systems and user needs.

  • Early applied work on integrating heterogeneous data sources for analytics and reporting.

Petrobras May 2007 — May 2008

IT Intern | Rio de Janeiro, Brazil

  • Early industry experience in software delivery and operations.

Selected on-going Projects

American Science Cloud (AmSC)

A secure, federated cloud environment integrating DOE facilities and data to enable AI-ready datasets and scalable model services. I develop multi-agent methods and implementations within ORNL use cases for cross-facility science workflows.

Orchestrated Platform for Autonomous Laboratories (OPAL)

A DOE multi-lab initiative to make biological discovery self-driving using AI, robotics, and automated experimentation. I lead agentic AI systems that connect interactive intent to Frontier-scale execution and multimodal analysis.

Advanced Manufacturing into Leadership-class Supercomputers via AI Agents

A modular architecture for autonomous cross-facility experimentation using AI agents, programmable facility APIs, and provenance-aware workflows. I led the multi-agent communication and workflow steering components.

Flowcept

A provenance system for traceable, auditable, and reproducible agentic workflows that unifies runtime signals, lineage, and agent interactions. I created and lead Flowcept, and we are advancing agentic provenance capabilities.

Technical Knowledge

  • Programming Languages: Python, Java, C, C++, C#, Shell, NodeJS, Scala, Lua

  • Data Science/ML: PyTorch, MLFlow, Airflow, Pandas, Polars, Jupyter, Matplotlib, Plotly

  • Agentic AI: MCP, CrewAI, LangChain, Streamlit, Chainlit, RAG, LLM-based orchestration

  • Big Data & Streaming Platforms: Apache Spark, Dask, Kafka, Redis, RabbitMQ

  • Parallel & Distributed Programming: MPI, OpenMP, CUDA, PubSub

  • Cloud, HPC, DevOps: Kubernetes, OpenShift, Docker, Slurm, LSF, PBS; CI/CD GitHub Actions, Jenkins, Travis; Grafana

  • Databases & Knowledge Graphs: PostgreSQL/PostGIS, MySQL, MongoDB, Elasticsearch, HBase, Hive, Redis, LMDB, Polystores; AllegroGraph, Jena, Virtuoso, RDF, SPARQL, OWL

Publications and Events

For all publications and patents, visit renansouza.org/publications.

For all events participation, visit renansouza.org/events.