Renan Souza, Ph.D.
Data Science and Engineering Research Staff at Oak Ridge National Laboratory


All Publications and Patents

Please, feel free to reach me if you need a preprint of a paper not available here.



Supporting User Steering in Large-scale Workflows with Provenance Data
R. Souza
COPPE/Federal University of Rio de Janeiro, Ph.D. Thesis, 2019.
[T1] [online] [pdf] [bibtex]
Controlling the Parallel Execution of Workflows Relying on a Distributed Database
R. Souza
COPPE/Federal University of Rio de Janeiro, M.Sc. Thesis, 2015.
[T2] [online] [pdf] [bibtex]
Linked Open Data Publication Strategies: An Application in Network Performance Data (in pt)
R. Souza
DCC/Federal University of Rio de Janeiro, B.Sc. Thesis, 2013.
[T3] [pdf] [bibtex]

Journal Articles

A Polystore Architecture Using Knowledge Graphs to Support Queries on Heterogeneous Data Stores
L. Azevedo, R. Souza, E. S Soares, R. Thiago, J. Tesolin, A. Oliveira, and M. Moreno
arXiv preprint Databases (cs.DB), 2023.
[J1] [doi] [online] [pdf] [bibtex]
Workflows Community Summit 2022: A Roadmap Revolution
R. da Silva, R. Badia, V. Bala, D. Bard, P. Bremer, I. Buckley, S. Caino-Lores, K. Chard, C. Goble, S. Jha, ..., R. Souza, and et al.
arXiv preprint Distributed, Parallel, and Cluster Computing (cs.DC), 2023.
[J2] [doi] [online] [pdf] [bibtex]
Workflow Provenance in the Lifecycle of Scientific Machine Learning
R. Souza, L. G. Azevedo, V. Lourenço, E. Soares, R. Thiago, R. Brandão, D. Civitarese, E. Vital Brazil, M. Moreno, P. Valduriez, M. Mattoso, R. Cerqueira, and M. A. S. Netto
Concurrency and Computation: Practice and Experience, 2021.
[J3] [abstract] [online] [pdf] [bibtex]
Distributed In-memory Data Management for Workflow Executions
R. Souza, V. Silva, A. Lima, D. Oliveira, P. Valduriez, and M. Mattoso
PeerJ Computer Science, 2021.
[J4] [abstract] [doi] [online] [pdf] [bibtex]
Workflows Community Summit: Advancing the State-of-the-art of Scientific Workflows Management Systems Research and Development
R. da Silva, H. Casanova, K. Chard, ..., R. Souza, and et al.
arXiv preprint Distributed, Parallel, and Cluster Computing (cs.DC), 2021.
[J5] [online] [pdf] [bibtex]
Adding Hyperknowledge-enabled data lineage to a machine learning workflow management system for oil and gas
L. Azevedo, R. Souza, R. Brandão, V. Lourenço, M. Costalonga, M. de Machado, M. Moreno, and R. Cerqueira
First Break, 2020.
[J6] [doi] [bibtex]
Keeping Track of User Steering Actions in Dynamic Workflows
R. Souza, V. Silva, J. Camata, A. Coutinho, P. Valduriez, and M. Mattoso
Future Generation Computer Systems, 2019.
[J7] [abstract] [doi] [online] [pdf] [bibtex]
Adding Domain Data to Code Profiling Tools to Debug Workflow Parallel Execution
V. Silva, L. Neves, R. Souza, A. Coutinho, D. de Oliveira, and M. Mattoso
Future Generation Computer Systems, 2018.
[J8] [doi] [bibtex]
Data Reduction in Scientific Workflows Using Provenance Monitoring and User Steering
R. Souza, V. Silva, A. Coutinho, P. Valduriez, and M. Mattoso
Future Generation Computer Systems, 2017.
[J9] [abstract] [doi] [pdf] [bibtex]
A Hybrid Architecture for Multi-party Conversational Systems
M. de Bayser, P. Cavalin, R. Souza, A. Braz, H. Candello, C. Pinhanez, and J. Briot
arXiv preprint Computation and Language (cs.CL), 2017.
[J10] [online] [pdf] [bibtex]

Conference and Workshop Papers

Towards Cross-Facility Workflows Orchestration through Distributed Automation
T. Skluzacek, R. Souza, M. Coletti, F. Suter, and R. Silva
Practice and Experience in Advanced Research Computing (PEARC ’24), 2024.
[C1] [bibtex]
Advancing Computational Earth Sciences: Innovations and Challenges in Scientific HPC Workflows
R. da Silva, K. Maheshwari, T. Skluzacek, R. Souza, and S. Wilkinson
European Geosciences Union (EGU), 2024.
[C2] [bibtex]
HKPoly: A Polystore Architecture to Support Data Linkage and Queries on Distributed and Heterogeneous Data
L. Azevedo, R. Souza, E. Soares, R. Thiago, J. Tesolin, A. Oliveira, and M. Moreno
Proceedings of the 20th Brazilian Symposium on Information Systems (SBSI), 2024.
[C3] [doi] [online] [bibtex]
Towards Lightweight Data Integration using Multi-workflow Provenance and Data Observability
R. Souza, T. Skluzacek, S. Wilkinson, M. Ziatdinov, and R. da Silva
IEEE International Conference on e-Science, 2023.
[C4] [abstract] [doi] [online] [pdf] [bibtex]
ProvLight: Efficient Workflow Provenance Capture on the Edge-to-Cloud Continuum
D. Rosendo, M. Mattoso, A. Costan, R. Souza, D. Pina, P. Valduriez, and G. Antoniu
IEEE International Conference on Cluster Computing, 2023.
[C5] [doi] [online] [pdf] [bibtex]
Context-aware Execution Migration Tool for Data Science Jupyter Notebooks on Hybrid Clouds
R. Cunha, L. Real, R. Souza, B. Silva, and M. Netto
IEEE International Conference on e-Science, 2021.
[C6] [doi] [pdf] [bibtex]
Supporting Polystore Queries using Provenance in a Hyperknowledge Graph
L. Azevedo, R. Souza, E. Soares, R. Thiago, A. Oliveira, and M. Moreno
International Semantic Web Conference (ISWC), 2021.
[C7] [pdf] [bibtex]
User Steering Support in Large-scale Workflows
R. Souza
PhD Thesis Contest: Brazilian Symposium on Databases (SBBD), 2021.
[C8] [pdf] [bibtex]
A Recommender for Choosing Data Systems based on Application Profiling and Benchmarking
E. Soares, R. Souza, R. Thiago, M. Machado, and L. Azevedo
Brazilian Symposium on Databases (SBBD), 2021.
[C9] [bibtex]
Cycle Orchestrator: A Knowledge-Based Approach for Structuring Cyclic ML Pipelines in the O&G Industry
R. Brandão, V. Lourenço, M. Machado, L. Azevedo, M. Cardoso, R. Souza, G. Lima, R. Cerqueira, and M. Moreno
International Semantic Web Conference (ISWC), 2020.
[C10] [bibtex]
A Knowledge-Based Approach for Structuring Cyclic Workflows
R. Brandão, V. Lourenço, M. Machado, L. Azevedo, M. Cardoso, R. Souza, G. Lima, R. Cerqueira, and M. Moreno
International Semantic Web Conference (ISWC), 2020.
[C11] [bibtex]
Runtime Steering of Parallel CFD Simulations
R. Souza, J. Camata, M. Mattoso, and A. Coutinho
International Conference on Parallel Computational Fluid Dynamics, 2020.
[C12] [bibtex]
Experiencing ProvLake to Manage the Data Lineage of AI Workflows
L. Azevedo, R. Souza, R. Thiago, E. Soares, and M. Moreno
Innovation Summit on Information Systems (EISI) in Brazilian Symposium in Information Systems (SBSI), 2020.
[C13] [bibtex]
Modern Federated Databases: an Overview
L. Azevedo, R. Souza, E. Soares, and M. Moreno
International Conference on Enterprise Information Systems (ICEIS), 2020.
[C14] [bibtex]
Supporting the Training of Physics Informed Neural Networks for Seismic Inversion Using Provenance
R. Souza, A. Codas, J. Nogueira Junior, M. Quinones, L. Azevedo, R. Thiago, E. Soares, M. Cardoso, and L. Martins
American Association of Petroleum Geologists Annual Convention and Exhibition (AAPG), 2020.
[C15] [bibtex]
Managing Data Lineage of O&G Machine Learning Models: The Sweet Spot for Shale Use Case
R. Thiago, R. Souza, L. Azevedo, E. Soares, R. Santos, W. Santos, M. De Bayser, M. Cardoso, M. Moreno, and R. Cerqueira
European Association of Geoscientists and Engineers (EAGE) Digitalization Conference and Exhibition, 2020.
[C16] [doi] [pdf] [bibtex]
Efficient Runtime Capture of Multiworkflow Data Using Provenance
R. Souza, L. Azevedo, R. Thiago, E. Soares, M. Nery, M. Netto, E. Brazil, R. Cerqueira, P. Valduriez, and M. Mattoso
IEEE International Conference on e-Science, 2019.
[C17] [abstract] [doi] [online] [pdf] [bibtex]
Managing Data Traceability in the Data Lifecycle for Deep Learning Applied to Seismic Data
R. Souza, E. Brazil, L. Azevedo, R. Ferreira, E. Soares, R. Thiago, M. Nery, V. Torres, and R. Cerqueira
American Association of Petroleum Geologists Annual Convention and Exhibition (AAPG), 2019.
[C18] [online] [bibtex]
Provenance Data in the Machine Learning Lifecycle in Computational Science and Engineering
R. Souza, L. Azevedo, V. Lourenço, E. Soares, R. Thiago, R. Brandão, D. Civitarese, E. Vital Brazil, M. Moreno, P. Valduriez, M. Mattoso, R. Cerqueira, and M. A. S. Netto
Workflows in Support of Large-Scale Science (WORKS) co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2019.
[C19] [abstract] [doi] [pdf] [bibtex]
Towards a human-in-the-loop library for tracking hyperparameter tuning in deep learning development
R. Souza, L. Neves, L. Azeredo, R. Luiz, E. Tady, P. Cavalin, and M. Mattoso
Latin American Data Science (LaDaS) workshop co-located with the Very Large Database (VLDB) conference, 2018.
[C20] [pdf] [bibtex]
Capturing Provenance for Runtime Data Analysis in Computational Science and Engineering Applications
V. Silva, R. Souza, J. Camata, D. de Oliveira, P. Valduriez, A. Coutinho, and M. Mattoso
Provenance and Annotation of Data and Processes - International Provenance and Annotation Workshop (IPAW), 2018.
[C21] [doi] [bibtex]
Provenance of Dynamic Adaptations in User-Steered Dataflows
R. Souza and M. Mattoso
Provenance and Annotation of Data and Processes - International Provenance and Annotation Workshop (IPAW), 2018.
[C22] [doi] [pdf] [bibtex]
Ravel: A MAS orchestration platform for Human-Chatbots Conversations
M. de Bayser, C. Pinhanez, H. Candello, M. Affonso, M. Vasconcelos, M. Guerra, P. Cavalin, and R. Souza
International Workshop on Engineering Multi-Agent Systems (EMAS@AAMAS 2018), 2018.
[C23] [pdf] [bibtex]
Scientific Data Analysis Using Data-Intensive Scalable Computing: the SciDISC Project
P. Valduriez, M. Mattoso, R. Akbarinia, H. Borges, J. Camata, A. Coutinho, D. Gaspar, N. Lemus, J. Liu, H. Lustosa, F. Masseglia, F. Nogueira Da Silva, V. Silva, R. Souza, K. Ocaña, E. Ogasawara, D. Oliveira, E. Pacitti, F. Porto, and D. Shasha
LADaS: Latin America Data Science Workshop, 2018.
[C24] [online] [pdf] [bibtex]
Spark Scalability Analysis in a Scientific Workflow
R. Souza, V. Silva, P. Miranda, A. Lima, P. Valduriez, and M. Mattoso
Brazilian Symposium on Databases (SBBD), 2017.
[C25] [pdf] [bibtex]
Parallel Execution of Workflows driven by Distributed Database Techniques
R. Souza
MSc Thesis Contest: Brazilian Symposium on Databases (SBBD), 2017.
[C26] [pdf] [bibtex]
Tracking of online parameter fine-tuning in scientific workflows
R. Souza, V. Silva, J. Camata, A. Coutinho, P. Valduriez, and M. Mattoso
Workflows in Support of Large-Scale Science (WORKS) workshop co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2017.
[C27] [online] [bibtex]
Integrating Domain-data Steering with Code-profiling Tools to Debug Data-intensive Workflows
V. Silva, L. Neves, R. Souza, A. Coutinho, D. Oliveira, and M. Mattoso
Workflows in Support of Large-Scale Science (WORKS) workshop co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2016.
[C28] [bibtex]
Online Input Data Reduction in Scientific Workflows
R. Souza, V. Silva, A. Coutinho, P. Valduriez, and M. Mattoso
Workflows in Support of Large-Scale Science (WORKS) workshop co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2016.
[C29] [online] [bibtex]
Building a question-answering corpus using social media and news articles
P. Cavalin, F. Figueiredo, M. de Bayser, L. Moyano, H. Candello, A. Appel, and R. Souza
International Conference on Computational Processing of the Portuguese Language, 2016.
[C30] [bibtex]
Enhancing Energy Production with Exascale HPC Methods
J. Camata, J. Cela, D. Costa, A. Coutinho, D. Fernández-Galisteo, C. Jimenez, V. Kourdioumov, M. Mattoso, R. Mayo-García, T. Miras, J. Moríñigo, J. Navarro, P. Navaux, D. De Oliveira, M. Rodríguez-Pascual, V. Silva, R. Souza, and P. Valduriez
CARLA: Latin American High Performance Computing Conference, 2016.
[C31] [doi] [online] [pdf] [bibtex]
Applying future Exascale HPC methodologies in the energy sector
J. Camata, J. Cela, D. Costa, A. Coutinho, D. Fernández-Galisteo, R. Souza, C. Jiménez, V. Kourdioumov, M. Mattoso, R. Mayo-García, T. Miras, J. Moríñigo, J. Navarro, D. de Oliveira, M. Rodríguez-Pascual, V. Silva, and P. Valduriez
Russian Supercomputing Days, 2016.
[C32] [online] [pdf] [bibtex]
Parallel Execution of Workflows Driven by a Distributed Database Management System
R. Souza, V. Silva, D. Oliveira, P. Valduriez, A. Lima, and M. Mattoso
ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2015.
[C33] [online] [pdf] [bibtex]
Applying data warehousing and big data techniques to analyze internet performance
T. Barbosa, R. Souza, S. Cruz, M. Campos, and R. Cottrell
International Conference on Internet Applications, Protocols, and Services (NETAPPS), 2015.
[C34] [pdf] [bibtex]
Uma Abordagem para Publicação de Dados de Proveniência de Workflows Científicos na Web Semântica
R. Castro, R. Souza, V. Silva, K. Ocaña, D. Oliveira, and M. Mattoso
Brazilian Symposium on Databases (SBBD), 2015.
[C35] [bibtex]
Linked open data publication strategies: Application in networking performance measurement data
R. Souza, L. Cottrell, B. White, M. Campos, and M. Mattoso
ASE BigData/SocialCom/CyberSecurity, Stanford, CA, 2014.
[C36] [pdf] [bibtex]


Shortened narrative instruction generator for software code change
M. Netto, L. Real, B. Silva, and R. Souza, 2024.
[P1] [online] [bibtex]
Data transformation for acceleration of context migration in interactive computing notebooks
L. Real, R. Cunha, R. Souza, and M. Netto, 2023.
[P2] [online] [bibtex]
Remotely healing crashed processes
M. Netto, B. Silva, R. Cunha, R. Souza, and L. Real, 2023.
[P3] [online] [bibtex]
Asset identification for collaborative projects in software development
L. Real, R. Cunha, M. dos Santos, and R. Souza, 2022.
[P4] [online] [bibtex]
Program context migration
L. Real, M. Netto, R. Cunha, R. Souza, and A. Braz, 2022.
[P5] [online] [bibtex]
Model Document Creation in Source Code Development Environments using Semantic-aware Detectable Action Impacts
A. Appel, C. De Freitas, R. Souza, C. Mendes, A. Vital, N. Dos, S. Marcelo, M. Stelmar Netto, P. Avegliano, and C. Villas, 2022.
[P6] [online] [bibtex]
Continuous storage of data in a system with limited storage capacity
L. Real, M. dos Santos, and R. Souza, 2021.
[P7] [online] [bibtex]
Metadata-based scientific data characterization driven by a knowledge database at scale
R. Souza, R. Mozart, F. Da Silva, A. Vital, and V. Silva, 2021.
[P8] [online] [bibtex]
Creating coordinated multi-chatbots using natural dialogues by means of knowledge base
M. De Bayser, A. Braz, P. Cavalin, F. Figueiredo, and R. Souza, 2018.
[P9] [online] [bibtex]
System and method for managing artificial conversational entities enhanced by social knowledge
A. Braz, P. Cavalin, F. Figueiredo, M. De Bayser, and R. Souza, 2018.
[P10] [online] [bibtex]
Predicting user question in question and answer system
A. Appel, A. Gama Leal, and R. Souza, 2017.
[P11] [online] [bibtex]

Last updated on 2024-06-27.

Complete version of the curriculum vitae: Renan-Souza-CV-full.pdf.

Source code to generate this website and the CV: